basis but in the past five years two new
tools have come to my attention that,
although they may not have been specifically designed with code spelunking in mind, both make significant
contributions to the field. The tools are
Doxygen2 and DTrace. 6 Here, I discuss
each tool and how it can help us understand large code bases.
Doxygen. Right at the top of the Doxygen Web page2 we find the following:
“Doxygen is a documentation system
for C++, C, Java, Objective-C, Python,
IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D.” As the blurb says, Doxygen was
designed with documenting source
code in mind—and it is quite a good
system for documenting source code
so that the output is usable as man pages and manuals—but it has a few features that make it applicable to code
spelunking, too.
What Doxygen does is read in all,
or part, of a source tree, looking for
documentation tags that it can extract
and turn into nicely formatted output
suitable for documenting a program.
It can produce Unix man pages, La TeX,
HTML, RTF, PostScript, and PDF.
What is most interesting for the
code spelunker is Doxygen’s ability to
extract information from any source
code by running pre-processors over
the code in question. Doxygen is a static analysis tool in that it analyzes the
source code of a program but does not
look into the program state while it is
running. The great thing about a static
analysis tool is that it can be run at any
time and does not require that the software be executing. In analyzing something like an operating system, this is
extremely helpful.
ILLUS TRATION B Y JOHN HERSE Y
The features that make Doxygen
most relevant to our work are those
related to how data is extracted from
the source code. When you start out
with the intention to document your
own code with Doxygen you are already
working with the system and very little
extra needs to be done. If you’re code
spelunking an unknown code base
then you will need to be more aggres-