basis but in the past five years two new tools have come to my attention that, although they may not have been specifically designed with code spelunking in mind, both make significant contributions to the field. The tools are Doxygen2 and DTrace. 6 Here, I discuss each tool and how it can help us understand large code bases.
Doxygen. Right at the top of the Doxygen Web page2 we find the following: “Doxygen is a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D.” As the blurb says, Doxygen was designed with documenting source code in mind—and it is quite a good system for documenting source code so that the output is usable as man pages and manuals—but it has a few features that make it applicable to code spelunking, too.
What Doxygen does is read in all, or part, of a source tree, looking for documentation tags that it can extract and turn into nicely formatted output suitable for documenting a program. It can produce Unix man pages, La TeX, HTML, RTF, PostScript, and PDF.
What is most interesting for the code spelunker is Doxygen’s ability to extract information from any source code by running pre-processors over the code in question. Doxygen is a static analysis tool in that it analyzes the source code of a program but does not look into the program state while it is running. The great thing about a static analysis tool is that it can be run at any time and does not require that the software be executing. In analyzing something like an operating system, this is extremely helpful.
ILLUS TRATION B Y JOHN HERSE Y
The features that make Doxygen most relevant to our work are those related to how data is extracted from the source code. When you start out with the intention to document your own code with Doxygen you are already working with the system and very little extra needs to be done. If you’re code spelunking an unknown code base then you will need to be more aggres-
References:
Archives