from which we can derive clues as to how the software is structured. One clue that is relatively easy to see is that there is another hot spot in the packet output code, namely tcp _ output(), which is called from seven different routines.
The kind of information that Doxygen can show comes at a price. Generating the graphs shown here, which required analyzing 136 files comprising 125,000 lines of code, took 45 minutes on a dual-core 2.5GHz Macbook Pro laptop. Most of the time was taken up by generating the call and caller graphs, which are by far the most useful pieces of information to a code spelunker. 5
DTrace. One of the most talked about system tools in the last few years is DTrace, a project from Sun Microsystems released under the CDDL that has been ported to the FreeBSD and Mac OS/X operating systems. Regardless of whether the designers of DTrace were specifically targeting code spelunking when they wrote their tool, it is clearly applicable.
DTrace has several components: a command line program, a language, and a set of probes that give information about various events that occur throughout the system. The system was designed such that it could be run against an application for which the user had no source code.
DTrace is the next logical step in the line of program tracing programs that came before it, such as ktrace and truss. What DTrace brings to code spelunking is a much richer set of primitives, both in terms of its set of probes and the D language, which makes it easier for code spelunkers to answer the questions they have. A program like ktrace only shows the system calls that the program executes while it’s running, which are all of the application’s interactions with the operating system. On a typical OS these number in the low hundreds, and while they can give clues to what a complex piece of software is doing, they are not the whole story. Ktrace cannot trace the operating system itself, which is something that can now be accomplished using DTrace.
When people discuss DTrace they often point out the large number of probes available, which on Mac OS X is more than 23,000. This is somewhat misleading. Not all of the probes are
table 1. comparing the sizes of the systems as discussed in 2003 and today.
Program
Apache Web server
Version
1. 3
files
471
Lines
158,332
chg Lines
emacs
136%
34%
Freebsd Kernel
66%
linux Kernel
Python
2. 2. 8
21
22
5. 1
7.0
2. 4. 20-8
2. 6. 25-3
2. 2. 3
2. 5. 2
1108
2586
2598
4758
6723
12417
19483
1158
2379
374,993
1,317,915
1,771,282
2,140,517
3,556,087
5,223,290
8,098,992
356,314
910,573
55%
155%
table 2. features in the Doxyfile.
feature EXTRACT _ ALL
SOURCE _ BROWSER
CLASS _ DIAGRAMS
HAVE _ DOT
CALL _ GRAPH
CALLER _ GRAPH
meaning
extract everything you can from the source code.
create a full cross-reference of the source code.
create class diagrams and inheretance graphs.
create useful code spelunking graphs.
Makes a call graph following all function calls.
outputs a graph of the caller dependencies.
table 3. Providers available in mac os x.
Provider
dtrace probes
fbt
io
lockstat
plockstat
proc
profile
syscall
vminfo
Purpose
related to dtrace itself
entry and exit points for functions
i/o probes
Probes related to locking
pthread lock related probes
Process specific information
Profiling and performance data
information on system calls
virtual Memory probes
immediately usable, and in reality, having such an embarrassment of riches makes picking the most useful probes for a particular job difficult. A probe is some piece of code in an application, library, or the operating system that can be instrumented to record information on behalf of the user. The probes are broken down into several categories based on what they record. Each
probe is delineated by its Provider, Module, Function, and Name. Providers are named after systems such as io, lockstat, proc, profile, syscall, vminfo, and dtrace itself. There are several distinct providers available in Mac OS X, although naively printing them all will show you that several exist on a per-process basis. The per-process probes show information on core data within
References:
Archives