Code
Spelunking
Redux
although naively printing them all will show you that
several exist on a per-process basis. The per-process probes
show information on core data within the process, as well
as on locks. Some of the providers available on Mac OS X
are shown here:
The script here is written in the D language and should
be relatively easy to decipher for anyone familiar with C.
The script contains a single function, which counts the
entry into any call that the ls program makes. Where a C
programmer might find a function name and argument
list, we instead see what is called a predicate. The predicate
is a way of selecting the probes for which DTrace will
record data. The predicate on line 1 selects the entry into
any call for the associated process. When the calls.d script
is executed with dtrace, its pid$ variable is replaced with
the process ID of the program that is given after the -c
command-line argument:
> sudo dtrace -s calls.d -c ls
dtrace: script ‘calls.d’ matched 5906 probes
Provider
dtrace
fbt
io
lockstat
plockstat
proc
profile
syscall
vminfo
Purpose
Probes related to dtrace itself
Entry and exit points for functions
I/O probes
Probes related to locking
pthread lock-related probes
Process-specific information
Profiling and performance data
Information on system calls
Virtual memory probes
[output of ls command removed for brevity]
dtrace: pid 7008 has exited
strcoll
strcoll_l
__error
free
strcmp
wcwidth
_none_mbrtowc
mbrtowc
putchar
pthread_getspecific
Probes are associated not only with providers but also
with modules, which are those you want to instrument, as
well as with functions, which are also subject to observation. The name of a trace point specifies a single probe.
All of these categories are put together in a DTrace script
or command line to form a sort of address that specifies >
what the engineer is trying to observe. The canonical form
is provider:module:function:name, with an empty section
indicating a wildcard. Although the two manuals from
Sun are excellent references on DTrace, 9, 10 a quick example
will demonstrate how it can be used for code spelunking.
When presented with a new and unknown system to
spelunk, one of the first things to find out is which services the program uses. Programs such as ktrace and truss
can show this type of information but DTrace greatly
extends this ability. We will now find out which services
the ls program requires to execute, as well as which ones
are used most often.
148
148
326
353
381
614
662
662
705
1424
1: pid$target:::entry
2: {
3: @[probefunc] = count();
4: }
DTrace also allows the tracing of live processes by
replacing -c with -p and the program name with a live
process ID. Here we show the abbreviated output from
the execution of ls under DTrace. Only the last several
lines, those with high entry counts, are shown. From this
snapshot we can see that ls does a lot of work with the
string functions strcoll and strcmp, and if we were trying
to optimize the program we might look first at where
these functions were called.
With thousands of predefined probe points, and the
ability to create probes dynamically for user processes, it’s
obvious that DTrace is the most powerful code-spelunk-ing tool developed in the past decade.
CONTINUING CHALLENGES
In reviewing the tools mentioned here—as well as those
that are not—a few challenges remain apparent. The first
32 November/December 2008 ACM QUEUE
rants: feedback@acmqueue.com