Code Spelunking Redux

although naively printing them all will show you that several exist on a per-process basis. The per-process probes show information on core data within the process, as well as on locks. Some of the providers available on Mac OS X are shown here:

The script here is written in the D language and should be relatively easy to decipher for anyone familiar with C. The script contains a single function, which counts the entry into any call that the ls program makes. Where a C programmer might find a function name and argument list, we instead see what is called a predicate. The predicate is a way of selecting the probes for which DTrace will record data. The predicate on line 1 selects the entry into any call for the associated process. When the calls.d script is executed with dtrace, its pid$ variable is replaced with the process ID of the program that is given after the -c command-line argument:

> sudo dtrace -s calls.d -c ls dtrace: script ‘calls.d’ matched 5906 probes

Provider
dtrace
fbt
io
lockstat
plockstat
proc
profile
syscall
vminfo

Purpose
Probes related to dtrace itself
Entry and exit points for functions
I/O probes
Probes related to locking
pthread lock-related probes
Process-specific information
Profiling and performance data
Information on system calls
Virtual memory probes

[output of ls command removed for brevity]

dtrace: pid 7008 has exited

strcoll
strcoll_l
__error
free
strcmp
wcwidth
_none_mbrtowc
mbrtowc
putchar
pthread_getspecific

Probes are associated not only with providers but also with modules, which are those you want to instrument, as well as with functions, which are also subject to observation. The name of a trace point specifies a single probe. All of these categories are put together in a DTrace script or command line to form a sort of address that specifies > what the engineer is trying to observe. The canonical form is provider:module:function:name, with an empty section indicating a wildcard. Although the two manuals from Sun are excellent references on DTrace, 9, 10 a quick example will demonstrate how it can be used for code spelunking.

When presented with a new and unknown system to spelunk, one of the first things to find out is which services the program uses. Programs such as ktrace and truss can show this type of information but DTrace greatly extends this ability. We will now find out which services the ls program requires to execute, as well as which ones are used most often.

148

148

326

353

381

614

662

662

705

1424

1: pid$target:::entry 2: { 3: @[probefunc] = count(); 4: }

DTrace also allows the tracing of live processes by replacing -c with -p and the program name with a live process ID. Here we show the abbreviated output from the execution of ls under DTrace. Only the last several lines, those with high entry counts, are shown. From this snapshot we can see that ls does a lot of work with the string functions strcoll and strcmp, and if we were trying to optimize the program we might look first at where these functions were called.

With thousands of predefined probe points, and the ability to create probes dynamically for user processes, it’s obvious that DTrace is the most powerful code-spelunk-ing tool developed in the past decade.

CONTINUING CHALLENGES

In reviewing the tools mentioned here—as well as those that are not—a few challenges remain apparent. The first

32 November/December 2008 ACM QUEUE

rants: feedback@acmqueue.com

References:

mailto:feedback@acmqueue.com

Archives