these traces isn’t immediately apparent. Unlike the emulator, you do not
control the (hardware) implementation of the microprocessor running
your software.
One may object that such verbose
logging is not particularly feasible for
real programs. Although hard-drive
storage continues to decrease in cost,
I/O bandwidth does have limits. If
storage is truly limited, there are a few
other approaches. For example, the
traces between snapshots could be
stored only in RAM and not written to
the log file. Traces could be recreated
as necessary (by rerunning the target
from a given state). Alternatively, only
the most recent trace could be stored
in main memory, providing quick access to scanning the trace if necessary.
The complete immutability of our
CPU implementation is an elaborate
illusion in many cases. All Java (and
.NET) programs are actually running
on virtual CPUs. Modifying these virtual machines to record trace information and state snapshots is conceivable—no hardware would have to be
changed. It is merely a matter of convincing the owners of the virtual machine implementations to make the
necessary changes.
Even native executables are a step
removed from the real hardware. All
modern operating systems enforce a
separation between user and kernel
modes of execution. A process is, in
fact, a virtual entity. The capability al-
ready exists to snapshot a process’s
execution state (Unix’s venerable core
dump). With some additional operat-
ing-system support, it is quite feasible
to take these snapshot states and re-
store them as real processes that can
continue their execution.
higher-Level traces
Tracing memory accesses is helpful
for programs such as assembler or
C. Memory and CPU transfers correspond easily with actual source code,
but it is more common to be working
with languages a step removed from
the machine where the C code is an
interpreter of the actual language in
use. In this case the low-level action of
the code does not easily map to recognizable actions in the interpreted language.
The use of I/O is also more complicated in modern systems. Programs do
not read and write directly to hardware
I/O locations. Instead, device interaction is mediated through the operating system.
Tracing can be applied to these
higher-level programs in an obvious
fashion: change the trace to record
higher-level events. The exact events to
capture would depend on the kind of
program. A GUI program may need to
capture mouse, keyboard, and window
events. A program that manipulates
files would capture open/close and
read/write operations. Code built on
top of a database might log SQL statements and results.
A good trace differs from a simple
log of events. The trace must provide
enough information that the correct-
ness of execution can be verified using
only the trace. It should be possible to
construct a reference implementation
that can read the trace and automati-
cally verify that the correct decisions
are made. Experience with emulators
suggests you might first code the ref-
erence implementation and use it to
verify the production code. (You may
avoid the premature optimization trap
and discover that the simplified refer-
ence implementation is a satisfactory
solution.) Writing a reference might
seem to involve as much effort as the
production version, but there are al-
ways requirements that do not change
the functional output. For example,
the production code may require a
lookup table to be persistent, whereas
the reference can use a simpler in-
memory hash table.
conclusion
Adding snapshots, tracing, and playback to existing debugging environments would significantly reduce the
time required to find and correct stubborn bugs. Low-level code is seeing
some progress in this area; for some
platforms, gdb has recently been
given the ability to reverse execution.
Since CPU operations are not reversible, this means there are now ways
of capturing trace information for
compiled programs. If the addition of
saving and reloading snapshots were
added, gdb could become a traceable
debugger.
Detailed CPU state traces are extremely helpful in optimizing and debugging emulators, but the technique
can be applied to ordinary programs
as well. The method may be applied almost directly if a reference implementation is available for comparison. If
this is not the case, traces are still useful for debugging nonlocal problems.
The extra work of adding tracing facilities to your program will be rewarded
in reduced debugging time.
Related articles
on queue.acm.org
no Source Code? no Problem!
Peter Phillips and George Phillips
http://queue.acm.org/detail.cfm?id=945155
Debugging AJAX in Production
Eric Schrock
http://queue.acm.org/detail.cfm?id=1515745
Debugging in an Asynchronous World
Michael Donat
http://queue.acm.org/detail.cfm?id=945134
Peter Phillips received a bachelor’s degree in computer
science from the university of british columbia. he has
15 years of experience in software development and has
spent the past 10 years of that working on console games.
© 2010 acm 0001-0782/10/0500 $10.00