Is it getting any easier to understand
other people’s code?
By GeoRGe V. neViLLe-neiL
IT HAS BEEN five years since I first wrote about code
spelunking9 and though systems continue to grow in
size and scope the tools we use to understand those
systems are not growing at the same rate. In fact, I
believe we are steadily losing ground. So why should
we go over the same ground again? Is this subject
important enough to warrant two articles in five years?
I believe it is.
The oft-quoted Moore’s Law about the increasing
power of computers actually works against the code
spelunker. The more powerful computers become,
the more we demand that they do, which increases
the complexity of the software that runs on them.
Processor speeds increase and that means more lines
of code can now be run in the same amount of time.
Available memory gets larger so we can now keep
more state or code in memory. Disks get larger and
require less power (in the case of flash), and suddenly
we’re able to carry around what were once considered
huge amounts data in our pockets. What was termed
the “software crisis” in the 1970s has never really
abated, because each time software
engineers came up with a new way of
working that reduced complexity, the
industry moved forward and demanded more.
Complexity increases in many directions, including lines of code, numbers
of modules, and numbers of systems
and sub-systems. More complex systems require more lines of code to implement. As they grow their systems,
software teams often integrate more
code from outside resources, which
leads to complex interactions between
systems that may not have been designed with massive integration in
These numbers should not be surprising to any software engineer, but
they are a cause for concern. Although
it was unlikely that the numbers would
shrink, all but one of them grew by
more than 50%, and although the number of lines may have grown linearly,
the interactions among the new components that these numbers represent
have not grown in a linear fashion. If
we assume that all modules in a system
can interact freely with all other modules, then we have a system in which
the potential number of interactions is
expressed as n(n- 1)/2, an equation that
should be familiar to those who work in
networking as it represents a fully connected network. If a system grows from
100 modules to 200 modules, a 100%
growth rate, then the number of potential connections grows from 4,950 to
19,900, a 302% growth rate.
One reliable measure of the number
of interfaces into a system is the number of system calls provided to user programs by an operating system kernel.
Since the publication of my first article
on code spelunking the Linux kernel
has grown from just shy of 200 system
calls to 313, an increase of more than
50% (see Table 1). 7
two new tools
My first article on code spelunking
covered several tools, including global, 3 Cscope, 1 gprof, 4 ktrace, 8 and truss. 11
I continue to use these tools on a daily