practice

Doi: 10.1145/1400181.1400194

Is it getting any easier to understand
other people’s code?

By GeoRGe V. neViLLe-neiL
code
spelunking
Redux
IT HAS BEEN five years since I first wrote about code
spelunking9 and though systems continue to grow in
size and scope the tools we use to understand those
systems are not growing at the same rate. In fact, I
believe we are steadily losing ground. So why should
we go over the same ground again? Is this subject
important enough to warrant two articles in five years?
I believe it is.

The oft-quoted Moore’s Law about the increasing power of computers actually works against the code spelunker. The more powerful computers become, the more we demand that they do, which increases the complexity of the software that runs on them. Processor speeds increase and that means more lines of code can now be run in the same amount of time. Available memory gets larger so we can now keep more state or code in memory. Disks get larger and require less power (in the case of flash), and suddenly we’re able to carry around what were once considered huge amounts data in our pockets. What was termed the “software crisis” in the 1970s has never really

abated, because each time software engineers came up with a new way of working that reduced complexity, the industry moved forward and demanded more.

Complexity increases in many directions, including lines of code, numbers of modules, and numbers of systems and sub-systems. More complex systems require more lines of code to implement. As they grow their systems, software teams often integrate more code from outside resources, which leads to complex interactions between systems that may not have been designed with massive integration in mind.

These numbers should not be surprising to any software engineer, but they are a cause for concern. Although it was unlikely that the numbers would shrink, all but one of them grew by more than 50%, and although the number of lines may have grown linearly, the interactions among the new components that these numbers represent have not grown in a linear fashion. If we assume that all modules in a system can interact freely with all other modules, then we have a system in which the potential number of interactions is expressed as n(n- 1)/2, an equation that should be familiar to those who work in networking as it represents a fully connected network. If a system grows from 100 modules to 200 modules, a 100% growth rate, then the number of potential connections grows from 4,950 to 19,900, a 302% growth rate.

One reliable measure of the number of interfaces into a system is the number of system calls provided to user programs by an operating system kernel. Since the publication of my first article on code spelunking the Linux kernel has grown from just shy of 200 system calls to 313, an increase of more than 50% (see Table 1). 7

two new tools

My first article on code spelunking covered several tools, including global, 3 Cscope, 1 gprof, 4 ktrace, 8 and truss. 11 I continue to use these tools on a daily

References:

Archives