to logging, to single-stepping, to constructing a unit test, or a specialized
tool. No bug can elude a programmer
who perseveres. And keep in mind that
the joy of fixing a fault is proportional
to the work the programmer puts into
debugging the failure.
My thanks to Moritz Beller, Alexander
Lattas, Dimitris Mitropoulos, Tushar
Sharma, and the anonymous reviewers
for insightful comments on earlier versions of this article.
1. Ayewah, N., Hovemeyer, D., Morgenthaler, J. D., Penix,
J., and Pugh, W. Using static analysis to find bugs.
IEEE Software 25, 5 (Sept. 2008), 22–29.
2. Bailis, P., Alvaro, P., and Gulwani, S. Research for
practice: Tracing and debugging distributed systems;
programming by examples. Commun. ACM 60, 7 (July
3. Beller, M., Spruit, N., Spinellis, D., and Zaidman, A.
On the dichotomy of debugging behavior among
programmers. In Proceedings of the 40th International
Conference on Software Engineering (Gothenburg,
Sweden, May 27–June 3). ACM Press, New York, 2018,
4. Beschastnikh, I., Wang, P., Brun, Y., and Ernst, M.D.
Debugging distributed systems. Commun. ACM 59, 8
(Aug. 2016), 32–37.
5. Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B.,
Hallem, S., Henri-Gros, C., Kamsky, A., McPeak, S., and
Engler, D. A few billion lines of code later: Using static
analysis to find bugs in the real world. Commun. ACM
53, 2 (Feb. 2010), 66–75.
6. Böhme, M., Soremekun, E.O., Chattopadhyay, S.,
Ugherughe, E., and Zeller, A. Where is the bug and
how is it fixed? An experiment with practitioners. In
Proceedings of the 11th Joint Meeting on Foundations
of Soft ware Engineering (Paderborn, Germany, Sept.
4–8). ACM Press, New York, 2017, 117–128.
7. Branco, R.R. Ltrace internals. In Proceedings of the
Linux Symposium, A.J. Hutton and C. C. Ross, Eds.
(Ottawa, ON, Canada, June 27–30, 2007), 41–52;
8. Cadar, C. and Sen, K. Symbolic execution for software
testing: Three decades later. Commun. ACM 56, 2
(Feb. 2013), 82–90.
9. Cantrill, B. and Bonwick, J. Real-world concurrency.
Commun. ACM 51, 11 (Nov. 2008), 34–39.
10. Duvall, P.M., Matyas, S., and Glover, A. Continuous
Integration: Improving Software Quality and Reducing
Risk. Pearson Education, Boston, MA, 2007.
11. Eigler, F. C. Problem solving with Systemtap. In
Proceedings of the Linux Symposium, A. J. Hutton
and C. C. Ross, Eds. (Ottawa, ON, Canada, July
19–22, 2006), 261–268; https://www.kernel.org/doc/
12. Engblom, J. A review of reverse debugging. In
Proceedings of the 2012 System, Software, SoC and
Silicon Debug Conference (Vienna, Austria, Sept.
19–20). Electronic Chips & Systems Design Initiative,
Gières, France, 2012, 28–33.
13. Graham, S.L., Kessler, P.B., and McKusick, M.K. An
execution profiler for modular programs. Software:
Practice & Experience 13, 8 (Aug.1983), 671–685.
14. Gregg, B. and Mauro, J. DTrace: Dynamic Tracing in
Oracle Solaris, Mac OS X, and FreeBSD. Prentice Hall
Professional, Upper Saddle River, NJ, 2011.
15. Kernighan, B. W. Sometimes the old ways are best.
IEEE Software 25, 6 (Nov. 2008), 18–19.
16. LeBlanc, T.J. and Mellor-Crummey, J.M. Debugging
parallel programs with Instant Replay. IEEE
Transactions on Computers C- 36, 4 (Apr. 1987), 471–482.
17. Magnusson, P. S., Christensson, M., Eskilson, J.,
Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F.,
Moestedt, A., and Werner, B. Simics: A full system
simulation platform. Computer 35, 2 (Feb. 2002), 50–58.
18. Margosis, A. and Russinovich, M.E. Windows
Sysinternals Administrator’s Reference. Microsoft
Press, Redmond, WA, 2011.
19. Mernik, M., Heering, J., and Sloane, A.M. When and
how to develop domain-specific languages. ACM
Computing Surveys 37, 4 (Dec. 2005), 316–344.
20. Nasehi, S. M., Sillito, J., Maurer, F., and Burns, C.
What makes a good code example?: A study of
programming Q&A in StackOverflow. In Proceedings
of the 28th IEEE International Conference on Software
Maintenance (Riva del Garda, Trento, Italy, Sept.
23–30). IEEE Press, 2012, 25–34.
21. Nethercote, N. and Seward, J. Valgrind: A framework
for heavyweight dynamic binary instrumentation. In
Proceedings of the 28th ACM SIGPLAN Conference on
Programming Language Design and Implementation
(San Diego, CA, June 10–13). ACM Press, New York,
22. Neumann, P.G. Computer Related Risks. Addison-Wesley, Reading, MA, 1995.
23. Nielson, F., Nielson, H.R., and Hankin, C. Principles of
Program Analysis. Springer, Berlin, Germany, 2015.
24. O’Dell, D. H. The debugging mind-set. Commun. ACM
60, 6 (June 2017), 40–45.
25. Orebaugh, A., Ramirez, G., and Beale, J. Wireshark &
Ethereal Network Protocol Analyzer Toolkit. Syngress,
Cambridge, MA, 2006.
26. Patil, H., Pereira, C., Stallcup, M., Lueck, G., and
Cownie, J. Pinplay: A framework for deterministic
replay and reproducible analysis of parallel programs.
In Proceedings of the Eighth Annual IEEE/ACM
International Symposium on Code Generation and
Optimization ( Toronto, ON, Canada, Apr. 24–28). ACM
Press, New York, 2010, 2–11.
27. Perscheid, M., Siegmund, B., Taeumel, M., and
Hirschfeld, R. Studying the advancement in debugging
practice of professional software developers. Software
Quality Journal 25, 1 (Mar. 2017), 83–110.
28. Runeson, P. A survey of unit-testing practices. IEEE
Software 23, 4 (July 2006), 22–29.
29. Sack, P., Bliss, B.E., Ma, Z., Petersen, P., and Torrellas,
J. Accurate and efficient filtering for the Intel Thread
Checker race detector. In Proceedings of the First
Workshop on Architectural and System Support for
Improving Software Dependability (San Jose, CA,
Oct. 21–25). ACM Press, New York, 2006, 34–41.
30. Serebryany, K., Bruening, D., Potapenko, A., and
Vyukov, D. Address-Sanitizer: A fast address sanity
checker. In Proceedings of the 2012 USENIX Annual
Technical Conference (Boston, MA, June 13–15).
USENIX Association, Berkeley, CA, 2012, 309–318.
31. Spinellis, D. Code Reading: The Open Source
Perspective. Addison-Wesley, Boston, MA, 2003.
32. Spinellis, D. Working with Unix tools. IEEE Soft ware
22, 6 (Nov./Dec. 2005), 9–11.
33. Spinellis, D. Debuggers and logging frameworks. IEEE
Software 23, 3 (May/June 2006), 98–99.
34. Spinellis, D. Differential debugging. IEEE Software 30,
5 (Sept./Oct. 2013), 19–21.
35. Stahl, T. and Volter, M. Model-Driven Software
Development: Technology, Engineering, Management.
John Wiley & Sons, Inc., New York, 2006.
36. Tseitlin, A. The anti-fragile organization. Commun.
ACM 56, 8 (Aug. 2013), 40–44.
37. Wilkes, M. The Birth and Growth of the Digital
Computer. Lecture delivered at the Digital Computer
Museum, available through the Computer History
Museum, Catalog Number 102695269, Sept. 1979;
38. Zeller, A. Automated debugging: Are we close?
Computer 34, 1 (Nov. 2001), 26–31.
39. Zeller, A. Isolating cause-effect chains from computer
programs. In Proceedings of the 10th ACM SIGSOF T
Symposium on Foundations of Software Engineering
(Charleston, SC, Nov. 18–22). ACM Press, New York,
40. Zeller, A. Why Programs Fail: A Guide to Systematic
Debugging, Second Edition. Morgan Kaufmann,
Burlington, MA, 2009.
41. Zeller, A. and Hildebrandt, R. Simplifying and isolating
failure-inducing input. IEEE Transactions on Software
Engineering 28, 2 (Feb. 2002), 183–200.
Diomidis Spinellis ( email@example.com) is a professor in
and head of the Department of Management Science
and Technology in the Athens University of Economics
and Business, Athens, Greece, and author of Effective
Debugging: 66 Specific Ways to Debug Software and
Systems, Addison-Wesley, 2016.
© 2018 ACM 0001-0782/18/11 $15.00
interactions with other processes, and
determine changes in a system’s configuration. By listing metrics and error
messages, logs can reveal a sickly application (such as one with unusually
high latency or memory use) or expose
one that fails due to insufficient privileges. Such things can help programmers pinpoint a specific application
as a contributing factor in a more complex failure.
Virtualization and system simulators. One family of technologies that
can help debug software running
on hardware that does not match a
given development environment includes virtual machines, emulators,
and system simulators. With virtual
machines and operating system virtualization systems (such as Docker),
software development teams can create a single environment that can be
used for development, debugging,
and production deployment. Such
containers are also useful when a programmer wants to find and eliminate
configuration-related errors. Moreover, development environments for
some commonly used embedded platforms (such as smartphones) come
with an emulator, allowing programmers to experience the capabilities of
the target hardware from the comfort
of a desktop. Finally, when a team is
developing software and hardware together, a full system simulator (such
as Simics17) will provide a high-fidelity
view of the complete platform stack.
The number of possible faults in a software system can easily challenge the
limits of human ingenuity. Debugging
the corresponding failures thus requires an arsenal of tools, techniques,
methods, and strategies. Here I have
outlined some I find particularly effective, but there are many others I consider useful, as well as many specialized ones that may work wonders in a
Each debugging session represents
a new venture into the unknown. Programmers should work systematically, starting with an approach that
matches the failure’s characteristics,
but adapt it quickly as they uncover
more things about the failure’s probable cause. Programmers should not
hesitate to switch from Web searching,