high-quality set of positive test cases that encode program
requirements. In other work, we experimented with held-out indicative workloads, fuzz testing, and held-out exploits
to demonstrate that our server repairs address the causes of
problems without being fragile memorizations of the negative
input and without failing common requests (i.e., because of
the positive tests). Much remains to be done in this area, however, such as automatically documenting or proving properties
of the generated repairs.
Future work. Beyond these immediate steps, there are other
more ambitious areas for future work. For example, we plan
to develop a generic set of repair templates so the GP has an
additional source of new code to use in mutation, beyond those
statements that happen to be in the program. Our AST program
representation could be extended in various ways, for example,
by including data structure definitions and variable declarations. Similarly, we are currently experimenting with assem-bly- and bytecode-level repairs. Finally, we are interested in
testing the method on more sophisticated errors, such as race
conditions, and in learning more about bugs that need to be
repaired, such as their size and distribution, and how we might
identify which ones are good candidates for the GP technique.
7. concLusion
We credit much of the success of this technique to design
decisions that limit the search space. Restricting attention to
statements, focusing genetic operations along the weighted
path, reusing existing statements rather than inventing new
ones, and repairing existing programs rather than creating
new ones, all help to make automatic repair of errors using
GP tractable in practice.
The dream of automatic programming has eluded computer scientists for at least 50 years. The methods described
in this paper do not fulfill that dream by evolving new programs from scratch. However, they do show how to evolve
legacy software in a limited setting, providing at least a small
down payment on the dream. We believe that our success
in evolving automatic repairs may say as much about the
state of today’s software as it says about the efficacy of our
method. In modern environments, it is exceedingly difficult
to understand an entire software package, test it adequately,
or localize the source of an error. In this context, it should
not be surprising that human programming often has a
large trial and error component, and that many bugs can be
repaired by copying code from another location and pasting
it in to another. Such debugging is not so different from the
approach we have described in this paper.
1. al-ekram, r., adma, a., baysal, o.
diffX: an algorithm to detect changes
in multi-version Xml documents.
in Conference of the Centre for
Advanced Studies on Collaborative
Research. ibm Press,
2005, 1–11.
2. arcuri, a., yao, X. a novel
co-evolutionary approach to
automatic software bug fixing. in
IEEE Congress on Evolutionary
Computation (2008), 162–168.
3. ball, t., rajamani, s.k. automatically
validating temporal safety properties
of interfaces. in SPIN Workshop on
Model Checking of Software (may
2001), 103–122.
4. dallmeier, v., Zeller, a., meyer,
b. generating fixes from object
behavior anomalies. in International
Conference on Automated Software
Engineering (2009).
5. demsky, b., ernst, m.d., guo, P. J.,
mccamant, s., Perkins, J.h., rinard,
m. inference and enforcement of data
structure consistency specifications.
in International Symposium on
Software Testing and Analysis (2006),
233–244.
6. ernst, m.d., Perkins, J.h., guo, P.J.,
mccamant, s., Pacheco, c., tschantz,
m.s., Xiao, c. the daikon system for
dynamic detection of likely invariants.
Sci. Comput. Program (2007).
7. forrest, s., hofmeyr, s.a., somayaji,
a., longstaff, t.a. a sense of self for
unix processes. in IEEE Symposium
on Security and Privacy (1996),
120–128.
8. forrest, s., Weimer, W., nguyen, t.,
le goues, c. a genetic programming
approach to automated software
repair. in Genetic and Evolutionary
Computing Conference (2009),
947–954.
9. harman, m. the current state and
future of search based software
engineering. in International
Conference on Software Engineering
(2007), 342–357.
10. hovemeyer, d., Pugh, W. finding
bugs is easy. in Object-Oriented
Programming Systems, Languages,
and Applications Companion (2004),
132–136.
11. Jones, J.a., harrold, m.J. empirical
evaluation of the tarantula automatic
fault-localization technique. in
International Conference on
Automated Software Engineering
(2005), 273–282.
12. koza, J.r. Genetic Programming:
on the Programming of Computers
by Means of Natural Selection. mit
Press, 1992.
13. liblit, b., aiken, a., Zheng, a.X.,
Jordan, m.i. bug isolation via remote
program sampling. in Programming
Language Design and
Implementation (2003),
acknowledgments
We thank David E. Evans, Mark Harman, John C. Knight,
Anh Nguyen-Tuong, and Martin Rinard for insightful discussions. This work is directly based on the seminal ideas of
John Holland and John Koza.
This research was supported in part by National Science
Foundation Grants CCF 0621900, CCR-0331580, CNS
0627523, and CNS 0716478, Air Force Office of Scientific
Research grant FA9550-07-1-0532, as well as gifts from Microsoft
Research. No official endorsement should be inferred. The
authors thank Cris Moore for help finding the Zune code.
Westley Weimer ( weimer@virginia.edu),
university of virginia.
Claire Le Goues ( legoues@virginia.edu),
university of virginia.
Stephanie Forrest ( forrest@cs.unm.edu),
university of new mexico.
ThanhVu nguyen ( tnguyen@cs.unm.edu),
university of new mexico.