deploying a code base, CI provides the
prerequisites for repair tools that use
test suites as correctness specifications.
Repair can become an activity in CI systems that suggests patches in response
to regression test failures, such as for
Alex, our hypothetical programmer.
Are we there yet? Existing techniques for automated repair of correctness bugs are typically evaluated for
effectiveness using bugs taken from
open source projects. Because many
techniques require input tests to trigger the bug under repair and to evaluate the technique, such programs and
bugs must be associated with one or
more failing test cases. These bugs
are typically collected systematically
by going back in time through code
histories to identify bug-fixing commits and the regression tests associated with them. Open source projects
whose bugs have been studied in this
way include popular Java projects, for
example, various Apache libraries,
Log4J, and the Rhino JavaScript interpreter, as well as popular C projects,
for example, the PHP and Python interpreters, the Wireshark network protocol analyzer, and the libtiff library.
Recently, the Repairnator project33
has presented a bot which monitors
for software errors, and automatically find fixes using repair tools. Another recent work from Facebook15
describes experiences in integrating
repair as part of continuous integration—a repair tool monitors test failures, reproduces them, and automatically looks for patches. Once patches
are found, they are presented to the
developers for validation. Currently,
the effort focuses on automatically repairing crashes in Android apps, however, the project plan is to extend the
work to general-purpose repair.
Repairing security vulnerabilities.
Many security vulnerabilities are ex-
ploitable memory errors or program-
ming errors, and hence a relevant tar-
get for automated repair. Key software,
including popular libraries process-
ing file formats or operating system
utilities, are regularly and rigorously
checked for vulnerabilities in response
to frequent updates using grey-box
fuzz testing tools, such as American
Fuzzy Lop (AFLb). Microsoft recently
b http://lcamtuf.coredump.cx/afl/
Next, she must speculate about strategies
to possibly fix the problem. For some
of these strategies, the developer will
evaluate a potential patch, by applying
it and evaluating whether the associated
test cases then pass; if not, she might
use the failing test cases to conduct addi-
tional debugging activities. Finally, the
developer must select a patch and ap-
ply it to code base. The difficulty of all
these tasks is compounded by the fact
that complex software projects tend to
contain legacy code, code written by oth-
er members of an organization, or even
code written by third parties.
The promise of automated program
repair is in reducing the burden of
these tasks by suggesting likely correct
patches for software bugs. At a high
level, such techniques take as input a
program and some specification of the
correctness criteria that the fixed program should meet. Most research techniques use test suites for this purpose:
one or more failing tests indicate a bug
to be fixed, while passing tests indicate
behavior that should not change. The
end goal is a set of program changes
(typically to source code) that leads all
tests to pass, fixing the bug without
breaking other behavior.
The grand challenge in today’s research on automated program repair
is the problem of weak specifications.
Since detailed formal specifications of
intended program behavior are typically unavailable, program repair is driven
by weak correctness criteria, such as
a test suite. As a result, the generated
patches may over-fit the given test suite
and may not generalize to tests outside
the test suite. 29
In the rest of this article, we discuss
some of the technical developments in
automated program repair, including
an illustration of the overfitting problem. We start by sketching some of the
use-cases of automated program repair.
Use Cases
This section discusses four practical
use cases of automated repair, and reports initial experience based on current repair techniques.
Fixing bugs throughout development. Existing continuous integration
(CI) pipelines, such as Jenkins, are an
important stepping stone for integrating repair into the development process. By regularly building, testing, and
The grand challenge
in today’s research
on automated
program repair
is the problem of
weak specifications.