the primary repair. Let Test(C) = 1 if the program obtained
by applying the changes in C to the original program passes
all positive and negative test cases; let Test(C) = 0 otherwise.
A one-minimal subset C Í Cp is a set such that Test(C) = 1 and
"ci Î C. Test(C\{ci}) = 0. That is, a one-minimal subset produces a program that passes all test cases, but dropping any
additional elements causes the program to fail at least one
test case. Checking if a set is valid involves a fitness evaluation (a call to Test).
We use delta debugging25 to efficiently compute a one-minimal subset of changes from the primary repair. Delta
debugging is conceptually similar to binary search, but it
returns a set instead of a single number. Intuitively, starting with {c1, …, cn}, it might check {c1, …, cn/2}: if that half
of the changes is sufficient to pass the Test, then {c1+n/2, …,
cn} can be discarded. When no more subsets of size n/2
can be removed, subsets of size n/4 are considered for
removal, until eventually subsets of size 1 (i.e., individual
changes) are tested. Finding the overall minimal valid set
by brute force potentially involves O(2n) evaluations; delta
debugging finds a one-minimal subset in O(n2). 25, Proposition 12
However, we typically observe a linear number of tests in
our experiments. This smaller set of changes is presented
to the developers as the final repair in the form of a standard program patch. In our experiments, the final repair is
typically at least an order-of-magnitude smaller than the
primary repair.
3. iLLustRation
On 31 December 2008, a bug was widely reported in Microsoft Zune media players, causing them to freeze up.b The
fault was a bug in the following program fragment:c
1 void zunebug(int days) {
2 int year = 1980;
3 while (days > 365) {
4 if (isLeapYear (year)){
5 if (days > 366) {
6 days −= 366;
7 year += 1;
8 }
9 else {
10 }
11 }
12 else {
13 days −= 365;
14 year += 1;
15 }
16 }
17 printf(“the year is %d\n”, year);
18 }
b See Microsoft Zune affected by “Bug,” BBC News, December 2008, http://
news.bbc.co.uk/2/hi/technology/7806683.stm.
c Downloaded from http://pastie.org/349916 (Jan. 2009). Note that the original program source code does not make lines 9–10 explicit: our AST represents missing blocks, such as those in if statements without else clauses,
as blocks containing zero statements.
When the value of the input days is the last day of a leap
year (such as 10,593, which corresponds to 31 December
2008), the program enters an infinite loop on lines 3–16.
We now walk through the evolution of a repair for this
program. We first produce its AST and determine the
weighted path, using line numbers to indicate statement
IDs. The positive test case zunebug ( 1,000) visits lines 1–8,
11–18. The negative test case zunebug ( 10,593) visits lines
1–16, and then repeats lines 3, 4, 8, and 11 infinitely.
For the purposes of this example, the negative test cases
consist of the inputs 366 and 10,593, which cause an infinite
loop (instead of the correct values, 1980 and 2008), and the
positive test cases are the inputs 1,000, 2,000, 3,000, 4,000,
and 5,000, which produce the correct outputs 1982, 1985,
1988, 1990, and 1993.
We consider one variant, V, which is initialized to be identical to the original program. In Generation 1, two operations
mutate V: the conditional statement “if (days > 366)
{days −= 366; year += 1;}” is inserted between lines
6 and 7 of the original program; and the statement
“days −= 366” is inserted between lines 10 and 11. Note
that the first insertion includes not just the if but its entire
subtree. This produces the following code fragment:
5 if (days > 366) {
6 days −= 366;
7 if (days > 366) {
8 days −= 366;
9 year += 1;
10 }
11 year += 1;
12 }
13 else {
14 }
15 days −= 366;
// insert #1
// insert #1
// insert #1
// insert #1
// insert #2
This modified program passes the negative test case 366
(year 1980) and one positive test case 1000.
Variant V survives Generations 2, 3, 4, 5 unchanged, but
in Generation 6, it is mutated with the following operations: lines 6–10 are deleted, and “days −= 366” is inserted
between lines 13 and 14:
5 if (days > 366) {
6 // days −= 366;
7 // if (days > 366) {
8 // days −= 366;
9 // year += 1;
10 // }
11 year += 1;
12 }
13 else {
14 days −= 366;
15 }
16 days −= 366;
// delete
// delete
// delete
// delete
// delete
// insert