continues to be in agreement, they can continue executing
concurrently with the error isolation process. Once the runtime patch generation process has completed, it signals the
running replicas to reload their runtime patches. Thus, subsequent allocations in the same process will be patched on
the fly without interrupting execution.
3. 4. 3. Cumulative Mode
While the replicated mode can isolate and correct errors on
the fly in deployed applications, it may not be practical in all
situations. For example, replicating applications with high
resource requirements may cause unacceptable overhead.
In addition, multithreaded or nondeterministic applications
can exhibit different allocation activity and so cause object
ids to diverge across replicas. To support these applications,
Exterminator uses its third mode of operation, cumulative
mode, which isolates errors without replication or multiple
identical executions.
When operating in cumulative mode, Exterminator reasons about objects grouped by allocation and deallocation
sites instead of individual objects, since objects are no longer
guaranteed to be identical across different executions.
Because objects from a given site only occasionally cause
errors, often at low frequencies, Exterminator requires more
executions than in replicated or iterative mode in order to
identify these low-frequency errors without a high false-positive rate. Instead of storing heap images from multiple runs,
Exterminator computes relevant statistics about each run
and stores them in its patch file. The retained data are on the
order of a few kilobytes per execution, compared to tens or
hundreds of megabytes for each heap image.
4. iteRative anD RePLicateD eRRoR isoLation
Exterminator employs two different families of error isolation algorithms: one set for replicated and iterative modes,
and another for cumulative mode.
When operating in its replicated or iterative modes,
Exterminator’s probabilistic error isolation algorithm operates by searching for discrepancies across multiple heap
images. Exterminator relies on corrupted canaries stored
in freed objects to indicate the presence of an error. A corrupted canary (one that has been overwritten) can mean two
things. If the same object (identified by object id) across all
heap images has the same corruption, then the error is likely
to be a dangling pointer. If canaries are corrupted in multiple
freed objects, then the error is likely to be a buffer overflow.
Exterminator limits the number of false positives for both
overflows and dangling pointer errors.
4. 1. Buffer overflow detection
Exterminator examines heap images looking for discrepancies
across the heaps, both in overwritten canaries and in live objects.
If an object is not equivalent across the heaps, Exterminator
considers it to be a candidate victim of an overflow.
To identify victim objects, Exterminator compares the
contents of equivalent objects, as identified by their object
id across all heaps. Exterminator builds an overflow mask
that comprises the discrepancies found across all heaps.
However, because the same logical object may legitimately
differ across multiple heaps, Exterminator must take care
not to consider these occurrences as overflows.
First, a freed object may differ across heaps because it was
filled with canaries only in some of the heaps. Exterminator
uses the canary bitmap to identify this case.
Second, an object can contain pointers to other objects,
which are randomly located on their respective heaps.
Exterminator uses both deterministic and probabilistic
techniques to distinguish integers from pointers. Briefly, if
a value interpreted as a pointer points inside the heap area
and points to the same logical object across all heaps, then
Exterminator considers it to be the same logical pointer, and
thus not a discrepancy. Exterminator also handles the case
where pointers point into dynamic libraries, which newer
versions of Linux place at pseudorandom base addresses.
Finally, an object can contain values that legitimately differ from process to process. Examples of these values include
process ids, file handles, pseudorandom numbers, and pointers in data structures that depend on addresses (e.g., some
red-black tree implementations). When Exterminator examines an object and encounters any word that differs at the
same position across all the heaps, it considers it to be legitimately different, and not an indication of buffer overflow.
For small overflows, the risk of missing an overflow by
ignoring overwrites of the same objects across multiple
heaps is low:
Theorem 1. Let k be the number of heap images, S the length
(in number of objects) of the overflow string, and h the number
of objects on the heap. Then the probability of an overflow overwriting an object on all k heaps is
P(identical overflow) v h × (S/h)k.
Proof. This result holds for a stronger adversary than
usual—rather than assuming a single contiguous overflow,
we allow an attacker to arbitrarily overwrite any S distinct
objects. Consider a given object a. On each heap, S objects
are corrupted at random. The probability that object i is cor-
rupted on a single heap is (S/h). Call E the event that object
i
i is corrupted across all heaps; the probability P(E ) is (S/h)k.
i
The probability that at least one object is corrupted across all
the heaps is P(∪ E ), which by a straightforward union bound
ii
is at most Σ P(E ) = h × (S/h)k. □
ii
We now bound the worst-case false-negative rate for buffer overflows; that is, the odds of not finding a buffer overflow
because it failed to overwrite any canaries.
Theorem 2. Let M be the heap multiplier, so a heap is never
more than 1/M full. The likelihood that an overflow of length b
bytes fails to be detected by comparison against a canary is at
most:
− −k ≤ M
1 +
1 P(missed overflow) 1.
2 M 256b
Proof. Each heap is at least (M − 1)/M free. Since DieFast fills
free space with canaries with P = 1/2, the fraction of each heap
filled with canaries is at least (M − 1)/2M. The likelihood of a