Exterminator: Automatically
Correcting Memory Errors with
High Probability

by Gene novark, Emery d. berger, and benjamin G. Zorn

abstract

Programs written in C and C++ are susceptible to memory errors, including buffer overflows and dangling pointers. These errors, which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously costly to repair. Tracking down their location in the source code is difficult, even when the full memory state of the program is available. Once the errors are finally found, fixing them remains challenging: even for critical security-sensitive bugs, the average time between initial reports and the issuance of a patch is nearly 1 month.

We present Exterminator, a system that automatically corrects heap-based memory errors without programmer intervention. Exterminator exploits randomization to pinpoint errors with high precision. From this information, Exterminator derives runtime patches that fix these errors both in current and subsequent executions. In addition, Exterminator enables collaborative bug correction by merging patches generated by multiple users. We present analytical and empirical results that demonstrate Exterminator’s effectiveness at detecting and correcting both injected and real faults.

1. intRoDuction

The use of manual memory management and unchecked memory accesses in C and C++ leaves applications written in these languages susceptible to a range of memory errors. These include buffer overruns, where reads or writes go beyond allocated regions, and dangling pointers, when a program deallocates memory while it is still live. Memory errors can cause programs to crash or produce incorrect results. Worse, attackers are frequently able to exploit these memory errors to gain unauthorized access to systems.

Debugging memory errors is notoriously difficult. Reproducing the error requires an input that exposes it. Since inputs are often unavailable from deployed programs, developers must either concoct such an input or find the problem via code inspection. Once a test input is available, software developers typically execute the application with heap debugging tools like Purify7 and Valgrind, 10 which may slow execution by an order of magnitude. When the bug is ultimately discovered, developers must construct and carefully test a patch to ensure that it fixes the bug without introducing any new ones. This process can be costly and time-consuming. For example, according to Symantec, the average time between the discovery of a critical, remotely

exploitable memory error and the release of a patch for enterprise applications is 28 days. 17

Because memory errors are so difficult to find and fix, researchers have proposed many solutions that fall roughly into two categories: detection, which prevents errors from being exploited and potentially allows them to be debugged more easily; and toleration, where the effects of errors are mitigated. Fail-stop systems are compiler-based approaches that may require access to source code, and abort programs when they perform illegal operations like buffer overflows. 1, 2, 6, 9

Fault-tolerant runtime systems, which attempt to hide the effect of errors, have also been proposed. Rinard’s failure-oblivious systems are also compiler-based, but manufacture read values and drop or cache illegal writes for later reuse. 13, 14 The Rx system12 uses logging and replay, with potential perturbation, to provide fault tolerance. Our previous work, DieHard, 3, 4 uses heap over-provisioning, layout randomization, and optional voting-based replication to reduce the likelihood that an error will have any effect (see Section 3. 1 for an overview). DieHard provides probabilistic memory safety, giving the application the illusion of having an infinite heap with a well-defined probability.

contributions: This paper presents exterminator, a runtime system that not only tolerates but also detects, isolates, and corrects two classes of heap-based memory errors with high probability. Exterminator requires neither source code nor programmer intervention, and fixes existing errors without introducing new ones. To our knowledge, this system is the first of its kind.

Exterminator relies on an efficient probabilistic debugging allocator that we call DieFast. DieFast is based on DieHard’s allocator, which ensures that heaps are independently randomized. However, while DieHard can only probabilistically tolerate errors, DieFast probabilistically detects them.

When Exterminator discovers an error, it dumps a heap image that contains the complete state of the heap. Exterminator’s probabilistic error isolation algorithm processes one or more heap images to try to locate the source of the error. This error isolation algorithm has provably low false-positive and false-negative rates. Buffer overflows and dangling pointer errors can be distinguished because they tend to produce distinct patterns of heap corruption.

A previous version of this article appeared in Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, NY, pp. 1–11.

References:

Archives