Technical Perspective
A Compiler’s story
By Greg Morrisett
Doi: 10.1145/1538788.1538813
in tHe early
1970s, pioneers like
Floyd, Dijkstra, and Hoare argued that
programs should be formally specified
and proven correct. But for the past
40 years, most of the computer science community has discounted this
vision as an unrealistic goal. Perhaps
the most important reason has been
simple economics: Throughout the
1980s and 1990s, the industry tended
to be more interested in factors such
as time-to-market than issues involving correctness.
But the context has changed dramatically over the intervening years:
Security and reliability have become
key concerns in the fastest growing
sectors like embedded systems where
lives may be at stake. Even in non-crit-ical domains, developers must worry
about bugs, including buffer overruns
and race conditions that can lead to
security exploits. Researchers have
developed a variety of tools, including
strong static type checkers, software
model checkers, and abstract interpreters, all of which can (and are) used
to help enforce a range of safety properties. Consequently, formal methods
are in wide use today, albeit disguised
within tools.
However, these tools typically operate at the source-level (or, at best,
a VM bytecode level), and not at machine code level. To scale, they must
make assumptions about how a compiler will refine the source code to machine code. For example, even though
the C language specification does not
specify an order of evaluation for function arguments, most analysis tools
assume the compiler will use a left-to-right strategy, since analyzing all
possible evaluation strategies would
take too much time. This mismatch of
assumptions, or a bug in the compiler
itself, can easily render the analysis
tool useless.
In the following paper, Xavier Leroy addresses these problems by describing a compiler he built and verified using the Coq proof development
system. Although he is not the first
to build a verified translator, Leroy’s
compiler is notable and exciting for
three key reasons: First, it maps a useful source language (a significant subset of C) to PowerPC assembly, making
it directly usable for a range of embedded applications. Second, his compiler includes a number of analyses and
optimizations, such as liveness analysis and graph-coloring register allocation, that makes the resulting code
competitive with gcc -0. Third, the
proof of correctness is mechanically
checked, yielding the highest level of
assurance we can hope to achieve. In
short, developers can be assured that,
in spite of optimization, the output of
Leroy’s compiler will behave the same
as the source code input.
There are a number of hidden contributions to this work, beyond the
construction of a fully verified, optimizing compiler: The compiler was
built in a modular, pipelined fashion
as a series of translations between
well-specified intermediate languages, making it possible to add new optimizations or reuse components for
other projects. For example, the specification for the C-subset can be reused
to build new verified tools, such as a
source-level analysis.
The compiler also demonstrates
judicious use of translation validation
in lieu of full verification. With translation validation, we can use unverified components to compute something (for example, an assignment of
variables to registers) and need only
check the output is valid (no interfering variables are assigned to the same
register). Only the checker must be
verified to ensure soundness, and this
is often much easier than validating
a full analysis and transformation.
Translation validation is one engineering technique that can drastically
reduce the burden of building verified
systems.
This paper also shows how far we
must go before verification becomes
routinely feasible for production compilers or any other setting. Foremost is
the cost of constructing and maintaining the proof as the code evolves. Leroy’s proof of correctness is about five
to six times as big as the compiler itself, making it difficult to significantly
change the code without breaking the
proof. However, the proof was constructed largely by hand, and for the
most part, does not take advantage of
semi-automated decision procedures
or proof search, a research area that
has seen tremendous progress over
the past decade. Indeed, work by other researchers following Leroy’s lead
suggests we can potentially cut the
size of the proofs by up to an order of
magnitude.
Perhaps the biggest challenge we
face is specification. Compilers have
a fairly clean notion of “correctness”
(the output code should behave the
same as the input code), but most
software systems do not. For example, what does it mean for an operating system or Web browser to be correct? At best we can hope to formalize
some safety and security properties
that these systems should obey, and
be willing to adapt these properties
as our understanding of failures and
attacks improves. In turn, this demands a verification architecture that
allows specifications to be modified
and adapted almost as frequently as
the code. Fortunately, verified compilers make it possible to do this sort of
adaptation using high-level languages
without sacrificing assurance for the
generated machine code.
Consequently, I think we are on
the verge of a new engineering paradigm for safety- and security-critical
software systems, where we rely upon
formal, machine-checked verification
for certification, instead of human audits. Leroy’s compiler is an impressive
step toward this goal.
Greg Morrisett is the Allen B. Cutting Professor of
Computer Science and associate dean for Computer
Science and Engineering at harvard University.
© 2009 ACM 0001-0782/09/0700 $10.00