researchhighlights
in-place. Likewise, a monadic programming style enables us
to encode exceptions and state in a legible, compositional
manner.
The main advantage of this unconventional approach,
compared with implementing the compiler in a conventional imperative language, is that we do not need a program
logic (such as Hoare logic) to connect the compiler’s code
with its logical specifications. The Coq functions implementing the compiler are first-class citizens of Coq’s logic
and can be reasoned on directly by induction, simplifications, and equational reasoning.
To obtain an executable compiler, we rely on Coq’s
extraction facility, 15 which automatically generates Caml
code from Coq functional specifications. Combining the
extracted code with hand-written Caml implementations
of the unverified parts of the compiler (such as the parser),
and running all this through the Caml compiler, we obtain a
compiler that has a standard, cc-style command-line interface, runs on any platform supported by Caml, and generates PowerPC code that runs under MacOS X. (Other target
platforms are being worked on.)
3. 5. Performance
To assess the quality of the code generated by CompCert, we
benchmarked it against the GCC 4.0.1 compiler at optimization levels 0, 1, and 2. Since standard benchmark suites use
features of C not supported by CompCert, we had to roll our
own small suite, which contains some computational kernels, cryptographic primitives, text compressors, a virtual
machine interpreter and a ray tracer. The tests were run on a
2 GHz PowerPC 970 “G5” processor.
As the timings in Figure 2 show, CompCert generates
code that is more than twice as fast as that generated by
GCC without optimizations, and competitive with GCC at
optimization levels 1 and 2. On average, CompCert code is
only 7% slower than gcc −01 and 12% slower than gcc −02.
The test suite is too small to draw definitive conclusions, but
figure 2: Relative execution times of compiled code.
gcc -00 CompCert
1
these results strongly suggest that while CompCert is not
going to win a prize in high performance computing, its performance is adequate for critical embedded code.
Compilation times of CompCert are within a factor of
2 of those of gcc −01, which is reasonable and shows that
the overheads introduced to facilitate verification (many
small passes, no imperative data structures, etc.) are
acceptable.
4. ReGisTeR ALLoCATion
To provide a more detailed example of a verified compilation pass, we now present the register allocation pass of
CompCert and outline its correctness proof.
4. 1. The RTL intermediate language
Register allocation is performed over the RTL intermediate representation, which represents functions as a CFG of
abstract instructions, corresponding roughly to machine
instructions but operating over pseudo-registers (also
called “temporaries”). Every function has an unlimited
supply of pseudo-registers, and their values are preserved
across function call. In the following, r ranges over pseudo-registers and l over labels of CFG nodes.
Instructions:
i ::= nop (l ) no operation (go to l )
→
| op(op, r , r, l ) arithmetic operation
→
| load (k, mode, r , r, l) memory load
→
| store(k, mode, r , r, l) memory store
→
| call(sig, (r | id), r , r, l) function call
→
| tailcall(sig,(r | id), r ) function tail call
→
| cond(cond, r , l , l ) conditional branch
true false
| return | return(r) function return
Control-flow graphs:
g ::= l → i
finite map
gcc -01
gcc -02
0
Almabench
Arith m etic co ding
Binary trees
Fannkuch
AES cipher
K-nucleotide
Lem pel-Ziv
Lempel-Ziv- Welch
Mandelbrot
N-body
FFT
Quicksort
SHA1 hash
Virtual m achine
Num ber sieve
Ray tracer
Spectral test