and spaceouts) on only 3.5% of all
checks. On WDF drivers, 98% of defects reported by SDV are true bugs,
and non-results are reported on only
0.04% of all checks. During the development cycle of Windows 7, SDV 2.0
was applied as a quality gate to drivers
written by Microsoft and sample drivers shipped with the WDK. SDV was
applied later in the cycle after all other
tools, yet found 270 real bugs in 140
WDM and WDF drivers. All bugs found
by SDV in Microsoft drivers were fixed
by Microsoft. We do not have reliable
data on bugs found by SDV in third-party device drivers.
Here, we give performance statistics from a recent run of SDV on 100
drivers and 80 SLIC rules. The largest
driver in the set is about 30,000 lines
of code, and the total size of all drivers
is 450,000 lines of code. The total runtime for the 8,000 runs (each driver-rule combination is a run) is about 30
hours on an eight-core machine. We
kill a run if it exceeds 20 minutes, and
SDV yields useful results (either a bug
or a pass) on over 97% of the runs. We
thus find SDV checks drivers with acceptable performance, yielding useful
results on a large fraction of the runs.
Limitations. SLAM and SDV also
involve several notable limitations.
Even with CEGAR, SLAM is unable to
handle very large programs (with hundreds of thousands of lines of code).
However, we also found SDV is able to
give useful results for control-dominated properties and programs with
tens of thousands of lines of code.
Though SLAM handles pointers in a
sound manner, in practice, it is unable to prove properties that depend
on establishing invariants of heap
data structures. SLAM handles only
sequential programs, though others have extended SLAM to deal with
bounded context switches in concurrent programs. 40 Our experience with
SDV shows that in spite of these limitations, SLAM is very successful in the
domain of device-driver verification.
Related Work
SLAM builds on decades of research in
formal methods. Model checking15, 16, 41
has been used extensively to algorithmically check temporal logic properties of models. Early applications of
model checking were in hardware38
and protocol design. 32 In compiler and
programming languages, abstract interpretation21 provides a broad and generic framework to compute fixpoints
using abstract lattices. The particular
abstraction used by SLAM was called
“predicate abstraction” by Graf and
Saïdi. 29 Our contribution was to show
how to perform predicate abstraction
on C programs with such language
features as pointers and procedure
calls in a modular manner. 2, 3 The
predicate-abstraction algorithm uses
an automated theorem prover. Our initial implementation of SLAM used the
Simplify theorem prover. 23 Our current
implementation uses the Z3 theorem
prover. 22
The Bandera project explored the
idea of user-guided finite-state abstractions for Java programs20 based on
predicate abstraction and manual abstraction but without automatic refinement of abstractions. It also explored
the use of program slicing for reducing
the state space of models. SLAM was
influenced by techniques used in Bandera to check typestate properties on
all objects of a given type.
SLAM’s Boolean program model
checker (Bebop) computes fixpoints
on the state space of the generated
Boolean program that can include recursive procedures. Bebop uses the
Context Free Language Reachability algorithm, 42, 43 implementing it symbolically using Binary Decision Diagrams. 11
Bebop was the first symbolic model
checker for pushdown systems. Since
then, other symbolic checkers have
been built for similar purposes, 25, 36 and
Boolean programs generated by SLAM
have been used to study and improve
their performance.
SLAM and its practical application
to checking device drivers has been
enthusiastically received by the re-
search community, and several related
projects have been started by research
groups in universities and industry.
At Microsoft, the ESP and Vault proj-
ects were started in the same group
as SLAM, exploring different ways of
checking API usage rules. 37 The Blast
project31 at the University of Califor-
nia, Berkeley, proposed a technique
called “lazy abstraction” to optimize
constructing and maintaining the ab-
stractions across the iterations in the
CEGAR loop. McMillan39 proposed “in-
terpolants” as a more systematic and
general way to perform refinement;
Henzinger et al. 30 found predicates
generated from interpolants have nice
local properties that were then used to
implement local abstractions in Blast.
Conclusion
The past decade has seen a resurgence
of interest in the automated analysis of
software for the dual purpose of defect
detection and program verification, as
well as advances in program analysis,
model checking, and automated theorem proving. A unique SLAM contribution is the complete automation of
CEGAR for software written in expres-