sis not infrequently finds more general
specifications than a top-down analysis
that dives into procedures at call sites;
finding general specs is important for
both scalability and precision.
The main bi-abduction paper12
contributed proof techniques and algorithms for abduction, and a novel
compositional algorithm for generating pre/post specs of program components. Experimental results scaled to
hundreds of thousands of lines, and a
part of Linux of 3M lines. This form of
analysis finds preconditions supporting safety proofs of clusters of procedures as well as indicating potential
bugs where proofs failed.
This work led to the program proof
startup Monoidics, founded by Calcagno, Distefano and O’Hearn in 2009.
Monoidics developed and marketed the
Infer tool, based on the abductive technique. Monoidics was acquired by Facebook in 2013 at which point Calcagno,
Distefano, and O’Hearn moved to Facebook with the Monoidics engineering
team ( www.fbinfer.com).
The compositional nature of Infer turned out to be a remarkable fit
for Facebook’s software development
process. 11 A codebase with millions
of lines is altered thousands of times
per day in “code diffs” submitted by
the programmers. Instead of doing
a whole-program analysis for each
diff, Infer analyzes changes (the diffs)
compositionally, and reports regressions as a bot participating in the internal code review process. Using bi-abduction, the frame rule picks off (an
approximation of) just enough state
to analyze a diff, instead of considering the entire global program state.
The way that compositional analysis
supports incremental diff analysis is
even more important than the ability
to scale; a linear-time analysis operating on the whole program would usually be too slow for this deployment
model. Indeed, Infer has evolved from
a standalone SL-based analyzer to a
general framework for compositional
analyses ( http://fbinfer.com/docs/
checkers.html and appendix; https://
bit.ly/2CQD9CU).
Conclusion
Some time during 2001, while sitting
together in his back garden, Reynolds
turned to me and exclaimed: “The
other entry points and explore the pro-
gram graph, perhaps visiting proce-
dure bodies multiple times. This can
be expensive. While accurate analysis
of 10k LOC can be a leading research
achievement, 10k is tiny compared to
software found in the wild. A single
company can have tens of millions of
lines of code. Progress toward big code
called for a radical departure.
Bi-Abduction and Facebook Infer
In 2008 Calcagno asked: What is the main
obstacle blocking application of SpaceInvader and similar tools to programs in the
millions of LOC? O’Hearn answered: The
need for the human to supply preconditions. He proposed that a “truly modular” analysis based on local reasoning
could accept a program component with
no human annotations, and generate
a pre/post spec where the precondition
approximates the footprint. The analysis
would then “stitch” these specifications
together to obtain results for larger program parts. The analysis would be compositional, in that a spec for a procedure
could be obtained without knowing its
callers, and the hypothesis was that it
would scale because procedures could be
visited independently. This implied giving up on whole-program analysis.
Calcagno, O’Hearn, Distefano and
Yang set to work on realizing a truly
modular analysis. Yang developed a
scheme based on gleaning information
from failed proofs to discover a footprint. Distefano made a breakthrough
on the stitching issue for the modular
analysis that involved a new inference
problem:
Bi-abduction: given A and B, find
?frame and ?anti-frame such that
where is read ‘entails’ or ‘implies.’
The inference of ?frame (the leftover
part in A but not B) was present in
Smallfoot, and is used in many tools.
The ?anti-frame part (the missing bit
needed to establish B), is abduction,
or inference of hypotheses, an infer-
ence problem identified by the philos-
opher Charles Peirce in his conceptu-
al analysis of the scientific method. As
a simple example,
can be solved with
With bi-abduction we can automate
the local reasoning idea by abducing
assertions that describe preconditions,
and using frame inference to keep speci-
fications small. Let us illustrate with the
program we started the paper with. We
begin symbolic execution with nothing
in the precondition, and we ask a bi-
abduction question, using the current
state emp as the A part of the bi-abduc-
tion query and the pre of the small axi-
om for [x] = y as B.
Now, we move the abduced anti-frame
to the overall precondition, we take
one step of symbolic execution using
the small axiom for Pointer Write from
Figure 2, we install the post of the small
axiom as the pre of the next instruction,
and we continue.
The formula y – in the bi-abduc-
tion query is the precondition of the
small axiom for the pointer write [y] = x:
we abduce it as the anti-frame, and add
it to the overall precondition. The frame
rule tells us that the inferred frame x
y is unaltered by [y] = x, when it is sepa-
rately conjoined with y –, and this
with the small axiom gives us our overall
postcondition in
So, starting from specifications for
primitive statements, we can infer both
a precondition and a postcondition for
a compound statement by repeated ap-
plications of bi-abduction and the frame
rule. This facility leads to a high degree
of automation. Also, note that the pre-
condition here is more general than the
one at the start of the paper, because it
does not mention 0. Bi-abductive analy-