rigorous upper and lower bounds,”
Hales explains.
Similar issues of trust greeted the
first computerized proof of the theorem that argued only four colors are
needed to distinguish between adjacent regions on a 2D map. Forty years
ago, working at the University of Illinois at Urbana-Champaign, Kenneth
Appel and Wolfgang Haken developed
computer programs to demonstrate
there were no counterexamples to the
theorem. Still, the programs were tedious to check by hand.
In 2005, Georges Gonthier of Microsoft Research and Benjamin Werner of
French research institute Inria used a
computerized proof assistant, Coq,
to develop a proof that did not rely on
counterexamples generated by programs. Moving to a position where the
Coq kernel could be trusted involved
less manual inspection.
“Most proof systems are built on
a very small trusted base. In terms of
what is carrying out inferences, that
code base is very small. Then on top
of that, you write very complicated
software that parses the code, but ul-
timately, it calls this very small thing
that actually checks the proof,” Avi-
gad explains. “People have also built
mechanisms where you can have a
complex proof system that checks the
proof and then have another checker
check the proof independently. The
probability of there being a mistake
in one and the other having the same
mistake is very low.”
Avigad adds: “Mathematicians are
great pragmatists. People will do what
they need to do and if they see some-
thing is useful, they will apply it. But
right now the feeling is that there is a
very steep learning curve. You spend
some time learning logic and formal
methods, then spend some time learn-
ing to prove what are really quite trivial
formal theorems.”
A key stumbling block is the level of
detail required by the computer to pre-
pare a formal proof. Despite the appar-
ent rigor of mathematical notation and
language, “there is a lot that’s left im-
plicit,” Avigad says. “In a proof, there
are many small inferences. Specifying
each one explicitly is not something we
typically do.”
Automated proof assistants need
far more detail from the mathemati-
cian, says Tobias Nipkow, a member
of the theorem-proving group at the
Technical University of Munich’s de-
partment of information technology.
“We are still at the stage where in or-
der to formalize anything non-trivial,
you frequently have to include a signif-
icant amount of foundational materi-
al that a mathematician would simply
take for granted.”
One approach is to build online
archives of formalized mathematical
knowledge that could be accessed by
proof tools. An example of such a re-
pository is the Archive of Formal Proofs
( http://afp.sourceforge.net/), which uses
Sourceforge to hold a collection of proof
libraries and examples written for the
Isabelle prover software, one of the
tools used for the FlySpeck project.
“Equally important are more powerful automatic proof procedures. This
would enable mathematicians to take
bigger steps in their proofs and reduce the tedium of computer-assisted
proofs,” Nipkow adds.
Carette says a key problem in de-
veloping more usable proof assistants
with the help of online repositories is
the breadth of techniques needed to
produce the software. “The technical
challenges involved in these two enter-
prises are quite different. They tend to
be done by different people.”
A further issue is the tendency for
branches of mathematics to form si-
los that use quite different techniques
to approach the problem of building
proofs. Large-scale proofs such as Fly-
Speck rely on extensive computation,
use. Tools can take some time to adopt:
tools such as (the algebra systems) Ma-
ple and Mathematica took a solid 20
years before they became pervasive.
“Personally, once I got over the hur-
dle of learning how these things work,
it now speeds me up. I can be bold in
my conjectures and the computer will
tell me that I’m wrong. The computer
is good at spotting problems in the
small details: things that humans are
typically really bad at.”
Jeremy Avigad, a professor in the
philosophy and mathematical sci-
ences departments at Carnegie Mellon
University, says of formal proof tech-
nology: “I believe that it will become
commonplace. It’s a natural progres-
sion. We care that our math is precise
and correct and we now have a technol-
ogy that helps in that regard. But the
technology is not yet ready for prime
time. There is a gap: it’s usability.”
For mathematicians working on
problems that seemed insurmount-
able if tackled purely by hand, the us-
ability gap was less of an issue than
trust in the results by others. The
length and complexity of a proof of
the 1611 conjecture by Johannes
Kepler on the most efficient method
for packing spheres that was developed
by Thomas Hales while working at the
University of Michigan in the late 1990s
led to a reviewing process that took
four years to complete. Even after that
length of time, reviewers claimed they
could only be 99% certain of the proof’s
correctness. To fully demonstrate the
proof’s correctness, Hales started a col-
laborative project called FlySpeck built
on the use of automated proof-checking
software. The group completed its work
early in 2015 with the publication of a
paper that described the process used.
In building the original proof of
the Kepler conjecture, Hales and his
colleagues had to develop software
that performed computation in a
way that lent itself to building a reliable proof. A large part of the proof
lies in a lengthy series of inequalities
that needed to be demonstrated using computation. Rather than rely on
conventional floating-point arithmetic and the imprecision that introduces, the group had to develop software
to perform interval arithmetic and
automated differentiation using Taylor expansions “so we were able to get
“Once I got over the
hurdle of learning
how these things
work, it now speeds
me up. I can be bold
in my conjectures
and the computer
will tell me that
I’m wrong.”