shows an example of this for a simple procedure; much
of this chaining is automated in CHL. The crash condition for a procedure is the disjunction (i.e., “or”) of the
crash conditions of all components of that procedure,
as illustrated by the red arrows in Figure 7. Finally, proving the correctness of a procedure together with its recovery function requires proving that the procedure’s crash
condition implies the recovery precondition, and that
recovery itself is idempotent.
4. PROTOTYPE IMPLEMENTATION
The implementation follows the organization shown in
Figure 1. FSCQ and CHL are implemented using Coq, which
provides a single programming language for implementations, specifications, and proofs. Figure 8 breaks down the
source code of FSCQ and CHL. Because Coq provides a single language, proofs are interleaved with source code and
are difficult to separate. The development effort took several
researchers about a year and a half; most of it was spent on
proofs and specifications. Checking the proofs takes 11 h on
an Intel i7-3667U 2.00 GHz CPU with 8 GB DRAM. The proofs
are complete; we used Coq’s Print Assumptions command to verify that FSCQ did not introduce any unproven
axioms or assumptions.
CHL. CHL is implemented as a domain-specific language inside of Coq, much like a macro language (or, in
the more technical language of proof assistants, we use a
shallow embedding). We specified the semantics of this
language and proved that it is sound. For example, we
proved the standard Hoare-logic specifications for the for
and if combinators. We also proved the specifications of
disk_read, disk_write (whose spec is in Figure 3),
and disk_sync manually, starting from CHL’s execution
and crash model. Much of the automation (e.g., the chaining of pre- and postconditions) is implemented using
Ltac, Coq’s domain-specific language for proof search.
FSCQ. We implemented FSCQ also inside of Coq, writ-
ing the specifications using CHL. We proved that the imple-
mentation obeys the specifications, starting from the basic
operations in CHL. FSCQ’s write-ahead log simplified speci-
fication and implementation tremendously, because much
or neither was. The special ret variable indicates whether
the system reached a completed or a recovered state
and in this case enables callers of atomic_two_write
to conclude that, if atomic_two_write completed with-
out crashes, it updated both blocks (i.e., updating none
of the blocks is allowed only if the system crashed and
recovered).
Note that distinguishing the completed and recovered
states allows the specification to state stronger properties
for completion than recovery. Also note that the recovery-execution version of atomic_two_write does not have
a crash condition: if the computer crashes, it will run
log_recover again, and the specification describes what
happens when the computer eventually stops crashing and
log_recover can run to completion.
In this example, the recovery procedure is just log_
recover, but the recovery procedure of a system built on
top of the transaction system may be composed of several
recovery procedures. For example, recovery in a file system
consists of first reading the superblock from disk to locate
the log and then running log_recover.
3. 5. Proving
In order to prove the correctness of a procedure, CHL
follows the standard Hoare-logic approach of decomposing the procedure into smaller units (e.g., control-flow
constructs or lower-level functions with already-proven
specifications) and chaining their pre- and postconditions according to the procedure’s control flow. Figure 7
log_recover
PRE
POST
RECOVER
if bnum >= NDIRECT:
indirect = log_read(inode.blocks[NDIRECT])
return indirect[bnum - NDIRECT]
else:
return inode.blocks[bnum]
if
log_read
return
return
Figure 7. Example control flow of a CHL procedure that looks up the address of a block in an inode, with support for indirect blocks. (The
actual code in FSCQ checks for some additional error cases.) Gray boxes represent the specifications of procedures. The dark red box
represents the recovery procedure. Green and pink boxes represent preconditions and crash conditions, respectively. Blue boxes represent
postconditions. Dashed arrows represent logical implication.
SPEC atomic_two_write(a1, v1, a2, v2) ; log_recover()
PRE disk: log_rep(No Txn, start_state)
start_state: a1 ; vx ; a2 ; vy ; others
POST disk: log_rep(No Txn, new_state) ∨
(ret = RECOVERED ∧ log_rep(No Txn, start_state))
new_state: a1 ; v1 ; a2 ; v2 ; others
Figure 6. Specification for atomic_two_write with recovery. The
operator indicates the combination of a regular procedure and a
recovery procedure.