5. 2. Bug discussion
To understand whether FSCQ eliminates real problems
that arise in current file systems, we consider broad categories of bugs that have been found in real-world file systems24, 38 and discuss whether FSCQ’s theorems eliminate
1. Violating file or directory invariants, such as all link
counts adding up36 or the absence of directory cycles.
2. Improper handling of unexpected corner cases, such
as running out of blocks during rename.
3. Mistakes in logging and recovery logic, such as not
issuing disk writes and syncs in the right order.
4. Misusing the logging API, such as freeing an indirect
block and clearing the pointer to it in different
5. Low-level programming errors, such as integer overflows21 or double frees.
6. Resource allocation bugs, such as losing disk blocks37
or returning ENOSPC when there is available space.
7. Returning incorrect error codes.
8. Bugs due to concurrent execution of system calls, such
as races between two threads allocating blocks.
Some categories of bugs (#1–5) are eliminated by FSCQ’s
theorems and proofs. For example, FSCQ’s representation
invariant for the entire file system enforces a tree shape for
it, and the specification guarantees that it will remain a tree
(without cycles) after every system call.
With regards to resource allocation (#6), FSCQ guarantees resources are never lost, but our prototype’s specification does not require that the system be out of resources
in order to return an out-of-resource error. Strengthening
the specification (and proving it) would eliminate this
class of bugs.
Incorrect error codes (#7) can be a problem for our
FSCQ prototype in cases where we did not specify what
exact code (e.g., EINVAL or ENOTDIR) should be returned.
Extending the specification to include specific error codes
could avoid these bugs, at the cost of more complex specifications and proofs. On the other hand, FSCQ can never
have a bug where an operation fails without an error code
we measure ext4 in three additional configurations: the
journal_async_commit mode, which uses checksums
to commit in one disk sync instead of two (“ext4-journal-
async” in our experiments); the data=ordered mode,
which is incompatible with journal_async_commit
(“ext4-ordered”); and the default data=ordered and
async mode, which does not sync to disk on every system
We ran all of these experiments on a quad-core Intel
i7-3667U 2.00 GHz CPU with 8 GB DRAM running Linux
3. 19. The file system was stored on a separate partition
on an Intel SSDSCMMW180A3L flash SSD. Running
the experiments on an SSD ensures that potential file-system CPU bottlenecks are not masked by a slow rotational disk. We compiled FSCQ’s Haskell code using
GHC 7. 10. 2.
Results. The results of running our experiments are
shown in Figure 9. The first conclusion is that FSCQ’s performance is close to that of the xv6 file system. The small gap
between FSCQ and xv6 is due to the fact that FSCQ’s Haskell
implementation uses about four times more CPU time than
xv6’s. This can be reduced by generating C or assembly code
instead of Haskell. Second, FUSE imposes little overhead,
judging by the difference between ext4 and ext4-fuse. Third,
both FSCQ and xv6 lag behind ext4. This is due to the fact
that our benchmarks are bottlenecked by syncs to the SSD,
and that ext4 has a more efficient logging design that defers
applying the log contents until the log fills up, instead of at
each commit. As a result, ext4 can commit a transaction with
two disk syncs, compared to four disk syncs for FSCQ and
xv6. For example, mailbench requires 10 transactions per
message, and the SSD can perform a sync in about 2. 8 ms.
This matches the observed performance of ext4 ( 64 ms per
message) and xv6 and FSCQ (103 and 118 ms per message,
Finally, there is room for even further optimizations:
ext4’s journal_async_commit commits with one disk
sync instead of two, achieving almost twice the throughput in some cases; data=ordered avoids writing file data
twice, achieving almost twice the throughput in other cases;
and asynchronous mode achieves much higher throughput
by avoiding disk syncs altogether (at the cost of not persisting data right away).
git clone make xv6 make lfs largefile smallfile cleanup mailbench
Figure 9. Running time for each phase of the application benchmark suite and for delivering 200 messages with mailbench.