Applications in the future will need to
be much more aware of, and careful in
responding to, failures.
The high-performance computing
community accurately describes what
needs to be done:
“We already mentioned the lack of
coordination between software layers
with regards to errors and fault man-
agement. Currently, when a software
layer or component detects a fault it
does not inform the other parts of the
software running on the system in a
consistent manner. As a consequence,
fault-handling actions taken by this
software component are hidden to the
rest of the system. …In an ideal wor[l]
d, if a software component detects a
potential error, then the information
should propagate to other compo-
nents that may be affected by the error
or that control resources that may be
responsible for the error.” 7
In particular, as regards storage,
APIs should copy Amazon’s S3 by providing optional data-integrity capabilities that allow applications to perform
end-to-end checks. These APIs should
be enhanced to allow the application
to provide an optional nonce that is
prepended to the object data before
the message digest reported to the application is computed. This would allow applications to exclude the possibility that the reported digest has been
remembered rather than recomputed.
Grateful thanks are due to Eric Allman,
Kirk McKusick, Jim Gettys, Tom Lipkis,
Mark Compton, Petros Maniatis, Mary
Baker, Fran Berman, Tsutomu Shimo-mura, and the late Jim Gray. Some of
this material was originally presented
at iPRES 2008 and subsequently published in the International Journal of
Digital Curation. 32
This work was funded by the member libraries of the LOCKSS Alliance,
and the Library of Congress’ National
Digital Information Infrastructure
and Preservation Program. Errors and
omissions are the author’s own.
Triple-Parity RAID and Beyond
hard Disk Drives: The Good, the Bad
and the Ugly!
You Don’t Know Jack about Disks
1. adams, D. The Hitchhiker’s Guide to the Galaxy. British
Broadcasting corp., 1978.
2. amazon. amazon s3 apI reference (mar. 2006);
3. andersen, D.g., franklin, J., Kaminsky, m.,
phanishayee, a., tan, L., Vasudevan, V. fa Wn: a fast
array of wimpy nodes. In Proceedings of the ACM
SIGOPS 22nd Symposium on Operating Systems
Principles (2009), 1–14.
4. anderson. D. Hard drive directions (sept. 2009);
5. Bairavasundaram, L., goodson, g., schroeder, B.,
arpaci-Dusseau, a.c., arpaci-Dusseau, r.H. an
analysis of data corruption in the storage stack. In
Proceedings of 6th Usenix Conference on File and
Storage Technologies, (2008).
6. Baker, m., shah, m., rosenthal, D.s.H., roussopoulos,
m., maniatis, p., giuli, t.J., Bungale, p. a fresh look
at the reliability of long-term digital storage. In
Proceedings of EuroSys2006, (apr. 2006).
7. cappello, f., geist, a., gropp, B., Kale, s., Kramer, B.,
snir, m. toward exascale resilience. technical report
tr-JLpc-09-01. InrIa-Illinois Joint Laboratory on
petascale computing, (July 2009).
8. cern. Worldwide LHc computing grid, 2008; http://
9. chang, f., Dean, J., ghemawat, s., Hsieh, W. c.,
Wallach, D. a., Burrows, m., chandra, t., fikes, a.,
grube, r.e. Bigtable: a distributed storage system
for structured data. In Proceedings of the 7th
Usenix Symposium on Operating System Design and
Implementation, (2006), 205–218.
10. christensen, c.m. The Innovator’s Dilemma: When
New Technologies Cause Great Firms to Fail. Harvard
Business school press (June 1997), cambridge, ma.
11. corbett, p., english, B., goel, a., grcanac, t., Kleiman,
s., Leong, J., sankar, s. row-diagonal parity for double
disk failure correction. In 3rd Usenix Conference on
File and Storage Technologies (mar. 2004).
12. elerath. J. Hard-disk drives: the good, the bad, and the
ugly. Commun. ACM 52, 6 (June 2009).
13. elerath, J.g., pecht, m. enhanced reliability modeling
of raID storage systems. In Proceedings of the
37th Annual IEEE/IFIP International Conference on
Dependable Systems and Networks, (2007), 175–184.
14. engler, D. a system’s hackers crash course: techniques
that find lots of bugs in real (storage) system code.
In Proceedings of 5th Usenix Conference on File and
Storage Technologies, (feb. 2007).
15. Haber, s., stornetta, W.s. How to timestamp a digital
document. Journal of Cryptology 3, 2 (1991), 99–111.
16. Hafner, J.L., Deenadhayalan, V., Belluomini, W., rao, K.
undetected disk errors in raID arrays. IBM Journal
of Research & Development 52, 4/5, (2008).
17. Jiang, W., Hu, c., Zhou, y., Kanevsky, a. are disks
the dominant contributor for storage failures? a
comprehensive study of storage subsystem failure
characteristics. In Proceedings of 6th Usenix
Conference on File and Storage Technologies, (2008).
18. Kelemen, p. silent corruptions. In 8th Annual Workshop
on Linux Clusters for Super Computing, (2007)
19. Klima, V. finding mD5 collisions—a toy for a notebook.
cryptology eprint archive, report 2005/075; http://
20. Krioukov, a., Bairavasundaram, L.n., goodson, g.r.,
srinivasan, K., thelen, r., arpaci-Dusseau, a.c.,
arpaci-Dusseau, r.H. parity lost and parity regained.
In proceedings of 6th usenix conference on file and
storage technologies, (2008).
21. maniatis, p., roussopoulos, m., giuli, t.J., rosenthal,
D.s. H., Baker, m., muliadi, y. preserving peer replicas
by rate-limited sampled voting. In Proceedings of
the 19th ACM Symposium on Operating Systems
Principles, (oct. 2003), 44–59.
22. marshall, c. “It’s like a fire. you just have to move on:”
rethinking personal digital archiving. In 6th Usenix
Conference on File and Storage Technologies, (2008).
David S.h. Rosenthal has been an engineer in
silicon Valley for a quarter of a century, including as
a Distinguished engineer at sun microsystems and
employee #4 at nVIDIa. for the last decade he has been
working on the problems of long-term digital preservation
under the auspices of the stanford Library.
© 2010 acm 0001-0782/10/1100 $10.00