tents very quickly, useful for scrubbing,
as well as answering queries.
This is all encouraging, but once it
became possible to study the behavior
of disk storage at a large scale, it became
clear that system-level reliability fell far
short of the media specifications. This
should make us cautious about predicting a revolution from flash or any
other new storage technology.
economics
Ever since Clayton Christensen published The Innovator’s Dilemma10 it
has been common knowledge that
disk-drive cost per byte halves every
two years. So you might argue that you
don’t need to know how many copies
you need to keep your data safe for the
long term, you just need to know how
many you need to keep it safe for the
next few years. After that, you can keep
more copies.
In fact, what has happened is the
capacity at constant cost has been doubling every two years, which is not quite
the same thing. As long as this exponential grows faster than you generate
new data, adding copies through time
is a feasible strategy.
Alas, exponential curves can be deceiving. Moore’s Law has continued to
deliver smaller and smaller transistors.
A few years ago, however, it effectively
ceased delivering faster and faster CPU
clock rates. It turned out that, from a
business perspective, there were more
important things to spend the extra
transistors on than making a single
CPU faster. Like putting multiple CPUs
on a chip.
At a recent Library of Congress
meeting, Dave Anderson of Seagate
warned4 that something similar is
about to happen to hard disks. Technologies—HAMR (heat-assisted magnetic recording) and BPM (bit pattern
media)—are in place to deliver the
table 1. the time to read an entire disk of
various generations.
1990
2000
2006
2009
2013
240
720
6450
8000
12800
Disks have been
getting bigger but
they have not been
getting equivalently
faster. this is to be
expected; the data
rate depends on
the inverse of the
diameter of a bit,
but the capacity
depends on the
inverse of the area
of a bit.
2013 disk generation (that is, a consumer 3.5-inch drive holding 8TB).
But the business case for building it
is weak. The cost of the transition to
BPM in particular is daunting. 24 Laptops, netbooks, and now tablets are
destroying the market for the desktop
boxes that 3.5-inch drives go into. And
very few consumers fill up the 2009
2TB disk generation, so what value
does having an 8TB drive add? Let
alone the problem of how to back up
an 8TB drive on your desk!
What is likely to happen—indeed,
is already happening—is that the consumer market will transition rather
quickly to 2.5-inch drives. This will
eliminate the high-capacity $100 3.5-
inch drive, since it will no longer be
produced in consumer quantities.
Consumers will still buy $100 drives,
but they will be 2. 5 inches and have
perhaps one-third the capacity. For a
while the $/byte curve will at best flatten, and more likely go up. The problem this poses is that large-scale disk
farms are currently built from consumer 3.5-inch drives. The existing players
in the market have bet heavily on the
exponential cost decrease continuing;
if they’re wrong, it will be disruptive.
the Bigger Picture
Our inability to compute how many
backup copies we need to achieve a reliability target is something we are just
going to have to live with. In practice,
we are not going to have enough backup copies, and stuff will get lost or damaged. This should not be a surprise, but
somehow it is. The fact that bits can be
copied correctly leads to an expectation that they always will be copied correctly, and then to an expectation that
digital storage will be reliable. There is
an odd cognitive dissonance between
this and people’s actual experience of
digital storage, which is that loss and
damage are routine occurrences. 22
The fact that storage is not reliable
enough to allow us to ignore the problem of failures is just one aspect of a
much bigger problem looming over
computing as it continues to scale up.
Current long-running petascale high-performance computer applications
require complex and expensive checkpoint and restart schemes, because
the probability of a failure during execution is so high that restarting from