know more about the data and the applications. Look at the history of an
earlier technology we all know about:
RAID. There are multiple reasons to
do RAID. You do it for availability, to
protect the data, and for performance
benefits. There are also areas where
RAID does not provide any benefits.
When we ask our customers why they
are doing RAID, nobody knows which
of the benefits are more important to
We’ve spent all this time sending
them to training classes, teaching
them about the various RAID-levels,
and how you calculate the XORs. What
they know is if they want to protect
their data, they’ve got to turn it up to
RAID5, and if they’ve got money lying around, they want to turn it up to
RAID10. They don’t know why they’re
doing that, they’re just saying, “This is
what I’m supposed to do so I’ll do it.”
There isn’t the deeper understanding
of how the data and applications are
being used. The model is not there.
maRGo seLtzeR: I don’t think that’s
going to change. We’re going to have
to figure out the RAID equivalent for
power management because I don’t
think people are going to figure out
their data that way. It’s not something
that people know or understand.
KiRK mcKusicK: Or they’re going to
put flash in front of the disk, so you
can have the disk power down. You can
dump it into flash and then update the
disk when it becomes available.
eRic BRe WeR: Many disks have some
NVRAM (Non-Volatile RAM) in them
anyway, so I feel like one could absorb
the write burst while the drive wakes
up. We should be able to hide that.
At least in my consumer case, I know
that one disk can handle my read load.
Enterprise is a more complicated, but
that’s a lot of disks we can shut down.
steVe KLeiman: I disagree. Flash caches can help with a lot of applications
being consumed in the enterprise.
However, because there is a 10-to- 1
cost factor, there are areas where flash
adds no benefit. You have to let the disk
show through so that cache misses are
addressed. That is very hard to predict.
We’ve long passed the point where
you can delete something. Typically,
you don’t know what is important and
what is not and you can’t spend the
time and money to figure it out. So
a mantra that
i learned early on
in databases was
are better. What
you’re all saying
now is that we
have to challenge
that. more spindles
are better, but at
you end up keeping everything, which
means in some sense everything’s
equally valued. The problem is that you
need a certain level of minimum reli-
ability or redundancy into all the data
because it’s hard to distinguish what
is important and what’s not. It’s not
just RAID. People are going to want to
have disaster recovery strategy. They’re
not going to have just one copy of this
thing, RAID or no RAID.
eRiK RieDeL: At a recent event in my
department to discuss storage power,
we had a vendor presentation that
showed a CPU scaling system. When
system administrators feel they are get-
ting close to peak power they can access
a master console and turn back all the
processors by 20%. That’s a system that
they have live running today. And they
do it without fear. They figure that ap-
plications are balanced and somehow
all the applications—the Web servers,
the database servers—will adjust to ev-
erything running 20% slower.
When our group saw that, it became
clear that we are going to have to figure
out what the equivalent of that is for
storage. We need to be able to architect
storage systems so that an administra-
tor has the option of saying, “I need it
to consume 20% or 30% less power for
the next couple of hours.”
mache cReeGeR: A mantra that I
learned early on in databases was more
spindles are better. More spindles allow
you to have more parallelism and a wid-
er data path. What you’re all saying now
is that we have to challenge that. More
spindles are better, but at what cost?
Yes, I can run a database on one spin-
dle, but it’s not going to be a particular-
ly responsive one. It won’t have all the
performance of a 10-spindle database,
but it’s going to be cheaper to run.
steVe KLeiman: If you think about the
database example, I don’t know about
that. You can put most of the working
set on flash. You don’t have to worry
about spinning it.
maRGo seLtzeR: That’s the key insight
here. Flash has two attractive properties: It handles random I/O load really
well and it’s also very power efficient.
I think you have to look at how that’s
going to play into the storage hierarchy
and how it’s going to help.
In some cases you may be using flash
as a performance enhancer, as a power
enhancer, or both. This gets back to