the L2ARC writes slowly to its flash devices and data on the system may be
modified quickly (especially with the
use of flash as a log device), the contents of the L2ARC may not reflect the
same data stored on disk. During normal operation, dirtied and stale entries
are marked as such so they are ignored.
After a system reset, though stale data
may be read off the cache device, metadata kept on the device and ZFS’s built-in checksums are used to identify this
condition and seamlessly recover by
reading the correct data from disk.
For working sets that are larger than
the DRAM capacity, flash offers an avenue to access that working set much
faster than could otherwise be done
by disks of any speed. Even for working sets that could comfortably fit in
DRAM, if the absolute performance of
DRAM isn’t necessary, it may be more
economical to skimp on DRAM for the
main ARC and instead cache the data
on flash. As this use of flash meshes
perfectly with its natural strengths,
suitable devices can be produced quite
cheaply and still have a significant performance advantage over fast disks.
Although flash is still more expensive
than fast disks per unit storage, caching even a very large working set in
flash is often cheaper than storing all
data on fast disks.
the impact of flash
By combining the use of flash as an
intent-log to reduce write latency with
flash as a cache to reduce read latency,
we can create a system that performs
far better and consumes less power
than other systems of similar cost. It is
now possible to construct systems with
a precise mix of write-optimized flash,
flash for caching, DRAM, and cheap
disks designed specifically to achieve
the right balance of cost and performance for any given workload, with data
automatically handled by the appropriate level of the hierarchy. It is also possible to address specific performance
problems with directed rather than
general solutions. Through the use of
smarter software, we can build systems
that integrate different technologies to
extract the best qualities of each. Further, the use of smarter software will
allow flash vendors to build solutions
for specific problems rather than gussying up flash to fit the anachronistic
although flash’s
prospects are
tantalizing, the
challenge is to
find uses for it
that strike
the right balance
between cost
and performance.
flash should be
viewed not as
a replacement
for existing
storage, but rather
as a means
to enhance it.
constraints of a hard drive. ZFS is just
one example among many of how one
could apply flash as a log and a cache
to deliver total system performance.
Most generally, this new flash tier can
be thought of as a radical form of hierarchical storage management (HSM)
without the need for explicit management. Although these solutions offer
concrete methods of integrating flash
into a storage system, they also raise
a number of questions and force us to
reconsider many aspects of the system.
For example, how should we connect
flash to the system? SSDs are clearly an
easy approach, but there may be faster
interfaces such as the memory bus.
More broadly, how will this impact the
balance of a system? As more requests
are serviced from flash, it may be possible to provision systems with far more
network connectivity to clients than
bus connectivity to disks.
In that vein, flash opens the possibility of using disks that are even slower,
cheaper, and more power efficient. We
can now scoff at a 15,000RPM drive as
an untargeted half-measure for a variety of problems, but couldn’t the same
argument be applied to a 7,200RPM
drive? Just because it’s at the low end
of the performance curve doesn’t mean
it’s at the bottom. The 5,400RPM drive
is quite common today and consumes
less power still. Can the return of the
3,600RPM drive be far behind? The cost
of power has continued to rise, but even
if that trend were to plateau, a large
portion of the total cost of ownership
of a storage system is directly tied to its
power use—and that’s to say nothing
of the increased market emphasis on
green design. Flash provides solutions
that require us to rethink how we build
systems and challenge us to develop
smarter software to drive those systems;
the result will be faster systems that are
cheaper and greener.
Adam Leventhal ( ahl@sun.com) is a staff engineer
on Sun’s Microsystems’ Fishworks advanced product
development team, San Francisco, CA.
Props to Neil Perrin for developing slogs, to Brendan
Gregg for developing the L2ARC, and to Jeff Bonwick and
Matt Ahrens for reinventing storage with ZFS.