the L2ARC writes slowly to its flash devices and data on the system may be modified quickly (especially with the use of flash as a log device), the contents of the L2ARC may not reflect the same data stored on disk. During normal operation, dirtied and stale entries are marked as such so they are ignored. After a system reset, though stale data may be read off the cache device, metadata kept on the device and ZFS’s built-in checksums are used to identify this condition and seamlessly recover by reading the correct data from disk.
For working sets that are larger than the DRAM capacity, flash offers an avenue to access that working set much faster than could otherwise be done by disks of any speed. Even for working sets that could comfortably fit in DRAM, if the absolute performance of DRAM isn’t necessary, it may be more economical to skimp on DRAM for the main ARC and instead cache the data on flash. As this use of flash meshes perfectly with its natural strengths, suitable devices can be produced quite cheaply and still have a significant performance advantage over fast disks. Although flash is still more expensive than fast disks per unit storage, caching even a very large working set in flash is often cheaper than storing all data on fast disks.
the impact of flash By combining the use of flash as an intent-log to reduce write latency with flash as a cache to reduce read latency, we can create a system that performs far better and consumes less power than other systems of similar cost. It is now possible to construct systems with a precise mix of write-optimized flash, flash for caching, DRAM, and cheap disks designed specifically to achieve the right balance of cost and performance for any given workload, with data automatically handled by the appropriate level of the hierarchy. It is also possible to address specific performance problems with directed rather than general solutions. Through the use of smarter software, we can build systems that integrate different technologies to extract the best qualities of each. Further, the use of smarter software will allow flash vendors to build solutions for specific problems rather than gussying up flash to fit the anachronistic
constraints of a hard drive. ZFS is just one example among many of how one could apply flash as a log and a cache to deliver total system performance. Most generally, this new flash tier can be thought of as a radical form of hierarchical storage management (HSM) without the need for explicit management. Although these solutions offer concrete methods of integrating flash into a storage system, they also raise a number of questions and force us to reconsider many aspects of the system. For example, how should we connect flash to the system? SSDs are clearly an easy approach, but there may be faster interfaces such as the memory bus. More broadly, how will this impact the balance of a system? As more requests are serviced from flash, it may be possible to provision systems with far more network connectivity to clients than bus connectivity to disks.
In that vein, flash opens the possibility of using disks that are even slower, cheaper, and more power efficient. We can now scoff at a 15,000RPM drive as an untargeted half-measure for a variety of problems, but couldn’t the same argument be applied to a 7,200RPM drive? Just because it’s at the low end of the performance curve doesn’t mean it’s at the bottom. The 5,400RPM drive is quite common today and consumes less power still. Can the return of the 3,600RPM drive be far behind? The cost of power has continued to rise, but even if that trend were to plateau, a large portion of the total cost of ownership of a storage system is directly tied to its power use—and that’s to say nothing of the increased market emphasis on green design. Flash provides solutions that require us to rethink how we build systems and challenge us to develop smarter software to drive those systems; the result will be faster systems that are cheaper and greener.
Adam Leventhal ( ahl@sun.com) is a staff engineer on Sun’s Microsystems’ Fishworks advanced product development team, San Francisco, CA.
Props to Neil Perrin for developing slogs, to Brendan Gregg for developing the L2ARC, and to Jeff Bonwick and Matt Ahrens for reinventing storage with ZFS.
References:
Archives