Flash Devices onto Servers
By David Roberts, Taeho Kgil, and Trevor Mudge
Flash is a widely used storage device in portable mobile
devices such as smart phones, digital cameras, and MP3 players. It provides high density and low power, properties that
are appealing for other computing domains. In this paper,
we examine its use in the server domain. Wear-out has the
potential to limit the use of Flash in this domain. To seriously
consider Flash in the server domain, architectural support
must exist to address this lack of reliability. This paper first
provides a survey of current and potential Flash usage models
in a data center. We then advocate using Flash as an extended
system memory usage model—OS managed disk cache—and
describe the necessary architectural changes. Specifically we
propose two key changes. The first improves performance
and reliability by splitting Flash-based disk caches into separate read and write regions. The second improves reliability
by employing a programmable Flash memory controller. It
changes the error code strength (number of correctable bits)
and the number of bits that a memory cell can store (cell density) in response to the demands of the application.
Data centers are an integral part of today’s computing platforms. As cloud computing initiatives provide IT capabilities
that incorporate software as a service, it requires internet service
providers such as Google and Yahoo to build large-scale data
centers hosting millions of servers. Energy efficiency becomes
a first-class citizen to address the increasing cost of operating a data center. Data centers based on off-the-shelf general-purpose processors are unnecessarily power hungry, require
expensive cooling systems, and occupy a large space. In fact,
the cost of power and cooling these data centers contributes to
a significant portion of the operating cost. Figure 1 breaks down
the annual operating cost for data centers. It clearly shows that
the cost of power and cooling servers increasingly contributes
to the overall operating costs of a data center.
System memory power (DRAM power) and disk power
contribute as much as 50% to the overall power consumption in a data center. Further, current trends suggest that
this percentage will continue to increase at a rapid rate as
we integrate more memory modules (DRAM) and disk drives
to improve throughput.
Fortunately, there are emerging memory devices in the
technology pipeline that may address this concern. These
devices typically display high density and consume low idle
power. Flash, Phase Change RAM (PCRAM) and Magnetic
RAM (MRAM) are examples.
In particular, Flash is an attractive technology that is
already deployed heavily in various computing platforms.
98 communicAtionS of the Acm | aPril 2009 | voL. 52 | no. 4
Today, NAND Flash can be found in handheld devices such
as smart phones, digital cameras, and MP3 players. This
has been made possible because of its high density and low
power properties. These result from the simple structure of
Flash cells and its nonvolatility. Its popularity has meant that
it is the focus of aggressive process scaling and innovation.
The rapid rate of improvement in density has become the
primary driver to consider Flash in other usage models. There
are several Flash usage models in the data center that are currently being examined by industry and academia that address
rising power and cooling costs, among other things. Two common usage models are disk caches or storage devices. Some
efforts have lead to product development,
19 while others have
influenced storage and memory device standards.
This paper provides an overview of the benefits of integrating Flash onto a server. Specifically, in this paper:
1. We provide an analysis of current and potential Flash
usage models for servers.
2. We argue that the extended system memory model10 is
the best usage model to reduce data center energy when
the contribution of system memory power exceeds the
contribution of disk power.
3. We review two architectural modifications to improve
NAND-based disk caches.
11 First, we show that by splitting Flash-based disk caches into read and write regions,
overall performance and reliability can be improved.
figure 1: iDc estimates for annual cost spent on powering and
cooling servers and purchasing new servers.
Installed base of servers (millions)
Spending (billions of dollars)
Power and cooling
New server spending
Installed base of servers
A previous version of this paper, entitled “Improving NAND
Flash-based Disk Caches” was published in Proceedings
of the International Symposium on Computer Architecture