Doi: 10.1145/1498765.1498791
By David Roberts, Taeho Kgil, and Trevor Mudge
Flash is a widely used storage device in portable mobile devices such as smart phones, digital cameras, and MP3 players. It provides high density and low power, properties that are appealing for other computing domains. In this paper, we examine its use in the server domain. Wear-out has the potential to limit the use of Flash in this domain. To seriously consider Flash in the server domain, architectural support must exist to address this lack of reliability. This paper first provides a survey of current and potential Flash usage models in a data center. We then advocate using Flash as an extended system memory usage model—OS managed disk cache—and describe the necessary architectural changes. Specifically we propose two key changes. The first improves performance and reliability by splitting Flash-based disk caches into separate read and write regions. The second improves reliability by employing a programmable Flash memory controller. It changes the error code strength (number of correctable bits) and the number of bits that a memory cell can store (cell density) in response to the demands of the application.
Data centers are an integral part of today’s computing platforms. As cloud computing initiatives provide IT capabilities that incorporate software as a service, it requires internet service providers such as Google and Yahoo to build large-scale data centers hosting millions of servers. Energy efficiency becomes a first-class citizen to address the increasing cost of operating a data center. Data centers based on off-the-shelf general-purpose processors are unnecessarily power hungry, require expensive cooling systems, and occupy a large space. In fact, the cost of power and cooling these data centers contributes to a significant portion of the operating cost. Figure 1 breaks down the annual operating cost for data centers. It clearly shows that the cost of power and cooling servers increasingly contributes to the overall operating costs of a data center.
System memory power (DRAM power) and disk power contribute as much as 50% to the overall power consumption in a data center. Further, current trends suggest that this percentage will continue to increase at a rapid rate as we integrate more memory modules (DRAM) and disk drives to improve throughput.
Fortunately, there are emerging memory devices in the technology pipeline that may address this concern. These devices typically display high density and consume low idle power. Flash, Phase Change RAM (PCRAM) and Magnetic RAM (MRAM) are examples.
In particular, Flash is an attractive technology that is already deployed heavily in various computing platforms.
98 communicAtionS of the Acm | aPril 2009 | voL. 52 | no. 4
Today, NAND Flash can be found in handheld devices such as smart phones, digital cameras, and MP3 players. This has been made possible because of its high density and low power properties. These result from the simple structure of Flash cells and its nonvolatility. Its popularity has meant that it is the focus of aggressive process scaling and innovation.
The rapid rate of improvement in density has become the primary driver to consider Flash in other usage models. There are several Flash usage models in the data center that are currently being examined by industry and academia that address rising power and cooling costs, among other things. Two common usage models are disk caches or storage devices. Some efforts have lead to product development, 8, 19 while others have influenced storage and memory device standards. 16, 18
This paper provides an overview of the benefits of integrating Flash onto a server. Specifically, in this paper:
1. We provide an analysis of current and potential Flash usage models for servers.
2. We argue that the extended system memory model10 is the best usage model to reduce data center energy when the contribution of system memory power exceeds the contribution of disk power.
3. We review two architectural modifications to improve NAND-based disk caches. 11 First, we show that by splitting Flash-based disk caches into read and write regions, overall performance and reliability can be improved.
figure 1: iDc estimates for annual cost spent on powering and cooling servers and purchasing new servers. 17
125 50
45
40
35
30
25
20
15
10
5
Installed base of servers (millions)
0
Spending (billions of dollars)
100
75
50
25
Power and cooling New server spending Installed base of servers
1997
1998
1999
2000
2001
0
2002
2003
1996
2004
2005
2006
2007
2008
200 9
2010
A previous version of this paper, entitled “Improving NAND Flash-based Disk Caches” was published in Proceedings of the International Symposium on Computer Architecture (ISCA 2008).
References:
Archives