5. ReSuLtS
figure 8: Breakdown in system memory and disk power and network
bandwidth for architecture with/without a flash-based disk cache.
mem RD power mem WR power
mem IDLE power disk power
network bandwidth
5. 1. System memory and disk energy efficiency
Figure 8 shows a breakdown of power consumption in the
system memory and disk drive (left y-axis). Figure 8 also
shows the measured network bandwidth (right y-axis).
Throughput measured as network bandwidth is a good
indicator of overall system performance as it represents the
amount of data that the server can handle in each configura-
10
8
Overall Power(W)
tion. We calculated power for a DRAM-only system memory
and a heterogenous (DRAM + Flash) system memory that
6
uses a Flash as a secondary disk cache with hard disk drive
support. We assume equal die area for a DRAM-only system
4
memory and a DRAM + Flash system memory. Figure 8 shows
the reduction in disk drive power and system memory power
2
that results from adopting Flash. Our primary power savings
for system memory come from using Flash instead of DRAM
0
for a large amount of the disk cache. The power savings for
disk come from reducing the accesses to disk due to a bigger overall disk cache made possible by adopting a Flash. We
also see improved throughput with Flash because it displays
lower access latency than disk.
5. 2. impact of Bch code strength on system
performance
We have already mentioned that BCH latency incurs an additional delay beyond the initial access latency. We simulated
the performance of the SPECWeb99 and dbt2 benchmarks
to observe the effect of increasing code strength that would
occur as Flash wears out. It is assumed that all Flash blocks
have the same ECC strength applied. We also measured performance for code strengths (more than 12 bits per page)
that are beyond our Flash memory controller’s capabilities
to fully capture the performance trends.
From Figure 9, we can see that throughput degrades
slowly with ECC strength. dbt2 suffers a greater performance
loss than SPECWeb99 after 15 bits per page. The disk bound
property of dbt2 makes it more sensitive to ECC strength.
1DDR2 512MB
+ 60GB HDD
(a) dbt2
0
8
Overall Power(W)
6
4
2
1
0.8
0.6
0.4
0.2
norm. network bandwidth
0
DDR2 256MB + Flash
1GB + 60GB HDD
mem RD power
mem IDLE power
network bandwidth
mem WR power
disk power
1
0.8
0.6
0.4
0.2
norm. network bandwidth
0
0
DDR2 512MB
+ 60GB HDD
DDR2 128MB + Flash
2GB + 60GB HDD
5. 3. improved flash lifetime with reliability support in
flash memory controller
Figure 10 shows a comparison of the normalized number
of accesses required to reach the point of total Flash failure
where none of the Flash pages can be recovered. We compare our programmable Flash memory controller with a
BCH 1-bit error correcting controller. Our studies show that
for typical workloads, our programmable Flash memory
controller extends lifetime by a factor of 20 on average. For
a workload that would previously limit Flash lifetime to 6
months, we show it can now operate for more than 10 years
using our programmable Flash memory controller. This was
accompanied by a graceful increase in overall access latency
as Flash wore out.
(b)SPECWeb99
6. concLuSionS AnD futuRe WoRK
This paper presents the challenges and opportunities in integrating Flash onto a server platform. Flash is an attractive
candidate for integration because it reduces power consumption in system memories and disk drives while improving
overall throughput. This in turn can reduce the operating
cost of a server platform, which is a growing concern in a data
center. We presented three key usage models of Flash and
examined an architecture for the “extended system memory”
usage model. Our proposed architecture carefully manages
the Flash and uses it as a secondary disk cache split into a
separate read cache and write cache. We observed a dramatic
improvement in power consumption and performance. In
our simulation studies, a Flash-based disk cache improved
the DBT2 database benchmark performance by over 25%
while reducing memory and disk power by 44%. For a web
server benchmark, performance improvement was around
11% with a power reduction of 73%. This does not account
for potentially larger systemwide energy savings obtained
from speeding up system response and increasing idle time.
Assuming that a server can enter a low-power mode while
aPril 2009 | Vol. 52 | no. 4 | communicAtionS of the Acm
105