Servers clearly benefit from all three usage models that
essentially integrate Flash as a faster hard disk or disk cache. All
usage models help ( 1) reduce unnecessary standby power from
hard disk drives and ( 2) improve overall throughput by reading
and writing from disk cache instead of a hard disk drive.
In the remainder of this paper, we examine Flash-based
disk cache architectures that improve Flash manageability
and reliability in the extended system memory usage model.
We believe this usage model is effective in addressing the
increasing power consumption in system memory. Our studies on servers have revealed the system memory architecture
to be the critical component in delivering high throughput
in a data center.
3. PRoPoSeD ARchitectuRe
3. 1. Architecture of the flash-based disk cache
The right side of Figure 3 shows the Flash-based disk cache
architecture for the extended system memory usage model.
Compared to a conventional DRAM-only architecture shown
on the left side of Figure 3, our proposed architecture uses a
two level disk cache, composed of a relatively small DRAM in
front of a dense Flash. The much lower access time of DRAM
allows it to act as a cache for the Flash without significantly
increasing power consumption. A Flash memory controller
is also required for reliability management.
Our design uses a NAND Flash that stores 2 bits per cell
(MLC) and is capable of switching from MLC to SLC mode
using techniques proposed in Flex-OneNAND6 and Cho.
Finally, our design uses variable-strength ECC to improve
reliability while adding the smallest possible delay.
figure 3: 1GB DRAm is replaced with a smaller 256 mB DRAm and
1GB nAnD-based flash. Additional components are added to control
Hard Disk Drive
(a) Standard system
Hard Disk Drive
(a) Proposed system
operating System Support: Our proposed architecture
requires additional data structures to manage the Flash
blocks and pages. These tables are read from the hard disk
drive and stored in DRAM at run-time to reduce access
latency and mitigate wear-out. Together, they describe
whether pages exist in DRAM or Flash, and specify the various Flash memory configuration options for reliability. For
example, the FlashCache Hash Table allows the operating
system to quickly look up the location of a file page. The
Flash Page Status Table keeps track of the ECC strength,
MLC/SLC mode and access frequency for each page. Each
Erase block has an entry in the Block Status Table to determine how worn out it is. Finally, the Global Status Table
records how quickly the Flash-based disk cache is satisfying
requests, and is the number we try to maximize while the
system is running.
The storage overhead of the four tables are less than 2%
of the Flash size. The FlashCache Hash Table and Flash Page
Status Table are the primary contributors because an entry
is needed for each Flash page. Our Flash-based disk cache
is managed in software (OS code) using the tables described
above. We found the performance overhead in executing
this code to be minimal.
Splitting flash into read and Write regions: We divide
the Flash into a read disk cache and a write disk cache. Read
caches are less susceptible to out-of-place writes, which
reduce the read cache capacity and increase the risk of garbage collection. An out-of-place write happens when existing
data is modified, because Flash has to be erased before it
can be written to a second time. It is simple to invalidate the
old data page (using the Page Status Table and modifying
the Hash Table) then write new data into a previously erased
page. However, the invalid pages accumulate as wasted
space that will have to be garbage collected later. By splitting
Flash into read and write regions, we were able cut down on
time consuming garbage collections.
Figure 4 shows an example that highlights the benefits
of splitting the Flash-based disk cache into a read and write
cache. The left side shows the behavior of a unified Flash-based disk cache and the right side shows the behavior of
splitting the Flash-based disk cache into a read and write
cache. Figure 4 assumes we have five pages per block and
five total blocks in a Flash-based disk cache. Garbage collection proceeds by reading all valid data from blocks containing invalid pages, erasing those blocks and then sequentially
re-writing the valid data. In this example, when the Flash-based disk cache is split into a read and write cache, only two
blocks are candidates for garbage collection. This dramatically reduces Flash reads, writes, and erases compared to a
unified Flash-based disk cache that considers all five Flash
blocks. Our studies also show that the overall disk cache
miss rate is reduced substantially for online transaction processing (OLTP) applications by splitting the Flash.
3. 2. Architecture of the flash memory controller
Flash needs architectural support to improve reliability and
lifetime when used as a cache. Figure 6 shows a high-level
block diagram of a programmable Flash memory controller
that addresses this need. Requests from the operating system
aPril 2009 | Vol. 52 | no. 4 | communicAtionS of the Acm