figure 6: Local flash drives versus hybrid
drives in network-attached storage.
CPU + RAM
Flash disk
Traditional disk
Flash disk
CPU + RAM
Traditional disk
and never were merged to form very
large runs on disk (shown as horizontal
boxes), and the available RAM is used
to merge a very large number of runs
exploiting the small page size optimal
for flash devices.
Third, Gray and Putzolu offered
further rules of thumb, such as the
10-byte rule for trading memory
and CPU power. These rules also
warrant revisiting for both costs and
energy. Compared with 1987, the
most fundamental change may be
that CPU power should be measured
not in instructions but in cache line
replacements. Trading off space and
time seems like a new problem in
an environment with multiple levels
in the memory hierarchy. A modern
memory hierarchy might be very deep:
multiple levels of CPU caches, main
memory (possibly in a NUMA design),
flash devices, and finally performance-optimized “enterprise” disks and
capacity-optimized “consumer” disks.
The lower levels may rely on various
software techniques with different
trade-offs between performance and
reliability, such as striping, mirroring,
single-redundancy RAID- 5, dual-redundancy RAID- 6, log-structured file
systems, and write-optimized B-trees.
Fourth, what are the best data
movement policies? One extreme is
a database administrator explicitly
moving entire files, tables, or indexes
between flash memory and traditional
disk. Another extreme is automatic
movement of individual pages,
controlled by a replacement policy
such as LRU. Intermediate policies may
focus on the roles of individual pages
within a database or on the current
query-processing activity. For example,
all catalog pages may be moved as a
unit after schema changes to facilitate
fast recompilation of all cached query
execution plans, and all relevant upper
B-tree levels may be prefetched and
cached in RAM or in flash memory
during execution of query plans relying
on index-to-index navigation. The
variety of possibilities may overwhelm
automatic policies and may require
hints or directives from applications or
database software.
Fifth, what are the secondary and
tertiary effects of introducing flash
memory into the memory hierarchy of
a database server? For example, short
access times permit a lower multi-programming level, because only
short I/O operations must be hidden
by asynchronous I/O and context
switching. A lower multi-programming
level in turn may reduce contention for
memory in sort and hash operations,
locks (concurrency control for database
contents), and latches (concurrency
control for in-memory data structures).
Should this effect prove significant, the
effort and complexity of using a fine
granularity of locking may be reduced.
Page-level concurrency control may
also be sufficient simply as a result
of small page sizes. Similarly, in-page data structures may require
less optimization, although some
techniques may apply to small pages
(optimized for flash) within large pages
(optimized for disks)—for example,
clustering records versus clustering
fields.
1
Sixth, will hardware architecture
considerations invalidate some of
the findings and conclusions of this
article? For example, disks are currently
separated from the main processors
(for example, in network-attached
storage or storage-area networks). Will
flash devices be placed with the main
processors? If so, is it still a good idea
to use flash devices as extended disk
rather than extended buffer pool?
Figure 6 sho ws two of these alternatives.
In the top arrangement, questions arise
about the scope and effectiveness of
centralized storage management, the
granularity of failures and replacement,
and so on, whereas many of these
questions have much more obvious
answers in the bottom arrangement.
Seventh, how will flash memory
affect in-memory database systems?
Will they become more scalable,
affordable, and popular based on
memory inexpensively extended with
flash memory rather than RAM? Will
they become less popular as a result of
very fast traditional database systems
using flash memory instead of (or in
addition to) disks? Can a traditional
code base using flash memory instead
of traditional disks compete with
a specialized in-memory database
system in terms of performance, total
cost of ownership, development and
maintenance costs, or time to market of
features and releases? What techniques
in the buffer pool are required to
achieve performance competitive with
in-memory databases? For example,
the upper levels of B-tree indexes
can be pinned in the buffer pool and
augmented with memory addresses
of all child pages (or their buffer
descriptors) also pinned in the buffer
pool, and auxiliary structures may
enable efficient interpolation search
instead of binary search.
Finally, techniques similar to
generational garbage collection may
benefit storage hierarchies. Selective
22
reclamation applies not only to
unreachable in-memory objects but
also to buffer-pool pages and favored
locations on permanent storage. Such
research also may provide guidance
for log-structured file systems, wear
leveling for flash memory, and write-optimized B-trees on RAID storage.
Conclusion
The 20-year-old five-minute rule for
RAM and disks still holds, but for
ever-larger disk pages. Moreover, it
should be augmented by two new
five-minute rules: one for small pages moving between RAM and flash
memory and one for large pages moving between flash memory and traditional disks. For small pages moving
between RAM and disk, Gray and Putzolu were amazingly accurate in predicting a five-hour break-even point
two decades into the future.
Research into flash memory and
its place in system architectures is
urgent and important. Within a few
years, flash memory will be used to
fill the gap between traditional RAM
and traditional disk drives in many
operating systems, file systems, and
database systems.
Flash memory can be used to extend