practice
THE AMOUNTS OF data processed by applications are
constantly growing. With this growth, scaling storage
becomes more challenging. Every database system
has its own trade-offs. Understanding them is crucial,
as it helps in selecting the right one from so many
available choices.
Every application is different in terms of read/
write workload balance, consistency requirements,
latencies, and access patterns. Familiarizing yourself
with database and storage internals facilitates
architectural decisions, helps explain why a system
behaves a certain way, helps troubleshoot problems
when they arise, and fine-tunes the da-
tabase for your workload.
It is impossible to optimize a system in all directions. In an ideal world
there would be data structures guaranteeing the best read and write performance with no storage overhead but, of
course, in practice that is not possible.
This article takes a closer look at
two storage system design approaches
used in a majority of modern databases
—read-optimized B-trees3 and write-optimized LSM (log-structured merge)-trees8—and describes their use cases
and trade-offs.
B-Trees
B-trees are a popular read-optimized
indexing data structure and general-
Algorithms
Behind
Modern
Storage
Systems
DOI: 10.1145/3209210
Article development led by
queue.acm.org
Different uses for read-optimized
B-trees and write-optimized LSM-trees.
BY ALEX PETROV