INCREASINGLY IN COMPUTING systems, when you
write something into durable storage it is in need
of reorganization later. Personally, I’m pretty darned
disorganized and I lose stuff a lot. This causes extensive
searching, sometimes to no avail. It is, however, easier to
“store” stuff by setting it down wherever I feel like it.
In computing, there is an interest-
ing trend where writing creates a need
to do more work. You need to reor-
ganize, merge, reindex, and more to
make the stuff you wrote more useful.
If you don’t, you must search or do oth-
er work to support future reads.
Indexing within a database. My first
programming job was to implement a
database system. In 1978, my colleague
and I didn’t even know what that was!
We voraciously read every paper from
ACM’s Special Interest Group on Man-
agement of Data and ACM Transactions
on Database Systems we could lay our
hands on. We learned about this in-
teresting and confusing concept of a
relational database and how indexing
can optimize access while being trans-
parent to the application. Of course,
updating an index meant another two-
disk access since the indices of a B+
tree didn’t fit in memory. We under-
stood the additional work to make da-
tabase changes was worth it if you were
ever going to read it later.
The next perplexing question was:
How much should be indexed? Should
we index every column? When should
a pair of columns be indexed together?
The more indexing we did, the faster the
read queries would become. The more
Write
Amplification
vs. Read
Perspiration
DOI: 10.1145/3359334
Article development led by
queue.acm.org
The trade-offs between
write and read.
BY PAT HELLAND