an artist’s depiction of graphene fragments, flat sheets of carbon attached to zinc atoms,
which may be used in the manufacture of molecular memories.
Institute (EBI), are focusing on DNA.
Others, including a research group at
the Massachusetts Institute of Technology (MIT), are examining molecular
storage methods.
Both approaches have begun to take
shape over the last few years—although
the feasibility of DNA storage was first
demonstrated in 1988. Over the next
decade, new approaches to data storage could transform the way organizations, and society, manage and store
huge volumes of data.
For perspective, all the data humans
produce in a year could fit into about
four grams of DNA. “There is an opportunity to create storage systems that
are a million to a billion times more
compact than existing technology and
provide a level of longevity that is unheard of today,” Church points out.
the Dna of storage
The need for more efficient data
storage methods is rooted in today’s
radically changing world. According to IBM, humans collectively produce about 2. 5 exabytes of data each
day; market research firm IDC says
roughly three zettabytes of data exist
in the digital world. Remarkably, 90%
of the data in the world has been created over the last two years alone, say
researchers at IBM. All this data requires increasingly large data centers
and storage networks. It also presents
challenges as storage devices and media change and data technologies become obsolete and prone to failure.
Researchers hope to significantly
alter the equation. Church and fellow researchers, including Sri Kosuri,
a senior scientist at the Wyss Institute, and Yuan Gao, an associate professor of biomedical engineering at
Johns Hopkins University, are forging into new territory with DNA storage research. They used sophisticated
sequencing techniques to encode
Church’s book in 96-bit blocks, each
containing a 19-bit address to assist
with the reassembly process.
The data was built from code based
on the four constituents of DNA: ade-
nine (A), guanine (G), cytosine (C), and
thymine (T), and converted to binary
code. The non-living DNA contained
54,898 data blocks—each stored on an
individual strand of protein. The team
then sent the data to Agilent Technol-
ogies, which used a 3D printer to at-
tach the data to the DNA strands and
build a physical storage device. Then
the team accurately decoded the text
and read it back. Remarkably, a bil-
lion copies of the book easily fit into
the moisture on the bottom of a glass
or small tube.