Designing for Digital Archives
Elizabeth Churchill
Yahoo! Research| churchill@acm.org
with Jeff Ubois
Fujitsu Labs of America | jeff@ubois.com
David Gartner
Have you amassed a collection
of photos and other media without quite knowing how to manage it? Have you spent hours
trying to locate a precious or
extremely important file? Have
you ever wished you’d backed
up your files after a computer
crash?
More and more of our work
and personal content is digital.
And mobile, digital technologies
like camera phones are changing the nature of capture and
collection—what and how we
collect. We are living in a world
of continuous accumulation.
This is relatively new. Ten
years ago fewer people had
home computers, fewer services
existed, and we weren’t surrounded by all those appealing, shiny devices that promise
to record our every action in
case we want to take a step
down memory lane or revisit
an article written a while back
to snaffle some useful content.
Back then terms like “
moblog-ging”, “lifelogging,” “
microblog-ging,” and “lifestreaming” were
not in common parlance.
Ironically, this ease of capture
and replication actually makes it
more likely that we’ll lose stuff.
The sheer volume of data we are
able to collect makes organization daunting and specific content difficult to locate. Frankly,
the logically extreme vision of
life as constant accumulation
offered by Gordon Bell and his
collaborator Jim Gemmell, with
their MyLifeBits project, is apt
to make anyone with old-time
curatorial sensibilities erupt in
hives.
Amplifying the challenge is
the fact that content tends to
accumulate in various places—
on internal or external flash
and other portable drives; on
recording devices themselves
(cameras, audio recorders,
phones); and hosted at ISPs and
by services like You Tube and
Flickr. Few people have a centralized repository of all their
stuff. We curate, consolidate,
and/or back up randomly or not
at all, and have muddled mental
models regarding file formats,
backup, and archive practices and services. Prospective
retrospective—that is, imagining now what we will want to
remember in the future—is
hard; we have a limited ability
to gauge such future value. So
we have a propensity to defer
decisions about whether something is worth keeping or not.
Consequently, most of us are
what Microsoft’s Cathy Marshall
and her collaborators have
called “lazy preservationists,”
who rely on “opportunism, optimism, and benign neglect.” And
most of us are living in a world
of digital bloat, our untamed
and insecure data strewn all
over the place. We skip along on
a wing and a prayer, explaining away catastrophes and
rethinking data importance in
the face of loss: “I guess it must
not have been important if I
lost it.” Sometimes this kind of
loss and revision is therapeutic.
Sometimes it is not. Sometimes
we spend hours reconstructing content or creating passable replacements. For our own
archives this is personally troubling, but as a culture it is positively terrifying that our data
and our memories are at risk.
Some see this problem as a
commercial opportunity. GYMA
(Google, Yahoo, Microsoft, AOL)
are exploring the business of
archiving, backup, and storage, and services; others, like
Seagate’s Mirra Personal Server,
Apple’s .Mac account, EMC’s
Mozy promise storage and a
“data cloud” where our stuff will
be safe … forever. Or until we
fail to pay the subscription fee.
Or until they have business or
technical problems. Or, as happened to one of our own
interactions columnists, some malicious miscreant masquerades
as you and in a click of a button
or two, deletes all your precious
material. Under most terms of
service agreements, users have
no recourse and companies
have no obligation to restore the
“lost” material even if back-ups
exist.
We need to develop a finer