forms with users whose applications
require greater performance, and the
tip associated the most powerful computational platforms with the users
requiring the greatest performance
for “hero” applications. The same approach can be used to create a Data
Pyramid (see Figure 2) to frame today’s
digital information and stewardship
options.
The Data Pyramid outlines the
spectrum of data-collection and data-stewardship alternatives. The bottom
includes data of individual (“local”)
value whose stewards focus primarily
on individual needs (such as personal
tax records and digital family photographs and videos). We back this up
on our hard drives, with an additional
copy off-site if we are methodical, but
little of this data will ever be considered of great societal value.
At the top is data of widespread and/
or societal value whose stewards are
primarily public-interest institutions
(such as government agencies, libraries, museums, archives, and universities). Included are official records,
data infeasible or too expensive to replace (such as the Shoah Collection of
holocaust survivor testimony, college.
usc.edu/vhi/, and digital photographs
from the most recent NASA space voyage). Much of it must be preserved over
the long term by trusted institutions. It
is typically replicated many times, the
focus of explicit plans for preservation, and hosted by only the most reliable cyberinfrastructure.
In the middle of the Pyramid is data
of value to a specific community whose
stewards range from individuals to
community groups to companies to
public-interest institutions. It includes
digital records from your local hospital, scientific research data preserved
in community repositories, and digital
copies of motion pictures preserved for
decades, commercially valuable in the
future in the form of, say, the “
director’s cut.” In every sector, groups are
beginning to grapple with the responsibility of creating plans for data stewardship that are cost-effective, support
reliable digital preservation, and are
not subject to the whims of markets
and/or community social dynamics.
The Data Pyramid makes it easy to
see that multiple solutions for sustainable digital preservation must be
Digital Data Terms
and Conditions
The following definitions are derived from a number of sources, including
the american library association ( www.lita.org/ala/), National information
assurance glossary ( www.cnss.gov/), and Joint information systems committee
Digital information briefing paper ( www.jisc.ac.uk):
aPPRaisaL
evaluation and selection of digital material for long-term curation and
preservation, documented policies, guidance, and legal requirements may
require that it be done securely;
authentication
security measure designed to establish the validity of a transmission, message,
or originator or a means of verifying an individual’s authority to receive specific
categories of information;
cuRation
Digital curation, broadly interpreted, is about maintaining and adding value to
a trusted body of digital information for current and future use. it builds on the
underlying concepts of digital preservation while emphasizing opportunities
for added value and knowledge through annotation and continuing resource
management;
DiGitaL RiGhts manaGement
The use of technologies to control how digital content is used and reused;
inGest
controlled or secure transfer of material to an archive, repository, data center,
or other custodial environment in adherence to documented guidance, policies,
or legal requirements;
inteGRity
The condition when data is unchanged from its source and has not been
accidently or maliciously modified, altered, or destroyed;
metaData
Documentation relating to data content, structure, provenance (history), and
context (such as experimental parameters and environmental conditions).
standards for metadata provide a basis for widespread community data sharing;
and
PReseRvation action
actions undertaken to ensure the long-term viability and availability of the
authoritative nature of digital material. preservation actions should ensure
the material remains authentic, reliable, and usable while its integrity is
maintained; such actions include validation, assigning preservation metadata,
assigning representation information, and ensuring acceptable data structures
and file formats.