financial complexity to the pursuit of
scientific efforts that are frequently
Establish methods for verifica- 10.
tion and performance testing. A critical
requirement is the ability to determine
compliance. Not having compliance
testing significantly weakens the archival value by undermining the reliability and integrity of the image data.
Performance testing using prototypical test cases assists in the design process by flagging proposed community
image design that will have severe performance problems. Defining baseline
test cases will quickly identify software
problems in the API.
Establish ongoing adminis- 11.
trative support. Formal design processes can take considerable time to
complete, but some needs—such as
technical support, consultation, publishing technical documentation, and
managing registration of community
image designs—require immediate attention. Establishing a mechanism for
imaging communities to register their
HDF5 root level groups as community
specific data domains will provide an
essential cornerstone for image design and avoid namespace collisions
with other imaging communities.
Examine how other formal stan- 12.
dards have evolved. Employ the successful strategies and avoid the pitfalls.
Developing strategies and alliances
with these standards groups will further strengthen the design and adoption of a scientific image standard.
Establishing the correct forum 13.
is crucial and will require the guidance
of a professional standards organization—or organizations—that perceives the development of such an image standard as part of its mission to
serve the public and its membership.
Broad consensus and commitment by
the scientific, governmental, business,
and professional communities is the
best and perhaps only way to accomplish this.
Out of necessity, bioscientists are independently assessing and implementing HDF5, but no overarching group is
responsible for establishing a comprehensive bio-imaging format, and there
are few best practices to rely on. Thus,
there is a real possibility that biolo-
gists will continue with incompatible
methods for solving similar problems,
such as not having a common image
The failure to establish a scalable
n-dimensional scientific image standard that is efficient, interoperable,
and archival will result in a less-than-optimal research environment and a
less-certain future capability for image repositories. The strategic danger
of not having a comprehensive scientific image storage framework is the
massive generation of unsustainable
bio-images. Subsequently, the long-term risks and costs of comfortable
inaction will likely be enormous and
The challenge for the biosciences
is to establish a world-class imaging
specification that will endow these
indispensable and nonreproducible
observations with long-term maintenance and high-performance computational access. The issue is not whether the biosciences will adopt HDF5 as
a useful imaging framework—that is
already happening—but whether it is
time to gather the many separate pieces of the currently highly fragmented
patchwork of biological image formats and place them under HDF5 as a
common framework. This is the time
to unify the imagery of biology, and we
encourage readers to contact the authors with their views.
This work was funded by the National Center for Research Resources (P41-RR-02250), National Institute of General Medical Sciences
(5R01GM079429, Department of Energy (ER64212-1027708-0011962),
National Science Foundation (DBI-
0610407, CCF-0621463), National
Institutes of Health (1R13RR023192-
01A1, R03EB008516), The HDF Group
R&D Fund, Center for Computation
and Technology at Louisiana State
University, Louisiana Information
Technology Initiative, and NSF/EPS-CoR (EPS-0701491, Cyber Tools).
Catching disk latency in the act
Better Scripts, Better Games
1. BiohDf; http://www.geospiza.com/research/biohdf/.
2. crystallographic information framework.
international union of crystallography; http://www.
3. DicoM (Digital imaging and communications in
4. eMDB (electron Microscopy Data Bank); http://
5. fits (flexible image transport system); http://fits.
6. hDf (hierarchical Data format); http://www.hdfgroup.
7. MeDsBio (consortium for Management of
experimental Data in structural Biology); http://www.
8. Mets (Metadata encoding and transmission
9. MPeg (Moving Picture experts group); http://www.
10. netcDf (network common Data form); http://www.
11. neXus (neutron, x-ray and muon science); http://www.
12. nfs (network file system); http://www.ietf.org/rfc/
13. oais (open archival information system); http://
14. oMe (open Microscopy environment); http://www.
15. rDf (resource Description framework); http://www.
Matthew T. Dougherty ( firstname.lastname@example.org) is at the
national center for Macromolecular imaging, specializing
in cryo-electron microscopy, visualization, and animation.
Michael J. Folk ( email@example.com) is president of
the hDf group.
Erez Zadok ( firstname.lastname@example.org) is associate professor
at stony Brook university, specializing in computer
storage systems performance and design.
herbert J. Bernstein ( email@example.com) is professor
of computer science at Dowling college, active in the
development of iucr standards.
Frances C. Bernstein ( firstname.lastname@example.org)
is retired from Brookhaven national laboratory after 24
years at the Protein Data Bank, active in macromolecular
data representation and validation.
Kevin W. Eliceiri ( email@example.com) is director
at the laboratory for optical and computational
instrumentation, university of Wisconsin-Madison, active
in the development of tools for bio-image informatics.
Werner Benger ( firstname.lastname@example.org) is visualization
research scientist at louisiana state university,
specializing in astrophysics and computational fluid
Christoph Best ( email@example.com) is project leader at the
european Bioinformatics institute, specializing in electron
microscopy image informatics.