quirements, projecting that in order to
achieve acceptable performance with
a 1TB data set we would need a GB/sec
sequential read speed from the disks,
translating to about 20 servers at the
time. Jim was a firm believer in using
“bricks,” or the cheapest, simplest
building blocks money could buy. We
started experimenting with low-level
disk IO on our inexpensive Dell servers, and our disks were soon much quieter and performing more efficiently.
astronomy and the skyserver
Toward the end of 2000 data started
arriving from the SDSS telescope in
Apache Point, NM (see Figure 1), and
Jim said simply, “Let’s get to work.” So
during Christmas and the New Year’s
holiday we converted the whole object-oriented database schema to a Microsoft SQL Server-based schema. We
modified many of our loading scripts
by looking at Tom Barclay’s TerraServer code and soon, with Jim’s guidance,
had a simple SQL Server version of the
SDSS database. 24
The SDSS project was at first reluctant to even consider switching technologies, so for about a year the SQL
Server database we had designed was a
“cowboy” implementation, not part of
the official SDSS data release. Coincidentally, Intel gave us a pool of servers
to use to experiment with the database,
giving us a show-and-tell meeting in
San Francisco a few months after the
first bits of data started to come in
from the telescope. We decided to create a simple graphical interface on top
of the database, similar to the one on
the TerraServer, to enable astronomers
and anyone else to visually browse the
sky. My son, Tamas ( 13 at the time)
came along for the Intel meeting and
helped man the booth, telling us, “No
self-respecting schoolkid would use
such an interface,” that it had to be
much more visually stimulating and
interactive.
Jim gave one of his characteristically big laughs; we then looked at one
another and realized we had our target
audience. Even if astronomers were
not ready, we would design a database
and integrated Web site for schoolchildren. This was the moment we set
out to build the SkyServer to connect
the database to the pixels in the sky.
The name was an obvious play on Ter-
We soon had
the framework
and the ability
to load hundreds
of GB of data in a
reasonable amount
of time, marking
the transition of the
skyserver team
from “cowboys”
to “ranchers.”
raServer, and we pitched it to the SDSS
project as a tool for education and outreach, as well as for serious scientific
investigation. When the first batch of
SDSS data was officially declared public in 2001, the SkyServer, then running
on computers donated by Compaq,
appeared side by side with the official
database for astronomers. We wrote
simple scripts to create false-color
images from the raw astronomy data
and adopted the TerraServer scheme
to build an image pyramid consisting
of successive sets of tiles at different
magnifications.
By the next year (2002), everyone
realized that the SkyServer engine was
much more robust and scalable than
expected. Ani Thakar, a research scientist at Johns Hopkins, made a superhuman effort to convert the whole
existing framework to SQL Server. 22
Jim insisted on “two-phase loading,”
that is, we would load each new batch
of data into its own separate little database, then run data-cleaning code
and accept the data only if it passed all
the tests. This foresight turned out to
be enormously useful; once the data
started coming through the hose, we
could recover from errors (there were
lots of them) much more easily. We
soon had the framework and the ability to load hundreds of GB of data in a
reasonable amount of time, marking
the transition of the SkyServer team
from cowboys to “ranchers.”
Curtis Wong, manager of the Microsoft Next Media Research Group, then
redesigned the SkyServer’s interface.
His seemingly minor modifications of
our style sheets had a huge effect on the
entire site’s look and feel; it suddenly
came alive. Many volunteers, including
former Johns Hopkins student Steve
Landy and physics teacher Rob Sparks,
helped add content. Jordan Raddick, a
science writer, created a new section
of the Web site, with educational exercises and formal class materials for all
students, from kindergarten to high
school. Professional astronomers also
appreciated the power of the visual
tools, and the site quickly became popular, even in this community.
The next major step came with the
emergence of Microsoft’s .NET Web
services. Jim invited our development
team (at Johns Hopkins) to San Francisco to the VSLive Conference (Janu-