Society | DOI: 10.1145/1831407.1831415
Should code
be Released?
Software code can provide important insights into
the results of research, but it’s up to individual scientists
whether their code is released—and many opt not to.
On An Y Given day, medical re- searchers at Carnegie Mel- lon University (CMU) may be investigating new ways to thwart the development
of epilepsy or designing an implant-able biosensor to improve the early
detection of diseases such as cancer
and diabetes. As with any disciplined
pursuit of science, such work is subject
to rigorous rounds of peer review, in
which documents revealing methodology, results, and other key details are
examined.
But, assuming software was created
for the research, should a complete
disclosure of the computer code be
included in the review process? This
is a debate that doesn’t arrive with
any ready answers—not on the campus grounds of CMU or many other
institutions. Scott A. Hissam, a senior
member of the technical staff at CMU’s
Software Engineering Institute, sees
validity in both sides of the argument.
“From one perspective, revealing the
code is the way it should be in a perfect
world, especially if the project is taking
public money,” says Hissam, who, as
a coauthor of Perspectives on Free and
Open Source Software, has explored the
topic. “But, in practice, there are questions. The academic community earns
needed credentialing by producing
original publications. Do you give up
the software code immediately? Or do
you wait until you’ve had a sufficient
number of publications? If so, who determines what a sufficient number is?”
Another dynamic that adds complexity to the discussion is that scientific researchers are not software developers. They often write their own code,
but generally don’t follow the same
practices, procedures, and standards
as professional software programmers.
“Researchers who are trying to cure
cancer or study tectonic plates will
write software code to do a specific task
in a lab,” Hissam says. “They aren’t
concerned about the same things that
computer programmers are, such as
scalability and design patterns and
software architecture. So imagine how
daunting of a task it would be to review
and try to understand how such a code
was written.”
u.K.’s climategate
This issue has gained considerable attention ever since Climategate, which
involved the illegal hacking of researchers’ email accounts last year at
the Climate Research Unit at the University of East Anglia, one of world’s
leading institutions on global climate
change. More than 1,000 email messages and 2,000 documents were
hacked, and source code was released.
Global warming contrarians have contended the email reveals that scientists
Dennis McCafferty
manipulated data, among other charges. Climate Research Unit scientists
have denied these allegations and independent reviews conducted by both
the university and the House of Commons’ Science and Technology Select
Committee have cleared the scientists
of any wrongdoing.
Still, Darrel Ince, professor of computing at the U.K’s Open University, cited the Climate Research Unit’s work as
part of his argument that code should
be revealed. He wrote in the Manchester
Guardian that the university’s climate-research team depended on code that
has been described as undocumented,
baroque, and lacking in data needed
to pass information from one program
and research team to another.
Ince noted that Les Hatton, a professor at the Universities of Kent and
Kingston, has conducted an analysis of
several million lines of scientific code
and found that the software possessed
a high level of detectable inconsistencies. For instance, Hatton found that
interface inconsistencies between
software modules that pass data from
one part of a program to another happen, on average, at the rate of one in
every seven interfaces in Fortran and
one in every 37 interfaces in C.
“This is hugely worrying when you
realize that one error—just one—will
usually invalidate a computer program,” Ince wrote. Those posting
comments on the Guardian Web site
have been largely supportive of his arguments. “The quality of academic
software code should absolutely be
scrutinized and called out whenever
needed,” wrote one commenter. “It
should be the de facto criteria for accepting papers,” wrote another.
Still, not all were in agreement. “I
work in scientific software,” wrote one
commenter. “The sort of good pro-
PhotograPh by Matthew Lowe