mous datasets and the widespread
use of machine learning are relatively
new additions to mainstream science,
and it is simply a matter of time before more stringent methodologies
emerge, they say.
“We are in the midst of a reformation. The research community is identifying challenges to reproducibility and
implementing a variety of solutions to
improve. It is an exciting time, not a
worrying one,” Nosek argues.
Zhang also says there’s no reason
to push the panic button; scientific
methods are messy, difficult, and iterative. “We need to embrace changes.
We need to be more selective and careful about avoiding mistakes that lead
to irreproducible results and invalid
conclusions. Right now, this crisis
represents enormous opportunities
for statisticians, data scientists, computer scientists, and others to develop
a more robust framework for research.”
Adds Ioannidis, “I’m optimistic that
we will find ways to solve the problem
of irreproducibility. We will learn how
to use today’s tools more effectively,
and come up with better methodologies. But it’s something we must confront and address.”
Why Most Published Research Findings
Are False, PLOS Medicine, Aug. 30,
Berk, R., Brown, L., Buja, A.,
Zhang, K., and Zhao, L.
Valid Post-Selection Inference, The Annals
of Statistics, 2013, Vol. 41, No. 2, 802-837.
Science Isn’t Broken: It’s Just a Hell
of a Lot Harder Than We Give It Credit For,
Five ThirtyEight. Aug. 19, 2015.
Halsey, L.G., Curran-Everett, D.,
Vowler, S.L., and Drummond, G.B.
The Fickle P Value Generates
Irreproducible Results. Nature Methods,
March 2015, Vol. 12 No. 3. pp. 179.
Samuel Greengard is an author and journalist based in
West Linn, OR, USA.
© 2019 ACM 0001-0782/19/9 $15.00
˲ Were studies blinded?
˲ Were all results shown?
˲ Were experiments repeated?
˲ Were positive and negative controls
˲ Were reagents validated?
˲ Were the statistical tests appropriate?
By boosting due diligence upfront,
Begley argues, it is possible to ensure
a much higher level of veracity and
validity to research results. The same
techniques also apply to analytics and
machine learning in business and industry, where users often lack the scientific grounding to ensure the methods they use are sound.
In the scientific community, greater scrutiny can also take the form of
more vigorous peer reviews and greater oversight from journals. In some
cases, researchers are publishing results that haven’t been reviewed at all;
they essentially are rubber-stamping
their own work. This has contributed
to an increased number of retractions
and corrections in journals. The
Journal of Medical Ethics, for example,
documented a 10-fold increase in retractions of scientific papers in the
PubMed database between 2000 and
More rigorous statistical methodologies, as well as better use of machine
learning, are also critical. As a result,
researchers are studying ways to improve analysis. For example, instead
of conducting exploratory data analysis on an entire data set, researchers
might use data splitting—essentially,
separating a training dataset and test
dataset and keeping the test dataset
hidden until the end, once the results
Another approach involves taking an
original training dataset and randomizing it in a way that mimics future datasets by adding random noise repeatedly. If researchers can aggregate all
the results and the discovery remains
stable (meaning it appears across many
different randomized datasets), then
it’s likely to be reproducible.
Although the inability to reproduce
scientific results has grown in recent
years, observers say most researchers strive for accurate findings and
the problem is largely solvable. Enor-
TO EXPAND ACCESS
TO THE INTERNET
the 1980s was a
bubble, and it
was the era of
University of California, Santa
Barbara (UCSB). “I liked the
cutting-edge aspect of
technology, which brought me
to computer science,” she adds.
Belding earned her
master’s degree and doctorate
in electrical and computer
engineering from UCSB. On
completion of her Ph. D.,
she joined the faculty of the
computer science department
at UCSB, where she has been
working ever since.
The focus of Belding’s
research is on mobile and
wireless networking, including
analysis, and information and
for development (ICTD).
“I have always been in mobile
wireless networking,” Belding
explains. “I started in protocol
development, and then got into
network performance analysis.”
She recalls that about a
decade ago, “I wanted to do
something with social impact,
and applied my wireless
networking expertise to bring
Internet access to more people
and communities worldwide.”
Belding adds her work is now
largely concentrated on Native
American groups within the U. S.
Some of Belding’s
interests have moved beyond
networking. All of her ICTD
work falls under the category
of computing for social good,
or computer science that has
a high social impact. Other
projects on which she is
currently collaborating include
analyzing hate speech, and also
gender-based violence online
and in social media.
“I like what I am doing,
and will continue to work on
socially impactful projects,”