from that instrumentation, such as using cryoEM (cryo-electron microscopy)
to generate structural data in biology,
that lets us now look at molecules that
up until now have been very difficult
to look at.” Recently, he adds, “there’s
been a tremendous infusion of technology in biology,” enabling, for example, the ability to interrogate a tissue
and determine the types of cells in the
tissue and their spatial organization.
Many big challenges still exist, such
as learning how individual cells work
together in the tumor micro-environ-ment and how they contribute to the
overall aggressiveness of cancer and its
ability to resist therapies, Kibbe adds.
The opportunity to work with the
DOE meant exposure to a tremendous
amount of computational expertise and
thinking about problems in deep learn-
ing and natural language processing
(NLP), as well as being able to do very
detailed simulations, he says. Taking
the available cancer data and using it to
build mechanistically informed models
and predictive models will enable re-
searchers to better understand, as they
perturb a particular cell, how that per-
turbation is going to impact the tissue
and the biological system. It will also
tell researchers whether they can “do a
better job providing patients with opti-
mal therapies based on the modeling.”
For the NCI/DOE collaboration, the
goal is not understanding individual
cells and tissues, but whether research-
ers can glean from a huge population
how patients respond when they are
given a particular therapy. “That’s a
data aggregation problem and a natu-
ral language processing problem,”
Kibbe says. “The DOE has a lot of ex-
pertise in looking not only at energy
grids, but thinking about integrating
data from a number of different sourc-
es and technologies, and building up
simulations and models.”
One pilot by Argonne National Lab-
oratory focuses on deep learning and
building predictive models for drug
treatment response using different cell
lines and patient-derived xenografts
(tissue grafts from a donor of a different
species than the recipient). “We’re try-
ing to build models where we can pre-
dict where tumors we haven’t screened
will respond to a drug,” explains Rick
Stevens, associate laboratory director
for computing, environment, and life
sciences research at Argonne, who is
spearheading the deep learning pilot.
This is the underlying concept of preci-
sion medicine.
Tumor cells have thousands of dif-
ferent types of molecules and tens of
thousands of genes that change all the
time, so there are fundamental points
that researchers don’t understand, Ste-
vens explains. Building a model based
on principles of what is happening
in cancer cells is incomplete; if a re-
searcher tried to make predictions of
how a cancer cell will respond without
taking into consideration the proper-
ties of the treatments, it wouldn’t be as
effective. That’s where the team hopes
deep learning applied to drug combi-
nation therapies will be useful.
A second pilot, at Lawrence Liver-
more National Laboratory, is aimed
at understanding the predictive paths
in the Ras cancer gene, mutations of
which are responsible for about 30%
of all cancers, Stevens says. Work
there is also focused on the oncogene
which, when mutated, becomes the
driver for causing cancer. “It’s one of
the core targets we’re trying to under-
stand [as well as] how to drug it,” says
Stevens. “It’s stuck in the ‘on’ posi-
tion; it’s like a switch and it tells your
cells when to divide.”
A third pilot, under way at Oak
Ridge National Laboratory, is mining
data from millions of patient records
in search of large-scale patterns to op-
timize drug treatments. The pilot is
working with the Surveillance, Epide-
miology and End Results (SEER) Reg-
istries, which NCI has used since 1974
to assess the incidence and outcomes
for cancer patients across the coun-
try and covers roughly 30% of the U.S.
population, says Stevens. However, the
challenge is that because it was built
over 40 years ago, it “has seen a lot of
technologies, and the hope is we can
transform the SEER Registries into
something that has very different char-
acteristics” using NLP and deep learn-
ing features.
This is where the partnership with
DOE will be especially valuable, says
Kibbe, because the department has a
lot of expertise working with sensor
networks and data aggregation inter-
rogation and analysis.
The common thread among all
three pilots is that each has a deep
learning component to them, Stevens
says. To fund the initiatives, he and his
co-investigators received $5 million in
fall 2016 from the Exascale Comput-
ing Project (ECP) to build a deep neu-
ral network code called the CANcer
Distributed Learning Environment
(CANDLE). This year, Argonne, Law-
rence Livermore, and Oak Ridge all
will deploy their highest-performing
supercomputers available and the
teams will use these systems to start
evaluating existing open source soft-
ware from various vendors and test
machine learning capabilities. That
way, Stevens notes, they won’t have to
reinvent the wheel.
“We’ll add what we need on top of
the frameworks and make it possible
to use the large-scale hardware we have
and feed it back into the open source
community,” Stevens says. “A wonderful
feature of the artificial intelligence com-
munity is that it’s very open. You have
collaborations that span companies that
are competing with each other,” includ-
ing Microsoft, Google, and Facebook.
The teams working on the three pi-
lots will “run big benchmark problems
on the DoE hardware,” and will have
the first code release that can serve all
three pilots and eventually other appli-
cation areas in the summer, he says.
One of the problems, in Stevens’ case,
is a classification problem, in which tu-
mor expression data, known as SNP (sin-
gle nucleotide polymorphisms) data,
is used to try to determine what type of
cancer is being studied from the SNPs
alone. “That hasn’t been done before;
it’s related, but not the same to classifi-
“A wonderful
feature of the
artificial intelligence
community is that it’s
very open. You have
collaborations that
span companies that
are competing with
each other.”