Nnews
Science | DOI: 10.1145/1592761.1592768
Gary Anthes
Deep Data Dives
Discover natural Laws
Computer scientists have found a way to bootstrap science, using evolutionary
computation to find fundamental meaning in massive amounts of raw data.
MiNiNg sCieNTiFiC daTa for patterns and rela- tionships has been a common practice for decades, and the use of
self-mutating genetic algorithms is
nothing new, either. But now a pair of
computer scientists at Cornell University have pushed these techniques into
an entirely new realm, one that could
fundamentally transform the methods
of science at the frontiers of research.
Photogra Ph by Jonathan hiLLer, CorneLL Uni VerSity
Writing in a recent issue of the
journal Science, Hod Lipson and Michael Schmidt describe how they
programmed a computer to take unstructured and imperfect lab measurements from swinging pendulums and
mechanical oscillators and, with just
the slightest initial direction and no
knowledge of physics, mechanics, or
geometry, derive equations representing fundamental laws of nature.
Conventional machine learning systems usually aim to generate predictive
models that might, for example, calculate the future position of a pendulum
given its current position. However, the
equations unearthed by Lipson and
Schmidt represented basic invariant
relationships—such as the conservation of energy—of the kind that govern
cornell university’s michael schmidt, left,
and hod Lipson with one of the double
pendulums used in their experiments.
the behavior of the universe.
The technique may come just in
time as scientists are increasingly
confronted with floods of data from
the Internet, sensors, particle accelerators, and the like in quantities that
defy conventional attempts at analysis.
“The technology to collect all that data
has far, far surpassed the technology
to analyze it and understand it,” says
Schmidt, a doctoral candidate and
member of the Cornell Computational
Synthesis Lab. “This is the first time a
computer has been used to go directly
from data to a free-form law.”
The Lipson/Schmidt work features
two key advancements. The first is
their look for invariants, or “
conservations,” rather than for predictive models. “All laws of nature are essentially
laws of conservation and symmetry,”
says Lipson, a professor of mechanical
engineering. “So looking for invariants
is fundamental.”
Given crude initial conditions and
some indication of what variables to
consider, the genetic program churned
through a large number of possible
equations, keeping and building on
the most promising ones at each iteration and eliminating the others. The
project’s second key advance was finding a way to identify the large number
of trivial equations that, while true and
invariant, are coincidental and not directly related to the behavior of the system being studied.
Lipson and Schmidt found that
trivial equations could be weeded out
by looking at ratios of rates of change
in the variables under consideration.
The program was written to favor those