Nnews
Science | DOI: 10.1145/1592761.1592768
Gary Anthes
Deep Data Dives
Discover natural Laws
Computer scientists have found a way to bootstrap science, using evolutionary
computation to find fundamental meaning in massive amounts of raw data.

MiNiNg sCieNTiFiC daTa for patterns and rela- tionships has been a common practice for decades, and the use of self-mutating genetic algorithms is nothing new, either. But now a pair of computer scientists at Cornell University have pushed these techniques into an entirely new realm, one that could fundamentally transform the methods of science at the frontiers of research.

Photogra Ph by Jonathan hiLLer, CorneLL Uni VerSity

Writing in a recent issue of the journal Science, Hod Lipson and Michael Schmidt describe how they programmed a computer to take unstructured and imperfect lab measurements from swinging pendulums and mechanical oscillators and, with just the slightest initial direction and no knowledge of physics, mechanics, or geometry, derive equations representing fundamental laws of nature.

 

Conventional machine learning systems usually aim to generate predictive models that might, for example, calculate the future position of a pendulum given its current position. However, the equations unearthed by Lipson and Schmidt represented basic invariant relationships—such as the conservation of energy—of the kind that govern

cornell university’s michael schmidt, left, and hod Lipson with one of the double pendulums used in their experiments.

 

the behavior of the universe.

The technique may come just in time as scientists are increasingly confronted with floods of data from the Internet, sensors, particle accelerators, and the like in quantities that defy conventional attempts at analysis. “The technology to collect all that data has far, far surpassed the technology to analyze it and understand it,” says

Schmidt, a doctoral candidate and member of the Cornell Computational Synthesis Lab. “This is the first time a computer has been used to go directly from data to a free-form law.”

The Lipson/Schmidt work features two key advancements. The first is their look for invariants, or “ conservations,” rather than for predictive models. “All laws of nature are essentially laws of conservation and symmetry,” says Lipson, a professor of mechanical engineering. “So looking for invariants is fundamental.”

Given crude initial conditions and some indication of what variables to consider, the genetic program churned through a large number of possible equations, keeping and building on the most promising ones at each iteration and eliminating the others. The project’s second key advance was finding a way to identify the large number of trivial equations that, while true and invariant, are coincidental and not directly related to the behavior of the system being studied.

Lipson and Schmidt found that trivial equations could be weeded out by looking at ratios of rates of change in the variables under consideration. The program was written to favor those

References:

Archives