P
h
o
t
o
g
r
a
P
h
b
y
r
e
m
y
m
u
s
s
e
r
software helps Linguists
Reconstruct, Decipher
Ancient Languages
Linguists who once spent an entire career reconstructing a major
language family now can accomplish that in just a few hours.
the early language from which they all
descended. The result? More than 85%
of the system’s reconstructions were
within one character of the manual
reconstruction provided by a linguist
who specialized in Austronesian languages—and, of course, the differences are not necessarily errors.
How does the system work?
The way we produce words differs from the way our ancestors pronounced those same words. As time
goes by, minute, ongoing alterations
help turn an ancestral language like
Latin into modern descendants like
French, Italian, and Portuguese.
The sound changes are almost
always regular, with similar words
While previously it might have
taken a linguist their entire career to
reconstruct a major language family,
now software running computations
on, say, a large experiment that may
involve a sixth of the world’s languages
can be completed in just a few hours.
The achievement is not about speed,
cautions Dan Klein, associate professor
of computer science at the University
of California, Berkeley. It’s about being
able to do things in a large-scale, data-driven manner without losing all the
important insights that historical linguists have gained in working on these
sorts of problems for decades, he says.
Indeed, linguistic researchers compare these techniques to those used for
gene sequence evolution.
“This achievement should not be
compared to, for example, Deep Blue,
IBM’s chess-playing computer,” Klein
insists. “This is not a human-versus-machine story in which humans used
to be better until finally a computer was
able to take the crown. This is a story of
computation giving human linguists
new tools that supplement their weaknesses and let them work in new ways.”
The efforts of Klein and his colleagues are described in their paper,
“Automated Reconstruction of Ancient
Languages Using Probabilistic Models
of Sound Change,” published by the National Academy of Sciences in February.
According to Klein, the work’s main
contribution was a new tool that researchers can use to automatically reconstruct the vocabularies of ancient
languages using only their modern-language descendants.
The goal, he says, is not to just rewind the clock; rather, it is to better
understand the processes that give rise
to language change, and to model how
the evolution of language proceeds.
“And so we want to know things like
what kind of sound changes are more
likely and what kind of sound changes
go together,” he explains.
To test the system, the team applied
it to 637 languages currently spoken
in Asia and the Pacific, and recreated
Society | DOI: 10.1145/2507771.2507778 Paul Hyman