language processing, points out that
some observers were dismissive about
Deep Blue’s victory, suggesting that
the system’s capability was due largely
to brute-force reasoning rather than
machine learning. The same criticism,
she says, cannot be leveled at Watson
because the overall system needed to
determine how to assess and integrate
diverse responses.
“Watson incorporates machine
learning in several crucial stages of its
processing pipeline,” Lee says. “For
example, reinforcement learning was
used to enable Watson to engage in
strategic game play, and the key prob-
lem of determining how confident to
be in an answer was approached using
machine-learning techniques, too.”
Lee says that while there has been
substantial research on the particular
problems the “Jeopardy!” challenge
involved for Watson, that prior work
should not diminish the team’s ac-
complishment in advancing the state
of the art to Watson’s championship
performance. “The contest really
showcased real-time, broad-domain
question-answering, and provided as
comparison points two extremely for-
midable contestants,” she says. “Wat-
son represents an absolutely extraor-
dinary achievement.”
Lee suggests that with language-
processing technologies now matur-
ing, with the most recent example of
such maturation being Watson, the
field appears to have passed through
an important early stage. It now faces
an unprecedented opportunity in help-
ing sift through the massive amounts
of user-generated content online, such
as opinion-oriented information in
product reviews or political analysis,
according to Lee.
While natural-language processing
is already used, with varying degrees
of success, in search engines and
other applications, it might be some
time before Watson’s unique question-answering capabilities will help
sift through online reviews and other
user-generated content. Even so, that
day might not be too far off, as IBM
has already begun work with Nuance
Communications to commercialize
the technology for medical applications. The idea is for Watson to assist
physicians and nurses in finding information buried in medical tomes, prior
“natural language
understanding
remains a
tremendously
difficult challenge,
and while Watson
demonstrated
a powerful approach,
we have only
scratched
the surface,”
says David Ferrucci.
cases, and the latest science journals.
The first commercial offerings from
the collaboration are expected to be
available within two years.
Beyond medicine, likely application
areas for Watson’s technology would
be in law, education, or the financial
industry. Of course, as with any technology, glitches and inconsistencies
will have to be worked out for each new
domain. Glitches notwithstanding,
technology analysts say that Watson-like technologies will have a significant
impact on computing in particular and
human life in general. Ferrucci, for his
part, says these new technologies likely
will mean a demand for higher-density
hardware and for tools to help developers understand and debug machine-learning systems more effectively.
Ferrucci also says it’s likely that user
expectations will be raised, leading to
systems that do a better job at interacting in natural language and sifting
through unstructured content.
To this end, explains Ferrucci, the
DeepQA team is moving away from at-
tempting to squeeze ever-diminishing
performance improvements out of
Watson in terms of parsers and local
components. Instead, they are focusing
on how to use context and information
to evaluate competing interpretations
more effectively. “What we learned is
that, for this approach to extend beyond
one domain, you need to implement a
positive feedback loop of extracting ba-
sic syntax and local semantics from lan-
guage, learning from context, and then
interacting with users and a broader
community to acquire knowledge that
is otherwise difficult to extract,” he
says. “The system must be able to boot-
strap and learn from its own failing
with the help of this loop.”
In an ideal future, says Ferrucci, Wat-
son will operate much like the ship com-
puter on “Star Trek,” where the input
can be expressed in human terms and
the output is accurate and understand-
able. Of course, the “Star Trek” ship com-
puter was largely humorless and devoid
of personality, responding to queries
and commands with a consistently even
tone. If the “Jeopardy!” challenge serves
as a small glimpse of things to come for
Watson—in particular, Watson’s pre-
cise wagers, which produced laughter
in the audience, and Watson’s visualiza-
tion component, which appeared to ex-
press the state of a contemplative mind
through moving lines and colors—the
DeepQA team’s focus on active learning
might also include a personality loop so
Watson can accommodate subtle emo-
tional cues and engage in dialogue with
the kind of good humor reminiscent of
the most personable artificial intelli-
gences in fiction.
Further Reading
Baker, S.
Final Jeopardy: Man vs. Machine and the
Quest to Know Everything. houghton Mifflin
harcourt, new York, n Y, 2011.
Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J.,
Gondek, D., Kalyanpur, A.A., Lally, A., Murdock,
J. W., Nyberg, E., Prager, J., Schlaefer, N.,
and Welty, C.
Building Watson: An overview of the
DeepQA project, AI Magazine 59, Fall 2010.
Ferrucci, D., et al.
Towards the Open Advancement of Question
Answering Systems. IBM Research Report
RC24789 (W0904-093), April 2009.
Simmons, R.F.
natural language question-answering
systems, Communications of the ACM 13, 1,
Jan. 1970.
Strzalkowski, T., and Harabagiu, S. (Eds.)
Advances in Open Domain Question
Answering. Springer-Verlag, Secaucus, nJ,
2006.
Based in los angeles, Kirk L. Kroeker is a freelance
editor and writer specializing in science and technology.
© 2011 acM 0001-0782/11/07 $10.00