Ranked Answers
1. Samuel Palmisano
2. Sam Palmisano
3. Sam
Q W CE?uestion ho is the O of IBM
Question
Classification
USER
Question Type
WHO IS
Answer Type
PERSON
Triangulation
Query
Modulation
Candidate Answers
Samuel Palmisano
recently, Samuel,
Sam Palmisano, Sam
She, became
Answer
Matching
Google Queries
“CEO of IBM” is
became “CEO of IBM”
“CEO of IBM”
Querying
Google
Text from Returned Web Pages
Samuel Palmisano recently became
the CEO of IBM
She wrote to the CEO of IBM,
Samuel Palmisano, and called …
… and Sam Palmisano, who is the
CEO of IBM, were having …
ers simply cannot enjoy the quantity
of information available on the Web,
since they are unable to glance through
the pages of snippets that are returned
by search engines. The best available
reader software and refreshable Braille
screens do not provide enough bandwidth for real-time interaction.
Although Google and Microsoft have
announced that they’ve added QA features to their engines, these capabilities
are limited, as we found in the simple
experiment we report here and reconfirmed at the time of publication. Since
many practitioners are familiar with the
concept of online QA, we review only
the recent advances in automated open-domain (Web) QA and the challenges
faced by QA. We contrast the most noticeable (in terms of academic research
interest and media attention) systems
available on the Web and compare their
performance, as a “team,” against two
leading search portals: Google.com and
MSN.com.
technology foundation
For the past decade, the driving force
behind many QA advances has been
the annual competition-like Text Retrieval Conference (TREC). 8 The par-
ticipating systems must identify precise
answers to factual questions (such as
“who,” “when,” and “where”), list questions (such as “What countries produce
avocados?”), and definitions (such as
“What is bulimia?”).
The following distinctions separate
QA from a fixed corpus (also called
“closed domain,” as in TREC competitions) and QA from the entire Web (
typically referred to as “open corpus” or
open-domain QA):
Existence of simpler variants. The
Web typically involves many possible
ways for answers to begin, allowing QA
fact-seeking systems to look for the low-est-hanging fruit, or most simple statements of facts, making the task easier at
times;
Expectation of context. Users of Web-based fact-seeking engines do not necessary need answers extracted precisely.
In fact, we’ve personally observed from
our interaction with practitioners (
recruited from among our MBA students)
that they prefer answers in context to
help verify that they are not spurious;
and
Speed. Web-based fact-seeking engines must be quick, and TREC competition does not impose real-time
constraints. This emphasizes simple,
computationally efficient algorithms
and implementations (such as simple
pattern matching vs. “deep” linguistic
analysis).
A typical Web QA system architecture
is illustrated by the NSIR system (see the
Figure here), 5 one of the earliest Web QA
systems (1999–2005) developed at the
University of Michigan, and the more
recent Arizona State University Question Answering system (ASU QA). 6 When
given a natural-language question
(such as “Who is the largest producer
of software?”), the system recognizes a
certain grammatical category (such as
“what is,” “who is,” and “where was”),
as well as the semantic category of the
expected answer (“organization” in this
example). NSIR uses machine-learning
techniques and a trainable classifier to
look for specific words in the questions
(such as “when” and “where”), as well as
parts of speech (POS) of the other words
supplied by the well-known Brill’s POS
tagger. 2 For example, in the question
“What ocean did Titanic sink in?,” the
tagger identifies “ocean” as a noun and
“sink” as a verb. The trained classifier
classifies the expected answer type as
“location.”
ASU QA matches the question to one
of the trained regular expressions. For
example, the question “What ocean did
Titanic sink in?” matches “What <C>
did <T> <V>,” where <C> is any word
that becomes the expected semantic
category (“ocean”), <T> is the word or
phrase that becomes the question target
(“Titanic”), and <V> is the verb phrase
(“sink in”). While NSIR and ASU QA use
only a few grammatical and semantic
categories, some other (non-Web) systems involve more fine-tuned taxono-mies. For example, Falcon, 7 one of the
most successful TREC systems, is based
on a pre-built hierarchy of dozens of semantic types of expected answers, subdividing the category “person” further
into “musician,” “politician,” “writer,”
“athlete,” and more.
Web QA systems generally do not
crawl or index the Web themselves.
They typically use the “meta engine”
approach: send one or more queries to
commercial engines providing application programming interfaces (APIs) specifically designed for this purpose. The
query-modulation step in the Figure
creates requests for the search engine