pleasantly modular, with different
choices of the components corresponding to existing semantic parsers
in the literature.
Natural language understanding problem. In this article, we focus on the
following natural language understanding problem: Given an utterance x in a
context c, output the desired action y.
Figure 1 shows the setup for a question
answering application, in which case
x is a question, c is a knowledgebase,
and y is the answer. In a robotics application, x is a command, c represents
the robot’s environment, and y is the
desired sequence of actions to be carried by the robot.
29 To build such a system, assume that we are given a set of
n examples . We would like
to use these examples to train a model
that can generalize to new unseen
utterances and contexts.
Semantic parsing components. This
article focuses on a statistical semantic parsing approach to the above-mentioned problem, where the key is
to posit an intermediate logical form
z that connects x and y. Specifically, z
captures the semantics of the utterance x, and it also executes to the
action y (in the context of c). In our running example, z would be max(primes
∩ (−∞, 10)). Our semantic parsing
framework consists of the following
five components (see Figure 2):
1. Executor: computes the denotation (action) y = zc given a logical
form z and context c. This defines
the semantic representation (
logical forms along with their
2. Grammar: a set of rules G that
produces D(x, c), a set of candidate derivations of logical forms.
3. Model: specifies a distribution
pq(d|x,c) over derivations d
parameterized by q.
4. Parser: searches for high probability derivations d under the model pq.
5. Learner: estimates the parameters
q (and possibly rules in G) given
training examples .
We now instantiate each of these
components for our running example: “What is the largest prime less
block which is taller than the one you are
holding and put it into the box.” However,
as the systems were based on handcrafted rules, it became increasingly difficult to generalize beyond the narrow
domains and handle the intricacies of
Rise of machine learning. In the early
1990s, influenced by the successes of
statistical techniques in the neighboring speech recognition community, the
field of NLP underwent a statistical
revolution. Machine learning offered
a new paradigm: Collect examples of
the desired input–output behavior
and then fit a statistical model to these
examples. The simplicity of this paradigm coupled with the increase in data
and computation allowed machine
learning to prevail.
What fell out of favor was not only
rule-based methods, but also the
natural language understanding
problems. In the statistical NLP era,
much of the community’s attention
turned to tasks—documentation
classification, part-of-speech tagging, and syntactic parsing—which
fell short of full end-to-end understanding. Even question answering
systems relied less on understanding and more on a shallower analysis coupled with a large collection of
unstructured text documents,
typified by the TREC competitions.
Statistical semantic parsing. The spirit
of deep understanding was kept alive
by researchers in statistical semantic
19, 24, 33, 36, 37 A variety of different
semantic representations and learning algorithms were employed, but all
of these approaches relied on having
a labeled dataset of natural language
utterances paired with annotated logical forms, for example:
Utterance: What is the largest prime
less than 10?
Logical form: max(primes ∩ (−∞, 10) )
Weak supervision. Over the last few
years, two exciting developments have
really spurred interest in semantic parsing. The first is reducing the amount
of supervision from annotated logical
forms to answers:
Utterance: What is the largest prime
less than 10?
This form of supervision is much easier
to obtain via crowdsourcing. Although
the logical forms are not observed, they
are still modeled as latent variables,
which must be inferred from the
answer. This results in a more difficult learning problem, but Liang et al.
showed it is possible to solve it without
Scaling up. The second development
is the scaling up of semantic parsers
to more complex domains. Previous
semantic parsers had only been trained
on limited domains such as U.S. geography, but the creation of broad-coverage knowledgebases such as Freebase8
set the stage for a new generation of
semantic parsers for question answering. Initial systems required annotated logical forms,
11 but soon, systems
became trainable from answers.
4, 5, 18
Semantic parsers have even been extended beyond fixed knowledgebases
to semi-structured tables.
26 With the
ability to learn semantic parsers from
question-answer pairs, it is easy to collect datasets via crowdsourcing. As a
result, semantic parsing datasets have
grown by an order of magnitude.
In addition, semantic parsers have
been applied to a number of applications outside question answering: robot navigation,
2, 29 identifying
objects in a scene,
15, 23 converting natural language to regular expressions,
and many others.
Outlook. Today, the semantic parsing community is a vibrant field, but
it is still young and grappling with the
complexities of the natural language
understanding problem. Semantic
parsing poses three types of challenges:
• • Linguistic: How should we represent
the semantics of natural language
and construct it compositionally
from the natural language?
• • Statistical: How can we learn
semantic parsers from weak
supervision and generalize well to
• • Computational: How do we efficiently search over the combinatorially large space of possible
In the remainder of this article, we
present a general framework for
semantic parsing, introducing the
key components. The framework is