same issues also confront systems
based on deep-lookahead search.
While many planning algorithms
have strong theoretical properties,
such as soundness, they search over
action models that include their own
assumptions. Furthermore, goal
specifications are likewise incomplete.
29 If these unspoken assumptions are incorrect, then a formally
correct plan may still be disastrous.
Consider a planning algorithm
that has generated a sequence of ac-
tions for a remote, mobile robot. If the
plan is short with a moderate number
of actions, then the problem may be
inherently intelligible, and a human
could easily spot a problem. However,
larger search spaces could be cogni-
tively overwhelming. In these cases,
local explanations offer a simplifica-
tion technique that is helpful, just
as it was when explaining machine
learning. The vocabulary issue is like-
wise crucial: how does one succinctly
and abstractly summarize a complete
search subtree? Depending on the
choice of explanatory foil, different
answers are appropriate.
et al. describe an algorithm for gen-
erating the minimal explanation that
patches a user’s partial understand-
ing of a domain.
37 Work on mixed-ini-
tiative planning7 has demonstrated
the importance of supporting inter-
active dialog with a planning system.
Since many AI systems, for example,
35 combine deep search and
machine learning, additional chal-
lenges will result from the need to ex-
drill-down actions after presenting a
user with an initial explanation:
• Redirecting the answer by changing
the foil. “Sure, but why didn’t you predict class C?”
• Asking for more detail (that is, a
more complex explanatory model),
perhaps while restricting the explanation to a subregion of feature space.
“I’m only concerned about women
over age 50 ...”
• Asking for a decision’s rationale.
“What made you believe this?” To
which the system might respond by displaying the labeled training examples
that were most influential in reaching
that decision, for example, ones identified by influence functions19 or nearest
• Query the model’s sensitivity by
asking what minimal perturbation to
certain features would lead to a different output.
• Changing the vocabulary by adding (or removing) a feature in the explanatory model, either from a predefined set, by using methods from
machine teaching, or with concept
• Perturbing the input example to see
the effect on both prediction and explanation. In addition to aiding understanding of the model (directly testing
a counterfactual), this action enables
an affected user who wants to contest
the initial prediction: “But officer, one
of those prior DUIs was overturned ...?”
• Adjusting the model. Based on new
understanding, the user may wish to
correct the model. Here, we expect to
build on tools for interactive machine
learning1 and explanatory debug-
20, 21 which have explored interac-
tions for adding new training exam-
ples, correcting erroneous labels in
existing data, specifying new features,
and modifying shape functions. As
mentioned in the previous section, it
may be challenging to map user adjust-
ments that are made in reference to an
explanatory model, back into the origi-
nal, inscrutable model.
To make these ideas concrete, Figure 7 presents a possible dialog as a
user tries to understand the robustness of a deep neural dog/fish classifier built atop Inception v3.39 As the
figure shows: ( 1) The computer correctly predicts the image depicts a fish.
( 2) The user requests an explanation,
which is provided using LIME.
33 ( 3) The
user, concerned the classifier is paying more attention to the background
than to the fish itself, asks to see the
training data that influenced the classifier; the nearest neighbors are computed using influence functions.
While there are anemones in those
images, it also seems that the system
is recognizing a clownfish. ( 4) To gain
confidence, the user edits the input
image to remove the background, resubmits it to the classifier and checks
Explaining Combinatorial Search
Most of the preceding discussion
has focused on intelligible machine
learning, which is just one type of
artificial intelligence. However, the
Figure 7. An example of an interactive explanatory dialog for gaining insight into a DOG/FISH image classifier.
For illustration, the questions and answers are shown in English language text, but our use of a ‘dialog’ is
for illustration only. An interactive GUI, for example, building on the ideas of Krause et al.,
20 would likely be
a better realization.
C: See below:
H: (Hmm. Seems like it might be just
recognizing anemone texture!)
Which training examples are most
influential to the prediction?
C: These ones:
H: What happens
if the background
are removed? E.g.,
C: I still predict FISH,
because of these green
superpixels: C: I predict FISH
Green regions argue
for FISH, while RED
pushes toward DOG.
There’s more green.