is immediately apparent, allowing appropriate skepticism (despite high test
accuracy) and easier debugging.
Recall that GA2Ms are more expressive than simple GAMs because they
include pairwise terms. Figure 4c
depicts such a term for the features
age and cancer. This explanation indicates that among the patients who
have cancer, the younger ones are at
higher risk. This may be because the
younger patients who develop cancer
are probably critically ill. Again, since
doctors can readily inspect these
terms, they know if the learner develops unexpected conclusions.
Limitations. As described, GA2M
models are restricted to binary classification, and so explanations are
clearly contrastive—there is only one
choice of foil. One could extend GA2M
to handle multiple classes by training
n one-vs-rest classifiers or building
a hierarchy of classifiers. However,
while these approaches would yield
a working multi-class classifier, we
don’t know if they preserve model intelligibility, nor whether a user could
effectively adjust such a model by editing the shape functions.
Furthermore, recall that GA2Ms decompose their prediction into effects
of individual terms, which can be visualized. However, if users are confused
about what terms mean, they will not
understand the model or be able to
ask meaningful “what-if” questions.
Moreover, if there are too many features, the model’s complexity may be
overwhelming. Lipton notes that the
effort required to simulate some models (such as decision trees) may grow
logarithmically with the number of
25 but for GA2M the number of visualizations to inspect could
increase quadratically. Several methods might help users manage this
complexity; for example, the terms
could be ordered by importance;
however, it’s not clear how to estimate importance. Possible methods
include using an ablation analysis to
compute influence of terms on model
performance or computing the maximum contribution of terms as seen in
the training samples. Alternatively, a
domain expert could group terms semantically to facilitate perusal.
However, when the number of fea-
tures grows into the millions—which
occur when dealing with classifiers
over text, audio, image, and video
data—existing intelligible models do
not perform nearly as well as inscru-
table methods, like deep neural net-
works. Since these models combine
millions of features in complex, non-
linear ways, they are beyond human
capacity to simulate.
Understanding Inscrutable Models
There are two ways that an AI model
may be inscrutable. It may be provided as a blackbox API, such as Microsoft Cognitive Services, which uses
machine learning to provide image-recognition capabilities but does not
allow inspection of the underlying
model. Alternatively, the model may
be under the user’s control yet extremely complex, such as a deep, neural network, where a user has access to
myriad learned parameters but cannot reasonably interpret them. How
can one best explain such models to
The comprehensibility/fidelity trade-off. A good explanation of an event is
both easy to understand and faithful,
conveying the true cause of the event.
Unfortunately, these two criteria almost always conflict. Consider the
predictions of a deep neural network
with millions of nodes: a complete
and accurate trace of the network’s
prediction would be far too complex
to understand, but any simplification
Finding a satisfying explanation,
therefore, requires balancing the competing goals of comprehensibility and
fidelity. Lakkaraju et al.
22 suggest formulating an explicit optimization of
this form and propose an approximation algorithm for generating global explanations in the form of compact sets
of if-then rules. Ribeiro et al. describe
a similar optimization algorithm that
balances faithfulness and coverage in
its search for summary rules.
Indeed, all methods for rendering an
inscrutable model intelligible require
mapping the complex model to a simpler one.
28 Several high-level approaches to mapping have been proposed.
Local explanations. One way to
simplify the explanation of a learned
model is to make it relative to a single
input query. Such explanations, which
are termed local33 or instance-based,
The key challenge
intelligible AI is
process to a human.