model. However, to do this based on a
learned function, the function would
need to be trained in the presence of
smokers or other smoke-producing
agents while defining smoke as an
input to the function and assuring
that smoke mediates the relationship
between fire and alarm, a task that requires external manipulation.
As Pearl told me, model-based
explanations are also important because they give us a sense of “
understanding” or “being in control” of a
phenomenon. For example, knowing
that a certain diet prevents heart disease does not satisfy our desire for
understanding unless we know why.
Knowing that the diet works by lowering the cholesterol level in the blood
partially satisfies this desire because it
opens up new possibilities of control.
For instance, it drives us to explore
cholesterol-lowering drugs, which
may be more effective than diet. Such
control possibilities are implicit in
models but cannot be inferred from
a learned, black-box function, as it
has no access to the necessary information (such as that cholesterol level
mediates the relationship between
diet and heart disease).
A number of researchers contacted
me about the first draft of this section, which was focused entirely on
explanations, to turn my attention to
additional policy considerations that
seem to require models. Like explanations, they all fell under the label
“reasoning about AI systems” but
this time to ensure that the developed systems would satisfy certain
properties. At the top of these properties were safety and fairness, particularly as they relate to AI systems
that are driven only by data. These
considerations constitute further examples where models may be needed, not only to explain or compensate for the lack of enough data, but
to further ensure we are able to build
the right AI systems and reason about
them rigorously.
A Theory of Cognitive Functions
One reaction I received concerning
my model-based vs. function-based
perspective was during a workshop
dedicated to deep learning at the Si-
mons Institute for the Theory of Com-
puting in March 2017. The workshop
There is today growing work on
explaining functions, where the vo-
cabulary of explanations is restricted
to the function inputs. For example,
in medical diagnosis, an explanation
may point to important inputs (such
as age, weight, and heart attack histo-
ry) when explaining why the function
is recommending surgery. The func-
tion may have many more additional
inputs, so the role of an explanation
is to deem them irrelevant. In vision
applications, such explanations may
point to a specific part of the image
that has led to recognizing an object;
again, the role of an explanation is to
deem some pixels irrelevant to the
recognition. These explanations are
practically useful, but due to their
limited vocabulary and the limited in-
formation they can access, they could
face challenges when encountering
novel situations. Moreover, they may
not be sufficient when one is seeking
explanations for the purpose of un-
derstanding or control.
Consider a function that predicts
the sound of an alarm based on many
inputs, including fire. An input-based explanation may point to fire
as a culprit of the alarm sound. Such
an explanation relies effectively on
comparing this scenario to similar
scenarios in the data, in which the
sound of the alarm was heard soon
after fire was detected; these scenarios are summarized by the function
parameters. While this may explain
why the function reached a certain
conclusion, it does not explain why
the conclusion (alarm sound) may be
true in the physical world.w Nor does
it explain how fire triggers the alarm;
is it, say, through smoke or through
heat? The importance of these distinctions surfaces when novel situations arise that have not been seen
before. For example, if the alarm is
triggered by smoke, then inviting a
smoker into our living room might
trigger an alarm even in the absence
of fire. In this case, pointing to fire as
an explanation of the sound would be
problematic. Humans arrive at such
conclusions without ever seeing a
smoker, which can also be achieved
through reasoning on an appropriate
w The function imitates data instead of reasoning about a model of the physical world.
Human-level
intelligence is
not required
for the tasks
currently
conquered by
neural networks,
as such tasks
barely rise to
the level of abilities
possessed
by many animals.