genuine academic interests instead
of just yielding to current fashions.t
Policy Considerations
Let me now address some policy con-
cerns with regard to focusing all our
attention on functions instead of
also on models. A major concern
here relates to interpretability and
explainability. If a medical-diagnosis
system recommends surgery, we
would need to know why. If a self-
driving car kills someone, we would
also need to know why. If a voice
command unintentionally shuts
down a power-generation system, it
would need to be explained as well.
Answering “Why?” questions is cen-
tral to assigning blame and respon-
sibility and lies at the heart of legal
systems. It is also now recognized
that opacity, or lack of explainabili-
ty, is “one of the biggest obstacles
to widespread adoption of artificial
intelligence.”u
Models are more interpretable
than functions.v Moreover, models
offer a wider class of explanations
than functions, including explana-
tions of novel situations and expla-
nations that can form a basis for
“understanding” and “control.” This
is due to models having access to in-
t I made these remarks over a dinner table that
included a young machine learning researcher,
whose reaction was: “I feel much better now.” He
was apparently subjected to this phenomenon
by support-vector-machine (SVM) researchers
during his Ph.D. work when SVMs were at their
peak and considered “it” at the time. Another
young vision researcher, pressed on whether
deep learning is able to address the ambitions of
vision research, said, “The reality is that you can-
not publish a vision paper today in a top confer-
ence if it does not contain a deep learning com-
ponent, which is kind of depressing.”
Darpa’s push to make artificial intelligence
explain itself. The Wall Street Journal (Aug.
10, 2017); http://on.wsj.com/2vmZKlM; DAR-
PA’s program on “explainable artificial intel-
ligence”; https://www.darpa.mil/program/
explainable-artificial-intelligence; and the
E.U. general data protection regulation on “ex-
plainability”; https://www.privacy-regulation.
eu/en/ r71.htm
v I am referring here to learned and large func-
tions of the kind that stand behind some of the
current successes (such as neural networks
with thousands or millions of parameters).
This excludes simple or well-understood
learned functions and functions synthesized
from models, as they can be interpretable or
explainable by design.
formation that goes beyond what can
be extracted from data. To elaborate
on these points, I first need to explain
why a function may not qualify as a
model, a question I received during a
discussion on the subject.
Consider an engineered system
that allows us to blow air into a balloon that then raises a lever that is
positioned on top of the balloon.
The input to this system is the
amount of air we blow (X), while the
output is the position of the lever
(Y). We can learn a function that
captures the behavior of the system
by collecting X- Y pairs and then estimating the function Y = f (X). While
this function may be all we need for
certain applications, it would not
qualify as a model, as it does not
capture the system mechanism.
Modeling that mechanism is essential for certain explanations (Why is
the change in the lever position not
a linear function of the amount of
air blown?) and for causal reasoning
more generally (What if the balloon
is pinched?). One may try to address
these issues by adding more inputs
to the function but may also blow up
the function size, among other difficulties; more on this next.
In his The Book of Why: The New Science of Cause and Effect, Judea Pearl
explained further the differences between a (causal) model and a function,
even though he did not use the term
“function” explicitly. In Chapter 1, he
wrote: “There is only one way a thinking
entity (computer or human) can work
out what would happen in multiple
scenarios, including some that it has
never experienced before. It must possess, consult, and manipulate a mental
causal model of that reality.” He then
gave an example of a navigation system
based on either reasoning with a map
(model) or consulting a GPS system that
gives only a list of left-right turns for arriving at a destination (function). The
rest of the discussion focused on what
can be done with the model but not the
function. Pearl’s argument particularly
focused on how a model can handle
novel scenarios (such as encountering
roadblocks that invalidate the function
recommendations) while pointing to
the combinatorial impossibility of encoding such contingencies in the function, as it must have a bounded size.
which I believe carries merit, were
completely silenced when probabilistic approaches started solving
commonsense reasoning problems
that had defied logical approaches
for more than a decade. The bullied-by-success community then made
even more far-reaching choices in
this case, as symbolic logic almost
disappeared from the AI curricula.
Departments that were viewed as
world centers for representing and
reasoning with symbolic logic barely offered any logic courses as a result. Now we are paying the price.
As one example: Not realizing that
probabilistic reasoning attributes
numbers to Boolean propositions in
the first place, and that logic was at
the heart of probabilistic reasoning
except in its simplest form, we have
now come to the conclusion that we
need to attribute probabilities to
more complex Boolean propositions
and even to first-order sentences. The
resulting frameworks are referred to
as “first-order probabilistic models”
or “relational probabilistic models,”
and there is a great need for skill in
symbolic logic to advance these formalisms. The only problem is that
this skill has almost vanished from
within the AI community.
The blame for this phenomenon
cannot be assigned to any particular
party. It is natural for the successful
to be overjoyed and sometimes also
inflate that success. It is expected that
industry will exploit such success in
ways that may redefine the employment market and influence the academic interests of graduate students.
It is also understandable that the rest
of the academic community may play
along for the sake of its survival: win a
grant, get a paper in, attract a student.
While each of these behaviors seems
rational locally, their combination
can be harmful to scientific inquiry
and hence irrational globally. Beyond
raising awareness about this recurring phenomenon, decision makers
at the governmental and academic
levels bear a particular responsibility
for mitigating its negative effects. Senior members of the academic community also bear the responsibility
of putting current developments in
historical perspective, to empower
junior researchers in pursuing their