ity into this consequential question:
How effective will function-based
approaches be when applied to new
and broader applications than those
already targeted, particularly those
that mandate more stringent measures of success? The question has
two parts: The first concerns the class
of cognitive tasks whose corresponding functions are simple enough to allow compact representations that can
be evaluated efficiently (as in neural
networks) and whose estimation is
within reach of current thresholds—
or thresholds we expect to attain in,
say, 10 to 20 years. The second alludes to the fact that these functions
are only approximations of cognitive
tasks; that is, they do not always get it
right. How suitable or acceptable will
such approximations be when targeting cognitive tasks that mandate
measures of success that are tighter
than those required by the currently
targeted applications?
The Power of Success
Before I comment on policy considerations, let me highlight a relevant
phenomenon that recurs in the history of science, with AI no exception.
I call it the “bullied-by-success” phenomenon, in reference to the subduing of a research community into
mainly pursing what is currently successful, at the expense of pursuing
enough what may be more successful
or needed in the future.
Going back to AI history, some of
the perspectives promoted during
the expert-systems era can be safely
characterized today as having been
scientifically absurd. Yet, due to the
perceived success of expert systems
then, these perspectives had a dominating effect on the course of scientific
dialogue and direction, leading to a
bullied-by-success community.s I saw a
similar phenomenon during the transition from logic-based approaches
to probability-based approaches for
commonsense reasoning in the late
1980s. Popular arguments then, like
“People don’t reason probabilistically,”
s A colleague could not but joke that the broad
machine learning community is being bullied
today by the success of its deep learning sub-community, just as the broader AI community
has been bullied by the success of its machine
learning sub-community.
man speech, and African grey parrots
can generate sounds that mimic human speech to impressive levels. Yet
none of these animals has the cognitive abilities and intelligence typically
attributed to humans.
One of the reactions I received to
such remarks was: “I don’t know of any
animal that can play Go!” This was in
reference to the AlphaGo system, which
set a milestone in 2016 by beating the
world champion in the game. Indeed,
we do not know of animals that can play
a game as complex as Go. But first recall
the difference between performance
and intelligence: A calculator outperforms humans at arithmetic without
possessing human or even animal cognitive abilities. Moreover, contrary to
what seems to be widely believed, AlphaGo is not a neural network since
its architecture is based on a collection
of AI techniques that have been in the
works for at least 50 years.o This includes
the minimax technique for two-player
games, stochastic search, learning from
self-play, use of evaluation functions
to cut off minimax search trees, and
reinforcement learning, in addition to
two neural networks. While a Go player
can be viewed as a function that maps a
board configuration (input) to an action
(output), the AlphaGo player was not
built by learning a single function from
input-output pairs; only some of its
components were built that way.p The
issue here is not only about assigning
credit but about whether a competitive
Go function can be small enough to be
represented and estimated under current data-gathering, storage, and computational thresholds. It would be
quite interesting if this was the case,
but we do not yet know the answer. I
should also note that AlphaGo is a
great example of what one can achieve
today by integrating model-based and
function-based approaches.
Pushing Thresholds
One cannot of course preclude the
possibility of constructing a competi-
tive Go function or similarly complex
o Oren Etzioni of the Allen Institute for Artificial
Intelligence laid out this argument during a
talk at UCLA in March 2016 called Myths and
Facts about the Future of AI.
p AlphaZero, the successor to AlphaGo, used one
neural net work instead of two and data generat-
ed through self-play, setting another milestone.
functions, even though we may not
be there today, given current thresh-
olds. But it begs the question: If it
is a matter of thresholds, and given
current successes, why not focus all
our attention on moving thresholds
further? While there is merit to this
proposal, which seems to have been
adopted by key industries, it does
face challenges that stem from both
academic and policy considerations.
I address academic considerations
next while leaving policy consider-
ations to a later section.
From an academic viewpoint, the
history of AI tells us to be quite cautious, as we have seen similar phenomena before. Those of us who have
been around long enough can recall
the era of expert systems in the 1980s.
At that time, we discovered ways to
build functions using rules that were
devised through “knowledge engineering” sessions, as they were then
called. The functions created through
this process, called “expert systems”
and “knowledge-based systems,” were
claimed to achieve performance that
surpassed human experts in some
cases, particularly in medical diagnosis.q The term “knowledge is power”
was used and symbolized a jubilant
state of affairs, resembling what “deep
learning” has come to symbolize today.r The period following this era
came to be known as the “AI Winter,”
as we could finally delimit the class of
applications that yielded to such systems, and that class fell well short of
AI ambitions.
While the current derivative for
progress on neural networks has been
impressive, it has not been sustained
long enough to allow sufficient visibil-
q One academic outcome of the expert system
era was the introduction of a dedicated master’s degree at Stanford University called the
“Master’s in AI” that was separate from the
master’s in computer science and had significantly looser course requirements. It
was a two-year program, with the second
year dedicated to building an expert system.
I was a member of the very last class that
graduated from the program before it was
terminated and recall that one of its justifications was that classical computer science
techniques can be harmful to the “
heuristic” thinking needed to effectively build expert systems.
r The phrase “knowledge is power” is apparently due to English philosopher Sir Francis
Bacon (1561–1626).