Figure 1. An abbreviated taxonomy of the evolution of neural networks shows a progression
from simple one-layer Perceptron to multilayered Perceptron (MLP), convolutional and
recurrent networks with memory and adaptable reinforcement learning algorithms.
possible terrorists. Some military strat-
egists talk about using them as auto-
matic fire-control systems. How can we
trust the networks?
This is a hot question. We know
that a network is quite reliable when
its inputs come from its training set.
But these critical systems will have
inputs corresponding to new, often
unanticipated situations. There are
numerous examples where a network
gives poor responses for untrained
inputs. This is called the “fragility”
problem. How can we know that the
network’s response will not cause a disaster or catastrophe?
The answer is we cannot. The “
programs” learned by neural networks
are in effect enormous, incomprehensible matrices of weights connecting millions of neurons. There is no
more hope of figuring out what these
weights mean or how they will respond to a new input than in looking
for a person’s motivations by scanning
the brain’s connections. We have no
means to “explain” why a medical network reached a particular conclusion.
At this point, we can only experiment
with the network to see how it performs for untrained inputs, building
our trust slowly and deliberately.
Q: Computers were in the news 20
years ago for beating the grandmaster
chess players, and today for beating the
world’s Go master champion. Do these
advances signal a time when machines
Back in favor because of the MLP
breakthrough, neural networks advanced rapidly. We have gone well beyond recognizing numbers and handwriting, to networks that recognize and
label faces in photographs. New methods have since been added that allow
recognition in video moving images;
see the figure for some of the keywords.
Q: What propelled the advances?
Two things. The abundance of data,
especially from online social networks
like Twitter and Facebook, large-scale
sensor networks such as smartphones
giving positional data for traffic maps,
or searches for correlations between
previously separate large databases.
The questions that could be answered
if we could process that data by recognizing and recommending were a very
strong motivating force.
The other big factor is the proliferation of low-cost massively parallel hardware such as the Nvidia GPU
(Graphics Processing Unit) used in
graphics cards. GPUs rapidly process
large matrices representing the positions of objects. They are super-fast lin-ear-algebra machines. Training a network involves computing connection
matrices and using a network involves
evaluations of matrix multiplications.
GPUs do these things really well.
Q: These networks are now used for
critical functions such as medical di-
agnosis or crowd surveillance to detect
Advertise with ACM!
Reach the innovators
and thought leaders
working at the
cutting edge
of computing
and information
technology through
ACM’s magazines,
websites
and newsletters.
Request a media kit
with specifications
and pricing:
Ilia Rodriguez
+ 1 212-626-0686
acmmediasales@acm.org
◊◆◊◆◊