Action discovery. An example
showing learning through interaction with the environment in order
to demonstrate autonomous action
discovery26 was to aim to have BabyX
learn to play the classic video game
“Pong” (see Figure 7). We thus connected motor neurons in BabyX to the
bat controls and overlaid the visual
output of the game on the camera’s
input. Motor babbling causes the virtual infant to inadvertently move the
bat, much like a baby might flail its
arms about. Trajectories of the ball
are learned as spatiotemporal patterns on neural network maps. If the
bat hits the ball, a rewarding reaction
results, reinforcing the association
of the current motor state with the
trajectory. This association further
results in the bat being moved in anticipation of where the ball is going.
Without further modification to the
model, it is possible for the user to
actively encourage BabyX’s choices
(releasing virtual dopamine), providing a nice example of “naturally supervised” reinforcement learning.
These basic examples show BabyX
learning through interaction with a human user and the shared environment.
While basic, these examples of intrinsic action discovery, association, and
reinforcement learning (unsupervised
and “naturally” supervised) are fundamental to developing generalized autonomous learning systems.
Observations. As interaction is cen-
tral to the phenomena, we have dem-
onstrated and tested BabyX in several
public forums where we have observed
an extension of emotion from BabyX
to a shared experience with a “passive”
audience reacting as vicarious partici-
pant. Audience behavior is absorbed
into the simulation and is not apart
from it, making the experience differ-
ent from a film, game, or pre-rendered
simulation. The demonstrator elicits
behavior from BabyX through visual
and vocal activity and tries to direct her
attention. Affective expressions and
voice stimulate reward and affective
circuits. If BabyX is abandoned, de-
pending on oxytocin levels, her stress
system can activate a cascade of virtual
hormones, and she becomes increas-
ingly distressed. BabyX can be trained
to recognize certain images that can be
associated with vocalizations. When
the pain response, as if someone was
about to “hurt” the baby. There was
no sense that the audience rationally
thought the baby was real, though it im-
mediately reacted as if she were. Even
within a formal academic presentation,
and with the pain response a valid part
of any brain model, the audience re-
acted as if the demonstrator was about
to be cruel. Interestingly, this was fol-
lowed by an emotional display of relief
in the form of laughter. As laughter is
infectious, the demonstrator laughed,
which was registered by BabyX’s senso-
ry inputs, causing her to be “happier.”
The audience thus became a part of the
feedback loop that changed both par-
ties’ emotional states. The implication
is that a witness to a BabyX session is to
the demonstrator gains BabyX’s at-
tention and shows her a “First Words
Book,” if an image causes a strong
enough activation, the image will trig-
ger BabyX to voice an associated word.
Observing in a real, unscripted en-
vironment, people anticipate and seek
emotional responses from BabyX. As
such engagement happens, they are of-
ten transformed from observers to en-
gaged participants. An example of this
was seen at the 2015 SIGGRAPH con-
ference where Sagar31 demonstrated
BabyX. While the audience was mainly
informed professionals, their reaction
was audible and visceral to BabyX. Re-
sponses repeatedly observed included
a sharp negative reaction when the
demonstrator offered to demonstrate
Figure 6. BabyX (version 3). Screenshot of sensorimotor online learning session in which
multiple inputs and outputs of the model can be viewed simultaneously, including scrolling
displays, spike rasters, plasticity, activity of specific neurons, camera input, animated output.
Figure 7. BabyX (version 3). Learning to play the video game “Pong” through action discovery
and online reinforcement learning.