typically used for real-time rendering,
as in Jimenez.
15
Deformation. Achieving coherent
movement of the skin is especially challenging due to the complex deformations in broad expressions and the
highly non-linear motion of skin. Many
computer-generated faces in games
and films do not address these characteristics; for example, the lips on a character may move while the surrounding
areas of the face remain static, causing
an unnatural effect. Unlike skeletal
muscles, facial muscles are embedded
in the mobile facial tissue, meaning facial muscle activation must be treated
as a system. Arguably the most coherent and generally useful way to drive
facial animation is through parameter-ization of individual muscle activity
(such as in Ekman and Friesen’s Facial
Action Coding System10).
The facial deformations used in
animatable faces are typically represented through deforming geometry using weighted joints or weighted
shape combination (“blendshape”)
methods.
25 While effective, these methods can suffer from combinatorial explosion in representing the complex
range of facial expressions. The high-est-quality models used in the visual-effects industry incorporate a large
number of blendshapes to form linear
approximations to non-linear deformations. Creating these models is labor
intensive, so a number of researchers
have approached the problem using
flesh simulations.
34, 37, 40
Facial Motor System
To design an autonomous digital facial
system, it is important to understand
how faces are controlled. Traveling inward from the facial nerves, we reach
the facial nucleus in the brainstem.
The facial nucleus receives its main
inputs from both subcortical and
cortical areas through different pathways. Both a person’s emotional and
voluntary facial expressions seem to
arise from different neural circuits.
8, 13
The implication is that the voluntary
expression cannot access a genuine
emotional motor pattern and is why it
is not possible to fully produce a genu-
ine emotional expression through vo-
lition. Similarly, stroke patients with
damage to certain primary motor and
pre-motor areas cannot produce a sym-
cal explanations link to high-level be-
havior (such as goal setting).
Building embodied nervous systems
that can learn through real-time senso-
rimotor interaction is being explored
in the field of developmental robotics.
8
Social-interaction models have been
explored with anthropomorphized “so-
cial robots” (such as in Leonardo and
Kismet4). Developmental robotics, in
particular, seeks to explore the theory
of embodied cognition—how the mind
develops through real-time sensorimo-
tor interaction.
Our approach to building live interactive virtual agents takes a similar direction whereby we embody,
through realistic computer graphics, a
biologically based model of behavior.
We ground experience through interaction and place particular emphasis
on the importance of face-to-face interaction, which is difficult to achieve
in robotics due to mechanical constraints. The result is a system that can
be reduced to more biological detail,
as well as expandable to incorporate
higher-level complex systems. As there
are many competing theories on how
different brain and behavioral systems
function, our choice is to opt for flexibility and develop a “system to build
systems” in a Lego-like manner.
Brain Language
BL28 is a modular simulation frame-
work we have been developing for the
past five years to integrate neural net-
works with real-time computer graph-
ics and sensing. It is designed for maxi-
mum flexibility and can be connected
with other architectures through a sim-
ple API. It consists of a library of time-
stepping modules and connectors. It
is designed to support a wide range of
computational neuroscience models,
as in Trappenberg.
38 Models supported
by BL range from simple leaky integra-
tors to spiking neurons to mean field
models to self-organizing maps. These
can be interconnected to form larger
neural networks (such as recurrent net-
works and convolutional networks like
those used in deep learning). Our main
interest is in online learning, in which
the network learns during live interac-
tion from both spatial and temporal
data. A key strength of BL is its tight
integration with computer graphics as
a visualization tool. Complex dynamic
metrical voluntary smile yet can smile
normally in response to jokes.
13
Expressions are generated by neural
patterns in both the subcortical and
cortical regions. In the subcortical area,
circuits include those for laughing and
crying. Evidence suggests certain basic
emotional expressions like these do not
have to be learned. In comparison, voluntary facial movements (such as those
involved in speech and culture-specific
expressions) are learned through experience and predominantly rely on cortical motor control.
Our psychobiological facial framework aims to reflect that facial expressions consist of both innate and learned
elements and are driven by quite independent brain-region simulations.
Building a Holistic Model
The human face mirrors both the brain
and the body, revealing mental state
(such as through mental attention in
eye direction) and physiological state
(such as through position of eyelids
and color of the skin). The dynamic behavior of the face emerges from many
systems interacting on multiple levels,
from high-level social interaction to
low-level biology.
To drive a biologically based life-like autonomous character, one would
need to model multiple aspects of a
nervous system. Depending on the level of implementation, a non-exhaustive
list includes models of the sensory and
motor systems, reflexes, perception,
emotion and modulatory systems,
attention, learning and memory, rewards, decision making, and goals. We
seek to define an architecture that is
able to interconnect all of these models as a virtual nervous system.
Several biologically inspired cognitive architectures have been developed; see Goertzel et al.
11 for a survey.
Most are non-graphical, focusing on
cognition over affect or physiological
states. It makes sense that the more
biologically based the architecture and
the more realistic, the more it is ultimately likely to represent biological
behavior. An example is the “Leabra”
framework,
23 which constructs low-level biologically based neural network
models and connects them to model
higher-level aspects of cognition. This
modeling approach is appealing for its
ability to suggest how low-level biologi-