these measurement approaches have
not been characterized, or simply fail
in natural settings. For example, facial
expression recognition may be reliable
for videos with simple behaviors and
when the face is frontal to the camera,
but, in the case of out-of-plane head
rotation and co-occurring facial actions, recognition can perform poorly.
Physiological sensing approaches are
seriously hampered during physical activities. As machine learning and affective computing research advance, objective measurement techniques will
improve. In the meantime, practical
systems can still be deployed based on
automated facial and speech analysis.
However, designers need to take these
limitations into account.
One challenge with real-world systems that respond to emotions is that
expressions of emotion are often very
subtle or sparse. This may mean that
it is challenging to develop automated
detection systems with high recall (that
is, the fraction of emotion responses
detected) and low false positive (alarm)
rates. In social interactions, many nonverbal behaviors (for example, smiling)
will be more frequent than when people
are alone. Thus, it may be more practical to design systems that respond to
both social and emotional cues.
The sparsity and lack of specificity
within unimodal cues (that is, a facial
expression) are key reasons why multimodal affective computing systems
have been found to be consistently better than unimodal ones. 10 In some settings (for example, call center analysis)
the availability of visual cues might be
limited. In others, various modalities
might not available. The most effective
systems will be those that leverage the
most information, both about the individual and the context she is in.
Large interpersonal variability exists
in nonverbal behaviors. Thus, person-specific models can bring many benefits. Longitudinal studies are needed
for this type of modeling. To date, such
studies have been few and far between.
We need to design new mechanisms
for incentivizing individuals to interact
with a system or to be passively monitored for extended periods of time. Ultimately, the most successful affective
computing technology will be able to
build personalized models that leverage online learning to update over time.
Emotion Labels
One of the most significant choices in
designing an affective computing system is how to represent or classify emotional states. Emotion theorists have
long debated the exact definition of
emotion, and many models and taxon-omies of emotion have been proposed.
Common approaches include discrete,
dimensional, and cognitive-appraisal
models; other approaches include rational, communicative and anatomic
representations of affect. 22
Discrete models. Discrete categorizations of emotion posit that there are
“affect” programs that drive a set of
core basic emotions and the associated
cognitive, physiological, and behavioral processes. 39 There are several categorizations that have been proposed, but
by far the most commonly used set is
the so-called “basic” list of emotions of
anger, fear, sadness, disgust, surprise,
and joy. These states can be represented
as regions within a dimensional space.
In practice, the challenge with discrete
models of emotion arises from the state
definitions. Even “basic” states do not
occur frequently in many situations.
Designers must a priori consider which
states might be relevant and/or commonly observed in their context.
Dimensional models. The most
commonly used dimensional model
of affect is the circumplex—a circular,
two-dimensional space in which points
close to one another are highly correlated. Valence (pleasantness) and arousal
(activation) are the most frequently
selected descriptions used for the two
axes of the circumplex, however, the
appropriate principal axes are still debated. Another model uses “Positive
Affect” (PA) and “Negative Affect” (NA)
each with an activation component.
Dimensional models are appealing,
as they do not confine the output to a
specific label but can be interpreted in
more continuous ways. For example, in
some applications, none of the “basic”
emotions labels may apply to an observed emotional response, but that response will still lie somewhere within
the dimensional space. Nevertheless, a
designer will still need to carefully consider which axes are most appropriate
for their use case.
Appraisal models. Cognitive-ap-
praisal models consider the influence
of emotions on decisions. Specifically,
emotions are elicited and differenti-
ated based on a person’s evaluation of
a stimulus (that is, an event or object).
In this case, a person’s appraisal of a
situation will affect their emotional
response to a stimulus. People in dif-
ferent contexts experiencing the same
stimulus will not necessarily experi-
ence the same emotion.
Appraisal models employ a more formalized approach to context. This is very
important, given that only a very small
number of behaviors are universally
interpretable (and even those theories
have been vigorously debated). It is likely that a hybrid dimensional-appraisal
model will be the most useful approach.
Although academics have been experimenting with computational models of emotion extensively, there are
no commercially available software
tools for recognizing emotion (either
from verbal or nonverbal modalities)
that use an appraisal based model of
emotion. Incorporating context and
personalization into assessment of
the emotional state of an individual
is arguably the next big technical and
design challenge for commercial software systems that wish to recognize the
emotion of a user.
Emotional Agents
Several articles have been written on
the benefits of conversational agents
for more naturalistic human-computer
interactions. 7, 8 This research movement partly came from a belief that
traditional WIMP (windows, icons,
mouse, and pointer) user interfaces
were too difficult to navigate and learn14
and not natural enough. Here, we focus
on the addition of emotional sentience
to the agent to explore what additional
benefits might be achieved with the addition of intelligent affect sensing and
appropriate agent-based responses.
Dialogue systems. The first examples of affective agents were dialogue
systems. In the 1960s, Eliza was an
agent capable of limited natural language understanding37 that simulated
a psychotherapist. Recently, chat systems have become popular and are being used in many forms, from mental
health therapy to customer support.
The practical application of these dialogue systems has been made possible
by advancements in natural language
processing (NLP). The barrier to create