• Stills from a
video of three
completed
speaking chairs
in action. See
more at the Six
Speaking Chairs
website; http://
imd.dundee.
ac.uk/sixspeak-ingchairs/video.
html/
[ 3] Schlosberg, H.
“Three Dimensions
of Emotion.” The
Psychological Review
61, 2 (1954): 81–88.
September + October 2010
[ 4] Buchenau, M.
and Fulton Suri,
J. “Experience
Prototyping.” Paper
presented at the 3rd
International Conference
on Designing Interactive
Systems, August 17–19,
New York (2000):
424–433.
[ 5] Murray, I. and Arnott,
J. “Implementation and
Testing of a System for
Producing Emotion-by-rule in Synthetic
Speech.” Speech
Communication 16
(1995): 369–390.
interactions
TTS is found in the screen-reading software
used by many visually impaired people, in other
eyes-free interfaces (such as Apple’s iPod Shuffle),
and in automated telephone answering services.
But its most profound application is in communication devices used by people who cannot
speak. And it is here that the limitations of TTS
can be most disabling, because a lack of variation
in tone of voice can never be neutral. A lack of
expressiveness can itself send out a false message
that the person is emotionally impaired as well
as speech-impaired, or perhaps socially unsophisticated. Writing and speaking are fundamentally
different ways of conveying language, and yet
TTS treats them as if they were equivalent.
Chair No. 2. The Happy/Sad Chair, illustrates
an alternative approach. On a reclaimed wooden
dining chair, a tuning dial (from a 1950s Bush
radio) has been relabeled, the international radio
stations replaced by a two-dimensional mapping
of emotions, taken from psychological research
[ 3]. Inside the box a potentiometer registers the
rotation of the tuning dial, and a separate slider
controls the degree of emotion. These inputs
drive a parametric model of prosody, using gran-
ular synthesis and formant resynthesis in Max/
MSP—a flexible “experience prototype” standing
in for more sophisticated state-of-the-art speech
technology [ 4]. (Our speech technology, while
flexible and capable of a high level of nuance and
real-time control, sounds far from realistically
human. Besides their iconic visual representation
of spoken announcements, the low-fidelity sound
from the metal horn speakers accentuates the
highly artificial sound of the speech. The empha-
sis is not on its realism, but its expressiveness.)