Speech technology with Subtle tones of Voice
This is a project about more expressive interactions with speech technology, specifically
interacting with the tone of voice of synthesized
speech. We hesitate to use the term “emotional”
speech synthesis because, as we will explain, we
are far more interested in the complex nuances
of everyday speech than basic emotions such as
sadness and fear.
All photographs by Graham Pullin and Andrew Cook
As interaction designers, our focus is not on
how to produce different tones of voice with
speech technology itself. (We know there are more
expert researchers looking into this, and we are
collaborating with a world-leading research center
on another project). Instead, we are exploring the
implications for a user interface: How might someone who is not a speech technologist conceive of
tone of voice in the first place, and therefore select
or control it? This is a challenging question.
A Story in Six Chairs
Without any further preamble, we will unfold the
story of the project through the objects themselves, introducing the background, rationale, and
inspiration along the way...
Chair No. 1. The Exclaiming/Questioning Chair,
the first chair in the collection, is a reclaimed
wooden kitchen chair; a plain charcoal-gray box
that extends to one side has been fitted beneath the
seat. Set into the top surface of this box are three
keys from a computer keyboard, marked with a
period (full stop), an exclamation point, and a ques-
tion mark. An old-fashioned metal horn loudspeak-
er projects from the front edge of the box. While
sitting in the chair, if you press the period key, the
loudspeaker emits the word “yes” in a level tone.
Press the “?” key and “yes” is delivered with a rising,
questioning intonation. Pressing the exclamation-
point key elicits a louder, more emphatic delivery.
[ 2] Crystal, D. The
English Tone of
Voice: Essays in
Intonation, Prosody and
Edward Arnold, 1975.