illustrate how medical systems and
rehabilitative procedures promise to
provide a rich environment for the potential exploitation of hand-gesture
systems. Still, additional research and
evaluation procedures are needed to
encourage system adoption.
Entertainment. Computer games
are a particularly technologically
promising and commercially rewarding arena for innovative interfaces due
to the entertaining nature of the interaction. Users are eager to try new interface paradigms since they are likely
immersed in a challenging game-like
environment. 45 In a multi-touch device, control is delivered through the
user’s fingertips. Which finger touches
the screen is irrelevant; most important is where the touch is made and the
number of fingers used.
In computer-vision-based, hand-gesture-controlled games, 13 the system must respond quickly to user
gestures, the “fast-response” requirement. In games, computer-vision algorithms must be robust and efficient,
as opposed to applications (such as
inspection systems) with no real-time
requirement, and where recognition
performance is the highest priority.
Research efforts should thus focus on
tracking and gesture/posture recognition with high-frame-rate image processing (> 10 fps).
Another challenge is “gesture spotting and immersion syndrome,” aiming to distinguish useful gestures
from unintentional movement. One
approach is to select a particular gesture to mark the “start” of a sequence
of gestures, as in the “push to talk” approach in radio-based communication
where users press a button to start talking. In touchscreen mobile phones,
the user evokes a “swipe” gesture to
start operating the device. To “end”
the interaction, the user may evoke
the “ending” gesture or just “rest” the
hands on the side of the body. This
multi-gesture routine may be preferable to purely gaze-based interaction
where signaling the end of the interaction is a difficult problem, since users
cannot turn off their eyes. The problem
of discriminating between intentional
gestures and unintentional movement
is also known as the Midas Touch problem ( http://www.diku.dk/hjemmesider/
ansatte/panic/eyegaze/ node27.html).
In the Mind-Warping augmented-reality fighting game, 45 where users interact with virtual opponents through
hand gestures, gesture spotting is
solved through voice recognition. The
start and end of a temporal gesture is
“marked” by voice—the start and end
of a Kung Fu yell; Kang et al. 21 addressed
the problem of gesture spotting in the
first-person-shooter Quake II. Such
games use contextual information like
gesture velocity and curvature to ex-
tract meaningful gestures from a video
sequence. Bannach et al. 41 addressed
gesture spotting through a sliding
window and bottom-up approach in a
mixed-reality parking game. Schlömer
et al. 42 addressed accelerometer-based
gesture recognition for drawing and
browsing operations in a computer
game. Gesture spotting in many Nin-
tendo Wii games is overcome by press-
ing a button on the WiiMote control
through the “push to talk” analogy.
Intuitiveness is another important
requirement in entertainment sys-
tems. In the commercial arena, most
Nintendo Wii games are designed
to mimic actual human motions in
sports games (such as golf, tennis, and
bowling). Wii games easily meet the
requirement of “intuitiveness,” even
as they violate the “come as you are”
requirement, since users must hold
the WiiMote, instead of using a bare
hand. Sony’s Eye Toy for the Playstation
and the Kinect sensor for Microsoft’s
Xbox360 overcome this limitation
while achieving the same level of im-
mersion through natural gestures for
interaction. These interfaces use hand-
body gesture recognition (also voice
recognition in Kinect) to augment the
gaming experience.
In the research arena, the intuitive
aspect of hand-gesture vocabulary is
addressed in a children’s action game
called QuiQui’s Giant Bounce20 where
control gestures are selected through
a “Wizard of Oz” paradigm in which a
player interacts with a computer appli-
cation controlled by an unseen subject,
with five full-body gestures detected
through a low-cost USB Web camera.
“User adaptability and feedback”
is the most remarkable requirement
addressed in these applications. In
entertainment systems, users profit
from having to learn the gesture vocabularies employed by the games. A
for sign languages
(such as American
sign Language),
hand-gesture-
recognition systems
must be able to
recognize a large
lexicon of single-
handed and two-
handed gestures.