training session is usually required to
teach them how the gestures should
be performed, including speed, trajectory, finger configuration, and body
posture. While beginners need time to
learn the gesture-related functions, experienced users navigate through the
games at least as quickly as if they were
using a mechanical-control device or
attached sensors. 8, 37
Intelligent user interfaces that rely
on hand/body gesture technology face
special challenges that must be addressed before future commercial systems are able to gain popularity. Aside
from technical obstacles like reliability, speed, and low-cost implementation, hand-gesture interaction must
also address intuitiveness and gesture
spotting.
Crisis management and disaster
relief. Command-and-control systems
help manage public response to natural disasters (such as tornados, floods,
wildfires, and epidemic diseases) and
to human-caused disasters (such as
terrorist attacks and toxic spills). In
them, the emergency response must
be planned and coordinated by teams
of experts with access to large volumes
of complex data, in most cases through
traditional human-computer interfaces. One such system, the “Command
Post of the Future,” 47 uses pen-based
gestures. 11 Such hand-gesture interface
systems must reflect the requirements
of “fast learning,” “intuitiveness,”
“lexicon size and number of hands,”
and “interaction space” to achieve satisfactory performance. 26 The first two
involve natural interaction with geospatial information (easy to remember
and common gestures); the last two
involve the system’s support of collaborative decision making among individuals. Multimodality (speech and
gesture), an additional requirement
for crisis-management systems, is not
part of our original list of requirements
since it includes modalities other than
gestures. The pioneering work was
Richard A. Bolt’s “Put-That-There” system, 6 providing multimodal voice input plus gesture to manipulate objects
on a large display.
DAVE_G, 40 a multimodal, multi-user geographical information system
(GIS), has an interface that supports
decision making based on geospatial
data to be shown on a large-screen dis-
Aside from
technical obstacles
like reliability,
speed, and low-cost
implementation,
hand-gesture
interaction must
also address
intuitiveness and
gesture spotting.
play. Potential users are detected as
soon as they enter the room (the “come
as you are” requirement) through a
face-detection algorithm; the detected
facial region helps create a skin-color
model applied to images to help track
the hands and face. Motion cues are
combined with color information to
increase the robustness of the tracking
module. Spatial information is conveyed using “here” and “there” manipulative gestures that are, in turn, recognized through a hidden Markov model.
The system was extended to operate
with multiple users in the “XISM” system at Pennsylvania State University26
where users simultaneously interface
with the GIS, allowing a realistic decision-making process; however, Krahn-stoever et al. 26 provided no detail as to
how the system disambiguates tracking information of the different users.
Other approaches to multi-user
hand-gesture interfaces have adopted
multi-touch control through off-the-shelf technology, 15, 31 allowing designers to focus on collaborative user
performance rather than on hand-gesture-recognition algorithms, These
systems give multiple users a rich
hand-gesture vocabulary for image
manipulation, including zoom, pan,
line drawing, and defining regions of
interest, satisfying the “lexicon size
and number of hands” requirement.
Spatial information about objects on
the GIS can be obtained by clicking
(touching) the appropriate object.
These applications combine collaborative hand-gesture interaction
with large visual displays. Their main
advantage is user-to-user communication, rather than human-computer
interaction, so the subjects use their
usual gestures without having to learn
new vocabularies; for example, sweeping the desk can be used to clean the
surface.
Human-robot interaction.
Hand-gesture recognition is a critical aspect
of fixed and mobile robots, as suggested by Kortenkamp et al. 25 Most important, gestures can be combined with
voice commands to improve robustness or provide redundancy and deal
with “gesture spotting.” Second, hand
gestures involve valuable geometric
properties for navigational robot tasks;
for example, the pointing gesture can
symbolize the “go there” command for