nized and the performance of the recognition is a well-known obstacle in
the design of gesture-based interfaces.
The more freely a system allows users
to express themselves, the less accurate it gets; conversely, the greater the
rigor in specifying gestures, the greater
the likelihood the system will perform
accurately.
A common approach toward achieving this trade-off is to create a set of
specific grammars or vocabularies for
different contexts. The system dynamically activates different subsets of vocabularies and grammars according
to the context, instead of maintaining
a single large lexicon. This built-in
feature reduces complexity in gesture-recognition systems, as separate gesture-recognition algorithms are used
for smaller gesture subsets. Context
is captured in many ways, including
hand position, interaction log, task,
type of gesture, and how the user interacts with devices in the environment.
Methods for hand-gesture recognition. No single algorithm for hand-gesture recognition favors every application. The suitability of each approach
depends on application, domain,
and physical environment. Nevertheless, integration of multiple methods
lends robustness to hand-tracking algorithms; for example, when a tracker
loses track of a hand due to occlusion,
a different tracker using a different
tracking paradigm can still be active.
Occlusion is usually disambiguated
through the stereo cameras to create
depth maps of the environment. Common approaches for hand-gesture
tracking use color and motion cues.
Human skin color is distinctive and
serves to distinguish the human face
and hand from other objects. Trackers
sensitive to skin color and motion can
achieve a high degree of robustness. 40
Regarding classification, gestures
are the outcome of stochastic processes. Thus, defining discrete representations for patterns of spatio-temporal
gesture motion is a complicated process. Gesture templates can be determined by clustering gesture training
sets to produce classification methods with accurate recognition performance; Kang et al. 21 described examples of such methods, including
hidden Markov models, dynamic time
warping, and finite state machines.
two-handed
dynamic-gesture
multimodal
interaction is thus
a promising area
for future research.
Finally, Kang et al. 21 also addressed
the problem of gesture spotting
through sliding windows, distinguishing intentional gestures from captured
gestures through recognition accuracy
of the observed gestures.
Intuitive gestures (selection and teaching) in interface design. Ideally, gestures
in HCI should be intuitive and spontaneous. Psycholinguistics and cognitive
sciences have produced a significant
body of work involving human-to-hu-man communication that can help find
intuitive means of interaction for HCI
systems. A widely accepted solution for
identifying intuitive gestures was suggested by Baudel et al., 2 and in Höysniemi et al.’s “Wizard-of-Oz” experiment,
an external observer interprets user
hand movement and simulates the system’s response. 20 Called “teaching by
demonstration,” it is widely used for
gesture learning. Rather than pick the
gestures during the design stage of the
interface, they are selected during real-time operation while interacting with
the user, thus mimicking the process
of parents teaching gestures to a toddler. 9 First, the parents show the toddler a gesture, then assist the toddler to
imitate the gesture by moving the toddler’s own hands. The toddler learns
the skill of producing the gesture by
focusing on his or her own active body
parts. Hand gestures play an important
role in human-human communication. Analysis of these gestures based
on experimental sociology and learning methodologies will lead to more robust, natural, intuitive interfaces.
Acknowledgments
This research was performed while
the first author held a National Research Council Research Associate-ship Award at the Naval Postgraduate
School, Monterey, CA. It was partially
supported by the Paul Ivanier Center
for Robotics Research & Production
Management at Ben-Gurion University
of the Negev.
References
1. Bannach, D., Amft, O., Kunze, K.S., Heinz, E.A.,
Tröster, G., and Lukowicz, P. Waving real-hand
gestures recorded by wearable motion sensors
to a virtual car and driver in a mixed-reality
parking game. In Proceedings of the Second IEEE
Symposium on Computational Intelligence and
Games (Honolulu, Apr. 1–5, 2007), 32–39.
2. Baudel, T. and Beaudouin-Lafon, M. Charade:
Remote control of objects using FreeHand gestures.
Commun. ACM 36, 7 (July 1993), 28–35.