contributed;articles
Doi: 10.1145/1897816.1897838
Body posture and finger pointing are a natural
modality for human-machine interaction, but
first the system must know what it’s seeing.
By JuAn PABLo WAcHs, mAtHiAs KöLscH,
HeLmAn steRn, AnD yAeL eDAn
Vision-Based
Hand-Gesture
Applications
There is s Trong evidence that future human-computer interfaces will enable more natural, intuitive
communication between people and all kinds of
sensor-based devices, thus more closely resembling
human-human communication. Progress in the
field of human-computer interaction has introduced
innovative technologies that empower users to
interact with computer systems in increasingly
natural and intuitive ways; systems adopting them
show increased efficiency, speed, power, and
realism. However, users comfortable with traditional
interaction methods like mice and keyboards are
often unwilling to embrace new, alternative interfaces.
Ideally, new interface technologies should be
more accessible without requiring long periods of
learning and adaptation. They should also provide
more natural human-machine communication. As
described in Myron Krueger’s pioneering 1991 book
Artificial Reality, 27 “natural interaction” means voice
and gesture. Pursuing this vision requires tools and features that mimic
the principles of human communication. Employing hand-gesture communication, such interfaces have
been studied and developed by many
researchers over the past 30 years in
multiple application areas. It is thus
worthwhile to review these efforts and
identify the requirements needed to
win general social acceptance.
Here, we describe the requirements
of hand-gesture interfaces and the
challenges in meeting the needs of various application types. System requirements vary depending on the scope of
the application; for example, an entertainment system does not need the
gesture-recognition accuracy required
of a surgical system.
We divide these applications into
four main classes—medical systems
and assistive technologies; crisis management and disaster relief; entertainment; and human-robot interaction—illustrating them through a set
of examples. For each, we present the
human factors and usability considerations needed to motivate use. Some
techniques are simple, often lacking
robustness in cluttered or dynamic
scenarios, indicating the potential for
further improvement. In each, the raw
data is real-time video streams of hand
gestures (vision-based), requiring effective methods for capturing and
processing images. (Not covered is the
literature related to voice recognition
and gaze-tracking control.)
key insights
;;; Gestures are useful for computer
interaction since they are the most
primary and expressive form of human
communication.
;;; Gesture interfaces for gaming based
on hand/body gesture technology
must be designed to achieve social and
commercial success.
;;; no single method for automatic hand-gesture recognition is suitable for every
application; each gesture-recognition
algorithm depends on user cultural
background, application domain, and
environment.