Gesture recognition

From Wikipedia, the free encyclopedia

Max Roberts being sensed by a simple gesture recognition algorithm detecting hand location and movement
Max Roberts being sensed by a simple gesture recognition algorithm detecting hand location and movement

Gesture Recognition is a topic in computer science with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current focuses in the field include emotion recognition from the face and hand gesture recognition. Many approaches have been made using cameras and computer vision algorithms to interpret sign language. However, posture and proxemics can be object to gesture recognition, too.[1]

Gesture Recognition can be seen as a way for computers to begin to understand human body language, thus building a richer bridge between machines and humans than primitive text user interfaces or even GUIs (Graphical User Interfaces), which still limit the majority of input to keyboard and mouse.

Gesture Recognition enables humans to interface with the machine (HMI) and interact naturally without any mechanical devices. Using the concept of Gesture Recognition, it is possible to point a finger at the computer screen so that the cursor will move accordingly. This could potentially make conventional input devices such as mouse, keyboards and even touch-screens redundant.

Gesture Recognition can be conducted with techniques from computer vision and image processing.

Often the term gesture interaction is used to refer to inking or mouse gesture interaction, which is computer interaction through the drawing of symbols with a pointing device cursor. Strictly speaking the term mouse strokes should be used instead of mouse gesture since this implies written communication, making a mark to represent a symbol.

Contents

[edit] Uses of Gesture Recognition

Gesture Recognition is useful for processing information from humans which is not conveyed through speech or type. As well, there are various types of gestures which can be identified by computers.

  • Sign language recognition. Just as speech recognition can transcribe speech to text, certain types of gesture recognition software can transcribe the symbols represented through sign language into text.[2]
  • Directional indication through pointing. Pointing has a very specific purpose in our society, to reference an object or location based on its position relative to ourselves. The use of Gesture Recognition to determine where a person is pointing is useful for identifying the context of statements or instructions. This application is of particular interest in the field of robotics. [3]
  • Control through facial gestures. Controlling a computer through facial gestures is a useful application of Gesture Recognition for users who may not physically be able to use a mouse or keyboard. Eye tracking in particular may be of use for controlling cursor motion or focusing on elements of a display.
  • Alternative computer interfaces. Foregoing the traditional keyboard and mouse setup to interact with a computer, strong Gesture Recognition could allow users to accomplish frequent or common tasks using hand or face gestures to a camera.
  • Immersive game technology. Gestures can be used to control interactions within video games to try and make the game player's experience more interactive or immersive.
  • Virtual controllers. For systems where the act of finding or acquiring a physical controller could require too much time, gestures can be used as an alternative control mechanism. Controlling secondary devices in a car, or controlling a television set are examples of such usage.[4]
  • Affective computing. In Affective computing, gesture recognition is used in the process of identifying emotional expression through computer systems.

[edit] Input devices

The ability to track a person's movements and determine what gestures they may be performing can be achieved through various tools. Although there is a large amount of research done in image/video based Gesture Recognition, there is some variation within the tools and environments used between implementations.

  • Depth-aware cameras. Using specialized cameras one can generate a depth map of what is being seen through the camera at a short range, and use this data to approximate a 3d representation of what is being seen. These can be effective for detection of hand gestures due to their short range capabilities.[5]
  • Stereo cameras. Using two cameras whose relations to one another are known, a 3d representation can be approximated by the output of the cameras. This method uses more traditional cameras, and thus does not hold the same distance issues as current depth-aware cameras. To get the cameras' relations, one can use a positioning reference such as a lexian-stripe or infrared emitters.[6]
  • Controller-based gestures. These controllers act as an extension of the body so that when gestures are performed, some of their motion can be conveniently captured by software. Mouse gestures are one such example, where the motion of the mouse is correlated to a symbol being drawn by a person's hand, as is the Wii Remote, which can study changes in acceleration over time to represent gestures.[7]
  • Single camera. A normal camera can be used for gesture recognition where the resources/environment wouldn't be convenient for other forms of image-based recognition. Although not necessarily as effective as stereo or depth aware cameras, using a single camera allows a greater possibility of accessibility to a wider audience.[8]

[edit] Challenges of Gesture Recognition

There are many challenges associated with the accuracy and usefulness of Gesture Recognition software. For image-based gesture recognition there are limitations on the equipment used and image noise. Images or video may not be under consistent lighting, or in the same location. Items in the background or distinct features of the users may make recognition more difficult. The variety of implementations for image-based gesture recognition may also cause issue for viability of the technology to general usage. For example, recognition using stereo cameras or depth-detecting cameras are not currently commonplace. Video or web cameras can give less accurate results based on their limited resolution.

[edit] See also

[edit] External links

[edit] Footnotes

  1. ^ Matthias Rehm, Nikolaus Bee, Elisabeth André, Wave Like an Egyptian - Accelerometer Based Gesture Recognition for Culture Specific Interactions, British Computer Society, 2007
  2. ^ Thad Starner, Alex Pentland, Visual Recognition of American Sign Language Using Hidden Markov Models, Massachusetts Institute of Technology
  3. ^ Kai Nickel, Rainer Stiefelhagen, Visual recognition of pointing gestures for human-robot interaction, Image and Vision Computing, vol 25, Issue 12, December 2007, pp 1875-1884
  4. ^ William Freeman, Craig Weissman, Television control by hand gestures, Mitsubishi Electric Research Lab, 1995
  5. ^ Yang Liu, Yunde Jia, A Robust Hand Tracking and Gesture Recognition Method for Wearable Visual Interfaces and Its Applications, Proceedings of the Third International Conference on Image and Graphics (ICIG’04), 2004
  6. ^ Kue-Bum Lee, Jung-Hyun Kim, Kwang-Seok Hong, An Implementation of Multi-Modal Game Interface Based on PDAs, Fifth International Conference on Software Engineering Research, Management and Applications, 2007
  7. ^ Thomas Schlomer, Benjamin Poppinga, Niels Henze, Susanne Boll, Gesture Recognition with a Wii Controller, Proceedings of the 2nd international Conference on Tangible and Embedded interaction, 2008
  8. ^ Wei Du, Hua Li, Vision based gesture recognition system with single camera, 5th International Conference on Signal Processing Proceedings, 2000
Languages