Articulated body pose estimation
From Wikipedia, the free encyclopedia
This article or section needs copy editing for grammar, style, cohesion, tone or spelling. You can assist by editing it now. A how-to guide is available. (October 2007) |
Articulated Body Pose Estimation in Computer Vision is a study of algorithms and systems to recover the pose of an articulated[1] body which consists of joints and rigid parts using image-based observations. It is one of the most enduring problems in Computer Vision because of the complexity of the models which relate observation with pose, and because of the variety of situations in which such a device would be useful.[2][3]
The desire to develop accurate tether-less, vision-based articulated body pose estimation systems. These bodies may be the human body, hand, or even other creatures. Such a system have several foreseeable applications, including
- marker-less motion capture for human-computer interfaces
- physiotherapy
- 3D animation
- ergonomics studies
- robot control
- visual surveillance
One of the major difficulties in recovering pose from images is the high number of degrees-of-freedom (DOF) in movement that is to be recovered. Any rigid object requires 6 DOF to fully describe its pose. Each additional rigid object connected to it adds at least 1 DOF. A human body contains no less than 10 large body parts, equating to more than 20 DOFs. The difficulty is compounded by with the problem of self-occlusion, where body parts occlude each other depending on the configuration. Other challenges involve dealing with varying illumination which affect appearance,varying subject attire or body type, required camera configuration, required computation time.
The typical system involves the model-based approach. An observation is made and provided as input to the model to generate pose estimates. With regards to the observation, different kinds of sensors have been explored:
- Visible wavelength imagery
- Long-wave thermal infrared imagery
- Time-of-flight imagery
- Laser range scanner imagery
The various sensors produce intermediate representation that is directly used by the model. These representations include
- Image Appearance
- Voxel (volume element) reconstruction
- 3D surface point cloud
- 3D surface mesh
[edit] Related Technologies
A commercially successful but specialized computer vision-based articulated body pose estimation techniques is optical motion capture. This approach involves placing markers on the individual at strategic locations to capture the 6 degrees-of-freedom pose of each body part.
[edit] Active Research Groups
A number of groups are actively pursuing this topic.
- Brown University.
- Carnegie Mellon University.
- MPI Saarbruecken,
- Stanford,
- University of California, San Diego.
- Univeresity of Toronto.
- University of Twente.
- Ecole Centrale de Paris.