Facial motion capture
From Wikipedia, the free encyclopedia
The introduction to this article provides insufficient context for those unfamiliar with the subject. Please help improve the article with a good introductory style. |
Facial motion capture is the process of electronically converting the movements of a person's face into a digital database using cameras or laser scanners. This database may then be used to produce CG (computer graphics) computer animation for movies, games, or real-time avatars. Because the motion of CG characters is derived from the movements of real people, it results in more realistic and nuanced computer character animation than if the animation was created manually.
A facial motion capture database describes the coordinates or relative positions of reference points on the actor's face. The capture may be in two dimensions, in which case the capture process is sometimes called "expression tracking", or in three dimensions. Two dimensional capture can be achieved using a single camera and low cost capture software such as Zign Creations' Zign Track. This produces les sophisticated tracking, and is unable to fully capture three dimensional motions such as head rotation. Three dimensional capture is accomplished using multi-camera rigs or laser marker system. Such systems are typically far more expensive, complicated, and time-consuming to use.
Facial motion capture is related to body motion capture, but is more challenging due to the higher resolution requirements to detect and track subtle expressions possible from small movements of the eyes and lips. These movements are often less than a few millimeters, requiring even greater resolution and fidelity and different filtering techniques than usually used in full body capture. The additional constraints of the face also allow more opportunities for using models and rules.
Two predominate technologies exist; marker and markerless tracking systems.
Marker based systems apply 10 to 100 markers to the actors face and track the marker movement with high resolution cameras. This has been used on movies such as Polar Express to allow an actor such as Tom Hanks to drive the facial expressions of several different characters. Unfortunately this is relatively cumbersome and makes the actors expressions overly driven once the smoothing and filtering have taken place.
Active LED Marker technology is currently being used to drive facial animation in real-time to provide user feedback.
Markerless technologies use the features of the face such as nostrils, the corners of the lips and eyes, and wrinkles and then track them. This technology is discussed and demonstrated at CMU [1] IBM [2] an open source project at Sourceforge [3] University of Manchester (where much of this started with Tim Cootes [4]) [5] and other locations, using active appearance models, Principle component analysis, eigen tracking and other techniques to track the desired facial features from frame to frame. This technology is much less cumbersome, and allows greater expression for the actor.
Vision based approaches also have the ability to track pupil movement, eyelids, teeth occlusion by the lips and tongue, which are obvious problems in most computer animated features. Typical limitations of vision based approaches are resolution and frame rate, both of which are decreasing as issues as high speed, high resolution CMOS cameras become available from multiple sources.