Video tracking

From Wikipedia, the free encyclopedia

It has been suggested that this article or section be merged with Motion estimation. (Discuss)

The introduction to this article provides insufficient context for those unfamiliar with the subject.
Please help improve the article with a good introductory style.

Video tracking is the process of locating a moving object (or several ones) in time using a camera. An algorithm analyses the video frames and outputs the location of moving targets within the video frame.

The main difficulty in video tracking is to associate target locations in consecutive video frames, especially when the objects are moving fast relative to the frame rate. Here, video tracking systems usually employ a motion model which describes how the image of the target might change for different possible motions of the object to track.

Examples of simple motion models are:

to track planar objects, the motion model is a 2D transformation (affine transformation or homography) of an image of the object (e.g. the initial frame)
when the target is a rigid 3D object, the motion model defines its aspect depending on its 3D position and orientation
for video compression, key frames are divided into macroblocks. The motion model is a disruption of a key frame, where each macroblock is translated by a motion vector given by the motion parameters
the image of deformable objects can be covered with a mesh, the motion of the object is defined by the position of the nodes of the mesh.

The role of the tracking algorithm is to analyse the video frames in order to estimate the motion parameters. These parameters characterize the location of the target.

[edit] Common algorithms

There are two major components of a visual tracking system; Target Representation and Localization and Filtering and Data Association.

Target Representation and Localization is mostly a bottom-up process. Typically the computational complexity for these algorithms is low. The following are some common Target Representation and Localization algorithms:

Blob tracking: Segmentation of object interior (for example blob detection, block-based correlation or optical flow)
Kernel-based tracking (Mean-shift tracking): An iterative localization procedure based on the maximization of a similarity measure (Bhattacharyya coefficient).
Contour tracking: Detection of object boundary (e.g. active contours or Condensation algorithm)
Visual feature matching: Registration

Filtering and Data Association is mostly a top-down process, which involves incorporating prior information about the scene or object, dealing with object dynamics, and evaluation of different hypotheses. The computational complexity for these algorithms is usually much higher. The following are some common Filtering and Data Association algorithms:

Kalman filter: An optimal recursive Bayesian filter for linear functions and Gaussian noise.
Particle filter: Useful for sampling the underlying state-space distribution of non-linear and non-Gaussian processes.

[edit] See also

[edit] References

D. Comaniciu, V. Ramesh, and P. Meer, "Kernel-Based Object Tracking", IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 25, no. 5, May 2003.
M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, "A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking", IEEE Trans. on Signal Processing, Vol. 50, no. 2, Feb. 2002.