Homography (computer vision)

For other uses of this term, see homography (disambiguation).

Geometrical setup for homography: stereo cameras O₁ and O₂ both pointed at X in epipolar geometry. Drawing from Neue Constructionen der Perspective und Photogrametrie by Hermann Guido Hauck (1845 — 1905)

In the field of computer vision, any two images of the same planar surface in space are related by a homography (assuming a pinhole camera model). This has many practical applications, such as image rectification, image registration, or computation of camera motion—rotation and translation—between two images. Once camera rotation and translation have been extracted from an estimated homography matrix, this information may be used for navigation, or to insert models of 3D objects into an image or video, so that they are rendered with the correct perspective and appear to have been part of the original scene (see Augmented reality).

3D plane to plane equation

We have two cameras a and b, looking at points $P_i$ in a plane. Passing the projections of $P_i$ from ${}^bp_i$ in b to a point ${}^ap_i$ in a:

{}^ap_i = K_a \cdot H_{ba} \cdot K_b^{-1} \cdot {}^bp_i

where the homography matrix $H_{ba}$ is

H_{ba} = R - \frac{t n^T}{d}.

$R$ is the rotation matrix by which b is rotated in relation to a; t is the translation vector from a to b; n and d are the normal vector of the plane and the distance to the plane respectively. K_a and K_b are the cameras' intrinsic parameter matrices.

The figure shows camera b looking at the plane at distance d. Note: From above figure, assuming $n^T P_i + d = 0$ as plane model, $n^T P_i$ is the projection of vector $P_i$ into $n^T$ , and equal to $-d$ . So $t = t \left(-\frac{n^TP_i}{d}\right)$ . And we have $H_{ba} P_i = R P_i + t$ where $H_{ba} = R - \frac{t n^T}{d}$ .

This formula is only valid if camera b has no rotation and no translation. In the general case where $R_a,R_b$ and $t_a,t_b$ are the respective rotations and translations of camera a and b, $R=R_a R_b^T$ and the homography matrix $H_{ba}$ becomes

H_{ba} = R_a R_b^T - R_a \frac{(t_b-t_a) n^T}{d}R_b^T = R_a \left(I - \frac{(t_b-t_a) n^T}{d}\right)R_b^T .

where d is the distance of the camera b to the plane.

Mathematical definition

In higher dimensions Homogeneous coordinates are used to represent projective transformations by means of matrix multiplications. With Cartesian coordinates matrix multiplication cannot perform the division required for perspective projection. In other words, with Cartesian coordinates a perspective projection is a non-linear transformation.

Given:

p_a = \begin{bmatrix} x_a\\y_a\\1\end{bmatrix}, p^\prime_b = \begin{bmatrix} w^{\prime}x_b\\w^{\prime}y_{b}\\w^{\prime}\end{bmatrix}, \mathbf{H}_{ab} = \begin{bmatrix} h_{11}&h_{12}&h_{13}\\h_{21}&h_{22}&h_{23}\\h_{31}&h_{32}&h_{33} \end{bmatrix}

Then:

p^\prime_b = \mathbf{H}_{ab}p_a \,

where

\mathbf{H}_{ba} = \mathbf{H}_{ab}^{-1}.

Also:

p_b = p^\prime_b/w^\prime = \begin{bmatrix} x_b\\y_b\\ 1\end{bmatrix}

Affine homography

When the image region in which the homography is computed is small or the image has been acquired with a large focal length, an affine homography is a more appropriate model of image displacements. An affine homography is a special type of a general homography whose last row is fixed to

h_{31}=h_{32}=0, \; h_{33}=1.

References

O. Chum and T. Pajdla and P. Sturm (2005). "The Geometric Error for Homographies". Computer Vision and Image Understanding 97 (1): 86–102. doi:10.1016/j.cviu.2004.03.004.

External links

Serge Belongie & David Kriegman (2007) Explanation of Homography Estimation from Department of Computer Science and Engineering, University of California, San Diego.
A. Criminsi, I. Reid & A. Zisserman (1997) "A Plane Measuring Device", §3 Computing the Plane to Plane Homography, from Visual Geometry Group, Department of Engineering Science, University of Oxford.
Elan Dubrofsky (2009) Homography Estimation, Master's thesis, from Department of Computer Science, University of British Columbia.
Richard Hartley & Andrew Zisserman (2004) Multiple View Geometry from Visual Geometry Group, Oxford. Includes Matlab Functions for calculating a homography and the fundamental matrix (computer vision).
Manolis Lourakis (2007) homest, a GPL C/C++ library for robust, non-linear (based on the Levenberg–Marquardt algorithm) homography estimation from matched point pairs, from Institute of Computer Science, Foundation for Research and Technology – Hellas, Heraklion, Crete.
OpenCV is a complete (open and free) computer vision software library that has many routines related to homography estimation (cvFindHomography) and re-projection (cvPerspectiveTransform).
GIMP Tutorial – using the Perspective Tool by Billy Kerr on YouTube. Shows how to do a perspective transform using GIMP.
Allan Jepson (2010) Planar Homographies from Department of Computer Science, University of Toronto. Includes 2D homography from four pairs of corresponding points, mosaics in image processing, removing perspective distortion in computer vision, rendering textures in computer graphics, and computing planar shadows.
Plane transfer homography Course notes from CSE576 at University of Washington in Seattle.
Etienne Vincent & Robert Laganiere (2000) Detecting Planar Homographies in an Image Pair from School of Information Technology and Engineering, University of Ottawa. Describes an algorithm for detecting planes in images, uses random sample consensus (RANSAC) method, describes heuristics and iteration.