Procrustes analysis

From Wikipedia, the free encyclopedia

In statistics, Procrustes analysis is a form of statistical shape analysis used to analyse the distribution of a set of shapes.

Here we just consider object made up from a finite number k of points in n dimensions, these points are called landmark points.

The shape of object can be considered as a member of an equivalence class formed by removing the translational, rotational and scaling components.

For example, translational components can be removed from an object by translating the object so that the mean of all the points lies at the origin. Likewise the scale component can be removed by scaling the object so that the sum of the squared distances from the points to the origin is 1.

Mathematically: take k points in two dimensions, say

((x_1,y_1),(x_2,y_2),\dots,(x_k,y_k))\,.

The mean of these points is (\bar{x},\bar{y}) where

\bar{x}=(x_1+x_2+\cdots+x_k)/k,\, \bar{y}=(y_1+y_2+\cdots+y_k)/k.

Now translate these points so that the mean is translated to the origin (x,y)\to(x-\bar{x},y-\bar{y}), giving the point (x_1-\bar{x},y_1-\bar{y}),\dots. Likewise scale can be removed by finding the size of the object

s=\sqrt{(x_1-\bar{x})^2+(y_1-\bar{y})^2+\cdots}

and dividing the points by the scale giving points ((x_1-\bar{x})/s,(y_1-\bar{y})/s). Other methods for removing the scale are also used.

Removing the rotational component is more complex. Consider two objects with scale and translation removed, lets the points of these be ((x_1,y_1),\ldots), ((w_1,z_1),\ldots). Fix one of these and rotate the other around the origin so that the sum of the squared distances between the points is minimised. A rotation by angle \theta \,\! gives (u_1,v_1) = (\cos\theta w_1-\sin\theta z_1,\sin\theta w_1+cos\theta z_1) \,\!. The Procrustes distance is

d=\sqrt{(u_1-x_1)^2+(v_1-y_1)^2+\cdots},

the distance can be minimised by using a least squares technique to find the angle θ which gives the minimum distance.

Contents

[edit] Variations

There are many ways of representing the shape of an object. The shape of object can be considered as a member of an equivalence class formed by taking the set of all sets of k points in n dimensions, that is Rkn and factoring out the set of all translations, rotations and scalings. A particular representation of shape is found by choosing a particular representation of the equivalence class. This will give a manifold of dimension kn − 4. Procrustes is one method of doing this with particular statistical justification.

Bookstein obtains a representation of shape by fixing the position of two points called the bases line. One point will be fixed at the origin and the other at (1,0) the remaining points form the Bookstein coordinates.

It is also common to consider shape and scale that is with translational and rotational components removed.

[edit] Examples

Shape analysis is used in biological data to identify the variations of anatomical features characterised by landmark data, for example in considering the shape of jaw bones.[1]

One study by David George Kendall examined the triangles formed by standing stones to deduce if these were often arranged in straight lines. The shape of a triangle can be represented as a point on the sphere, and the distribution of all shapes can be though of a distribution over the sphere. The sample distribution from the standing stones was compared with the theoretical distribution to show that the occurrence of straight lines was no more than average.[2]

[edit] See also

[edit] References

  1. ^ "Exploring Space Shape" by Nancy Marie Brown, Research/Penn State, Vol. 15, no. 1, March 1994
  2. ^ "A Survey of the Statistical Theory of Shape", by David G. Kendall, Statistical Science, Vol. 4, No. 2 (May, 1989), pp. 87-99
  • F.L. Bookstein, Morphometric tools for landmark data, Cambridge University Press, (1991).
  • J.C. Gower, G.B. Dijksterhuis, Procrustes Problems, Oxford University Press (2004).
  • K.V. Mardia, I.L.Dryden, Statistical Shape Analysis, Wiley, Chichester, (1998).