Johnson–Lindenstrauss lemma

From Wikipedia, the free encyclopedia

In mathematics, the Johnson–Lindenstrauss lemma is a result concerning low-distortion embeddings of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz, and can even be taken to be an orthogonal projection.

The lemma has uses in compressed sensing, manifold learning, dimensionality reduction, and graph embedding. Much of the data stored and manipulated on computers, including text and images, can be represented as points in a high-dimensional space. However, the essential algorithms for working with such data tend to become bogged down very quickly as dimension increases. It is therefore desirable to reduce the dimensionality of the data in a way that preserves its relevant structure. The Johnson–Lindenstrauss lemma is a classic result in this vein.

Also the lemma is tight up to a factor log(1/ε), i.e. there exists a set of points of size m that needs dimension

 \Omega\left(\frac{\log(m)}{\varepsilon^2\log (1/\varepsilon)}\right)

in order to preserve the distances between all pair of points. See 4.

[edit] Lemma

Given 0 < ε < 1, a set X of m points in RN, and a number n > n0 = O(ln(m)/ε2), there is a Lipschitz function ƒ : RN → Rn such that

(1-\varepsilon)||u-v||_{2} \leq ||f(u) - f(v)||_{2} \leq (1+\varepsilon)||u-v||_{2}

for all u,v \in X.

One proof of the lemma takes ƒ to be a projection onto a random subspace of dimension n, and exploits the phenomenon of concentration of measure.

[edit] References