Carathéodory's theorem (convex hull)

From Wikipedia, the free encyclopedia

An illustration of Carathéodory's theorem for a square in R²

See also Carathéodory's theorem for other meanings

In convex geometry Carathéodory's theorem states that if a point x of R^d lies in the convex hull of a set P, there is a subset P′ of P consisting of no more than d+1 points such that x lies in the convex hull of P′. Equivalently, x lies in a r-simplex with vertices in P, where $r \leq d$ . The result is named for Constantin Carathéodory, who proved the theorem in 1911.

For example, consider a set P = {(0,0), (0,1), (1,0), (1,1)} which is a subset of R². The convex hull of this set is a square. Consider now a point x = (1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = P ′, the convex hull of which is a triangle and encloses p, and thus the theorem works for this instance, since |P′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.

[edit] Proof

Let x be a point in the convex hull of P. Then, x is a convex combination of a finite number of points in P :

$\mathbf{x}=\sum_{j=1}^k \lambda_j \mathbf{x}_j$

where every x_j is in P, every λ_j is nonnegative, and $\sum_{j=1}^k\lambda_j=1$ .

Suppose k > d + 1 (otherwise, there is nothing to prove). Then, the points x₂ − x₁, ..., x_k − x₁ are linearly dependent, so there are real scalars μ₂, ..., μ_k, not all zero, such that

$\sum_{j=2}^k \mu_j (\mathbf{x}_j-\mathbf{x}_1)=\mathbf{0}.$

If μ₁ is defined as

$\mu_1:=-\sum_{j=2}^k \mu_j$

then

$\sum_{j=1}^k \mu_j \mathbf{x}_j=\mathbf{0}$

$\sum_{j=1}^k \mu_j=0$

and not all of the μ_j are equal to zero. Therefore, at least one μ_j>0. Then,

$\mathbf{x} = \sum_{j=1}^k \lambda_j \mathbf{x}_j-\alpha\sum_{j=1}^k \mu_j \mathbf{x}_j = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j$

for any real α. In particular, the equality will hold if α is defined as

$\alpha:=\min_{1\leq j \leq k} \left\{ \tfrac{\lambda_j}{\mu_j}:\mu_j>0\right\}=\tfrac{\lambda_i}{\mu_i}.$

Note that α>0, and for every j between 1 and k,

$\lambda_j-\alpha\mu_j \geq 0.$

In particular, λ_i − αμ_i = 0 by definition of α. Therefore,

$\mathbf{x} = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j$

where every $λ j - αμ j$ is nonnegative, their sum is one , and furthermore, $λ i - αμ i = 0$ . In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d + 1 points in P.