Banach fixed-point theorem

In mathematics, the Banach fixed-point theorem (also known as the contraction mapping theorem or contraction mapping principle) is an important tool in the theory of metric spaces; it guarantees the existence and uniqueness of fixed points of certain self-maps of metric spaces, and provides a constructive method to find those fixed points. The theorem is named after Stefan Banach (1892–1945), and was first stated by him in 1922.[1]

Statement

Definition. Let (X, d) be a metric space. Then a map T : XX is called a contraction mapping on X if there exists q ∈ [0, 1) such that

d(T(x),T(y)) \le q d(x,y)

for all x, y in X.

Banach Fixed Point Theorem. Let (X, d) be a non-empty complete metric space with a contraction mapping T : XX. Then T admits a unique fixed-point x* in X (i.e. T(x*) = x*). Furthermore, x* can be found as follows: start with an arbitrary element x0 in X and define a sequence {xn} by xn = T(xn−1), then xnx*.

Remark 1. The following inequalities are equivalent and describe the speed of convergence:


\begin{array}{rcl}
d(x^*, x_n) &\leq& \frac{q^n}{1-q} d(x_1,x_0), \\
d(x^*, x_{n+1}) &\leq& \frac{q}{1-q} d(x_{n+1},x_n), \\
d(x^*, x_{n+1}) &\leq& q d(x^*,x_n).
\end{array}

Any such value of q is called a Lipschitz constant for T, and the smallest one is sometimes called "the best Lipschitz constant" of T.

Remark 2. d(T(x), T(y)) < d(x, y) for all xy is in general not enough to ensure the existence of a fixed point, as is shown by the map T : [1, ∞) → [1, ∞), T(x) = x + 1/x, which lacks a fixed point. However, if X is compact, then this weaker assumption does imply the existence and uniqueness of a fixed point, that can be easily found as a minimizer of d(x, T(x)), indeed, a minimizer exists by compactness, and has to be a fixed point of T. It then easily follows that the fixed point is the limit of any sequence of iterations of T.

Remark 3. When using the theorem in practice, the most difficult part is typically to define X properly so that T(X) ⊆ X.

Proofs

Banach's original proof

Let x0 ∈ (X, d) be arbitrary and define a sequence {xn} by setting: xn = T(xn−1). Banach's original proof can be broken down into several Lemmas:

Lemma 1. For all nN, d(xn+1, xn) ≤ qnd(x1, x0).

Proof. We will proceed using induction, the base of the induction (n = 1) holds:

d(x_{1+1}, x_1) = d(x_2, x_1) = d(T(x_1), T(x_0)) \leq qd(x_1, x_0).

Suppose the statement holds for some kN. Then we have

\begin{array}{rclll}
d(x_{(k + 1) + 1}, x_{k + 1}) & = &d(x_{k + 2}, x_{k + 1}) \\
& = &d(T(x_{k + 1}), T(x_k)) \\
& \leq &q d(x_{k + 1}, x_k) \\
& \leq &q q^kd(x_1, x_0) && \text{Induction Hypothesis}\\
& = &q^{k + 1}d(x_1, x_0).
\end{array}

By the principle of mathematical induction, for all nN, the Lemma is proven.

Lemma 2. {xn} is a Cauchy sequence in (X, d) and hence converges to a limit x* in X.

Proof. Let m, nN such that m > n.

\begin{array}{rclll}
d(x_m, x_n) & \leq &d(x_m, x_{m-1}) + d(x_{m-1}, x_{m-2}) + \cdots + d(x_{n+1}, x_n) && \text{Triangle Inequality} \\
& \leq &q^{m-1}d(x_1, x_0) + q^{m-2}d(x_1, x_0) + \cdots + q^nd(x_1, x_0)  && \text{Lemma 1}\\
& = &q^n d(x_1, x_0) \sum_{k=0}^{m-n-1} q^k \\
& \leq &q^n d(x_1, x_0) \sum_{k=0}^\infty q^k \\
& = &q^n d(x_1, x_0) \left ( \frac{1}{1-q} \right ) && \text{Geometric Series}
\end{array}

Let ε > 0 be arbitrary, since q ∈ [0, 1), we can find a large NN so that

q^N < \frac{\varepsilon(1-q)}{d(x_1, x_0)}.

Therefore by choosing m, n large enough we may write:

d(x_m, x_n)  \leq q^n d(x_1, x_0) \left ( \frac{1}{1-q} \right ) < \left (\frac{\varepsilon(1-q)}{d(x_1, x_0)} \right ) d(x_1, x_0) \left ( \frac{1}{1-q} \right ) = \varepsilon.

Since ε > 0 was arbitrary this proves that sequence is Cauchy.

Lemma 3. x* is a fixed point of T.

Proof. Take the limit of both sides of the recurrence xn = T(xn−1),

 \lim_{n\to\infty} x_n = \lim_{n\to\infty} T(x_{n-1})

Since T is a contraction mapping, it is continuous, so we may take the limit inside:

\lim_{n\to\infty} x_n = T\left(\lim_{n\to\infty} x_{n-1} \right).

Thus, x* = T(x*).

Lemma 4. x* is the only fixed point of T in (X, d).

Proof. Suppose y also satisfies T(y) = y. Then

0 \leq d(x^*, y) = d(T(x^*), T(y)) \leq q d(x^*, y).

Remembering that q ∈ [0, 1), the above implies that 0 ≤ (1−q)d(x*, y) ≤ 0, which shows that d(x*, y) = 0, whence by positive definiteness, x* = y.

Shorter proof

Now we present a simpler proof that appeared recently in the Journal of Fixed Point Theory and its Application (see reference).

By the triangle inequality, for all x, y in X,

\begin{array}{rl}
d(x,y) &\le d(x,T(x)) + d(T(x),T(y)) + d(T(y),y) \\
&\le  d(x,T(x)) + q d(x,y) + d(T(y),y) 
\end{array}

solving for d(x, y) we get the ``Fundamental Contraction Inequality":

d(x,y) \le \frac{d(T(x), x) + d(T(y),y)}{1-q},

and we note that if x and y are both fixed points then this implies that d(x, y) = 0, so x = y, proving that T has at most one fixed point. Now define the mapping Tn by composing T with itself n times and note by induction that it satisfies a Lipschitz condition with constant qn. It remains to show that for any x0 in X, the sequence {Tn(x0)} is Cauchy and so converges to a point x* of X, which as noted above is clearly a fixed point of T. If in the Fundamental Inequality we replace x and y by Tn(x0) and Tm(x0), we find that

\begin{array}{rcl}
d(T^n(x_0), T^m(x_0)) &\le& \frac{d(T(T^n(x_0)), T^n(x_0)) + d(T(T^m(x_0)),T^m(x_0))}{1-q}, \\
&=& \frac{d(T^n(T(x_0)), T^n(x_0)) + d(T^m(T(x_0)), T^m(x_0))}{1-q} \\
&\le& \frac{q^n d(T(x_0), x_0) + q^m d(T(x_0), x_0)}{1-q} \\
&=& \frac {q^n + q^m} {1-q} d(T(x_0) ,x_0) 
\end{array}

since q < 1, the last expression converges to zero as n, m → ∞, proving that {Tn(x0)} is Cauchy. Note also that as m → ∞ gives us

 d(T^n(x_0), x^*) \le \frac {q^n} {1-q} d(T(x_0) ,x_0)

derived in the first proof that gives the rate at which {Tn(x0)} converges to x*.

Applications

  1. Ω′ := (I+g)(Ω) is an open subset of E: precisely, for any x in Ω such that B(x, r) ⊂ Ω one has B((I+g)(x), r(1−k)) ⊂ Ω′;
  2. I+g : Ω → Ω′ is a bi-lipschitz homeomorphism;
precisely, (I+g)−1 is still of the form I + h : Ω → Ω′ with h a Lipschitz map of constant k/(1−k). A direct consequence of this result yields the proof of the inverse function theorem.

Converses

Several converses of the Banach contraction principle exist. The following is due to Czesław Bessaga, from 1959:

Let f : XX be a map of an abstract set such that each iterate fn has a unique fixed point. Let q ∈ (0, 1), then there exists a complete metric on X such that f is contractive, and q is the contraction constant.

Indeed, very weak assumptions suffice to obtain such a kind of converse. For example if f : XX is a map on a T1 topological space with a unique fixed point a, such that for each x in X we have fn(x) → a, then there already exists a metric on X with respect to which f satisfies the conditions of the Banach contraction principle with contraction constant 1/2.[2] In this case the metric is in fact an ultrametric.

Generalizations

There are a number of generalizations as immediate corollaries, which are of some interest for the sake of applications. Let T : XX be a map on a complete non-empty metric space.

\sum\nolimits_n d(T^n(x),T^n(y))<\infty.
Then T has a unique fixed point.

However, in most applications the existence and unicity of a fixed point can be shown directly with the standard Banach fixed point theorem, by a suitable choice of the metric that makes the map T a contraction. Indeed, the above result by Bessaga strongly suggests to look for such a metric. See also the article on fixed point theorems in infinite-dimensional spaces for generalizations.

A different class of generalizations arise from suitable generalizations of the notion of metric space, e.g. by weakening the defining axioms for the notion of metric.[3] Some of these have applications, e.g., in the theory of programming semantics in theoretical computer science.[4]

See also

Notes

  1. http://www.emis.de/journals/BJMA/tex_v1_n1_a1.pdf
  2. Hitzler, Pascal; Seda, Anthony K. (2001). "A ‘Converse’ of the Banach Contraction Mapping Theorem". Journal of Electrical Engineering 52 (10/s): 3–6.
  3. Hitzler, Pascal; Seda, Anthony (2010). Mathematical Aspects of Logic Programming Semantics. Chapman and Hall/CRC.
  4. Seda, Anthony K.; Hitzler, Pascal (2010). "Generalized Distance Functions in the Theory of Computation". The Computer Journal 53 (4): 443–464.

References


An earlier version of this article was posted on Planet Math. This article is open content.