Characterizations of the exponential function

From Wikipedia, the free encyclopedia

In mathematics, the exponential function can be characterized in many ways. The following characterizations (definitions) are most common. This article discusses why each characterization makes sense, and why the characterizations are independent of and equivalent to each other. As a special case of these considerations, we will see that the three most common definitions given for the mathematical constant e are also equivalent to each other.

Contents

[edit] Characterizations

The five most common definitions of the exponential function exp(x) = ex are:

1. Define ex by the limit
e^x = \lim_{n\to\infty} \left(1+\frac{x}{n}\right)^n.
2. Define ex as the sum of the infinite series
e^x = \sum_{n=0}^\infty {x^n \over n!} = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + \frac{x^4}{4!} + \cdots
(Here, n! stands for the factorial of n.)
3. Define ex to be the unique number y > 0 such that
\int_{1}^{y} \frac{dt}{t} = x.
4. Define ex to be the unique solution to the initial value problem
y'=y,\quad y(0)=1.
5. The exponential function f(x) = ex is the unique Lebesgue-measurable function with f(1) = e that satisfies:
f(x+y) = f(x) \cdot f(y) \, for all x and y
(Hewitt and Stromberg, 1965, exercise 18.46). Alternatively, it is the unique anywhere-continuous function with these properties (Rudin, 1976, chapter 8 exercise 6). (As a counter-example, if one does not assume continuity or measurability, it is possible to prove the existence of an everywhere-discontinuous, non-measurable function with this property by using a Hamel basis for the real numbers over the rationals, as described in Hewitt and Stromberg.)

These definitions are not limited to the exponential of real numbers, and in several cases can be extended to any Banach algebra.

[edit] Why each characterization makes sense

Each characterization requires some justification to show that it makes sense. For instance, when the value of the function is defined by a sequence or series, the convergence of this sequence or series needs to be established.

[edit] Characterization 1

It can be shown that the sequence

a_n = \left(1+\frac{x}{n}\right)^n

is an increasing sequence which is bounded above. Since every bounded, increasing sequence of real numbers converges to a unique real number, this characterization makes sense.

[edit] Characterization 2

To show the infinite series converges at x = 1, it is enough to compare with a geometric series:

1 + 1 + {1 \over 2!} + {1 \over 3!} + {1 \over 4!} + \cdots \le 1 + 1 + \frac{1}{2} + \frac{1}{2^2} + \frac{1}{2^3} + \cdots = 3.

To show that the series converges for all x, we use the ratio test, which shows that the series has an infinite radius of convergence, since

\lim_{n\to\infty} \left|\frac{x^{n+1}/(n+1)!}{x^n/n!}\right|    = \lim_{n\to\infty} \left|\frac{x}{n+1}\right|    = 0 < 1 \mbox{ for all }x \mbox{.}

[edit] Characterization 3

In this case, we define the natural logarithm function ln(x) first, and then define exp(x) as the inverse of the natural logarithm. In other words, for all y > 0, define

\ln (y) = \int_{1}^{y} \frac{1}{t} \, dt.

Since 1/t is continuous for all t > 0, this function makes sense, and since 1/t is positive for all t > 0, this function is strictly increasing (hence, injective) for y > 0. (Note that if y < 1, then ln(y) is a negative number.) By the integral test and the divergence of the harmonic series, it follows that ln(y) → ∞ as y → ∞. By a similar argument, a change of variables (t \mapsto 1/t) shows that ln(y) → −∞ as y → 0. To sum up, ln(y) maps the infinite interval (0, ∞) bijectively onto the real line (−∞, ∞). Hence, for any real number x, there must exist a unique number y > 0 such that ln(y) = x.

[edit] Equivalence of the characterizations

The following proof demonstrates the equivalence of the three characterizations given for e above. The proof consists of two parts. First, the equivalence of characterizations 1 and 2 is established, and then the equivalence of characterizations 1 and 3 is established.

[edit] Equivalence of characterizations 1 and 2

The following argument is adapted from a proof in Rudin, theorem 3.31, p. 63-65. The irrationality of e follows quickly from this proof, see theorem 3.32.

Let x be a fixed real number. Define

s_n = \sum_{k=0}^n\frac{x^k}{k!},\ t_n=\left(1+\frac{x}{n}\right)^n.

By the binomial theorem,

t_n=\sum_{k=0}^n{n \choose k}\frac{x^k}{n^k}=1+x+\sum_{k=2}^n\frac{n(n-1)(n-2)\cdots(n-(k-1))x^k}{k!\,n^k}
=1+x+\frac{x^2}{2!}\left(1-\frac{1}{n}\right)+\frac{x^3}{3!}\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)+\cdots+\frac{x^n}{n!}\left(1-\frac{1}{n}\right)\cdots\left(1-\frac{n-1}{n}\right)\le s_n

so that

\limsup_{n\to\infty}t_n \le \limsup_{n\to\infty}s_n = e^x

where ex is in the sense of definition 2. Here, we must use limsup's, because we don't yet know that tn actually converges. Now, for the other direction, note that by the above expression of tn, if 2 ≤ mn, we have

1+x+\frac{x^2}{2!}\left(1-\frac{1}{n}\right)+\cdots+\frac{x^m}{m!}\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\cdots\left(1-\frac{m-1}{n}\right)\le t_n.

Fix m, and let n approach infinity. We get

s_m = 1+x+\frac{x^2}{2!}+\cdots+\frac{x^m}{m!} \le \liminf_{n\to\infty}t_n

(again, we must use liminf's because we don't yet know that tn converges). Now, take the above inequality, let m approach infinity, and put it together with the other inequality. This becomes

\limsup_{n\to\infty}t_n \le e^x \le \liminf_{n\to\infty}t_n \le \limsup_{n\to\infty}t_n

so that

\lim_{n\to\infty}t_n = e^x.

[edit] Equivalence of characterizations 1 and 3

Here, we define the natural logarithm function in terms of a definite integral as above. By the fundamental theorem of calculus,

\frac{d}{dx}\left( \ln x \right) = \frac{1}{x}.

Now, let x be any fixed real number, and let

y=\lim_{n\to\infty}\left(1+\frac{x}{n}\right)^n.

We will show that ln(y) = x, which implies that y = ex, where ex is in the sense of definition 3. We have

\ln y=\ln\lim_{n\to\infty}\left(1+\frac{x}{n}\right)^n=\lim_{n\to\infty}\ln\left(1+\frac{x}{n}\right)^n.

Here, we have used the continuity of ln(y), which follows from the continuity of 1/t:

\ln y=\lim_{n\to\infty}n\ln \left(1+\frac{x}{n}\right)=\lim_{n\to\infty}\frac{x\ln\left(1+(x/n)\right)}{(x/n)}

Here, we have used the result lnan = nlna. This result can be established for n a natural number by induction, or using integration by substitution. (The extension to real powers must wait until ln and exp have been established as inverses of each other, so that ab can be defined for real b as eb lna.)

=x\cdot\lim_{h\to 0}\frac{\ln\left(1+h\right)}{h} \quad \mbox{ where }h=\frac{x}{n}
=x\cdot\frac{d}{dt}\left( \ln t\right) \quad \mbox{ at }t=1
=x\cdot\frac{1}{t} \quad \mbox{ at }t=1
\!\, = x.

[edit] Equivalence of characterizations 1 and 5

The following proof is a simplified version of the one in Hewitt and Stromberg, exercise 18.46. First, one proves that measurability (or here, Lebesgue-integrability) implies continuity for a non-zero function f(x) satisfying f(x + y) = f(x)f(y), and then one proves that continuity implies f(x) = ekx for some k, and finally f(1) = e implies k=1.

First, we prove a few elementary properties from f(x) satisfying f(x + y) = f(x)f(y) and the assumption that f(x) is not identically zero:

  • If f(x) is nonzero anywhere (say at x=y), then it is non-zero everywhere. Proof: f(y) = f(x) f(y - x) \neq 0 implies f(x) \neq 0.
  • f(0) = 1. Proof: f(x) = f(x + 0) = f(x)f(0) and f(x) is non-zero.
  • f( − x) = 1 / f(x). Proof: 1 = f(0) = f(xx) = f(x)f( − x).
  • If f(x) is continuous anywhere (say at x=y), then it is continuous everywhere. Proof: f(x+\delta)-f(x) = f(x-y) [ f(y+\delta) - f(y)] \rightarrow 0 as \delta\rightarrow 0 by continuity at y.

The second and third properties mean that it is sufficient to prove f(x) = ex for positive x.

If f(x) is a Lebesgue-integrable function, then we can define

g(x) = \int_0^x f(x') dx'.

It then follows that

g(x+y)-g(x) = \int_x^{x+y} f(x') dx' = \int_0^y f(x+x') dx' = f(x)g(y).

Since f(x) is nonzero, we can choose some y such that g(y) \neq 0 and solve for f(x) in the above expression. Therefore:

f(x+\delta)-f(\delta) = \frac{[g(x+\delta+y)-g(x+\delta)]-[g(x+y)-g(x)]}{g(y)}
=\frac{[g(x+y+\delta)-g(x+y)]-[g(x+\delta)-g(x)]}{g(y)}
=\frac{f(x+y)g(\delta)-f(x)g(\delta)}{g(y)}=g(\delta)\frac{f(x+y)-f(x)}{g(y)}.

The final expression must go to zero as \delta\rightarrow 0 since g(0) = 0 and g(x) is continuous. It follows that f(x) is continuous.

Now, we prove that f(q) = ekq, for some k, for all positive rational numbers q. Let q=n/m for positive integers n and m. Then

f\left(\frac{n}{m}\right)=f\left(\frac{1}{m}+\cdots+\frac{1}{m}\right)=f\left(\frac{1}{m}\right)^{n}

by elementary induction on n. Therefore, f(1 / m)m = f(1) and thus

f\left(\frac{n}{m}\right)=f(1)^{n/m}=e^{k(n/m)}.

for k = \ln [f(1)]\,. Note that if we are restricting ourselves to real-valued f(x), then f(x) = f(x / 2)2 is everywhere positive and so k is real.

Finally, by continuity, since f(x) = ekx for all rational x, it must be true for all real x since the closure of the rationals is the reals (that is, we can write any real x as the limit of a sequence of rationals). If f(1) = e then k=1. This is equivalent to characterization 1 (or 2, or 3), depending on which equivalent definition of e one uses.

[edit] References

  • Walter Rudin, Principles of Mathematical Analysis, 3rd edition (McGraw-Hill, 1976), chapter 8.
  • Edwin Hewitt and Karl Stromberg, Real and Abstract Analysis (Springer, 1965).