Law of large numbers/Proof

From Wikipedia, the free encyclopedia

Main article: Law of large numbers

Given X1, X2, ... an infinite sequence of i.i.d. random variables with finite expected value E(X1) = E(X2) = ... = µ < ∞, we are interested in the convergence of the sample average

\overline{X}_n=\tfrac1n(X_1+\cdots+X_n).

Contents

[edit] The weak law

Theorem: \overline{X}_n \, \xrightarrow{P} \, \mu \qquad\textrm{for}\qquad n \to \infty

[edit] Proof using Chebyshev's inequality

This proof uses the assumption of finite variance  \operatorname{Var} (X_i)=\sigma^2 (for all i). The independence of the random variables implies no correlation between them, and we have that


\operatorname{Var}(\overline{X}_n) = \frac{n\sigma^2}{n^2} = \frac{\sigma^2}{n}.

The common mean μ of the sequence is the mean of the sample average:


E(\overline{X}_n) = \mu.

Using Chebyshev's inequality on \overline{X}_n results in


\operatorname{P}( \left| \overline{X}_n-\mu \right| \geq \varepsilon) \leq \frac{\sigma^2}{{n\varepsilon^2}}.

This may be used to obtain the following:


\operatorname{P}( \left| \overline{X}_n-\mu \right| < \varepsilon) = 1 - \operatorname{P}( \left| \overline{X}_n-\mu \right| \geq \varepsilon) \geq 1 - \frac{\sigma^2}{\varepsilon^2 n}.

As n approaches infinity, the expression approaches 1. And by definition of convergence in probability (see Convergence of random variables), we have obtained

\overline{X}_n \, \xrightarrow{P} \, \mu \qquad\textrm{for}\qquad n \to \infty

[edit] Proof using convergence of characteristic functions

By Taylor's theorem for complex functions, the characteristic function of any random variable, X, with finite mean μ, can be written as

\varphi_X(t) = 1 + it\mu + o(t), \quad t \rightarrow 0.

All X1, X2, ... have the same characteristic function, so we will simply denote this φX.

Among the basic properties of characteristic functions there are

\varphi_{\frac 1 n X}(t)= \varphi_X(\tfrac t n) \quad \textrm{and} \quad
 \varphi_{X+Y}(t)=\varphi_X(t) \varphi_Y(t) \quad \textrm{if\,}X\,\textrm{and}\, Y\, \textrm{are\,\,independent}.

These rules can be used to calculate the characteristic function of \scriptstyle\overline{X}_n in terms of φX:

\varphi_{\overline{X}_n}(t)= \left[\varphi_X\left({t \over n}\right)\right]^n = \left[1 + i\mu{t \over n} + o\left({t \over n}\right)\right]^n \, \rightarrow \, e^{it\mu}, \quad \textrm{as} \quad n \rightarrow \infty.

The limit  eitμ  is the characteristic function of the constant random variable μ, and hence by the Lévy continuity theorem,  \scriptstyle\overline{X}_n converges in distribution to μ:

\overline{X}_n \, \xrightarrow{\mathcal D} \, \mu \qquad\textrm{for}\qquad n \to \infty.

μ is a constant, which implies that convergence in distribution to μ and convergence in probability to μ are equivalent. (See Convergence of random variables) This implies that

\overline{X}_n \, \xrightarrow{P} \, \mu \qquad\textrm{for}\qquad n \to \infty.