Law of total expectation

From Wikipedia, the free encyclopedia

The proposition in probability theory known as the law of total expectation, the law of iterated expectations, the tower rule, the smoothing theorem, or perhaps by any of a variety of other names, states that if X is an integrable random variable (i.e., a random variable satisfying E( | X | ) < ∞) and Y is any random variable, not necessarily integrable, on the same probability space, then

E(X) = E( E( X\mid Y)),

i.e., the expected value of the conditional expected value of X given Y is the same as the expected value of X.

The nomenclature used here parallels the phrase law of total probability. See also law of total variance.

(The conditional expected value E( X | Y ) is a random variable in its own right, whose value depends on the value of Y. Notice that the conditional expected value of X given the event Y = y is a function of y (this is where adherence to the conventional rigidly case-sensitive notation of probability theory becomes important!). If we write E( X | Y = y) = g(y) then the random variable E( X | Y ) is just g(Y). )

[edit] Proof in the discrete case

\begin{align} \operatorname{E} \left( \operatorname{E}(X|Y) \right) &{} = \sum\limits_y \operatorname{E}(X|Y=y) \cdot \operatorname{P}(Y=y) \\ &{}=\sum\limits_y \left( \sum\limits_x x \cdot \operatorname{P}(X=x|Y=y) \right) \cdot \operatorname{P}(Y=y) \\ &{}=\sum\limits_y \sum\limits_x x \cdot \operatorname{P}(X=x|Y=y) \cdot \operatorname{P}(Y=y) \\ &{}=\sum\limits_y \sum\limits_x x \cdot \operatorname{P}(Y=y|X=x) \cdot \operatorname{P}(X=x) \\ &{}=\sum\limits_x x \cdot \operatorname{P}(X=x) \cdot \left( \sum\limits_y \operatorname{P}(Y=y|X=x) \right) \\ &{}=\sum\limits_x x \cdot \operatorname{P}(X=x) \\ &{}=\operatorname{E}(X). \end{align}
In other languages