Rao–Blackwell theorem

From Wikipedia, the free encyclopedia

In statistics, the Rao–Blackwell theorem describes a technique that can transform an absurdly crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar criteria.

The theorem is named after Calyampudi Radhakrishna Rao and David Blackwell. The process of transforming an estimator using the Rao-Blackwell theorem is sometimes called Rao-Blackwellization. (Pronunciation: Rao rhymes with "now".)

1 Definitions
2 The theorem
- 2.1 Mean-squared-error version
- 2.2 Convex loss generalization
3 Properties
4 Example
5 Completeness and the Rao–Blackwell process
- 5.1 Idempotence
- 5.2 Lehmann-Scheffé minimum variance

[edit] Definitions

An estimator δ(X) is an observable random variable (i.e. a statistic) used for estimating some unobservable quantity. For example, one may be unable to observe the average height of all male students at the University of X, but one may observe the heights of a random sample of 40 of them. The average height of those 40--the "sample average"--may be used as an estimator of the unobservable "population average".

A sufficient statistic T(X) is an observable random variable such that the conditional probability distribution of all observable data X given T(X) does not depend on any of the unobservable quantities such as the mean or standard deviation of the whole population from which the data X was taken. In the most frequently cited examples, the "unobservable" quantities are parameters that parametrize a known family of probability distributions according to which the data are distributed.

A Rao–Blackwell estimator δ₁(X) of an unobservable quantity θ is the conditional expected value E(δ(X) | T(X)) of some estimator δ(X) given a sufficient statistic T(X). Call δ(X) the "original estimator" and δ₁(X) the "improved estimator". It is important that the improved estimator be observable, i.e., that it not depend on θ. Generally, the conditional expected value of one function of these data given another function of these data does depend on θ, but the very definition of sufficiency given above entails that this one does not.

The mean squared error of an estimator is the expected value of the square of its deviation from the unobservable quantity being estimated.

[edit] The theorem

[edit] Mean-squared-error version

One case of Rao–Blackwell theorem states:

The mean squared error of the Rao–Blackwell estimator does not exceed that of the original estimator.

In other words

$\operatorname{E}((\delta_1(X)-\theta)^2)\leq \operatorname{E}((\delta(X)-\theta)^2).\,\!$

The essential tools of the proof besides the definition above are the law of total expectation and the fact that for any random variable Y, E(Y²) cannot be less than [E(Y)]². That inequality is a case of Jensen's inequality, although it may also be shown to follow instantly from the frequently mentioned fact that

$0 \leq \operatorname{Var}(Y) = \operatorname{E}((Y-\operatorname{E}(Y))^2) = \operatorname{E}(Y^2)-(\operatorname{E}(Y))^2.\,\!$

[edit] Convex loss generalization

The more general version of the Rao–Blackwell theorem speaks of the "expected loss"

$E(L(\delta_1(X)))\leq E(L(\delta(X)))\,\!$

where the "loss function" L may be any convex function. For the proof of the more general version, Jensen's inequality cannot be dispensed with.

[edit] Properties

The improved estimator is unbiased if and only if the original estimator is unbiased, as may be seen at once by using the law of total expectation. The theorem holds regardless of whether biased or unbiased estimators are used.

The theorem seems very weak: it says only that the allegedly improved estimator is no worse than the original estimator. In practice, however, the improvement is often enormous.

[edit] Example

Phone calls arrive at a switchboard according to a Poisson process at an average rate of λ per minute. This rate is not observable, but the numbers X₁, ..., X_n of phone calls that arrived during n successive one-minute periods are observed. It is desired to estimate the probability e^−λ that the next one-minute period passes with no phone calls.

An extremely crude estimator of the desired probability is

$\delta_0=\left\{\begin{matrix}1 & \mbox{if}\ X_1=0 \\ 0 & \mbox{otherwise}\end{matrix}\right\},$

i.e., this estimates this probability to be 1 if no phone calls arrived in the first minute and zero otherwise. Despite the apparent limitations of this estimator, the result given by its Rao–Blackwellization may perhaps be unexpected.

The sum

$\Sigma_{i} X_{i} = X_1+\cdots+X_n\,\!$

can be readily shown to be a sufficient statistic for λ, i.e., the conditional distribution of the data X₁, ..., X_n, given this sum, does not depend on λ. Therefore, we find the Rao–Blackwell estimator

$\delta_1=E(\delta_0|\Sigma_{i} X_{i}).\,\!$

After doing some algebra we have

$\delta_1=\left(1-{1 \over n}\right)^{\Sigma_{i} X_{i}}.\,\!$

Since the average number X₁+ ... + X_n of calls arriving during the first n minutes is nλ, one might not be surprised if this estimator has a fairly high probability (if n is big) of being close to

$\left(1-{1 \over n}\right)^{n\lambda}\approx e^{-\lambda}.$

So δ₁ is clearly a very much improved estimator of that last quantity.

[edit] Completeness and the Rao–Blackwell process

[edit] Idempotence

In case the sufficient statistic is also a complete statistic, i.e., one which "admits no unbiased estimator of zero", the Rao–Blackwell process is idempotent. Using it to improve the already improved estimator does not do so, but merely returns as its output the same improved estimator.

[edit] Lehmann-Scheffé minimum variance

If the improved Rao–Blackwellian estimator is both unbiased and complete, then the Lehmann-Scheffé theorem implies that it is the unique "best unbiased estimator".

Retrieved from "http://en.wikipedia.org../../../r/a/o/Rao%E2%80%93Blackwell_theorem_3ddc.html"

Categories: Probability and statistics | Statistical theorems | Estimation theory

Rao–Blackwell theorem

From Wikipedia, the free encyclopedia

Contents

[edit] Definitions

[edit] The theorem

[edit] Mean-squared-error version

[edit] Convex loss generalization

[edit] Properties

[edit] Example

[edit] Completeness and the Rao–Blackwell process

[edit] Idempotence

[edit] Lehmann-Scheffé minimum variance

Views

Navigation

Search

In other languages