Completeness (statistics)
From Wikipedia, the free encyclopedia
In statistics, completeness is a property of a statistic for which the statistic obtains optimal information, in a certain sense, about the unknown parameters characterizing the distribution of the underlying data.
It is closely related to statistical sufficiency and often occurs in conjunction with it.
Contents |
[edit] Definition
Suppose a random variable X (which may be a sequence (X1, ..., Xn) of scalar-valued random variables), has a probability distribution belonging to a known family of probability distributions, parametrized by θ, which may be either vector- or scalar-valued, and let s(X) be any statistic based on X. Then s(X) is a complete statistic precisely if the only functions g for which E(g(s(X)) = 0 for every value of θ are functions which themselves are zero almost everywhere.
[edit] Examples
[edit] A counterexample
Suppose (X1, X2) are independent, identically distributed random variables, normally distributed with expectation θ and variance 1.
Then
is an unbiased estimator of zero. Therefore the pair (X1, X2) itself is not a complete statistic.
[edit] An example
The sum
is a complete statistic. To show this one demonstrates that there is no non-zero function g such that the expectation of
remains zero regardless of the value of θ.
That fact may be seen as follows. The probability distribution of X1 + X2 is normal with expectation 2θ and variance 2. Its probability density function in x is therefore proportional to
The expectation of g above would therefore be a constant times
A bit of algebra reduces this to
where k(θ) is nowhere zero and
As a function of θ this is a two-sided Laplace transform of h(X), and cannot be identically zero unless h(x) zero almost everywhere. The exponential is not zero, so this can only happen if g(x) is zero almost everywhere.
[edit] Utility
[edit] Lehmann-Scheffé theorem
The major importance of completeness is in the application of the Lehmann-Scheffé theorem, which states that a statistic that is both complete and sufficient is the best estimator, i.e., the one that has a smaller expected loss for any convex loss function (in typical practice, a smaller mean squared error) among any estimators with the same expected value.
[edit] Basu's theorem
Completeness is also a prerequisite for the applicability of Basu's theorem: A statistic which is both complete and sufficient is independent of any ancillary statistic (one independent of the parameters θ).