Estimator

From Wikipedia, the free encyclopedia

In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter; an estimate is the result from the actual application of the function to a particular set of data. Many different estimators are possible for any given parameter. Some criterion is used to choose between the estimators, although it is often the case that a criterion cannot be used to clearly pick one estimator over another. To estimate a parameter of interest (e.g., a population mean, a binomial proportion, a difference between two population means, or a ratio of two population standard deviation), the usual procedure is as follows:

1- Select a random sample from the population of interest.

2- Calculate the point estimate of the parameter.

3- Calculate a measure of its variability, often a confidence interval.

4- Associate with this estimate a measure of variability.

There are two types of estimators: point estimators and interval estimators.

Contents

[edit] Point estimators

For a point estimator \widehat{\theta} of parameter θ,

  1. The error of \widehat{\theta} is \widehat{\theta} - \theta
  2. The bias of \widehat{\theta} is defined as B(\widehat{\theta}) = \operatorname{E}(\widehat{\theta}) - \theta.
  3. \widehat{\theta} is an unbiased estimator of θ if and only if B(\widehat{\theta}) = 0 for all θ, or, equivalently, if and only if \operatorname{E}(\widehat{\theta}) = \theta for all θ.
  4. The mean squared error of \widehat{\theta} is defined as \operatorname{MSE}(\widehat{\theta}) = \operatorname{E}[(\widehat{\theta} - \theta)^2].
  5. \operatorname{MSE}(\widehat{\theta}) = \operatorname{var}(\widehat\theta) + (B(\widehat{\theta}))^2,
i.e. mean squared error = variance + square of bias.

where var(X) is the variance of X and E(X) is the expected value of X.

The standard deviation of an estimator of θ (the square root of the variance), or an estimate of the standard deviation of an estimator of θ, is called the standard error of θ.

[edit] Consistency

A consistent estimator is an estimator that converges in probability to the quantity being estimated as the sample size grows.

An estimator tn (where n is the sample size) is a consistent estimator for parameter θ if and only if, for all ε > 0, no matter how small, we have

\lim_{n\to\infty}{\rm Prob}\left\{ \left| t_n-\theta\right|<\epsilon \right\}=1.

It is called strongly consistent, if it converges almost surely to the true value.

[edit] Efficiency

The quality of an estimator is generally judged by its mean squared error.

However, occasionally one chooses the unbiased estimator with the lowest variance. Efficient estimators are those that have the lowest possible variance among all unbiased estimators. In some cases, a biased estimator may have a uniformly smaller mean squared error than does any unbiased estimator, so one should not make too much of this concept. For that and other reasons, it is sometimes preferable not to limit oneself to unbiased estimators; see bias (statistics). Concerning such "best unbiased estimators", see also Cramér-Rao inequality, Gauss-Markov theorem, Lehmann-Scheffé theorem, Rao-Blackwell theorem.

[edit] Robustness

See: Robust estimator, Robust statistics

[edit] Other properties

Sometimes, estimators should satisfy further restrictions (restricted estimators) - eg, one might require an estimated probability to be between zero and one, or an estimated a variance to be nonnegative. Sometimes this conflicts with the requirement of unbiasedness, see the example in Bias (statistics) concerning the estimation of the exponent of minus twice lambda based on a sample of size one from the Poisson distribution with mean lambda.

[edit] See also

[edit] External links