Winsorized mean

From Wikipedia, the free encyclopedia

A Winsorized mean is a statistical measure of central tendency, much like the mean and median, and even more similar to the truncated mean. It involves the calculation of the mean after replacing given parts of a probability distribution or sample at the high and low end with the most extreme remaing values. Typically, discarding an equal amount of both; often 10 to 25 percent of the ends are replaced.

[edit] Advantages

The Winsorized mean is a useful estimator because it is less sensitive to outliers than the mean but will still give a reasonable estimate of central tendency or mean for almost all statistical models. In this regard it is referred to as a robust estimator.

[edit] Drawbacks

The Winsorized mean uses more information from the distribution or sample than the median, so unless the underlying distribution is symmetric, the truncated mean of a sample is unlikely to produce an unbiased estimator for either the mean or the median.

[edit] Examples

  • If you have 10 numbers (from x1, the smallest to x10, the largest , then a 10% Winsorized mean is \frac{x_2 + x_2 + x_3 + x_4 + x_5 + x_6 + x_7 + x_8 + x_9 + x_9}{10}. Note that the repetition of x2 and x9 is deliberate.

Plf515 02:30, 24 November 2006 (UTC)plf515