Truncated mean
From Wikipedia, the free encyclopedia
A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both.
For most statistical applications, 5 to 25 percent of the ends are discarded.
In some regions of Central Europe it is also known as a Windsor mean, but should not be confused with the Winsorized mean, which is similar but distinct.
Contents |
[edit] Notation
The index of the mean is an indication of the percentage of the entries removed on both sides.
For example, if you were to truncate a sample with 8 entries by 12.5%, you would discard the first and the last entry in the sample when calculating the truncated mean.
[edit] Interpolation
When a trimmed mean for a sample must be determined, but it cannot be accurately done, the best is to calculate the nearest two trimmed means, and interpolate (usually linearly).
For example, if you need to calculate the 15% trimmed mean of a sample containing 10 entries, you would calculate the 10% trimmed mean (removing 1 entry on either side of the sample), the 20% trimmed mean (removing 2 entries on either side), and interpolating to determine the 15% trimmed mean.
[edit] Advantages
The trimmed mean is a useful estimator because it is less sensitive to outliers than the mean but will still give a reasonable estimate of central tendency or mean for almost all statistical models. In this regard it is referred to as a robust estimator.
[edit] Drawbacks
The truncated mean uses more information from the distribution or sample than the median, so unless the underlying distribution is symmetric, the truncated mean of a sample is unlikely to produce an unbiased estimator for either the mean or the median.
[edit] Examples
The scoring method used in many sports that are evaluated by a panel of judges is a truncated mean: discard the lowest and the highest scores; calculate the mean value of the remaining scores.
The interquartile mean is another example when the lowest 25% and the highest 25% are discarded, and the mean of the remaining scores are calculated.