Median absolute deviation
From Wikipedia, the free encyclopedia
In statistics, the median absolute deviation (or "MAD") is a resistant measure of the variability of a univariate sample.
For a univariate data set X1, X2, ..., Xn, the MAD is defined as
that is, starting with the residuals (deviations) from the data's median, the MAD is the median of their absolute values.
[edit] Uses
The MAD can be used to estimate the scale parameter of distributions for which the variance and standard deviation do not exist, such as the Cauchy distribution. Even when working with distributions for which the variance exists, the MAD has advantages over the standard deviation. For instance, the MAD is more resilient to outliers in a data set. In the standard deviation, the distances from the mean are squared, so in the average, large deviations are weighted more heavily. In the MAD, the magnitude of the distances of a small number of outliers is irrelevant.
[edit] Relation to standard deviation
As an estimate for the standard deviation σ, one takes
where K is a constant. For normally distributed data K is taken to be 1 / Φ-1(3/4) (where Φ-1 is the inverse of the cumulative distribution function for the standard normal distribution), or 1.4826... , because the MAD is given by:
Hence
and:
In this case, its expectation for large samples of normally distributed Xi is approximately equal to the standard deviation of the normal distribution.
[edit] References
- Hoaglin, David C.; Frederick Mosteller and John W. Tukey (1983). Understanding Robust and Exploratory Data Analysis. John Wiley & Sons, 404-414. ISBN 0-471-09777-2.
- Russell, Roberta S.; Bernard W. Taylor III. (2006). Operations Management. John Wiley & Sons, 497-498. ISBN 0-471-69209-3.
- Venables, W.N.; B.D. Ripley (1999). Modern Applied Statistics with S-PLUS. Springer, 128. ISBN 0-387-98825-4.