Median absolute deviation

From Wikipedia, the free encyclopedia

In statistics, the median absolute deviation (or "MAD") is a resistant measure of the variability of a univariate sample.

For a univariate data set X₁, X₂, ..., X_n, the MAD is defined as

$\operatorname{MAD} = \operatorname{median}_{i}\left(\ \left| X_{i} - \operatorname{median}_{j} (X_{j}) \right|\ \right), \,$

that is, starting with the residuals (deviations) from the data's median, the MAD is the median of their absolute values.

[edit] Uses

The MAD can be used to estimate the scale parameter of distributions for which the variance and standard deviation do not exist, such as the Cauchy distribution. Even when working with distributions for which the variance exists, the MAD has advantages over the standard deviation. For instance, the MAD is more resilient to outliers in a data set. In the standard deviation, the distances from the mean are squared, so in the average, large deviations are weighted more heavily. In the MAD, the magnitude of the distances of a small number of outliers is irrelevant.

[edit] Relation to standard deviation

As an estimate for the standard deviation σ, one takes

$\hat{\sigma}=K\cdot \operatorname{MAD},$

where K is a constant. For normally distributed data K is taken to be 1 / Φ^-1(3/4) (where Φ^-1 is the inverse of the cumulative distribution function for the standard normal distribution), or 1.4826... , because the MAD is given by:

$\frac 12 =P(|X-\mu|\le \operatorname{MAD})=P\left(\left|\frac{X-\mu}{\sigma}\right|\le \frac {\operatorname{MAD}}\sigma\right)=P\left(|Z|\le \frac {\operatorname{MAD}}\sigma\right).$

Hence

$\frac {\operatorname{MAD}}\sigma=\Phi^{-1}(3/4) \approx 0.6745$

and:

$\sigma \approx 1.4826\ \operatorname{MAD}.$

In this case, its expectation for large samples of normally distributed X_i is approximately equal to the standard deviation of the normal distribution.

[edit] References

Hoaglin, David C.; Frederick Mosteller and John W. Tukey (1983). Understanding Robust and Exploratory Data Analysis. John Wiley & Sons, 404-414. ISBN 0-471-09777-2.
Russell, Roberta S.; Bernard W. Taylor III. (2006). Operations Management. John Wiley & Sons, 497-498. ISBN 0-471-69209-3.
Venables, W.N.; B.D. Ripley (1999). Modern Applied Statistics with S-PLUS. Springer, 128. ISBN 0-387-98825-4.