In statistics, the mid-range or mid-extreme of a set of statistical data values is the arithmetic mean of the maximum and minimum values in a data set,[1] or:
As such, it is a measure of central tendency.
The midrange is highly sensitive to outliers and ignores all but two data points. It is therefore a very non-robust statistic (having a breakdown point of 0, meaning that a single observation can change it arbitrarily), and it is rarely used in statistical analysis.
The midhinge is the 25% trimmed mid-range, and is more robust, having a breakdown point of 25%.
Contents |
Despite its drawbacks, in some cases it is useful: the midrange is a highly efficient estimator of μ, given a small sample of a sufficiently platykurtic distribution, but it is inefficient for mesokurtic distributions, such as the normal.
For example, for a continuous uniform distribution with unknown maximum and minimum, the mid-range is the UMVU estimator for the mean. The sample maximum and sample minimum, together with sample size, are a sufficient statistic for the population maximum and minimum – the distribution of other samples, conditional on a given maximum and minimum, is just the uniform distribution between the maximum and minimum and thus add no information. Thus the mid-range, which is an unbiased and sufficient estimator of the population mean, is in fact the UMVU: using the sample mean just adds noise based on the uninformative distribution of points within this range.
Conversely, for the normal distribution, the sample mean is the UMVU estimator of the mean. Thus for platykurtic distributions, which can often be thought of as between a uniform distribution and a normal distribution, the informativeness of the middle sample points versus the extrema values varies from "equal" for normal to "uninformative" for uniform, and for different distributions, one or the other (or some combination thereof) may be most efficient.
A limited amount of experimental work on the efficiency of measures of central tendency for small samples by William D. Vinson reveals the following facts, where γ2 is the coefficient of excess kurtosis, defined as γ2 = (μ4/(μ2)²) − 3.
Kurtosis (γ2) | Most efficient estimator of μ |
---|---|
-1.2 to -0.8 | Midrange |
-0.8 to 2.0 | Arithmetic mean |
2.0 to 6.0 | Modified mean |
This generalization holds for sample sizes (n) from 4 to 20.
When n = 3, there can be no modified mean, and the mean is the most efficient measure of central tendency for values of γ2 form 2.0 to 6.0 as well as from −0.8 to 2.0.
For a sample of size n from the standard normal distribution, the mid-range M is unbiased, and has a variance given[2] by
For a sample of size n from the standard Laplace distribution, the mid-range M is unbiased, and has a variance given[3] by
and, in particular, the variance does not decrease to zero as the sample size grows.
For a sample of size n from a zero-centred uniform distribution, the mid-range M is unbiased, nM has an asymptotic distribution which is a Laplace distribution.[4]
While the mean of a set of values minimizes the sum of squares of deviations and the median minimizes the average absolute deviation, the midrange minimizes the maximum deviation (defined as ): it is a solution to a variational problem.