Tukey-Lambda distribution

From Wikipedia, the free encyclopedia

Formalized by John Tukey, the Tukey-Lambda distribution is a continuous probability distribution defined in terms of its quantile function. It is typically used to identify an appropriate distribution (see the comments below) and not used in statistical models directly.

The Tukey-Lambda distribution has a shape parameter λ. As with other probability distributions, the Tukey-Lambda distribution can be transformed with a location parameter, μ, and a scale parameter, σ. Since the general form of probability functions can be expressed in terms of the standard distribution, the subsequent formula is given for the standard form of the function.

Contents

[edit] Quantile function

The quantile function (i.e. the inverse of the cumulative distribution function) of the standard form of the Tukey-Lambda distribution is


F^{-1}(p) = 
\begin{cases}
\left[p^\lambda - (1 - p)^\lambda\right]/\lambda, & \mbox{if } \lambda \ne 0 \\
\log(p) - \log(1-p), & \mbox{if } \lambda = 0
\end{cases}

The probability density function (pdf) and cumulative distribution function (cdf) are both computed numerically, as the Tukey-Lambda distribution does not have a simple, closed form for either one for all values of the parameters.

[edit] Comments

The Tukey-Lambda distribution is actually a family of distributions that can approximate a number of common distributions. For example,

λ = −1 approximately Cauchy
λ = 0 exactly logistic
λ = 0.14 approximately normal
λ = 0.5 strictly concave (\cap-shaped)
λ = 1 exactly uniform(−1, 1)

The most common use of this distribution is to generate a Tukey-Lambda PPCC plot of a data set. Based on the PPCC plot, an appropriate model for the data is suggested. For example, if the maximum correlation occurs for a value of λ at or near 0.14, then the data can be modeled with a normal distribution. Values of λ less than this imply a heavy-tailed distribution (with -1 approximating a Cauchy). That is, as the optimal value of lambda goes from 0.14 to -1, increasingly heavy tails are implied. Similarly, as the optimal value of λ becomes greater than 0.14, shorter tails are implied.

As the Tukey-Lambda distribution is a symmetric distribution, the use of the Tukey-Lambda PPCC plot to determine a reasonable distribution to model the data only applies to symmetric distributions. A histogram of the data should provide evidence as to whether the data can be reasonably modeled with a symmetric distribution.

[edit] External links

[edit] References

Joiner, Brian L. & Rosenblatt, Joan R. (1971), “Some Properties of the Range in Samples from Tukey's Symmetric Lambda Distributions”, Journal of the American Statistical Association 66 (334): 394-399, <http://links.jstor.org/sici?sici=0162-1459%28197106%2966%3A334%3C394%3ASPOTRI%3E2.0.CO%3B2-2> 

This article incorporates text from a public domain publication of the National Institute of Standards and Technology, a U.S. government agency.