Box-Cox transformation

From Wikipedia, the free encyclopedia

In statistics, the Box-Cox transformation of the response variable Y is used to make the linear model more appropriate to the data. It can be used to attempt to impose linearity, eliminate skew or stabilize the residual variance.

Contents

[edit] Background

The Box-Cox transformation is defined as:

\tau(Y;\lambda)=\begin{cases}(Y^\lambda-1)/\lambda & \mathrm{if}\ \lambda\neq 0, \\ \ln(Y) & \mathrm{if}\ \lambda=0.\end{cases}

The value at Y=1 for any λ is 0, and the derivative with respect to Y there is 1 for any λ. Sometimes Y is a version of some other variable scaled to give Y=1 at some sort of average value.

The transformation is a power transformation, but done in such a way as to make it continuous with the parameter λ at λ=0. It has proved popular in regression analysis, including econometrics.

Box and Cox also proposed a more general form of the transformation which incorporates a shift parameter.

\tau(Y;\lambda, \alpha) = \begin{cases} [(Y + \alpha)^\lambda - 1]/\lambda & \mathrm{if}\ \lambda\neq 0, \\ = \ln(Y + \alpha)& \mathrm{if}\ \lambda=0.\end{cases}

In regression analysis, one sometimes carries out a series of Box-Cox transformations of the response variable with a range of values of λ and α, and one then compares the residual sum of squares at these values in order to choose the transformation which gives the best results. Because the residual sum of squares is proportional to the log-likelihood, this procedure amounts to approximate maximum likelihood estimation.

[edit] Example

The BUPA liver data set contains data on liver enzymes ALT and γGT. The data can be found via the classic data sets page. Suppose we are interested in using log(γGT) to predict ALT. A plot of the data appears in panel (a) of the figure. There appears to be non-constant variance, and a Box-Cox transformation might help.

image:BUPA_BoxCox.JPG

The log-likelihood of the power parameter appears in panel (b). The horizontal reference line is at a distance of χ12/2 from the maximum and can be used to read off an approximate 95% confidence interval for λ. It appears as though a value close to zero would be good, so we take logs.

Possibly, the transformation could be improved by adding a shift parameter to the log transformation. Panel (c) of the figure shows the log-likelihood. In this case, the maximum of the likelihood is close to zero suggesting that a shift parameter is not needed. The final panel shows the transformed data with a superimposed regression line.

Note that although Box-Cox transformations can make big improvements in model fit, there are some issues that the transformation cannot help with. In the current example, the data are rather heavy-tailed so that the assumption of normality is not realistic and a robust regression approach leads to a more precise model.

[edit] Econometric application

Economists often characterize production relationships by some variant of the Box-Cox transformation.

Consider a common representation of production Q as dependent on services provided by a capital stock K and by labor hours N:

\tau(Q)=\alpha \tau(K)+ (1-\alpha)\tau(N).\,

Solving for Q by inverting the Box-Cox transformation we find

Q=\big(\alpha K^\lambda + (1-\alpha) N^\lambda\big)^{1/\lambda},\,

which is known as the constant elasticity of substitution (CES) production function.

The CES production function is a homogeneous function of degree one.

When λ = 1 this produces the linear production function:

Q=\alpha K + (1-\alpha)N.\,

When λ → 0 this produces the famous Cobb-Douglas production function:

Q=K^\alpha N^{1-\alpha}.\,


[edit] References

  • Box, G. E. P. and Cox, D. R. (1964) An analysis of transformations. Journal of Royal Statistical Society, Series B, vol. 26, pp. 211-–246.

The story of the writing of the paper is told in

  • DeGroot, M. H. (1987) A Conversation with George Box, Statistical Science, vol. 2, pp. 239-258.

[edit] External links

Box-Cox Transformation: An Overview, Pengfei Li