Delta method

From Wikipedia, the free encyclopedia

The delta method is a method for deriving an approximate probability distribution for a function of a statistical estimator from knowledge of the limiting distribution of that estimator. In many cases the limiting distribution of the initial estimator is a Normal distribution with mean zero, therefore it is sufficient to obtain the variance of the function of this estimator. If B is an estimator for β then the variance of a function h(B) is

\operatorname{Var}\left(h(B)\right) \approx \nabla h(\beta)^T \cdot \operatorname{Var}(B) \cdot \nabla h(\beta)

[edit] Derivation

We know that a consistent estimator converges in probability to its true value:

\sqrt{n}\left(B-\beta\right) \rightarrow N\left(0, \operatorname{Var}(B) \right)

where n is the number of observations. Suppose we want to estimate the variance of a function h of the estimator B. Keeping only the first two terms of the Taylor series, and using vector notation for the gradient, we can estimate h(B) as

h(B) \approx h(\beta) + \nabla h(\beta)^T \cdot (B-\beta)

which implies the variance of h(B) is approximately

\begin{align} \operatorname{Var}\left(h(B)\right) & \approx \operatorname{Var}\left(h(\beta) + \nabla h(\beta)^T \cdot (B-\beta)\right) \\   & = \operatorname{Var}\left(h(\beta) + \nabla h(\beta)^T \cdot B - \nabla h(\beta)^T \cdot \beta\right) \\   & = \operatorname{Var}\left(\nabla h(\beta)^T \cdot B\right) \\   & = \left(\nabla h(\beta)^T\right)^2 \cdot \operatorname{Var}\left(B\right) \end{align}

where the last two lines are achieved by recalling, for constants α and η and variable χ, the identity

\operatorname{Var}(\alpha\chi+\eta) = \alpha^2 \cdot \operatorname{Var}(\chi),

and noting that β is a constant.

The delta method therefore implies that

\sqrt{n}\left(h(B)-h(\beta)\right) \rightarrow N\left(0, \nabla h(\beta)^T \cdot \operatorname{Var}(B) \cdot \nabla h(\beta) \right)

or in univariate terms,

\sqrt{n}\left(h(B)-h(\beta)\right) \rightarrow N\left(0, \operatorname{Var}(B) \cdot \left(h^\prime(\beta)\right)^2 \right).

[edit] Note

The delta method is nearly identical to the formulae presented in Klein (1953, p. 258):

\operatorname{Var} \left( h_r \right) = \sum_i    \left( \frac{ \partial h_r }{ \partial B_i } \right)^2   \operatorname{Var}\left( B_i \right) + \sum_i \sum_{j \neq i}    \left( \frac{ \partial h_r }{ \partial B_i } \right)   \left( \frac{ \partial h_r }{ \partial B_j } \right)   \operatorname{Cov}\left( B_i, B_j \right)
\operatorname{Cov}\left( h_r, h_s \right) = \sum_i    \left( \frac{ \partial h_r }{ \partial B_i } \right)   \left( \frac{ \partial h_s }{ \partial B_i } \right)   \operatorname{Var}\left( B_i \right) + \sum_i \sum_{j \neq i}    \left( \frac{ \partial h_r }{ \partial B_i } \right)   \left( \frac{ \partial h_s }{ \partial B_j } \right)   \operatorname{Cov}\left( B_i, B_j \right)

where hr is the rth element of h(B) and Bi is the ith element of B. The only difference is that Klein stated these as identities, whereas they are actually approximations.

[edit] References