Scoring algorithm

From Wikipedia, the free encyclopedia

In statistics, Fisher's Scoring algorithm is a form of Newton's method used to solve maximum likelihood equations numerically.

1 Sketch of Derivation
2 Fisher Scoring
3 Application to Linear Models
4 References

[edit] Sketch of Derivation

Let $Y_1,\ldots,Y_n$ be random variables, independent and identically distributed with twice differentiable p.d.f. $f (y;θ)$ , and we wish to calculate the maximum likelihood estimator (M.L.E.) $θ *$ of $θ$ . First, suppose we have a starting point for our algorithm $θ 0$ , and consider a Taylor expansion of the score function, $V (θ)$ , about $θ 0$ :

$V(\theta) \approx V(\theta_0) - \mathcal{J}(\theta_0)(\theta - \theta_0)$ ,

where

$\mathcal{J}(\theta_0) = - \sum_{i=1}^n \left. \nabla \nabla^{\top} \right|_{\theta=\theta_0} \log f(Y_i ; \theta)$

is the observed information matrix at $θ 0$ . Now, setting $θ = θ *$ , using that $V (θ *) = 0$ and rearranging gives us:

$\theta^* = \theta_{0} + \mathcal{J}^{-1}(\theta_{0})V(\theta_{0})$ .

We therefore use the algorithm

$\theta_{m+1} = \theta_{m} + \mathcal{J}^{-1}(\theta_{m})V(\theta_{m})$ ,

and under certain regularity conditions, it can be shown that $\theta_m \rightarrow \theta^*$ .

[edit] Fisher Scoring

In practice, $\mathcal{J}(\theta)$ is usually replaced by $\mathcal{I}(\theta)= \mathrm{E}[J(\theta)]$ , the Fisher information, thus giving us the Fisher Scoring Algorithm: