Score (statistics)

From Wikipedia, the free encyclopedia

In statistics, the score or score function is the partial derivative, with respect to some parameter $θ$ , of the logarithm (commonly the natural logarithm) of the likelihood function. If the observation is $X$ and its likelihood is $L (θ; X)$ , then the score $V$ can be found through the chain rule:

$V = \frac{\partial}{\partial\theta} \log L(\theta;X) = \frac{1}{L(\theta;X)} \frac{\partial L(\theta;X)}{\partial\theta}.$

Note that $V$ is a function of $θ$ and the observation $X$ , so that, in general, it is not a statistic.

1 Mean
2 Variance
3 Example
4 See also
5 References

[edit] Mean

The expected value of $V$ , written $\mathbb{E}(V|\theta)$ , is zero. To see this, rewrite the definition of expectation, using the fact that the probability mass function is just $L (θ; x)$ , which is conventionally denoted by $f (x;θ)$ (in which the dependence on $x$ is more explicit). The corresponding cumulative distribution function is denoted as $F (x;θ)$ . With this change of notation and writing $f' θ (x;θ)$ for the partial derivative with respect to $θ$ ,

$\mathbb{E}(V|\theta) =\int_{[0,1]}\frac{f'_{\theta}(x; \theta)}{f(x; \theta)}dF(x;\theta) =\int_X \frac{f'_{\theta}(x; \theta)}{f(x; \theta)} f(x; \theta) dx = \int_X \frac{\partial f(x; \theta)}{\partial \theta} \, dx$

where the integral runs over the whole of the probability space of X and a prime denotes partial differentiation with respect to $θ$ . If certain differentiability conditions are met, the integral may be rewritten as

$\frac{\partial}{\partial\theta} \int_X f(x; \theta) \, dx = \frac{\partial}{\partial\theta}1 = 0.$

It is worth restating the above result in words: the expected value of the score is zero. Thus, if one were to repeatedly sample from some distribution, and repeatedly calculate the score with the true $θ$ , then the mean value of the scores would tend to zero as the number of repeat samples approached infinity.

[edit] Variance

Main article: Fisher information

The variance of the score is known as the Fisher information and is written $\mathcal{I}(\theta)$ . Because the expectation of the score is zero, this may be written as

$\mathcal{I}(\theta) = \mathbb{E} \left\{\left. \left[ \frac{\partial}{\partial\theta} \log L(\theta;X) \right]^2 \right|\theta\right\}.$

Note that the Fisher information, as defined above, is not a function of any particular observation, as the random variable $X$ has been averaged out. This concept of information is useful when comparing two methods of observation of some random process.

[edit] Example

Consider a Bernoulli process, with A successes and B failures; the probability of success is θ.

Then the likelihood L is

$L(\theta;A,B)=\frac{(A+B)!}{A!B!}\theta^A(1-\theta)^B,$

so the score V is given by taking the partial derivative of the log likelihood function as follows:

$V=\frac{\partial}{\partial\theta}\log\left[L(\theta;A,B)\right]= \frac{1}{L}\frac{\partial L}{\partial\theta}.$

This is a standard calculus problem: A and B are treated as constants. Then

$V=\frac{A}{\theta}-\frac{B}{1-\theta}.$

So if the score is zero, θ = A/(A + B). We can now verify that the expectation of the score is zero. Noting that the expectation of A is nθ and the expectation of B is n(1 − θ), we can see that the expectation of V is

$E(V) = \frac{n\theta}{\theta} - \frac{n(1-\theta)}{1-\theta} = n - n = 0.$

We can also check the variance of $V$ . We know that A + B = n (so B = n - A) and the variance of A is nθ(1 − θ) so the variance of V is

$\operatorname{var}(V)=\operatorname{var}\left(\frac{A}{\theta}-\frac{n-A}{1-\theta}\right) =\operatorname{var}\left(A\left(\frac{1}{\theta}+\frac{1}{1-\theta}\right)\right) =\left(\frac{1}{\theta}+\frac{1}{1-\theta}\right)^2\operatorname{var}(A) =\frac{n}{\theta(1-\theta)}.$