Jeffreys prior

From Wikipedia, the free encyclopedia

In Bayesian probability, the Jeffreys prior (called after Harold Jeffreys) is a non-informative prior distribution proportional to the square root of the Fisher information:

p(\theta) \propto \sqrt{I(\theta | y)}

and is invariant under reparameterization of θ.

It's an important uninformative (objective) prior.

It allows us to describe our knowledge on φ, a transformation of θ with an improper uniform distribution. This also implies the resulting likelihood function, L(φ | X) should be asymptotically translated by changes in data. Due to asymptotical normality, this means only the first moment will vary when data is updated.

It can be derived as follows:

We need an injective transformation of θ such that our prior under this transformation is uniform. It gives us "no information". We then use the following relation:

 I(\phi | y) = \left(\frac{d\theta}{d\phi}\right)^2I(\theta | y)

To conclude,

 \frac{d\phi}{d\theta} \propto \sqrt{I(\theta | y)}
 \phi \propto \int_ {}\sqrt{I(\theta | y)}  \ d\theta

From a practical and mathematical standpoint, a valid reason to use this noninformative prior instead of others, like the ones obtained through a limit in conjugate families of distributions, is that it best represents the lack of knowledge when a certain parametric family is chosen, and it is linked with strong Bayesian statistics results.

In general, use of Jeffreys priors violates the likelihood principle; some statisticians therefore regard their use as unjustified.[citation needed]

[edit] References

  • Jeffreys, H. (1939). Theory of Probability. Oxford University Press.