Jeffreys prior

From Wikipedia, the free encyclopedia

In Bayesian probability, the Jeffreys prior (called after Harold Jeffreys) is a non-informative prior distribution proportional to the square root of the Fisher information:

$p(\theta) \propto \sqrt{I(\theta | y)}$

and is invariant under reparameterization of $θ$ .

It's an important uninformative (objective) prior.

It allows us to describe our knowledge on $φ$ , a transformation of $θ$ with an improper uniform distribution. This also implies the resulting likelihood function, $L (φ | X)$ should be asymptotically translated by changes in data. Due to asymptotical normality, this means only the first moment will vary when data is updated.

It can be derived as follows:

We need an injective transformation of $θ$ such that our prior under this transformation is uniform. It gives us "no information". We then use the following relation:

$I(\phi | y) = \left(\frac{d\theta}{d\phi}\right)^2I(\theta | y)$

To conclude,

$\frac{d\phi}{d\theta} \propto \sqrt{I(\theta | y)}$

$\phi \propto \int_ {}\sqrt{I(\theta | y)} \ d\theta$

From a practical and mathematical standpoint, a valid reason to use this noninformative prior instead of others, like the ones obtained through a limit in conjugate families of distributions, is that it best represents the lack of knowledge when a certain parametric family is chosen, and it is linked with strong Bayesian statistics results.

In general, use of Jeffreys priors violates the likelihood principle; some statisticians therefore regard their use as unjustified.^{[citation needed]}