Jeffreys prior

From Wikipedia, the free encyclopedia

In Bayesian probability, the Jeffreys prior (called after Harold Jeffreys) is a noninformative prior distribution proportional to the square root of the Fisher information:

p(\theta) \propto \sqrt{I(\theta | y)}

and is invariant under reparameterization of θ.

It's an important noninformative (objective) prior.

It allows us to describe our knowledge on φ, a transformation of θ with an improper uniform distribution. This also implies the resulting likelihood function, L(φ | X) should be asymptotically translated by changes in data (Due to asymtotical normality, this means only the first moment will vary when data is updated).

It can be derived as follows:

We need an injective transformation of θ such that our prior under this transformation is uniform. It gives us "no information". We then use the following relation:

I(\phi | y) = (\frac{d\theta}{d\phi})^2I(\theta | y)

To conclude,

\frac{d\phi}{d\theta} \propto \sqrt{I(\theta | y)}
\phi \propto \int_ {}\sqrt{I(\theta | y)}  \ d\theta

From a practical and mathematical standpoint, a valid reason to use this noninformative prior instead of others, like the ones obtained through a limit in conjugate families of distributions, is that it best represents the lack of knowledge when a certain parametrical family is chosen, and it is linked with strong bayesian statistics results.