Invariant estimator

From Wikipedia, the free encyclopedia

In statistics, an invariant estimator or equivariant estimator is a non-Bayesian estimator having a certain intuitively appealing quality, which is defined formally below. Roughly speaking, invariance means that the estimator's behavior is unchanged when both the measurements and the parameters are transformed in a compatible way. The requirement of invariance is sometimes imposed when seeking an estimator, leading to what is called the optimal invariant estimator.

Contents

[edit] Setting

Invariance is defined in the deterministic (non-Bayesian) estimation scenario. Under this setting, we are given a measurement x which contains information about an unknown parameter θ. The measurement x is modeled as a random variable having a probability density function f(x | θ) which depends on θ.

We would like to estimate θ given x. The estimate, denoted by a, is a function of the measurements and belongs to a set A. The quality of the result is defined by a loss function L = L(a,θ) which determines a risk function R = R(a,θ) = E[L(a,θ) | θ].

We denote the sets of possible values of x, θ, and a by X, Θ, and A, respectively.

[edit] Definition

An invariant estimator is an estimator which obeys the following two rules:

  1. Principle of Rational Invariance: The action taken in a decision problem should not depend on transformation on the measurement used
  2. Invariance Principle: If two decision problems have the same formal structure (in terms of X, Θ, f(x | θ) and L) then the same decision rule should be used in each problem

To define invariant estimator formally we will first set some definitions about groups of transformations:

A group of transformation of X, to be denoted by G is a set of (measurable) 1:1 and onto transformations of X into itself, which satisfies the following conditions:

  1. If g_1\in G and g_2\in G then g_1,g_2\in G \,
  2. If g\in G then g^{-1}\in G (where g^{-1}(g(x))=x) \,)
  3. e\in G (i.e there is an identity transformation  e(x)=x \, )

Datasets x1 and x2 in X are equivalent if x1 = g(x2) for some g\in G. All the equivalent points form an equivalence class. Such an equivalence class is called an orbit (in X). The x0 orbit, X(x0), is the set X(x_0)=\{g(x_0):g\in G\}. If X consists of a single orbit then g is said to be transitive.

A family of densities F is said to be invariant under the group G if, for every g\in G and \theta\in \Theta there exists a unique \theta^*\in  \Theta such that Y = g(x) has density f(y | θ * ). θ * will be denoted \bar{g}(\theta).

If F is invariant under the group G then the loss function L(θ,a) is said to be invariant under G if for every g\in G and a\in A there exists an a^*\in A such that L(\theta,a)=L(\bar{g}(\theta),a^*) for all \theta \in \Theta. a * will be denoted \tilde{g}(a).

\bar{G}=\{\bar{g}:g\in G\} is a group of transformations from Θ to itself and \tilde{G}=\{\tilde{g}: g \in G\} is a group of transformations from A to itself.

An estimation problem is invariant under G if there exists three such groups G, \bar{G}, \tilde{G} as defined above.

For an estimation problem that is invariant under G, estimator δ(x) is invariant estimator under G if for all x\in X and g\in G \delta(g(x)) = \tilde{g}(\delta(x)).

[edit] Properties

  1. The risk function of an invariant estimator δ is constant on orbits of Θ. Equivalently R(\theta,\delta)=R(\bar{g}(\theta),\delta) for all \theta \in \Theta and \bar{g}\in \bar{G}.
  2. The risk function of an invariant estimator with transitive \bar{g} is constant.

For a given problem the invariant estimator with the lowest risk is termed the "best invariant estimator". Best invariant estimator cannot be achieved always. A special case for which it can be achieved is the case when \bar{g} is transitive.

[edit] Example: Location parameter

θ is a location parameter if the density of X is f(x − θ). For  \Theta=A=\Bbb{R}^1 and L = L(a − θ) the problem is invariant under g=\bar{g}=\tilde{g}=\{g_c:g_c(x)=x+c, c\in \Bbb{R}\}. The invariant estimator in this case must satisfy \delta(x+c)=\delta(x)+c, \forall c\in \Bbb{R} thus it is of the form δ(x) = x + K (K\in \Bbb{R}). \bar{g} is transitive on Θ so we have here constant risk: R(θ,δ) = R(0,δ) = E[L(X + K) | θ = 0]. The best invariant estimator is the one that brings the risk R(θ,δ) to minimum.

In the case that L is squared error δ(x) = xE[X | θ = 0]

[edit] Pitman estimator

Given the estimation problem: X=(X_1,\dots,X_n) that has density f(x_1-\theta,\dots,x_n-\theta) and loss L( | a − θ | ). This problem is invariant under G=\{g_c:g_c(x)=(x_1+c, \dots, x_n+c),c\in \Bbb{R}^1\}, \bar{G}=\{g_c:g_c(\theta)=\theta + c,c\in \Bbb{R}^1\} and \tilde{G}=\{g_c:g_c(a)=a + c,c\in \Bbb{R}^1\} (additive groups).

The best invariant estimator δ(x) is the one that minimize \frac{\int_{-\infty}^{\infty}{L(\delta(x)-\theta)f(x_1-\theta,\dots,x_n-\theta)d\theta}}{\int_{-\infty}^{\infty}{f(x_1-\theta,\dots,x_n-\theta)d\theta}} (Pitman's estimator, 1939).

For the square error loss case we get that \delta(x)=\frac{\int_{-\infty}^{\infty}{\theta f(x_1-\theta,\dots,x_n-\theta)d\theta}}{\int_{-\infty}^{\infty}{f(x_1-\theta,\dots,x_n-\theta)d\theta}}

If x \sim N(\theta 1_n,I)\,\! (normal distribution) than \delta_{pitman} = \delta_{ML}=\frac{\sum{x_i}}{n}

If x \sim C(\theta 1_n,I)\,\! (Cauchy distribution) than \delta_{pitman} \ne \delta_{ML} and \delta_{pitman}=\sum_{k=1}^n{x_k\left[\frac{Re\{w_k\}}{\sum_{m=1}^{n}{Re\{w_k\}}}\right]},n>1 when w_k = \prod_{j\ne k}\left[\frac{1}{(x_k-x_j)^2+4\sigma^2}\right]\left[1-\frac{2\sigma}{(x_k-x_j)}i\right]

[edit] References

  • James O. Berger Statistical Decision Theory and Bayesian Analysis. 1980. Springer Series in Statistics. ISBN 0-387-90471-9.
  • The Pitman estimator of the Cauchy location parameter, Gabriela V. Cohen Freue, Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913