Radon–Nikodym theorem

From Wikipedia, the free encyclopedia

In mathematics, the Radon–Nikodym theorem is a result in functional analysis that states that, given a measurable space (X,Σ), if a sigma-finite measure ν on (X,Σ) is absolutely continuous with respect to a sigma-finite measure μ on (X,Σ), then there is a measurable function f on X and taking values in [0,∞), such that

$\nu(A) = \int_A f \, d\mu$

for any measurable set A.

The theorem is named after Johann Radon, who proved the theorem for the special case where the underlying space is R^N in 1913, and for Otton Nikodym who proved the general case in 1930.

1 Radon–Nikodym derivative
2 Applications
3 Properties
4 Further applications
- 4.1 Information divergences
5 The assumption of sigma-finiteness
6 Proof
7 References

[edit] Radon–Nikodym derivative

The function f satisfying the above equality is uniquely defined up to a μ-null set, that is, if g is another function which satisfies the same property, then f = g μ-almost everywhere. f is commonly written dν/dμ and is called the Radon–Nikodym derivative. The choice of notation and the name of the function reflects the fact that the function is analogous to a derivative in calculus in the sense that it describes the rate of change of density of one measure with respect to another. A similar theorem can be proven for signed and complex measures: namely, that if μ is a nonnegative σ-finite measure, and ν is a finite-valued signed or complex measure such that $|\nu| \ll \mu$ , there is μ-integrable real- or complex-valued function g on X such that

$\nu(A) = \int_A g \, d\mu,$

for any measurable set A.

[edit] Applications

The theorem is very important in extending the ideas of probability theory from probability masses and probability densities defined over real numbers to probability measures defined over arbitrary sets. It tells if and how it is possible to change from one probability measure to another.

For example, it is necessary when proving the existence of conditional expectation for probability measures. The latter itself is a key concept in probability theory, as conditional probability is just a special case of it.

Amongst other fields, financial mathematics uses the theorem extensively. Such changes of probability measure are the cornerstone of the rational pricing of derivative securities and are used for converting actual probabilities into those of the risk neutral probabilities.

[edit] Properties

Let ν, μ, and λ be σ-finite measures on the same measure space. If ν << λ and μ << λ (ν and μ are absolutely continuous in respect to λ), then

$\frac{d(\nu+\mu)}{d\lambda} = \frac{d\nu}{d\lambda}+\frac{d\mu}{d\lambda}\quad\lambda\text{-almost everywhere}.$

If ν << μ << λ, then

$\frac{d\nu}{d\lambda}=\frac{d\nu}{d\mu}\frac{d\mu}{d\lambda}\quad\lambda\text{-almost everywhere}.$

If μ << λ and g is a μ-integrable function, then

$\int_X g\,d\mu = \int_X g\frac{d\mu}{d\lambda}\,d\lambda.$

If μ << ν and ν << μ, then

$\frac{d\mu}{d\nu}=\left(\frac{d\nu}{d\mu}\right)^{-1}.$

If ν is a finite signed or complex measure, then

${d|\nu|\over d\mu} = \left|{d\nu\over d\mu}\right|.$

[edit] Further applications

[edit] Information divergences

If μ and ν are measures over X, and ν << μ

The Kullback-Leibler divergence from μ to ν is defined to be

$D_{\mathrm{KL}}(\mu\|\nu) = - \int_X \log \left( \frac{d \nu}{d \mu} \right) \; d\mu. \!$

The Renyi divergence of order α from μ to ν is defined to be

$D_{\mathrm{\alpha}}(\mu\|\nu) = \frac{1}{1-\alpha} \int_X \left( \frac{d \nu}{d \mu} \right)^{1-\alpha} \; d\mu. \!$

[edit] The assumption of sigma-finiteness

The Radon–Nikodym theorem makes the assumption that the measure μ with respect to which one computes the rate of change of ν is sigma-finite. Here is an example when μ is not sigma-finite and the Radon–Nikodym theorem fails to hold.

Consider the Borel sigma-algebra on the real line. Let the counting measure, μ, of a Borel set A be defined as the number of elements of A if A is finite, and +∞ otherwise. One can check that μ is indeed a measure. It is not sigma-finite, as not every Borel set is at most a countable union of finite sets. Let ν be the usual Lebesgue measure on this Borel algebra. Then, ν is absolutely continuous with respect to μ, since for a set A one has μ(A) = 0 only if A is the empty set, and then ν(A) is also zero.

Assume that the Radon–Nikodym theorem holds, that is, for some measurable function f one has

$\nu(A) = \int_A f \, \mathrm{d} \mu$

for all Borel sets. Taking A to be a singleton set, A = {a}, and using the above equality, one finds

$0 = f(a)\,$

for all real numbers a. This implies that the function f, and therefore the Lebesgue measure ν, is zero, which is a contradiction.

[edit] Proof

This section gives a measure-theoretic proof of the theorem. There is also a functional-analytic proof, using Hilbert space methods, that was first given by von Neumann.

For finite measures μ and ν, the idea is to consider functions f with f dμ ≤ dν. The supremum of all such functions, along with the monotone convergence theorem, then furnishes the Radon-Nikodym derivative. The fact that the remaining part of μ is singular with respect to ν follows from a technical fact about finite measures. Once the result is established for finite measures, extending to σ-finite, signed, and complex measures can be done naturally. The details are given below.

[edit] For finite measures

First, suppose that μ and ν are both finite-valued nonnegative measures. Let F be the set of those measurable functions f : X→[0, +∞] satisfying

$\int_A f\,d\mu\leq\nu(A)$

for every A ∈ Σ (this set is not empty, for it contains at least the zero function). Let f₁, f₂ ∈ F; let A be an arbitrary measurable set, A₁ = {x ∈ A | f₁(x) > f₂(x)}, and A₂ = {x ∈ A | f₂(x) ≥ f₁(x)}. Then one has

$\int_A\max\{f_1,f_2\}\,d\mu = \int_{A_1} f_1\,d\mu+\int_{A_2} f_2\,d\mu \leq \nu(A_1)+\nu(A_2)=\nu(A),$

and therefore, max{f₁,f₂} ∈ F.

Now, let {f_n}_n be a sequence of functions in F such that

$\lim_{n\to\infty}\int_X f_n\,d\mu=\sup_{f\in F} \int_X f\,d\mu.$

By replacing f_n with the maximum of the first n functions, one can assume that the sequence {f_n} is increasing. Let g be a function defined as

$g(x):=\lim_{n\to\infty}f_n(x).$

By Lebesgue's monotone convergence theorem, one has

$\int_A g\,d\mu=\lim_{n\to\infty} \int_A f_n\,d\mu \leq \nu(A)$

for each A ∈ Σ, and hence, g ∈ F. Also, by the construction of g,

$\int_X g\,d\mu=\sup_{f\in F}\int_X f\,d\mu.$

Now, since g ∈ F,

$\nu_0(A):=\nu(A)-\int_A g\,d\mu$

defines a nonnegative measure on Σ. Suppose ν₀ ≠ 0; then, since μ is finite, there is an ε > 0 such that ν₀(X) > ε μ(X). Let (P,N) be a Hahn decomposition for the signed measure ν₀ − ε μ. Note that for every A ∈ Σ one has ν₀(A∩P) ≥ ε μ(A∩P), and hence,

$\nu(A)=\int_A g\,d\mu+\nu_0(A) \geq \int_A g\,d\mu+\nu_0(A\cap P)$

$\geq \int_A g\,d\mu +\varepsilon\mu(A\cap P)=\int_A(g+\varepsilon1_P)\,d\mu.$

Also, note that μ(P) > 0; for if μ(P) = 0, then (since ν is absolutely continuous in relation to μ) ν₀(P) ≤ ν(P) = 0, so ν₀(P) = 0 and

$\nu_0(X)-\varepsilon\mu(X)=(\nu_0-\varepsilon\mu)(N)\leq 0,$

contradicting the fact that ν₀(X) > ε μ (X).

Then, since

$\int_X g\,d\mu \leq \nu(X) < +\infty,$

g + ε 1_P ∈ F and satisfies

$\int_X (g+\varepsilon 1_P)\,d\mu>\int_X g\,d\mu=\sup_{f\in F}\int_X f\,d\mu.$

This is impossible, therefore, the initial assumption that ν₀ ≠ 0 must be false. So ν₀ = 0, as desired.

Now, since g is μ-integrable, the set {x∈X | g(x)=+∞} is μ-null. Therefore, if a f is defined as

$f(x)=\begin{cases} g(x)&\mbox{if }g(x) < \infty\\0&\mbox{otherwise,}\end{cases}$

then f has the desired properties.

As for the uniqueness, let f,g : X→[0,+∞) be measurable functions satisfying

$\nu(A)=\int_A f\,d\mu=\int_A g\,d\mu$

for every measurable set A. Then, g − f is μ-integrable, and

$\int_A (g-f)\,d\mu=0.$

In particular, for A = {x∈X | f(x) > g(x)}, or {x ∈ X | f(x) < g(x)}. It follows that

$\int_X (g-f)^+\,d\mu=0=\int_X (g-f)^-\,d\mu,$

and so, that (g−f)⁺ = 0 μ-almost everywhere; the same is true for (g − f)⁻, and thus, f = g μ-almost everywhere, as desired.

[edit] For σ-finite positive measures

If μ and ν are σ-finite, then X can be written as the union of a sequence {B_n}_n of disjoint sets in Σ, each of which has finite measure under both μ and ν. For each n, there is a Σ-measurable function f_n : B_n→[0,+∞) such that

$\nu(A)=\int_A f_n\,d\mu$

for each Σ-measurable subset A of B_n. The union f of those functions is then the required function.

As for the uniqueness, since each of the f_n is μ-almost everywhere unique, then so is f.

[edit] For signed and complex measures

If ν is a σ-finite signed measure, then it can be Hahn–Jordan decomposed as ν = ν⁺−ν⁻ where one of the measures is finite. Applying the previous result to those two measures, one obtains two functions, g,h : X→[0,+∞), satisfying the Radon–Nikodym theorem for ν⁺ and ν⁻ respectively, at least one of which is μ-integrable (i.e., its integral with respect to μ is finite). It is clear then that f = g−h satisfies the required properties, including uniqueness, since both g and h are unique up to μ-almost everywhere equality.

If ν is a complex measure, it can be decomposed as ν = ν₁+i ν₂, where both ν₁ and ν₂ are finite-valued signed measures. Applying the above argument, one obtains two functions, g,h : X→[0,+∞), satisfying the required properties for ν₁ and ν₂, respectively. Clearly, f = g + i h is the required function.

[edit] References

Shilov, G. E., and Gurevich, B. L., 1978. Integral, Measure, and Derivative: A Unified Approach, Richard A. Silverman, trans. Dover Publications. ISBN 0486635198.

This article incorporates material from Radon-Nikodym theorem on PlanetMath, which is licensed under the GFDL.

Radon–Nikodym theorem

From Wikipedia, the free encyclopedia

Contents

[edit] Radon–Nikodym derivative

[edit] Applications

[edit] Properties

[edit] Further applications

[edit] Information divergences

[edit] The assumption of sigma-finiteness

[edit] Proof

[edit] For finite measures

[edit] For σ-finite positive measures

[edit] For signed and complex measures

[edit] References

Views

Navigation

Interaction

Search

Languages