Rayleigh quotient

From Wikipedia, the free encyclopedia

In mathematics, for a given complex Hermitian matrix A and nonzero vector x, the Rayleigh quotient R(A,x) is defined as:

{x^{*} A x \over x^{*} x}

For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugate transpose x * to the usual transpose x'. Note that R(A,c.x) = R(A,x) for any real scalar c. Recall that a Hermitian (or real symmetric) matrix has real eigenvalues. It can be shown that the Rayleigh quotient reaches its minimum value \lambda_{\operatorname{min}} (the smallest eigenvalue of A) when x is v_{\operatorname{min}} (the corresponding eigenvector). Similarly, R(A, x) \leq \lambda_{\operatorname{max}} and R(A, v_{\operatorname{max}}) = \lambda_{\operatorname{max}}. The Rayleigh quotient is used in Min-max theorem to get exact values of all eigenvalues. It is also used in eigenvalue algorithms to obtain an eigenvalue approximation from an eigenvector approximation. Specifically, this is the basis for Rayleigh quotient iteration.

[edit] Special case of covariance matrices

A covariance matrix Σ can be represented as the product A'A. Its eigenvalues are positive:

Σvi = λivi
A'Avi = λivi
vi'A'Avi = viivi
 \left\| A v_i \right\|^2 = \lambda _i \left\| v_i \right\|^2
 \lambda _i = \frac{\left\| A v_i \right\|^2}{\left\| v_i \right\|^2} \geq 0

The eigenvectors are orthogonal to one another:

A'Avi = λivi
vj'A'Avi = λivj'vi
(A'Avj)'vi = λivj'vi
λjvj'vi = λivj'vi
j − λi)vj'vi = 0
vj'vi = 0 (different eigenvalues, in case of multiplicity, the basis can be orthogonalized)

The Rayleigh quotient can be expressed as a function of the eigenvalues by decomposing any vector x on the basis of eigenvectors:

x = \sum _{i=1} ^n \alpha _i v_i
\rho = \frac{x' A' A x}{x' x}
\rho = \frac{(\sum _{j=1} ^n \alpha _j v_j)' A' A (\sum _{i=1} ^n \alpha _i v_i)}{(\sum _{j=1} ^n \alpha _j v_j)' (\sum _{i=1} ^n \alpha _i v_i)}

Which, by orthogonality of the eigenvectors, becomes:

\rho = \frac{\sum _{i=1} ^n \alpha _i ^2 \lambda _i}{\sum _{i=1} ^n \alpha _i ^2}

If a vector x maximizes ρ, then any vector k.x (for k \ne 0) also maximizes it, one can reduce to the Lagrange problem of maximizing \sum _{i=1} ^n \alpha _i ^2 \lambda _i under the constraint that \sum _{i=1} ^n \alpha _i ^2 = 1.

Since all the eigenvalues are non-negative, the problem is convex and the maximum occurs on the edges of the domain, namely when α1 = 1 and \forall i > 1, \alpha _i = 0 (when the eigenvalues are ordered in decreasing magnitude).

This property is the basis for principal components analysis and canonical correlation.

[edit] Use in Sturm-Liouville theory

Sturm-Liouville theory concerns the action of the linear operator

L(y) = \frac{1}{w(x)}\left(-\frac{d}{dx}\left[p(x)\frac{dy}{dx}\right] + q(x)y\right)

on the inner product space defined by

\langle{y_1,y_2}\rangle = \int_a^b{w(x)y_1(x)y_2(x)}dx

of functions satisfying some specified boundary conditions at a and b. In this case the Rayleigh quotient is

\frac{\langle{y,Ly}\rangle}{\langle{y,y}\rangle} = \frac{\int_a^b{y(x)\left(-\frac{d}{dx}\left[p(x)\frac{dy}{dx}\right] + q(x)y(x)\right)}dx}{\int_a^b{w(x)y(x)^2}dx}

This is sometimes presented in an equivalent form, obtained by separating the integral in the numerator and using integration by parts:

\frac{\langle{y,Ly}\rangle}{\langle{y,y}\rangle} = \frac{\int_a^b{y(x)\left(-\frac{d}{dx}\left[p(x)y'(x)\right]\right)}dx + \int_a^b{q(x)y(x)^2}dx}{\int_a^b{w(x)y(x)^2}dx}
= \frac{-y(x)\left[p(x)y'(x)\right]|_a^b + \int_a^b{y'(x)\left[p(x)y'(x)\right]}dx + \int_a^b{q(x)y(x)^2}dx}{\int_a^b{w(x)y(x)^2}dx}
= \frac{-p(x)y(x)y'(x)|_a^b + \int_a^b\left[p(x)y'(x)^2 + q(x)y(x)^2\right]dx}{\int_a^b{w(x)y(x)^2}dx}

[edit] See also

Languages