Eigenvalue algorithm

From Wikipedia, the free encyclopedia

In linear algebra, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors.

Contents

[edit] Characteristic polynomial

Given a square matrix A, an eigenvalue λ and its associated eigenvector v are, by definition, a pair obeying the relation

A{\bold v} = \lambda{\bold v} \,\!

Equivalently, (A−λI)v = 0, implying det(A−λI) = 0. This determinant can be expanded into a polynomial in λ, known as the characteristic polynomial of A. One common method for determining the eigenvalues of a small matrix is by finding roots of the characteristic polynomial.

Unfortunately, this method has some limitations. A general polynomial of order n > 4 cannot be solved by a finite sequence of arithmetic operations and radicals (see Abel-Ruffini theorem). There do exist efficient root-finding algorithms for higher order polynomials. However, finding the roots of the characteristic polynomial may be an ill-conditioned problem even when the underlying eigenvalue problem is well-conditioned. For this reason, this method is rarely used.

The above discussion implies a restriction on all eigenvalue algorithms. It can be shown that for any polynomial, there exists a matrix (see companion matrix) having that polynomial as its characteristic polynomial (actually, there are infinitely many). If there did exist a finite sequence of arithmetic operations for exactly finding the eigenvalues of a general matrix, this would provide a corresponding finite sequence for general polynomials, in contradiction of the Abel-Ruffini theorem. Therefore, general eigenvalue algorithms are expected to be iterative.

[edit] Power iteration

The basic idea of this method is to choose an (arbitrary) initial vector b and then repeatedly multiply it by the matrix, iteratively calculating Ab, A²b, A³b,…. Suppose the eigenvalues are ordered by magnitude, with λ1 being the largest, and with associated eigenvector v1. Then each iteration scales the component of b in the v1 direction by λ1, and every other direction by a smaller amount (assuming |λ2| < |λ1|). Except for a set of measure zero, for any initial vector the result will converge to an eigenvector corresponding to the dominant eigenvalue. In practice, the vector should be normalized after every iteration.

By itself, power iteration is not very useful. Its convergence is slow except for special cases of matrices, and without modification, it can only find the largest or dominant eigenvalue (and the corresponding eigenvector). However, we can understand a few of the more advanced eigenvalue algorithms as variations of power iteration. In addition, some of the better algorithms for the generalized eigenvalue problem are based on power iteration.

[edit] Matrix eigenvalues

In mathematics, and in particular in linear algebra, an important tool for describing eigenvalues of square matrices is the characteristic polynomial: saying that λ is an eigenvalue of A is equivalent to stating that the system of linear equations (A - λI) v = 0 (where I is the identity matrix) has a non-zero solution v (namely an eigenvector), and so it is equivalent to the determinant det (A - λI) being zero. The function p(λ) = det (A - λI) is a polynomial in λ since determinants are defined as sums of products. This is the characteristic polynomial of A: the eigenvalues of a matrix are the zeros of its characteristic polynomial.

It follows that we can compute all the eigenvalues of a matrix A by solving the equation pA(λ) = 0. If A is an n-by-n matrix, then pA has degree n and A can therefore have at most n eigenvalues. Conversely, the fundamental theorem of algebra says that this equation has exactly n roots (zeroes), counted with multiplicity. All real polynomials of odd degree have a real number as a root, so for odd n, every real matrix has at least one real eigenvalue. In the case of a real matrix, for even and odd n, the non-real eigenvalues come in conjugate pairs.

An example of a matrix with no real eigenvalues is the 90-degree rotation

\begin{bmatrix}0 & 1\\ -1 & 0\end{bmatrix}

whose characteristic polynomial is x2 + 1 and so its eigenvalues are the pair of complex conjugates i, -i.

The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic polynomial, that is, pA(A) = 0.

[edit] Types

[edit] Eigenvalues of 2×2 matrices

An analytic solution for the eigenvalues of 2×2 matrices can be obtained directly from the quadratic formula: if

A = \begin{bmatrix} a  & b \\ c & d \end{bmatrix}

then the characteristic polynomial is

{\rm det} \begin{bmatrix} a-\lambda & b \\ c & d-\lambda \end{bmatrix}=(a-\lambda)(d-\lambda)-bc=\lambda^2-(a+d)\lambda+(ad-bc)

so the solutions are

 \lambda = \frac{a + d}{2}  \pm \sqrt{\frac{(a + d)^2}{4} + bc - ad} = \frac{a + d}{2}  \pm \frac{\sqrt{4bc + (a - d)^2  }}{2}

Notice that the characteristic polynomial of a 2x2 matrix can be written in terms of the trace tr(A) = a + d and determinant det(A) = adbc as

{\rm det} \begin{bmatrix} a-\lambda & b \\ c & d-\lambda \end{bmatrix}
  = {\rm det} \left[ A - \lambda I_{2}\right] 
  = \lambda^2- \lambda {\rm tr}(A)+ {\rm det}(A)

where I2 is the 2x2 identity matrix. The solutions for the eigenvalues of a 2x2 matrix can thus be written as

 
\lambda = \frac{1}{2} \left({\rm tr}(A) \pm \sqrt{{\rm tr}^2 (A) - 4 {\rm det}(A)} \right)

It can not be stressed enough that this formula holds for only a 2x2 matrix.

[edit] Eigenvalues of 3×3 matrices

If

A = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}

then the characteristic polynomial of A is

\det \begin{bmatrix} a-\lambda & b & c \\ d & e-\lambda & f \\ g & h & i-\lambda \end{bmatrix}= -\lambda^3 + \lambda^2 ( a + e + i ) + \lambda ( db + gc + fh - ae - ai - ei ) + ( aei - afh - dbi + dch + gbf - gce ).

Alternatively the characteristic polynomial of a 3x3 matrix can be written in terms of the trace tr(A) and determinant det(A) as

{\rm det} \left[ A - \lambda I_{3}\right]
= -\lambda^3 
  + \lambda^2 {\rm tr}(A) 
  + \lambda \frac{1}{2}\left[ {\rm tr}(A^2) - {\rm tr}^2(A) \right]
  + {\rm det}(A)

where I3 is the 3x3 identity matrix.

The eigenvalues of the matrix are the roots of this polynomial, which can be found using the method for solving cubic equations.

A formula for the eigenvalues of a 4x4 matrix could be derived in an analogous way, using the formulae for the solutions of the quartic equation.

[edit] Eigenvalues and eigenvectors of special matrices

For matrices satisfying A2 = α one can write explicit formulas for the possible eigenvalues and the projectors on the corresponding eigenspaces.

P_+=\frac{1}{2}\left(I+\frac{A}{\sqrt{\alpha}}\right)
P_-=\frac{1}{2}\left(I-\frac{A}{\sqrt{\alpha}}\right)

with

AP_+=\sqrt{\alpha}P_+ \quad AP_-=-\sqrt{\alpha}P_-

and

P_+P_+=P_+ \quad P_-P_-=P_- \quad P_+P_-=P_-P_+=0

This provides the following resolution of identity

I=P_+ + P_-= \frac{1}{2}\left(I+\frac{A}{\sqrt{\alpha}}\right) 
+ \frac{1}{2}\left(I-\frac{A}{\sqrt{\alpha}}\right)

The multiplicity of the possible eigenvalues is given by the rank of the projectors.

[edit] Example computation

The computation of eigenvalue/eigenvector can be realized with the following algorithm.

Consider an n-square matrix A

1. Find the roots of the characteristic polynomial of A. These are the eigenvalues.
  • If n different roots are found, then the matrix can be diagonalized.
2. Find a basis for the kernel of the matrix given by B − λnI. For each of the eigenvalues. These are the eigenvectors
  • The eigenvectors given from different eigenvalues are linearly independent.
  • The eigenvectors given from a root-multiplicity are also linearly independent.

Let us determine the eigenvalues of the matrix


A = \begin{bmatrix}
     0 & 1 & -1 \\
     1 & 1 &  0 \\
    -1 & 0 &  1 
\end{bmatrix}

which represents a linear operator R³ → R³.

[edit] Identifying eigenvalues

We first compute the characteristic polynomial of A:


p(\lambda) = \det( A - \lambda I) =
\det \begin{bmatrix}
    -\lambda &    1      &   -1 \\
        1    & 1-\lambda &    0     \\
       -1    &    0      & 1-\lambda
\end{bmatrix}
= -\lambda^3 + 2\lambda^2 +\lambda - 2.

This polynomial factors to p(λ) = − (λ − 2)(λ − 1)(λ + 1). Therefore, the eigenvalues of A are 2, 1 and −1.

[edit] Identifying eigenvectors

With the eigenvalues in hand, we can solve sets of simultaneous linear equations to determine the corresponding eigenvectors. Since we are solving for the system (A - \lambda I)v = 0\,, if λ = 2 then,


\begin{bmatrix}
       -2    & 1   &   -1 \\
        1    & -1  &    0 \\
       -1    & 0   &   -1
\end{bmatrix}
\begin{bmatrix}
       v_1 \\
       v_2 \\
       v_3
\end{bmatrix} = 0.

Now, reducing (A - 2I)\, to row echelon form:


\begin{bmatrix}
       -2    & 1   &   -1 \\
        1    & -1  &    0 \\
       -1    & 0   &   -1
\end{bmatrix} \to
\begin{bmatrix}
        1 & 0 & 1 \\
        0 & 1 & 1 \\
        0 & 0 & 0
\end{bmatrix}

allows us to solve easily for the eigenspace E2:


\begin{bmatrix}
        1 & 0 & 1 \\
        0 & 1 & 1 \\
        0 & 0 & 0
\end{bmatrix}
\begin{bmatrix}
       v_1 \\
       v_2 \\
       v_3
\end{bmatrix} = 0 \to
\begin{cases}
    v_1 + v_3 = 0 \\
    v_2 + v_3 = 0
\end{cases}
\to E_2 = {\rm span}\begin{bmatrix}1 \\ 1 \\ -1\end{bmatrix}.

We can confirm that a simple example vector chosen from eigenspace E2 is a valid eigenvector with eigenvalue λ = 2:

A \begin{bmatrix} \; 1 \\ \; 1 \\ -1 \end{bmatrix} 
= \begin{bmatrix} \; 2 \\ \; 2 \\ -2 \end{bmatrix} 
= 2 \begin{bmatrix} \; 1 \\ \; 1 \\ -1 \end{bmatrix}.

Note that we can determine the degrees of freedom of the solution by the number of pivots.

If A is a real matrix, the characteristic polynomial will have real coefficients, but its roots will not necessarily all be real. The complex eigenvalues come in pairs which are conjugates. For a real matrix, the eigenvectors of a non-real eigenvalue z , which are the solutions of (AzI)v = 0, cannot be real.

If v1, ..., vm are eigenvectors with different eigenvalues λ1, ..., λm, then the vectors v1, ..., vm are necessarily linearly independent.

The spectral theorem for symmetric matrices states that if A is a real symmetric n-by-n matrix, then all its eigenvalues are real, and there exist n linearly independent eigenvectors for A which are mutually orthogonal. Symmetric matrices are commonly encountered in engineering.

Our example matrix from above is symmetric, and three mutually orthogonal eigenvectors of A are

v_1 = \begin{bmatrix}\; 1  \\ \;1 \\   -1 \end{bmatrix},\quad v_2 = \begin{bmatrix}\; 0\;\\   1 \\    1 \end{bmatrix},\quad v_3 = \begin{bmatrix}\; 2  \\  -1 \\ \; 1 \end{bmatrix}.

These three vectors form a basis of R³. With respect to this basis, the linear map represented by A takes a particularly simple form: every vector x in R³ can be written uniquely as

x = x1v1 + x2v2 + x3v3

and then we have

Ax = 2x1v1 + x2v2x3v3.

[edit] Advanced methods

A popular method for finding eigenvalues is the QR algorithm, which is based on the QR decomposition. Other advanced methods include:

Most eigenvalue algorithms rely on first reducing the matrix A to Hessenberg or tridiagonal form. This is usually accomplished via reflections.