Symbolic computation of matrix eigenvalues

From Wikipedia, the free encyclopedia

In mathematics, and in particular in linear algebra, an important tool for describing eigenvalues of square matrices is the characteristic polynomial: saying that λ is an eigenvalue of A is equivalent to stating that the system of linear equations (A - λI) v = 0 (where I is the identity matrix) has a non-zero solution v (namely an eigenvector), and so it is equivalent to the determinant det (A - λI) being zero. The function p(λ) = det (A - λI) is a polynomial in λ since determinants are defined as sums of products. This is the characteristic polynomial of A: the eigenvalues of a matrix are the zeros of its characteristic polynomial.

(Sometimes the characteristic polynomial is taken to be det (λI - A) instead, which is the same polynomial if V is even-dimensional but has the opposite sign if V is odd-dimensional. This has the slight advantage that its leading coefficient is always 1 rather than -1.)

It follows that we can compute all the eigenvalues of a matrix A by solving the equation pA(λ) = 0. If A is an n-by-n matrix, then pA has degree n and A can therefore have at most n eigenvalues. Conversely, the fundamental theorem of algebra says that this equation has exactly n roots (zeroes), counted with multiplicity. All real polynomials of odd degree have a real number as a root, so for odd n, every real matrix has at least one real eigenvalue. In the case of a real matrix, for even and odd n, the non-real eigenvalues come in conjugate pairs.

An example of a matrix with no real eigenvalues is the 90-degree rotation

\begin{bmatrix}0 & 1\\ -1 & 0\end{bmatrix}

whose characteristic polynomial is x2 + 1 and so its eigenvalues are the pair of complex conjugates i, -i.

The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic polynomial, that is, pA(A) = 0.

Contents

[edit] Eigenvalues of 2×2 matrices

An analytic solution for the eigenvalues of 2×2 matrices can be obtained directly from the quadratic formula: if

A = \begin{bmatrix} a  & b \\ c & d \end{bmatrix}

then the characteristic polynomial is

{\rm det} \begin{bmatrix} a-\lambda & b \\ c & d-\lambda \end{bmatrix}=(a-\lambda)(d-\lambda)-bc=\lambda^2-(a+d)\lambda+(ad-bc)

(notice that the coefficients, up to sign, are the trace tr(A) = a + d and determinant det(A) = adbc so the solutions are

\lambda = \frac{a + d}{2}  \pm \sqrt{\frac{(a + d)^2}{4} + bc - ad} = \frac{a + d}{2}  \pm \frac{\sqrt{4bc + (a - d)^2  }}{2}

A formula for the eigenvalues of a 3x3 matrix or 4x4 matrix could be derived in an analogous way, using the formulae for the roots of a cubic or quartic equation.

[edit] Example computation

The computation of eigenvalue/eigenvector can be computed by the following algorithm.

Consider an n-square matrix A

1. Find the roots of the characteristic polynomial of A. These are the eigenvalues.
  • If n different roots are found, then the matrix can be diagonalized.
2. Find a basis for the kernel of the matrix given by B − λnI. For each of the eigenvalues. These are the eigenvectors
  • The eigenvectors given from different eigenvalues are linearly independent.
  • The eigenvectors given from a root-multiplicity are also linearly independent.

Let us determine the eigenvalues of the matrix

A = \begin{bmatrix}      0 & 1 & -1 \\      1 & 1 &  0 \\     -1 & 0 &  1  \end{bmatrix}

which represents a linear operator R3R3.

[edit] Identifying eigenvalues

We first compute the characteristic polynomial of A:

p(\lambda) = \det( A - \lambda I) = \det \begin{bmatrix}     -\lambda &    1      &   -1 \\         1    & 1-\lambda &    0     \\        -1    &    0      & 1-\lambda \end{bmatrix} = -\lambda^3 + 2\lambda^2 +\lambda - 2.

This polynomial factors to p(λ) = − (λ − 2)(λ − 1)(λ + 1). Therefore, the eigenvalues of A are 2, 1 and −1.

[edit] Identifying eigenvectors

With the eigenvalues in hand, we can solve sets of simultaneous linear equations to determine the corresponding eigenvectors. Since we are solving for the system (A - \lambda I)v = 0\,, if λ = 2 then,

\begin{bmatrix}        -2    & 1   &   -1 \\         1    & -1  &    0 \\        -1    & 0   &   -1 \end{bmatrix} \begin{bmatrix}        v_1 \\        v_2 \\        v_3 \end{bmatrix} = 0.

Now, reducing (A - 2I)\, to row echelon form:

\begin{bmatrix}        -2    & 1   &   -1 \\         1    & -1  &    0 \\        -1    & 0   &   -1 \end{bmatrix} \to \begin{bmatrix}         1 & 0 & 1 \\         0 & 1 & 1 \\         0 & 0 & 0 \end{bmatrix}

allows us to solve easily for the eigenspace E2:

\begin{bmatrix}         1 & 0 & 1 \\         0 & 1 & 1 \\         0 & 0 & 0 \end{bmatrix} \begin{bmatrix}        v_1 \\        v_2 \\        v_3 \end{bmatrix} = 0 \to \begin{cases}     v_1 + v_3 = 0 \\     v_2 + v_3 = 0 \end{cases}
\to E_2 = {\rm span}\begin{bmatrix}1 \\ 1 \\ -1\end{bmatrix}.

We can confirm that a simple example vector chosen from eigenspace E2 is a valid eigenvector with eigenvalue λ = 2:

A \begin{bmatrix} \; 1 \\ \; 1 \\ -1 \end{bmatrix}  = \begin{bmatrix} \; 2 \\ \; 2 \\ -2 \end{bmatrix}  = 2 \begin{bmatrix} \; 1 \\ \; 1 \\ -1 \end{bmatrix}

Note that we can determine the degrees of freedom of the solution by the number of pivots.

If A is a real matrix, the characteristic polynomial will have real coefficients, but its roots will not necessarily all be real. The complex eigenvalues come in pairs which are conjugates. For a real matrix, the eigenvectors of a non-real eigenvalue z , which are the solutions of (AzI)v = 0, cannot be real.

If v1, ..., vm are eigenvectors with different eigenvalues λ1, ..., λm, then the vectors v1, ..., vm are necessarily linearly independent.

The spectral theorem for symmetric matrices states that if A is a real symmetric n-by-n matrix, then all its eigenvalues are real, and there exist n linearly independent eigenvectors for A which are mutually orthogonal. Symmetric matrices are commonly encountered in engineering.

Our example matrix from above is symmetric, and three mutually orthogonal eigenvectors of A are

v_1 = \begin{bmatrix}\; 1  \\ \;1 \\   -1 \end{bmatrix},\quad v_2 = \begin{bmatrix}\; 0\;\\   1 \\    1 \end{bmatrix},\quad v_3 = \begin{bmatrix}\; 2  \\  -1 \\ \; 1 \end{bmatrix}.

These three vectors form a basis of R3. With respect to this basis, the linear map represented by A takes a particularly simple form: every vector x in R3 can be written uniquely as

x = x1v1 + x2v2 + x3v3

and then we have

Ax = 2x1v1 + x2v2x3v3.

[edit] See also