Pseudoinverse

From Wikipedia, the free encyclopedia

In mathematics, and in particular linear algebra, the pseudoinverse A + of an m \times n matrix A is a generalization of the inverse matrix.[1] More precisely, this article talks about the Moore-Penrose pseudoinverse, which was independently described by E. H. Moore[2] in 1920 and Roger Penrose[3] in 1955. Earlier, Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. The term generalized inverse is sometimes used as a synonym for pseudoinverse.

A common use of the pseudoinverse is to compute a 'best fit' (least squares) solution to a system of linear equations (see below under Applications). The pseudoinverse is defined and unique for all matrices whose entries are real or complex numbers. The pseudoinverse can be computed using the singular value decomposition.

Contents

[edit] Definition

The pseudoinverse A + of a matrix A is the unique matrix satisfying the following criteria:

  1. AA + A = A;
  2. A + AA + = A +       (A + is a weak inverse for the multiplicative semigroup);
  3. (AA + ) * = AA +       (AA + is Hermitian); and
  4. (A + A) * = A + A       (A + A is also Hermitian).

Here M * is the conjugate transpose of a matrix M. For matrices whose elements are real numbers instead of complex numbers, M * = MT.

An alternative way to define the pseudoinverse is via a limiting process:

A^+ = \lim_{\delta \to 0} (A^* A + \delta I)^{-1} A^*           = \lim_{\delta \to 0} A^* (A A^* + \delta I)^{-1}

(see Tikhonov regularization). These limits exist even if (AA * ) − 1 and (A * A) − 1 do not exist.

[edit] Properties

  • Pseudoinversion is reversible. It is its own inverse: (A + ) + = A.
  • The pseudoinverse of a zero matrix is its transpose.
  • Pseudoinversion commutes with transposition, conjugation, and taking the conjugate transpose:
(AT) + = (A + )T,
\overline{A}^+ = \overline{A^+}, and
(A * ) + = (A + ) * .
  • The pseudoinverse of a scalar multiple of A is the reciprocal multiple of A + :
A) + = α − 1A + for \alpha\neq 0.
  • If the pseudoinverse of A * A is already known, it may be used to compute A + :
A + = (A * A) + A * .
  • Likewise, if (AA * ) + is already known:
A + = A * (AA * ) + .

[edit] Special cases

If the columns of A are linearly independent, then A * A is invertible. In this case, an explicit formula is:[1]

A + = (A * A) − 1A * .

In the first limit in the "Definition" section above, the limiting expression is continuous at δ = 0; that is, we can simply substitute δ = 0 in the limiting expression. It follows that A + is a left inverse of A:   A + A = I.

If the rows of A are linearly independent, then AA * is invertible. In this case, an explicit formula is:

A + = A * (AA * ) − 1.

In the second limit in the "Definition" section above, the limiting expression is continuous at δ = 0. It follows that A + is a right inverse of A:   AA + = I.

If both columns and rows are linearly independent (that is, for square nonsingular matrices), the pseudoinverse is just the inverse:

A + = A − 1.

If A and B are such that the product AB is defined and either A or B is unitary, then (AB) + = B + A + . If A and B are such that the product AB is defined, A is of full column rank, and B is of full row rank, then (AB) + = B + A + . The second case here does not cover the first; a unitary matrix must be of full rank, but otherwise there is no assumption made on the matrix it multiplies.

It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar x is zero if x is zero and the reciprocal of x otherwise:

x^+ = \left\{\begin{matrix} 0, & \mbox{if }x=0;  \\ x^{-1}, & \mbox{otherwise}. \end{matrix}\right.

The pseudoinverse of the null vector is the transposed null vector. The pseudoinverse of other vectors is the conjugate transposed vector divided by its squared magnitude:

x^+ = \left\{\begin{matrix} 0^T, & \mbox{if }x = 0;  \\ {x^* \over x^* x}, & \mbox{otherwise}. \end{matrix}\right.

For a proof, simply check that these definitions meet the defining criteria for the pseudoinverse.

[edit] Finding the pseudoinverse of a matrix

Let k be the rank of a m \times n matrix A. Then A can be decomposed as A = BC, where B is a m \times k-matrix and C is a k \times n matrix. Then

A + = C * (CC * ) − 1(B * B) − 1B * .

If A has full row rank, so that k = m, then B can be chosen to be the identity matrix and the formula reduces to A + = A * (AA * ) − 1. Similarly, if A has full column rank (that is, k = n), we have A + = (A * A) − 1A * .

A computationally simpler way to get the pseudoinverse is using the singular value decomposition.[1][4][5]

If A = UΣV * is the singular value decomposition of A, then A + = VΣ + U * . For a diagonal matrix such as Σ, we get the pseudoinverse by taking the reciprocal of each non-zero element on the diagonal.

Optimized approaches exist for calculating the pseudoinverse of block structured matrices.

If a pseudoinverse is already known for a given matrix, and the pseudoinverse is desired for a related matrix, the pseudoinverse for the related matrix can be computed using specialized algorithms that may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithms exist that exploit the relationship.

[edit] Applications

The pseudoinverse provides a least squares solution to a system of linear equations.[6]

Given an overdetermined system Ax = b, we look for a vector x that minimizes \|A x - b\|^2, where \|\,\cdot\,\| denotes the Euclidean norm.

The general solution to an inhomogeneous system Ax = b is the sum of a particular solution of the inhomogeneous system and the general solution of the corresponding homogeneous system Ax = 0.

Lemma: If (AA * ) − 1 exists, then the solution x can always be written as the sum of the pseudoinverse solution of the inhomogeneous system and a solution of the homogeneous system:

x = A * (AA * ) − 1b + (1 − A * (AA * ) − 1A)y.

Proof:

Ax = AA * (AA * ) − 1 b + AyAA * (AA * ) − 1Ay
= b + AyAy
= b .

Here, the vector y is arbitrary (apart from the dimensionality). In both summands, the pseudoinverse A * (AA * ) − 1 appears. If we write it as A + , the equation looks like this:

x = A + b + (1 − A + A)y.

The first summand is the pseudoinverse solution. In the sense of the least squares error, it is the best linear approximation to the actual solution. This means that the correction summand has minimal euclidean norm. The second summand represents a solution of the homogeneous system Ax = 0, because (1 − A + A) is the projection on the kernel (null space) of A, while (A + A) = A * (AA * ) − 1A is the projection onto the image (range) of A (the space spanned by the column vectors of A).

[edit] References

  1. ^ a b c Ben-Israel, Adi; Thomas N.E. Greville (2003). Generalized Inverses. Springer-Verlag. ISBN 0-387-00293-6. 
  2. ^ Moore, E. H. (1920). "On the reciprocal of the general algebraic matrix". Bulletin of the American Mathematical Society 26: 394-395. 
  3. ^ Penrose, Roger (1955). "A generalized inverse for matrices". Proceedings of the Cambridge Philosophical Society 51: 406-413. 
  4. ^ Golub, Gene H.; Charles F. Van Loan (1996). Matrix computations, 3rd edition, Baltimore: Johns Hopkins. ISBN 0-8018-5414-8. 
  5. ^ Linear Systems & Pseudo-Inverse
  6. ^ Penrose, Roger (1956). "On best approximate solution of linear matrix equations". Proceedings of the Cambridge Philosophical Society 52: 17-19. 

[edit] External links