Schur complement
From Wikipedia, the free encyclopedia
In linear algebra and the theory of matrices, the Schur complement of a block of a matrix within a larger matrix is defined as follows. Suppose A, B, C, D are respectively p×p, p×q, q×p and q×q matrices, and D is invertible. Let
so that M is a (p+q)×(p+q) matrix.
Then the Schur complement of the block D of the matrix M is the p×p matrix
- A − BD − 1C.
It is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously[1].
Contents |
[edit] Background
The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with the "lower triangular" block matrix
Here Ip denotes a p×p unit matrix. After multiplication with the matrix L the Schur complement appears in the upper p×p block. The product matrix is
The inverse of M thus may be expressed involving D − 1 and the inverse of Schur's complement (if it exists) only as
If M is a positive-definite symmetric matrix, then so is the Schur complement of D in M.
If p and q are both 1 (i.e. A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2 by 2 matrix:
provided that the determinant AD − BC is non-zero.
[edit] Application to solving linear equations
The Schur complement arises naturally in solving a system of linear equations such as
- Ax + By = a
- Cx + Dy = b
where x, a are p-dimensional column vectors, y, b are q-dimensional column vectors, and A, B, C, D are as above. Multiplying the bottom equation by BD − 1 and then subtracting from the top equation one obtains
Thus if one can invert D as well as the Schur complement of D, one can solve for x, and then by using the equation Cx + Dy = b one can solve for y. This reduces the problem of inverting a matrix to that of inverting a p×p matrix and a q×q matrix. In practice one needs D to be well-conditioned in order for this algorithm to be numerically accurate.
[edit] Applications to probability theory and statistics
Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn+m has a multivariate normal distribution whose variance is the symmetric positive-definite matrix
Then the conditional variance of X given Y is the Schur complement of C in V:
If we take the matrix V above to be, not a variance of a random vector, but a sample variance, then it may have a Wishart distribution. In that case, the Schur complement of C in V also has a Wishart distribution.
[edit] References
- ^ Zhang, Fuzhen (2005). The Schur Complement and Its Applications. Springer. ISBN 0387242716.