Schur product theorem

In mathematics, particularly in linear algebra, the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur^[1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik.^[2]^[3])

Proof

Proof using the trace formula

It is easy to show that for matrices $M$ and $N$ , the Hadamard product $M \circ N$ considered as a bilinear form acts on vectors $a, b$ as

a^T (M \circ N) b = \operatorname{Tr}(M \operatorname{diag}(a) N \operatorname{diag}(b))

where $\operatorname{Tr}$ is the matrix trace and $\operatorname{diag}(a)$ is the diagonal matrix having as diagonal entries the elements of $a$ .

Since $M$ and $N$ are positive definite, we can consider their square-roots $M^{1/2}$ and $N^{1/2}$ and write

\operatorname{Tr}(M \operatorname{diag}(a) N \operatorname{diag}(b)) = \operatorname{Tr}(M^{1/2} M^{1/2} \operatorname{diag}(a) N^{1/2} N^{1/2} \operatorname{diag}(b)) = \operatorname{Tr}(M^{1/2} \operatorname{diag}(a) N^{1/2} N^{1/2} \operatorname{diag}(b) M^{1/2})

Then, for $a=b$ , this is written as $\operatorname{Tr}(A^T A)$ for $A = N^{1/2} \operatorname{diag}(a) M^{1/2}$ and thus is positive. This shows that $(M \circ N)$ is a positive definite matrix.

Proof using Gaussian integration

Case of M = N

Let $X$ be an $n$ -dimensional centered Gaussian random variable with covariance $\langle X_i X_j \rangle = M_{ij}$ . Then the covariance matrix of $X_i^2$ and $X_j^2$ is

\operatorname{Cov}(X_i^2, X_j^2) = \langle X_i^2 X_j^2 \rangle - \langle X_i^2 \rangle \langle X_j^2 \rangle

Using Wick's theorem to develop $\langle X_i^2 X_j^2 \rangle = 2 \langle X_i X_j \rangle^2 + \langle X_i^2 \rangle \langle X_j^2 \rangle$ we have

\operatorname{Cov}(X_i^2, X_j^2) = 2 \langle X_i X_j \rangle^2 = 2 M_{ij}^2

Since a covariance matrix is positive definite, this proves that the matrix with elements $M_{ij}^2$ is a positive definite matrix.

General case

Let $X$ and $Y$ be $n$ -dimensional centered Gaussian random variables with covariances $\langle X_i X_j \rangle = M_{ij}$ , $\langle Y_i Y_j \rangle = N_{ij}$ and independt from each other so that we have

\langle X_i Y_j \rangle = 0

for any

i, j

Then the covariance matrix of $X_i Y_i$ and $X_j Y_j$ is

\operatorname{Cov}(X_i Y_i, X_j Y_j) = \langle X_i Y_i X_j Y_j \rangle - \langle X_i Y_i \rangle \langle X_j Y_j \rangle

Using Wick's theorem to develop

\langle X_i Y_i X_j Y_j \rangle = \langle X_i X_j \rangle \langle Y_i Y_j \rangle + \langle X_i Y_i \rangle \langle X_j Y_j \rangle + \langle X_i Y_j \rangle \langle X_j Y_i \rangle

and also using the independence of $X$ and $Y$ , we have

\operatorname{Cov}(X_i Y_i, X_j Y_j) = \langle X_i X_j \rangle \langle Y_i Y_j \rangle = M_{ij} N_{ij}

Since a covariance matrix is positive definite, this proves that the matrix with elements $M_{ij} N_{ij}$ is a positive definite matrix.

Proof using eigendecomposition

Proof of positivity

Let $M = \sum \mu_i m_i m_i^T$ and $N = \sum \nu_i n_i n_i^T$ . Then

M \circ N = \sum_{ij} \mu_i \nu_j (m_i m_i^T) \circ (n_j n_j^T) = \sum_{ij} \mu_i \nu_j (m_i \circ n_j) (m_i \circ n_j)^T

Each $(m_i \circ n_j) (m_i \circ n_j)^T$ is positive (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices) and $\mu_i \nu_j > 0$ , thus the sum giving $M \circ N$ is also positive.

Complete proof

To show that the result is positive definite requires further proof. We shall show that for any vector $a \neq 0$ , we have $a^T (M \circ N) a > 0$ . Continuing as above, each $a^T (m_i \circ n_j) (m_i \circ n_j)^T a \ge 0$ , so it remains to show that there exist $i$ and $j$ for which the inequality is strict. For this we observe that

a^T (m_i \circ n_j) (m_i \circ n_j)^T a = \left(\sum_k m_{i,k} n_{j,k} a_k\right)^2

Since $N$ is positive definite, there is a $j$ for which $n_{j,k} a_k$ is not 0 for all $k$ , and then, since $M$ is positive definite, there is an $i$ for which $m_{i,k} n_{j,k} a_k$ is not 0 for all $k$ . Then for this $i$ and $j$ we have $\left(\sum_k m_{i,k} n_{j,k} a_k\right)^2 > 0$ . This completes the proof.

References

↑ "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik (Crelle's Journal) 1911 (140): 1–00. 1911. doi:10.1515/crll.1911.140.1.
↑ Zhang, Fuzhen, ed. (2005). "The Schur Complement and Its Applications". Numerical Methods and Algorithms 4. doi:10.1007/b105056. ISBN 0-387-24271-6. , page 9, Ch. 0.6 Publication under J. Schur
↑ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society 15 (2): 97–106. doi:10.1112/blms/15.2.97.

External links

Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen at EUDML