Weyr canonical form

The image shows an example of a general Weyr matrix consisting of two blocks each of which is a basic Weyr matrix. The basic Weyr matrix in the top-left corner has the structure (4,2,1) and the other one has the structure (2,2,1,1).

In mathematics, in linear algebra, a Weyr canonical form (or, Weyr form or Weyr matrix) is a square matrix satisfying certain conditions. A square matrix is said to be in the Weyr canonical form if the matrix satisfies the conditions defining the Weyr canonical form. The Weyr form was discovered by the Czech mathematician Eduard Weyr in 1885.[1][2][3] The Weyr form did not become popular among mathematicians and it was overshadowed by the closely related, but distinct, canonical form known by the name Jordan canonical form.[3] The Weyr form has been rediscovered several times since Weyr’s original discovery in 1885.[4] This form has been variously called as modified Jordan form, reordered Jordan form, second Jordan form, and H-form.[4] The current terminology is credited to Shapiro who introduced it in a paper published in the American Mathematical Monthly in 1999.[4][5]

Recently several applications have been found for the Weyr matrix. Of particular interest is an application of the Weyr matrix in the study of phylogenetic invariants in biomathematics.

Definitions

Basic Weyr matrix


Definition

A basic Weyr matrix with eigenvalue \lambda is an n\times n matrix W of the following form: There is a partition

n_1 + n_2+ \cdots +n_r=n of n with n_1\ge n_2\ge \cdots \ge  n_r\ge 1

such that, when W is viewed as an  r \times r blocked matrix (W_{ij}), where the  (i, j) block  W_{ij} is an n_i \times n_j matrix, the following three features are present:

  1. The main diagonal blocks  W_{ii} are the n_i\times  n_i scalar matrices \lambda I for i = 1, \ldots  , r.
  2. The first superdiagonal blocks W_{i,i+1} are full column rank n_i \times n_{i+1} matrices in reduced row-echelon form (that is, an identity matrix followed by zero rows) for  i=1, \ldots, r-1 .
  3. All other blocks of W are zero (that is,  W_{ij} = 0 when j \ne  i, i + 1).

In this case, we say that W has Weyr structure (n_1, n_2, \ldots  , n_r).

Example

The following is an example of a basic Weyr matrix.

W =  =
\begin{bmatrix}
W_{11} & W_{12} &  &    \\
       & W_{22} & W_{23} &    \\
       &        & W_{33} & W_{34}   \\
       &        &        & W_{44}  \\
\end{bmatrix}

In this matrix,  n=10 and  n_1=4, n_2=2, n_3=2, n_4=1. So  W has the Weyr structure (4,2,2,1). Also,


W_{11} =
\begin{bmatrix}
\lambda &      0 &       0 &       0 \\
   0     &\lambda &       0 &       0 \\
   0     &    0    & \lambda &       0 \\
   0     &    0    &     0    & \lambda \\
\end{bmatrix} = \lambda I_4, \quad
W_{22} =
\begin{bmatrix}
\lambda &      0 \\
    0    &\lambda & \\
 \end{bmatrix} = \lambda I_2, \quad
W_{33} =
\begin{bmatrix}
\lambda &      0 \\
    0    &\lambda & \\
 \end{bmatrix} =\lambda I_2, \quad
W_{44} =
\begin{bmatrix}
\lambda \\
 \end{bmatrix} = \lambda I_1

and


W_{12}=
\begin{bmatrix}
1 & 0 \\
0 & 1\\
0 & 0\\
0 & 0\\
\end{bmatrix}, \quad
W_{23}=
\begin{bmatrix}
1 & 0 \\
0& 1\\
\end{bmatrix},\quad
W_{34} =
\begin{bmatrix}
1 \\
0 \\
\end{bmatrix}.

General Weyr matrix


Definition

Let  W be a square matrix and let \lambda_1, \ldots, \lambda_k  be the distinct eigenvalues of W  . We say that  W is in Weyr form (or is a Weyr matrix) if  W has the following form:


W =
\begin{bmatrix}
W_1 &     &        &    \\
    & W_2 &        &    \\
    &     & \ddots &    \\
    &     &        & W_k \\
\end{bmatrix}

where  W_i is a basic Weyr matrix with eigenvalue  \lambda_i for  i = 1, \ldots , k.

Example

The following image shows an example of a general Weyr matrix consisting of three basic Weyr matrix blocks. The basic Weyr matrix in the top-left corner has the structure (4,2,1) with eigenvalue 4, the middle block has structure (2,2,1,1) with eigenvalue -3 and the one in the lower-right corner has the structure (3, 2) with eigenvalue 0.

The Weyr form is canonical

That the weyr form is a canonical form of a matrix is a consequence of the following result:[3] To within permutation of basic Weyr blocks, each square matrix A over an algebraically closed field is similar to a unique Weyr matrix W. The matrix W is called the Weyr (canonical ) form of A.

Computation of the Weyr canonical form

Reduction to the nilpotent case

Let A be a square matrix of order n over an algebraically closed field and let the distinct eigenvalues of A be \lambda_1, \lambda_2, \ldots, \lambda_k. As a consequence of the generalized eigenspace decomposition theorem, one can show that A is similar to a block diagonal matrix of the form


A=
\begin{bmatrix}
\lambda_1I + N_1&   &   &    \\
    & \lambda_2I + N_2 &  &  \\
    &      & \ddots & \\
    &      &        & \lambda_kI + N_k \\
\end{bmatrix}
=
\begin{bmatrix}
\lambda_1I &   &   &    \\
    & \lambda_2I  &  &  \\
    &      & \ddots & \\
    &      &        & \lambda_kI  \\
\end{bmatrix}
+
\begin{bmatrix}
 N_1&   &   &    \\
    &  N_2 &  &  \\
    &      & \ddots & \\
    &      &        &  N_k \\
\end{bmatrix}
=
D+N

where D is a diagonal matrix and N is a nilpotent matrix. So the problem of reducing A to the Weyr form reduces to the problem of reducing the nilpotent matrices N_i to the Weyr form.

Reduction of a nilpotent matrix to the Weyr form


Given a nilpotent square matrix A of order  n over an algebraically closed field  F, the following algorithm produces an invertible matrix  C and a Weyr matrix  W such that W=C^{-1}AC.

Step 1

Let A_1=A

Step 2

  1. Compute a basis for the null space of A_1.
  2. Extend the basis for the null space of A_1 to a basis for the n-dimensional vector space F^n.
  3. Form the matrix P_1 consisting of these basis vectors.
  4. Compute  P_1^{-1}A_1P_1=\begin{bmatrix}0 & B_2 \\ 0 & A_2 \end{bmatrix}. A_2 is a square matrix of size n nullity (A_1).

Step 3

If A_2 is nonzero, repeat Step 2 on A_2.

  1. Compute a basis for the null space of A_2.
  2. Extend the basis for the null space of A_2 to a basis for the vector space having dimension n nullity (A_1).
  3. Form the matrix P_2 consisting of these basis vectors.
  4. Compute  P_2^{-1}A_2P_2=\begin{bmatrix}0 & B_3 \\ 0 & A_3 \end{bmatrix}. A_2 is a square matrix of size n nullity (A_1) nullity(A_2).

Step 4

Continue the processes of Steps 1 and 2 to obtain increasingly smaller square matrices A_1, A_2, A_3, \ldots and associated nvertible matrices P_1, P_2, P_3, \ldots until the first zero matrix A_r is obtained.

Step 5

The Weyr structure of A is (n_1,n_2, \ldots, n_r) where n_i = nullity(A_i).

Step 6

  1. Compute the matrix  P = P_1 \begin{bmatrix} I & 0 \\ 0 & P_2 \end{bmatrix}\begin{bmatrix} I & 0 \\ 0 & P_3 \end{bmatrix}\cdots \begin{bmatrix} I & 0 \\ 0 & P_r \end{bmatrix} (here the I's are appropriately sized identity matrices).
  2. Compute X=P^{-1}AP. X is a matrix of the following form:
 X = \begin{bmatrix}0 & X_{12} & X_{13} & \cdots & X_{1,r-1} &X_{1r}\\  & 0 & X_{23} & \cdots & X_{2,r-1} & X_{2r}\\  &  &  & \ddots & \\ & & & \cdots & 0& X_{r-1,r} \\ & & & & & 0 \end{bmatrix}.

Step 7

Use elementary row operations to find an invertible matrix  Y_{r-1} of appropriate size such that the product Y_{r-1}X_{r,r-1} is a matrix of the form I_{r,r-1}= \begin{bmatrix} I \\ O \end{bmatrix}.

Step 8

Set Q_1= diag (I,I, \ldots, Y_{r-1}^{-1}, I) and compute  Q_1^{-1}XQ_1. In this matrix, the (r,r-1)-block is I_{r,r-1}.

Step 9

Find a matrix R_1 formed as a product of elementary matrices such that  R_1^{-1} Q_1^{-1}XQ_1R_1 is a matrix in which all the blocks above the block I_{r,r-1} contain only 0's.

Step 10

Repeat Steps 8 and 9 on column  r-1 converting (r-1, r-2)-block to I_{r-1,r-2} via conjugation by some invertible matrix Q_2. Use this block to clear out the blocks above, via conjugation by a product R_2 of elementary matrices.

Step 11

Repeat these processes on r-2,r-3,\ldots , 3, 2 columns, using conjugations by  Q_3, R_3,\ldots , Q_{r-2}, R_{r-2}, Q_{r-1} . The resulting matrix W is now in Weyr form.

Step 12

Let  C = P_1 \text{diag} (I, P_2) \cdots \text{diag}(I, P_{r-1})Q_1R_1Q_2\cdots  R_{r-2}Q_{r-1}. Then  W = C^{-1}AC.

Applications of the Weyr form

Some well-known applications of the Weyr form are listed below:[3]

  1. The Weyr form can be used to simplify the proof of Gerstenhaber’s Theorem which asserts that the subalgebra generated by two commuting n \times n matrices has dimension at most n.
  2. A set of finite matrices is said to be approximately simultaneously diagonalizable if they can be perturbed to simultaneously diagonalizable matrices. The Weyr form is used to prove approximate simultaneous diagonalizability of various classes of matrices. The approximate simultaneous diagonalizability property has applications in the study of phylogenetic invariants in biomathematics.
  3. The Weyr form can be used to simplify the proofs of the irreducibility of the variety of all k-tuples of commuting complex matrices.

References

  1. Eduard Weyr (1985). "Répartition des matrices en espèces et formation de toutes les espèces". Comptes Rendus, Paris 100: 966–969. Retrieved 10 December 2013.
  2. Eduard Weyr (1890). "Zur Theorie der bilinearen Formen". Monatsh. Math. Physik 1: 163–236.
  3. 3.0 3.1 3.2 3.3 Kevin C. Meara, John Clark, Charles I. Vinsonhaler (2011). Advanced Topics in Linear Algebra: Weaving Matrix Problems through the Weyr Form. Oxford University Press.
  4. 4.0 4.1 4.2 Kevin C. Meara, John Clark, Charles I. Vinsonhaler (2011). Advanced Topics in Linear Algebra: Weaving Matrix Problems through the Weyr Form. Oxford University Press. pp. 44, 81–82.
  5. Shapiro, H. (1999). "The Weyr characteristic". The American Mathematical Monthly 106: 919–929. doi:10.2307/2589746.