Jordan Canonical Form

Section 5.6 Jordan Canonical Form

The results of Theorem 5.5.1 and Theorem 5.5.3 tell us that for an eigenvalue \(\lambda\) of \(T:V\to V\) with multiplicity \(m\text{,}\) we have a sequence of subspace inclusions

\begin{equation*} E_\lambda(T) = \ker (T-\lambda I)\subseteq \ker (T-\lambda I)^2 \subseteq \cdots \subseteq \ker (T-\lambda I)^m = G_\lambda(T)\text{.} \end{equation*}

Not all subspaces in this sequence are necessarily distinct. Indeed, it is entirely possible that \(\dim E_\lambda(T)=m\text{,}\) in which case \(E_\lambda(T)=G_\lambda(T)\text{.}\) In geeral there will be some \(l\leq m\) such that \(\ker (T-\lambda I)^l=G_\lambda(T)\text{.}\)

Our goal in this section is to determine a basis for \(G_\lambda(T)\) in a standard way. We begin with a couple of important results, which we state without proof. The first can be found in Axler’s book; the second in Nicholson’s.

Theorem 5.6.1.

Suppose \(V\) is a complex vector space, and \(T:V\to V\) is a linear operator. Let \(\lambda_1,\ldots, \lambda_k\) denote the distinct eigenvalues of \(T\text{.}\) (We can assume \(V\) is real if we also assume that all eigenvalues of \(V\) are real.) Then:

Generalized eigenvectors corresponding to distinct eigenvalues are linearly independent.
\(\displaystyle V = G_{\lambda_1}(T)\oplus G_{\lambda_2}(T)\oplus \cdots \oplus G_{\lambda_k}(T)\)
Each generalize eigenspace \(G_{\lambda_j}(T)\) is \(T\)-invariant.
Each restriction \((T-\lambda_j)|_{G_{\lambda_j}(T)}\) is nilpotent.

Theorem 5.6.2.

Let \(T:V\to V\) be a linear operator. If the characteristic polynomial of \(T\) is given by

\begin{equation*} c_T(x) = (x-\lambda_1)^{m_1}(x-\lambda_2)^{m_2}\cdots (x-\lambda_k)^{m_k}\text{,} \end{equation*}

then \(\dim G_{\lambda_j}(T)=m_j\) for each \(j=1,\ldots, k\text{.}\)

Moreover, if we let \(B=B_1\cup B_2\cup\cdots \cup B_k\text{,}\) where \(B_j\) is any basis for \(G_{\lambda_j}(T)\) for \(j=1,\ldots, k\text{,}\) then \(B\) is a basis for \(V\) (this follows immediately from Theorem 5.6.1) and the matrix of \(T\) with respect to this basis has the block-diagonal form

\begin{equation*} M_B(T) = \bbm A_1 \amp 0 \amp \cdots \amp 0\\ 0 \amp A_2 \amp \cdots \amp 0\\ \vdots\amp\vdots\amp\ddots\amp\vdots\\ 0 \amp 0 \amp \cdots \amp A_k\ebm\text{,} \end{equation*}

where each \(A_j\) has size \(m_j\times m_j\text{.}\)

A few remarks are called for here.

One of the ways to see that \(\dim G_{\lambda_j}(T)=m_j\) is to consider \((M_B(T)-\lambda_j I_n)^{m_j}\text{.}\) This will have the form \(\diag(U_1^{m_j}, U_2^{m_j},\ldots, U_k^{m_j})\text{,}\) where \(U_i\) is the matrix of \((T-\lambda_j)^{m_j}\text{,}\) restricted to \(G_{\lambda_i}(T)\text{.}\) If \(i\neq j\text{,}\) \(T-\lambda_j I\) restricts to an invertible operator on \(G_{\lambda_i}(T)\text{,}\) but its restriction to \(G_{\lambda_j}(T)\) is nilpotent, by Theorem 5.6.1. So \(U_j\) is nilpotent (with \(U_j^{m_j}=0\)), and has size \(m_j\times m_j\text{,}\) while \(U_i\) is invertible if \(i\neq j\text{.}\) The matrix \((M_B(T)-\lambda_j I)^{m_j}\) thus ends up with a \(m_j\times m_j\) block of zeros, so \(\dim \ker (T-\lambda_j I)^{m_j}=m_j\text{.}\)
If the previous point wasn’t clear, note that with an appropriate choice of basis, the block \(A_i\) in Theorem 5.6.2 has the form

\begin{equation*} A_i = \bbm \lambda_i \amp \ast \amp \cdots \amp \ast\\ 0 \amp \lambda_i \amp \cdots \amp \ast\\ \vdots \amp \vdots \amp \ddots\amp \vdots\\ 0 \amp 0 \amp \cdots \amp \lambda_i\ebm\text{.} \end{equation*}

Thus, \(M_B(T)-\lambda_j I\) will have blocks that are upper triangular, with diagonal entries \(\lambda_i-\lambda_j\neq 0\) when \(i\neq j\text{,}\) but when \(i=j\) we get a matrix that is strictly upper triangular, and therefore nilpotent, since its diagonal entries will be \(\lambda_j-\lambda_j=0\text{.}\)
if \(l_j\) is the least integer such that \(\ker (A-\lambda_j I)^{l_j}=G_{\lambda_j}(T)\text{,}\) then it is possible to choose the basis of \(G_{\lambda_j}(T)\) so that \(A_j\) is itself block-diagonal, with the largest block having size \(l_j\times l_j\text{.}\) The remainder of this section is devoted to determining how to choose such a basis.

The basic principle for choosing a basis for each generalized eigenspace is as follows. We know that \(E_{\lambda}(T)\subseteq G_\lambda(T)\) for each eigenvalue \(\lambda\text{.}\) So we start with a basis for \(E_\lambda(T)\text{,}\) by finding eigenvectors as usual. If \(\ker (T-\lambda I)^2 = \ker (T-\lambda I)\text{,}\) then we’re done: \(E_\lambda(T)=G_\lambda(T)\text{.}\) Otherwise, we enlarge the basis for \(E_\lambda(T)\) to a basis of \(\ker T(-\lambda I)^2\text{.}\) If \(\ker T(-\lambda I)^3=\ker (T-\lambda I)^2\text{,}\) then we’re done, and \(G_\lambda(T) = \ker (T-\lambda I)^2\text{.}\) If not, we enlarge our existing basis to a basis of \(\ker (T-\lambda I)^3\text{.}\) We continue this process until we reach some power \(l\) such that \(\ker (T-\lambda I)^l = \ker (T-\lambda I)^{l+1}\text{.}\) (This is guaranteed to happen by Theorem 5.5.1.) We then conclude that \(G_\lambda(T) = \ker (T-\lambda I)^l\text{.}\)

The above produces a basis for \(G_\lambda(T)\text{,}\) but we want what is, in some sense, the “best” basis. For our purposes, the best basis is the one in which the matrix of \(T\) restricted to each generalized eigenspace is block diagonal, where each block is a Jordan block.

Definition 5.6.3.

Let \(\lambda\) be a scalar. A Jordan block is an \(m\times m\) matrix of the form

\begin{equation*} J(m,\lambda) = \bbm \lambda \amp 1 \amp 0 \amp \cdots \amp 0\\ 0 \amp \lambda \amp 1 \amp \cdots \amp 0\\ \vdots \amp \vdots \amp \ddots \amp \ddots \amp \vdots\\ 0 \amp 0 \amp \cdots \amp \lambda \amp 1\\ 0 \amp 0 \amp 0 \amp \cdots \amp \lambda\ebm\text{.} \end{equation*}

That is \(J(m,\lambda)\) has each diagonal entry equal to \(\lambda\text{,}\) and each “superdiagonal” entry (those just above the diagonal) equal to 1, with all other entries equal to zero.

Example 5.6.4.

The following are examples of Jordan blocks:

\begin{equation*} J(2,4)=\bbm 4 \amp 1\\ 0\amp 4\ebm, J(3,\sqrt{2})=\bbm \sqrt{2} \amp 1\amp 0\\0\amp \sqrt{2}\amp 1\\0\amp 0\amp \sqrt{2}\ebm, J(4,2i)=\bbm 2i \amp 1\amp 0\amp 0\\0\amp 2i\amp 1\amp 0\\0\amp 0\amp 2i\amp 1\\0\amp 0\amp 0\amp 2i\ebm\text{.} \end{equation*}

Insight 5.6.5. Finding a chain basis.

A Jordan block corresponds to basis vectors \(\vv_1,\vv_2,\ldots, \vv_m\) with the following properties:

\begin{align*} T(\vv_1) \amp = \lambda \vv_1\\ T(\vv_2) \amp = \vv_1+\lambda \vv_2\\ T(\vv_3) \amp = \vv_2+\lambda \vv_3\text{,} \end{align*}

and so on. Notice that \(\vv_1\) is an eigenvector, and for each \(j=2,\ldots, m\text{,}\)

\begin{equation*} (T-\lambda I)\vv_{j} = \vv_{j-1}\text{.} \end{equation*}

Notice also that if we set \(N=T-\lambda I\text{,}\) then

\begin{equation*} \vv_1 = N\vv_2, \vv_2 = N\vv_3, \ldots, \vv_{m-1} = N\vv_m \end{equation*}

so our basis for \(G_\lambda(T)\) is of the form

\begin{equation*} \vv, N\vv, N^2\vv, \ldots, N^{m-1}\vv\text{,} \end{equation*}

where \(\vv = \vv_m\text{,}\) and \(\vv_1=N^{m-1}\vv\) is an eigenvector. (Note that \(N^m\vv = (T-\lambda I)\vv_1=\zer\text{,}\) and indeed \(N^m\vv_j=\zer\) for each \(j=1,\ldots, m\text{.}\)) Such a basis is known as a chain basis.

If \(\dim E_\lambda(T)\gt 1\) we might have to repeat this process for each eigenvector in a basis for the eigenspace. The full matrix of \(T\) might have several Jordan blocks of possibly different sizes for each eigenvalue.

Example 5.6.6.

Determine a Jordan basis for the operator \(T:\R^5\to \R^5\) whose matrix with respect to the standard basis is given by

\begin{equation*} A = \bbm 7\amp 1 \amp -3\amp 2\amp 1\\ -6\amp 2\amp 4\amp -2\amp -2\\ 0 \amp 1\amp 3 \amp 1\amp -1\\ -8\amp -1\amp 6\amp 0 \amp-3\\ -4\amp 0\amp 3\amp -1\amp 1\ebm \end{equation*}

Solution.

First, we need the characteristic polynomial.

The characteristic polynomial of \(A\) is given by

\begin{equation*} c_A(x)=(x-2)^2(x-3)^3\text{.} \end{equation*}

We thus have two eigenvalues: \(2\text{,}\) of multiplicity \(2\text{,}\) and \(3\text{,}\) of multiplicity \(3\text{.}\) We next find the \(E_2(A)\) eigenspace.

The computer gives us

\begin{equation*} E_2(A)=\nll(A-2I) = \spn\{\xx_1\}, \text{ where } \xx_1=\bbm -1\\0\\-1\\1\\0\ebm\text{,} \end{equation*}

so we have only one independent eigenvector, which means that \(G_2(A)=\nll(A-2I)^2\text{.}\)

Following Insight 5.6.5, we extend \(\{\xx_1\}\) to a basis of \(G_2(A)\) by solving the system

\begin{equation*} (A-2I)\xx = \xx_1\text{.} \end{equation*}

Using the results above from the computer (or Gaussian elimination), we find a general solution

\begin{equation*} \xx = \bbm -t\\-1\\-t\\t\\0\ebm = t\bbm -1\\0\\-1\\1\\0\ebm + \bbm 0\\-1\\0\\0\\0\ebm\text{.} \end{equation*}

Note that our solution is of the form \(\xx = t\xx_1+\xx_2\text{.}\) We set \(t=0\text{,}\) and get \(\xx_2 = \bbm 0\amp -1\amp 0\amp 0\amp 0\ebm^T\text{.}\)

Next, we consider the eigenvalue \(\lambda=3\text{.}\) The computer gives us the following:

Rescaling to remove fractions, we find

\begin{equation*} E_3(A) = \nll(A-3I) = \spn\{\yy_1,\yy_2\}, \text{ where } \yy_1 = \bbm 1\\-2\\2\\2\\0\ebm, \yy_2 = \bbm -1\\2\\0\\0\\2\ebm\text{.} \end{equation*}

Again, we’re one eigenvector short of the multiplicity, so we need to consider \(G_3(A)=\nll(A-3I)^3\text{.}\)

In the next cell, note that we doubled the eigenvectors in E3 to avoid fractions. To follow the solution in our example, we append 2*E3[0], and reduce the resulting matrix. You should find that using the eigenvector \(\yy_1\) corresponding to E3[0] leads to an inconsistent system. Once you confirm this, replace E3[0] with E3[1] and re-run the cell to see that we get an inconsistent system using \(\yy_2\) as well!

The systems \((A-3I)\yy = \yy_1\) and \((A-3I)\yy=\yy_2\) are both inconsistent, but we can salvage the situation by replacing the eigenvector \(\yy_2\) by some linear combination \(\zz_2 = a\yy_1+b\yy_2\text{.}\) We row-reduce, and look for values of \(a\) and \(b\) that give a consistent system.

The rref command takes things a bit farther than we’d like, so we use the command echelon_form() instead. Note that if \(a=b\text{,}\) the system is inconsistent.

We find that \(a=b\) does the job, so we set

\begin{equation*} \zz_2 = \yy_1+\yy_2 = \bbm 0\\0\\2\\2\\2\ebm\text{.} \end{equation*}

Solving the system \((A-3I)\zz = \yy_1+\yy_2\text{,}\) using the code above, we find

\begin{align*} \zz \amp = \bbm \frac12 +\frac12 s-\frac12 t\\1-s+t\\1+s\\s\\t\ebm\\ \amp = \bbm \frac12\\1\\1\\0\\0\ebm + s\bbm\frac12\\-1\\1\\1\\0\ebm+t\bbm -\frac12\\1\\0\\0\\1\ebm\\ \amp = \bbm \frac12\\1\\1\\0\\0\ebm \frac{s}{2}\yy_1+\frac{t}{2}\yy_2\text{.} \end{align*}

We let \(\zz_3 = \bbm 1 \\ 2 \\ 2 \\ 0 \\ 0\ebm\text{,}\) and check that

\begin{equation*} A\zz_3 = 3\zz_3+\zz_2\text{,} \end{equation*}

as required:

This gives us the basis \(B = \{\xx_1,\xx_2,\yy_1,\zz_2,\zz_3\}\) for \(\R^5\text{,}\) and with respect to this basis, we have the Jordan canonical form

\begin{equation*} M_B(T) = \bbm 2 \amp 1\amp 0\amp 0 \amp 0\\ 0 \amp 2\amp 0\amp 0 \amp 0\\ 0 \amp 0\amp 3\amp 0 \amp 0\\ 0 \amp 0\amp 0\amp 3 \amp 1\\ 0 \amp 0\amp 0\amp 0 \amp 3\ebm\text{.} \end{equation*}

Now that we’ve done all the work required for Example 5.6.6, we should confess that there was an easier way all along:

The jordan_form() command returns a pair \(P,J\text{,}\) where \(J\) is the Jordan canonical form of \(A\text{,}\) and \(P\) is an invertible matrix such that \(P^{-1}AP=J\text{.}\) You might find that the computer’s answer is not quite the same as ours. This is because the Jordan canonical form is only unique up to permutation of the Jordan blocks. Changing the order of the blocks amounts to changing the order of the columns of \(P\text{,}\) which are given by a basis of (generalized eigenvectors).

Exercise 5.6.7.

Determine a Jordan basis for the linear operator \(T:\R^4\to\R^4\) given by

\begin{equation*} T(w,x,y,z)=(w+x,x,-x+2y,w-x+y+z)\text{.} \end{equation*}

A code cell is given below in case you want to try performing the operations demonstrated in Example 5.6.6.

One final note: we mentioned above that the minimal polynomial of an operator has the form

\begin{equation*} m_T(x)=(x-\lambda_1)^{l_1}(x-\lambda_2)^{l_2}\cdots (x-\lambda_k)^{l_k}\text{,} \end{equation*}

where for each \(j=1,2,\ldots, k\text{,}\) \(l_j\) is the size of the largest Jordan block corresponding to \(\lambda_j\text{.}\) Knowing the minimal polynomial therefore tells as a lot about the Jordan canonical form, but not everything. Of course, if \(l_j=1\) for all \(j\text{,}\) then our operator can be diagaonalized. If \(\dim V\leq 4\text{,}\) the minimal polynomial tells us everything, except for the order of the Jordan blocks.

In Exercise 5.6.7, the minimal polynomial is \(m_T(x)=(x-1)^3(x-2)\text{,}\) the same as the characteristic polynomial. If we knew this in advance, then the only possible Jordan canonical forms would be

\begin{equation*} \bbm 1\amp 1\amp 0\amp 0\\ 0\amp 1\amp 1\amp 0\\ 0\amp 0\amp 1\amp 0\\ 0\amp 0\amp 0\amp 2\ebm \text{ or } \bbm 2\amp 0\amp 0\amp 0\\ 0\amp 1\amp 1\amp 0\\ 0\amp 0\amp 1\amp 1\\ 0\amp 0\amp 0\amp 1\ebm\text{.} \end{equation*}

If instead the minimal polynomial had turned out to be \((x-1)^2(x-2)\) (with the same characteristic polynomial), then, up to permutation of the Jordan blocks, our Jordan canonical form would be

\begin{equation*} \bbm 1\amp 0\amp 0\amp 0\\0\amp 1\amp 1\amp 0\\0\amp 0\amp 1\amp 0\\0\amp 0\amp 0\amp 2\ebm\text{.} \end{equation*}

However, once we hit matrices of size \(5\times 5\) or larger, some ambiguity creeps in. For example, suppose \(c_A(x) = (x-2)^5\) with \(m_A(x)=(x-2)^2\text{.}\) Then the largest Jordan block is \(2\times 2\text{,}\) but we could have two \(2\times 2\) blocks and a \(1\times 1\text{,}\) or three \(1\times 1\) blocks and one \(2\times 2\text{.}\)

Exercises Exercises

1.

Find the minimal polynomial \(m(x)\) of \({\left[\begin{array}{cccc} 2 \amp -2 \amp 5 \amp 2\cr 0 \amp -4 \amp 0 \amp 1\cr 0 \amp -3 \amp -3 \amp 3\cr 0 \amp -1 \amp 0 \amp -2 \end{array}\right]}\text{.}\)

2.

Let

\begin{equation*} A={\left[\begin{array}{cccc} 28 \amp -34 \amp -65 \amp 24\cr 3 \amp -4 \amp -10 \amp 3\cr 3 \amp -2 \amp -3 \amp 3\cr -15 \amp 22 \amp 41 \amp -11 \end{array}\right]}. \end{equation*}

Find a matrix \(P\) such that \(D=P^{-1} A P\) is the Jordan canonical form of \(A\text{.}\)

3.

Let

\begin{equation*} A={\left[\begin{array}{cccc} -6 \amp 2 \amp 6 \amp -8\cr 0 \amp -2 \amp 4 \amp -4\cr -8 \amp 6 \amp 16 \amp -28\cr -5 \amp 4 \amp 13 \amp -22 \end{array}\right]}. \end{equation*}