The matrix of a linear transformation

Section 5.1 The matrix of a linear transformation

Recall from Example 2.1.4 in Chapter 2 that given any \(m\times n\) matrix \(A\text{,}\) we can define the matrix transformation \(T_A:\R^n\to \R^m\) by \(T_A(\xx)=A\xx\text{,}\) where we view \(\xx\in\R^n\) as an \(n\times 1\) column vector.

Conversely, given any linear map \(T:\R^n\to \R^m\text{,}\) if we let \(\basis{e}{n}\) denote the standard basis of \(\R^n\text{,}\) then the matrix

\begin{equation*} A = \bbm T(\mathbf{e}_1) \amp T(\mathbf{e}_2) \amp \cdots \amp T(\mathbf{e}_n)\ebm \end{equation*}

is such that \(T=T_A\text{.}\)

We have already discussed the fact that this idea generalizes: given a linear transformation \(T:V\to W\text{,}\) where \(V\) and \(W\) are finite-dimensional vector spaces, it is possible to represent \(T\) as a matrix transformation.

The representation depends on choices of bases for both \(V\) and \(W\text{.}\) Recall the definition of the coefficient isomorphism, from Definition 2.3.5 in Section 2.3. If \(\dim V=n\) and \(\dim W=m\text{,}\) this gives us isomorphisms \(C_B:V\to \R^n\) and \(C_D:W\to \R^m\) depending on the choice of a basis \(B\) for \(V\) and a basis \(D\) for \(W\text{.}\) These isomorphisms define a matrix transformation \(T_A:\R^n\to \R^m\) according to the diagram we gave in Figure 2.3.6.

Exercise 5.1.1.

What is the size of the matrix \(A\) used for the matrix transformation \(T_A:\R^n\to \R^m\text{?}\)

\(m\times n\)
Correct! We need to be able to multiply on the right by an \(n\times 1\) column vector, and get an \(m\times 1\) column vector as output.
\(n\times m\)
The domain of \(T_A\) is \(\R^n\text{,}\) and the product \(A\xx\) is only defined if the number of columns (\(m\)) is equal to the dimension of the domain.
\(m\times m\)
The domain of \(T_A\) is \(\R^n\text{,}\) and the product \(A\xx\) is only defined if the number of columns (\(m\)) is equal to the dimension of the domain.
\(n\times n\)
Although the product \(A\xx\) would be defined in this case, the result would be a vector in \(\R^n\text{,}\) and we want a vector in \(\R^m\text{.}\)

We should stress one important point about the coefficient isomorphism, however. It depends on the choice of basis, but also on the order of the basis elements. Thus, we generally will work with an ordered basis in this chapter. That is, rather than simply thinking of our basis as a set, we will think of it as an ordered list. Order matters, since given a basis \(B=\basis{e}{n}\text{,}\) we rely on the fact that we can write any vector \(\vv\) uniquely as

\begin{equation*} \vv = c_1\mathbf{e}_1+\cdots +c_n\mathbf{e}_n \end{equation*}

in order to make the assignment \(C_B(\vv) = \bbm c_1\\\vdots \\c_n\ebm\text{.}\)

Exercise 5.1.2.

Show that the coefficient isomorphism is, indeed, a linear isomorphism from \(V\) to \(\R^n\text{.}\)

Given \(T:V\to W\) and coefficient isomorphisms \(C_B:V\to \R^n, C_D:W\to \R^m\text{,}\) the map \(C_DTC_B^{-1}:\R^n\to \R^m\) is a linear transformation, and the matrix of this transformation gives a representation of \(T\text{.}\) Explicitly, let \(B = \basis{v}{n}\) be an ordered basis for \(V\text{,}\) and let \(D=\basis{w}{m}\) be an ordered basis for \(W\text{.}\) Since \(T(\vv_i)\in W\) for each \(\vv_i\in B\text{,}\) there exist unique scalars \(a_{ij}\text{,}\) with \(1\leq i\leq m\) and \(1\leq j\leq n\) such that

\begin{equation*} T(\vv_j) = a_{1j}\ww_1+a_{2j}\ww_2+\cdots + a_{mj}\ww_m \end{equation*}

for \(j=1,\ldots, n\text{.}\) This gives us the \(m\times n\) matrix \(A = [a_{ij}]\text{.}\) Notice that the first column of \(A\) is \(C_D(T(\vv_1))\text{,}\) the second column is \(C_D(T(\vv_2))\text{,}\) and so on.

Given \(\xx\in V\text{,}\) write \(\xx = c_1\vv_1+\cdots + c_n\vv_n\text{,}\) so that \(C_B(\xx) = \bbm c_1\\\vdots \\c_n\ebm\text{.}\) Then

\begin{equation*} T_A(C_B(\xx)) = \bbm a_{11}\amp a_{12} \amp \cdots \amp a_{1n}\\ a_{21}\amp a_{22} \amp \cdots \amp a_{2n}\\ \vdots \amp \vdots \amp \ddots \amp \vdots\\ a_{m1}\amp a_{m2} \amp \cdots \amp a_{mn}\ebm\bbm c_1\\c_2\\ \vdots \\c_n\ebm = \bbm a_{11}c_1+a_{12}c_2+\cdots +a_{1n}c_n\\ a_{21}c_1+a_{22}c_2+\cdots +a_{2n}c_n\\ \vdots\\ a_{m1}c_1+a_{m2}c_2+\cdots +a_{mn}c_n\ebm\text{.} \end{equation*}

On the other hand,

\begin{align*} T(\xx) \amp = T(c_1\vv_1+\cdots + c_n\vv_n) \\ \amp = c_1T(\vv_1)+\cdots + c_nT(\vv_n)\\ \amp = c_1(a_{11}\ww_1+\cdots + a_{m1}\ww_m)+\cdots c_n(a_{1n}\ww_1+\cdots + a_{mn}\ww_m)\\ \amp = (c_1a_{11}+\cdots + c_na_{1n})\ww_1 + \cdots + (c_1a_{m1}+\cdots + c_na_{mn})\ww_m\text{.} \end{align*}

Therefore,

\begin{equation*} C_D(T(\xx)) = \bbm c_1a_{11}+\cdots + c_na_{1n}\\ \vdots \\ c_1a_{m1}+\cdots + c_na_{mn}\ebm = T_A(C_B(\xx))\text{.} \end{equation*}

Thus, we see that \(C_DT = T_AC_B\text{,}\) or \(T_A = C_DTC_B^{-1}\text{,}\) as expected.

Definition 5.1.3. The matrix \(M_{DB}(T)\) of a linear map.

Let \(V\) and \(W\) be finite-dimensional vector spaces, and let \(T:V\to W\) be a linear map. Let \(B=\basis{v}{n}\) and \(D=\basis{w}{m}\) be ordered bases for \(V\) and \(W\text{,}\) respectively. Then the matrix \(M_{DB}(T)\) of \(T\) with respect to the bases \(B\) and \(D\) is defined by

\begin{equation*} M_{DB}(T) = \bbm C_D(T(\vv_1)) \amp C_D(T(\vv_2)) \amp \cdots \amp C_D(T(\vv_n))\ebm\text{.} \end{equation*}

In other words, \(A=M_{DB}(T)\) is the unique \(m\times n\) matrix such that \(C_DT = T_AC_B\text{.}\) This gives the defining property

\begin{equation*} C_D(T(\vv)) = M_{DB}(T)C_B(\vv) \text{ for all } \vv\in V\text{,} \end{equation*}

as was demonstrated above.

Exercise 5.1.4.

Suppose \(T:P_2(\R)\to \R^2\) is given by

\begin{equation*} T(a+bx+cx^2) = (a+c,2b)\text{.} \end{equation*}

Compute the matrix of \(T\) with respect to the bases \(B = \{1,1-x,(1-x)^2\}\) of \(P_2(\R)\) and \(D = \{(1,0),(1,-1)\}\) of \(\R^2\text{.}\)

When we compute the matrix of a transformation with respect to a non-standard basis, we don’t have to worry about how to write vectors in the domain in terms of that basis. Instead, we simply plug the basis vectors into the transformation, and then determine how to write the output in terms of the basis of the codomain. However, if we want to use this matrix to compute values of \(T:V\to W\text{,}\) then we need a systematic way of writing elements of \(V\) in terms of the given basis.

Example 5.1.5. Working with the matrix of a transformation.

Let \(T:P_2(\R)\to \R^2\) be a linear transformation whose matrix is given by

\begin{equation*} M(T) = \bbm 3\amp 0 \amp 3\\-1\amp -2\amp 2\ebm \end{equation*}

with respect to the ordered bases \(B = \{1+x, 2-x, 2x+x^2\}\) of \(P_2(\R)\) and \(D = \{(0,1),(-1,1)\}\) of \(\R^2\text{.}\) Find the value of \(T(2+3x-4x^2)\text{.}\)

Solution.

We need to write the input \(2+3x-4x^2\) in terms of the basis \(B\text{.}\) This amounts to solving the system of equations given by

\begin{equation*} a(1+x)+b(2-x)+c(2x+x^2)=2+3x-4x^2\text{.} \end{equation*}

Of course, we can easily set up and solve this system, but let’s try to be systematic, and obtain a more useful result for future problems. Since we can easily determine how to write any polynomial in terms of the standard basis \(\{1,x,x^2\}\text{,}\) it suffices to know how to write these three polynomials in terms of our basis.

At first, this seems like more work. After all, we now have three systems to solve:

\begin{align*} a_1(x+1)+b_1(2-x)+c_1(2x+x^2) \amp =1\\ a_2(x+1)+b_2(2-x)+c_2(2x+x^2) \amp =x\\ a_3(x+1)+b_3(2-x)+c_3(2x+x^2) \amp =x^2\text{.} \end{align*}

However, all three systems have the same coefficient matrix, so we can solve them simultaneously, by adding three “constants” columns to our augmented matrix.

We get the matrix

\begin{equation*} \left[\begin{matrix}1\amp 2\amp 0\\1\amp -1\amp 2\\0\amp 0\amp 1\end{matrix} \right\rvert\left.\begin{matrix}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 0\amp 1\end{matrix}\right]\text{.} \end{equation*}

But this is exactly the augmented matrix we’d right down if we were trying to find the inverse of the matrix

\begin{equation*} P=\bbm 1\amp 2\amp 0\\1\amp -1\amp 2\\0\amp 0\amp 1\ebm \end{equation*}

whose columns are the coefficient representations of our given basis vectors in terms of the standard basis.

To compute \(P^{-1}\text{,}\) we use the computer:

Next, we find \(M(T)P^{-1}\text{:}\)

This matrix first converts the coefficient vector for a polynomial \(p(x)\) with respect to the standard basis into the coefficient vector for our given basis \(B\text{,}\) and then multiplies by the matrix representing our transformation. The result will be the coefficient vector for \(T(p(x))\) with respect to the basis \(D\text{.}\)

The polynomial \(p(x) = 2+3x-4x^2\) has coefficient vector \(\bbm 2\\3\\-4\ebm\) with respect to the standard basis. We find that \(M(T)P^{-1}\bbm 2\\3\\-4\ebm = \bbm 12\\-10\ebm\text{:}\)

The coefficients \(12\) and \(-10\) are the coefficients of \(T(p(x))\) with repsect to the basis \(D\text{.}\) Thus,

\begin{equation*} T(2+3x-4x^2) = 12(0,1)-10(-1,1) = (10,2)\text{.} \end{equation*}

Note that in the last step we gave the “simplified” answer \((10,2)\text{,}\) which is simplified primarily in that it is expressed with respect to the standard basis.

Note that we can also introduce the matrix \(Q = \bbm 0\amp -1\\1\amp 1\ebm\) whose columns are the coefficient vectors of the vectors in the basis \(D\) with respect to the standard basis. The effect of multiplying by \(Q\) is to convert from coefficients with respect to \(D\) into a coefficient vector with respect to the standard basis. We can then write a new matrix \(\hat{M}(T) = QM(T)P^{-1}\text{;}\) this new matrix is now the matrix representation of \(T\) with respect to the standard bases of \(P_2(\R)\) and \(\R^2\text{.}\)

We check that

\begin{equation*} \hat{M}(T)\bbm 2\\3\\-4\ebm = \bbm 10\\2\ebm\text{,} \end{equation*}

as before.

We find that \(\tilde{M}(T) = \bbm 1\amp 0\amp -2\\0\amp 2\amp 1\ebm\text{.}\) This lets us determine that for a general polynomial \(p(x) = a+bx+cx^2\text{,}\)

\begin{equation*} \hat{M}(T)\bbm a\\b\\c\ebm = \bbm a-2c\\2b+c\ebm\text{,} \end{equation*}

and therefore, our original transformation must have been

\begin{equation*} T(a+bx+cx^2)=(a-2c,2b+c)\text{.} \end{equation*}

The previous example illustrated some important observations that are true in general. We won’t give the general proof, but we sum up the results in a theorem.

Theorem 5.1.6.

Suppose \(T:V\to W\) is a linear transformation, and suppose \(M_0 = M_{D_0B_0}(T)\) is the matrix of \(T\) with respect to bases \(B_0\) of \(V\) and \(D_0\) of \(W\text{.}\) Let \(B_1=\basis{v}{n}\) and \(D_1=\basis{w}{m}\) be any other choice of basis for \(V\) and \(W\text{,}\) respectively. Let

\begin{align*} P \amp =\bbm C_{B_0}(\vv_1) \amp C_{B_0}(\vv_2) \amp \cdots \amp C_{B_0}(\vv_n)\ebm\\ Q \amp =\bbm C_{D_0}(\ww_1) \amp C_{D_0}(\ww_2) \amp \cdots \amp C_{D_0}(\ww_n)\ebm \end{align*}

be matrices whose columns are the coefficient vectors of the vectors in \(B_1,D_1\) with respect to \(B_0,D_0\text{.}\) Then the matrix of \(T\) with respect to the bases \(B_1\) and \(D_1\) is

\begin{equation*} M_{D_0B_0}(T) = QM_{D_1B_1}(T)P^{-1}\text{.} \end{equation*}

The relationship between the different maps is illustrated in Figure 5.1.7 below. In this figure, the maps \(V\to V\) and \(W\to W\) are the identity maps, corresponding to representing the same vector with respect to two different bases. The vertical arrows are the coefficient isomorphisms \(C_{B_0},C_{B_1},C_{D_0},C_{D_1}\text{.}\)

In the HTML version of the book, you can click and drag to rotate the figure below.

Figure 5.1.7. Diagramming matrix of a transformation with respect to two different choices of basis

We generally apply Theorem 5.1.6 in the case that \(B_0,D_0\) are the standard bases for \(V,W\text{,}\) since in this case, the matrices \(M_0, P, Q\) are easy to determine, and we can use a computer to calculate \(P^{-1}\) and the product \(QM_0P^{-1}\text{.}\)

Exercise 5.1.8.

Suppose \(T:M_{22}(\R)\to P_2(\R)\) has the matrix

\begin{equation*} M_{DB}(T) = \bbm 2\amp -1\amp 0\amp 3\\0\amp 4\amp -5\amp 1\\-1\amp 0\amp 3\amp -2\ebm \end{equation*}

with respect to the bases

\begin{equation*} B = \left\{\bbm 1\amp 0\\0\amp 0\ebm, \bbm 0\amp 1\\0\amp 1\ebm, \bbm 0\amp 1\\1\amp 0\ebm, \bbm 1\amp 0\\0\amp 1\ebm\right\} \end{equation*}

of \(M_{22}(\R)\) and \(D=\{1,x,x^2\}\) of \(P_2(\R)\text{.}\) Determine a formula for \(T\) in terms of a general input \(X=\bbm a\amp b\\c\amp d\ebm\text{.}\)

In textbooks such as Sheldon Axler’s Linear Algebra Done Right that focus primarily on linear transformations, the above construction of the matrix of a transformation with respect to choices of bases can be used as a primary motivation for introducing matrices, and determining their algebraic properties. In particular, the rule for matrix multiplication, which can seem peculiar at first, can be seen as a consequence of the composition of linear maps.

Theorem 5.1.9.

Let \(U,V,W\) be finite-dimensional vectors spaces, with ordered bases \(B_1,B_2,B_3\text{,}\) respectively. Let \(T:U\to V\) and \(S:V\to W\) be linear maps. Then

\begin{equation*} M_{B_3B_1}(ST) = M_{B_3B_2}(S)M_{B_2B_1}(T)\text{.} \end{equation*}

Proof.

Let \(\xx\in U\text{.}\) Then \(C_{B_3}(ST(\xx)) = M_{B_3B_1}(ST)C_{B_1}(\xx)\text{.}\) On the other hand,

\begin{align*} M_{B_3B_2}(S)M_{B_2B_1}(T)C_{B_1}(\xx) \amp = M_{B_3B_2}(S)(C_{B_2}(T(\xx)))\\ \amp = C_{B_3}(S(T(\xx))) = C_{B_3}(ST(\xx))\text{.} \end{align*}

Since \(C_{B_3}\) is invertible, the result follows.

Being able to express a general linear transformation in terms of a matrix is useful, since questions about linear transformations can be converted into questions about matrices that we already know how to solve. In particular,

\(T:V\to W\) is an isomorphism if and only if \(M_{DB}(T)\) is invertible for some (and hence, all) choice of bases \(B\) of \(V\) and \(D\) of \(W\text{.}\)
The rank of \(T\) is equal to the rank of \(M_{DB}(T)\) (and this does not depend on the choice of basis).
The kernel of \(T\) is isomorphic to the nullspace of \(M_{DB}(T)\text{.}\)

Next, we will want to look at two topics in particular. First, if \(T:V\to V\) is a linear operator, then it makes sense to consider the matrix \(M_B(T)=M_{BB}(T)\) obtained by using the same basis for both domain and codomain. Second, we will want to know how this matrix changes if we change the choice of basis.

Exercises Exercises

1.

Let \(\mathcal{P}_{n}\) be the vector space of all polynomials of degree \(n\) or less in the variable \(x\text{.}\)

Let \(D : \mathcal{P}_{3} \to \mathcal{P}_{2}\) be the linear transformation defined by \(D(p(x)) = p'(x)\text{.}\) That is, ( D ) is the derivative transformation. Let

\begin{equation*} \begin{array}{lcl} \mathcal{B} \amp = \amp \lbrace 1,6x,9x^2,5x^3 \rbrace, \\ \mathcal{C} \amp = \amp \lbrace 8,6x,7x^2 \rbrace, \end{array} \end{equation*}

be ordered bases for \(\mathcal{P}_{3}\) and \(\mathcal{P}_{2}\text{,}\) respectively. Find the matrix \(M_{\mathcal{B}\mathcal{C}}(D)\) for \(D\) relative to the basis \(\mathcal{B}\) in the domain and \(\mathcal{C}\) in the codomain.

2.

Let \(\mathcal{P}_{n}\) be the vector space of all polynomials of degree \(n\) or less in the variable \(x\text{.}\) Let \(D : \mathcal{P}_{3} \to \mathcal{P}_{2}\) be the linear transformation defined by \(D(p(x)) = p'(x)\text{.}\) That is, \(D\) is the derivative transformation.

Let

\begin{equation*} \begin{array}{lcl} \mathcal{B} \amp = \amp \lbrace {1}, {x}, {x^{2}}, {x^{3}} \rbrace, \\ \mathcal{C} \amp = \amp \lbrace {-2-x-x^{2}}, {2+2x+x^{2}}, {3+3x+2x^{2}} \rbrace, \end{array} \end{equation*}

be ordered bases for \(\mathcal{P}_{3}\) and \(\mathcal{P}_{2}\text{,}\) respectively. Find the matrix \(M_{\mathcal{B}\mathcal{C}}(D)\) for \(D\) relative to the bases \(\mathcal{B}\) in the domain and \(\mathcal{C}\) in the codomain.

3.

Let

\begin{equation*} \begin{array}{lcl} \mathcal{B} \amp = \amp \lbrace {-1+x+x^{2}+x^{3}}, {-1+2x+x^{2}+x^{3}}, {-3+3x+2x^{2}+2x^{3}}, {-4+4x+3x^{2}+2x^{3}} \rbrace, \\ \mathcal{C} \amp = \amp \lbrace {1}, {x}, {x^{2}} \rbrace, \end{array} \end{equation*}

4.

Let \(f : \mathbb{R}^{3} \to \mathbb{R}^{2}\) be the linear transformation defined by

\begin{equation*} f(\vec{x}) = {\left[\begin{array}{ccc} -1 \amp 1 \amp -3\cr 1 \amp 3 \amp -4 \end{array}\right]} \vec{x}. \end{equation*}

Let

\begin{equation*} \begin{array}{lcl} \mathcal{B} \amp = \amp \lbrace {\left\lt 2,-1,1\right>}, {\left\lt 2,-2,1\right>}, {\left\lt 1,-2,1\right>} \rbrace, \\ \mathcal{C} \amp = \amp \lbrace {\left\lt -1,2\right>}, {\left\lt -3,7\right>} \rbrace, \end{array} \end{equation*}

be bases for \(\mathbb{R}^{3}\) and \(\mathbb{R}^{2}\text{,}\) respectively. Find the matrix \(M_{\mathcal{B}\mathcal{C}}(f)\) for \(f\) relative to the bases \(\mathcal{B}\) in the domain and \(\mathcal{C}\) in the codomain.

5.

Let \(f : \mathbb{R}^{2} \to \mathbb{R}^{3}\) be the linear transformation defined by

\begin{equation*} f(\vec{x}) = {\left[\begin{array}{cc} 0 \amp -3\cr 1 \amp -2\cr 3 \amp 2 \end{array}\right]} \vec{x}. \end{equation*}

Let

\begin{equation*} \begin{array}{lcl} \mathcal{B} \amp = \amp \lbrace {\left\lt -1,-2\right>}, {\left\lt -1,-3\right>} \rbrace, \\ \mathcal{C} \amp = \amp \lbrace {\left\lt 2,-1,1\right>}, {\left\lt -2,2,-1\right>}, {\left\lt 3,-3,2\right>} \rbrace, \end{array} \end{equation*}

be bases for \(\mathbb{R}^{2}\) and \(\mathbb{R}^{3}\text{,}\) respectively. Find the matrix \(M_{\mathcal{B}\mathcal{C}}(f)\) for \(f\) relative to the bases \(\mathcal{B}\) in the domain and \(\mathcal{C}\) in the codomain.

Linear Algebra: A second course, featuring proofs and Python

Search Results: