Chapter 3 Change of Basis
3.1 Invertible Matrix
Definition 3.1 A square matrix \(A \in \mathrm{M}_n(\mathbf{K})\) is invertible if there exists a matrix \(B \in \mathrm{M}_n(\mathbf{K})\) such that \(AB = BA = I_n\).
In this case, \(B\) is called the inverse of \(A\) and is denoted \(A^{-1}\).
Example 3.1 Here are some simple examples.
- The matrix \(I_n\) is invertible, with itself as its inverse because \(I_n^2 = I_n\).
- The matrix \(\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\) is invertible with inverse \(\begin{bmatrix} 1 & -1 \\ 0 & 1 \end{bmatrix}\).
Proposition 3.1 A square matrix \(A \in \mathrm{M}_n(\mathbf{K})\) is invertible if and only if the associated linear transformation is bijective.
Proof. Let \(f\) be the linear transformation associated with the matrix \(A\). Then \(f\) is bijective if and only if there exists an inverse transformation \(g: \mathbf{K}^n \to \mathbf{K}^n\) such that \(f \circ g = g \circ f = \mathrm{id}\). We need to show that \(g\) is linear:
Let \(v_1, v_2 \in \mathbf{K}^n\), and let \(u_1 = g(v_1)\) and \(u_2 = g(v_2)\). Then \(f(u_i) = f \circ g(v_i) = v_i\). Thus,
\[\begin{align*} g(v_1 + v_2) &= g(f(u_1) + f(u_2)) \\ &= g(f(u_1 + u_2)) \\ &= u_1 + u_2 \\ &= g(v_1) + g(v_2). \end{align*}\]
Similarly, for \(\lambda \in \mathbf{K}\),
\[\begin{align*} g(\lambda v_1) &= g(\lambda f(u_1)) \\ &= g(f(\lambda u_1)) \\ &= \lambda u_1 \\ &= \lambda g(v_1). \end{align*}\]
This shows that \(g\) is indeed linear.
Let \(B = \mathrm{Mat}_{\mathcal{B}}(g)\) where \(\mathcal{B}\) is the canonical basis of \(\mathbf{K}^n\). Then
\[\begin{align*} I_n &= \mathrm{Mat}_{\mathcal{B}}(\mathrm{id}) \\ &= \mathrm{Mat}_{\mathcal{B}}(f \circ g) \\ &= \mathrm{Mat}_{\mathcal{B}}(f) \mathrm{Mat}_{\mathcal{B}}(g) \\ &= AB \end{align*}\]
Similarly, using \(g \circ f\), we show that \(BA = I_n\).
Conversely, if there exists a matrix \(B \in \mathrm{Mat}_n(\mathbf{K})\) such that \(AB = BA = I_n\), then noting \(g\) as the linear transformation associated with \(B\), the same calculation above shows that \(f \circ g = \mathrm{id}\) and \(g \circ f = \mathrm{id}\).
Proposition 3.2 Let \(A \in \mathrm{M}_n(\mathbf{K})\) be a square matrix. The matrix \(A\) is invertible if and only if the image of a basis is also a basis.
Proof. Assume \(A\) is invertible. Let \((e_1, \dots, e_n)\) be a basis of \(\mathbf{K}^n\). Since \(A\) is surjective, for any vector \(Y\), there exists \(X = [x_i]\) such that \(Y = AX\), which gives
\[Y = \sum_{i=1}^n x_i A e_i\]
Thus, the family \((Ae_i)_{i=1}^n\) is a generating set. Since its cardinality equals the dimension, it forms a basis. Therefore, the image of a basis by an invertible matrix is a basis.
Conversely, assume that the image of a basis is a basis. For instance, the set of column vectors of \(A\) is the image of the canonical basis, which forms a basis. Hence, any vector \(Y\) can be uniquely expressed as \(Y = \sum_{i=1}^n x_i A_i\) where \(x_i\) are the coordinates in the basis of the column vectors \((A_1, \dots, A_n)\) of the matrix \(A\). Setting \(X\) as the column vector \([x_i]_{i=1}^n\), this translates to \(Y = AX\). The equation \(Y = AX\) with unknown \(X\) has a unique solution, so \(A\) is bijective.
Corollary 3.1 Let \(A \in \mathrm{M}_n(\mathbf{K})\) be a square matrix. The matrix \(A\) is invertible if and only if the set of column vectors of \(A\) forms a basis of \(\mathbf{K}^n\).
Proof. Indeed, the set of columns of \(A\) is the image of the canonical basis under the linear transformation associated with \(A\).
Proposition 3.3 Let \(A\) be a square matrix of size \(n\). The equation \(AX = B\) with \(X \in \mathbf{K}^n\) has a unique solution for all \(B \in \mathbf{K}^n\) if and only if \(A\) is invertible.
In this case, the solution is given by \(X = A^{-1}B\).
Proof. Saying that the equation \(AX = B\) with \(X \in \mathbf{K}^n\) has a unique solution for all \(B \in \mathbf{K}^n\) means exactly that the associated linear transformation is bijective, and we have seen that this implies \(A\) is invertible.
In this case, we have
\[X = I_n X = A^{-1} A X = A^{-1} B.\]
Calculating the inverse will thus allow us to solve linear systems in cases where \(A\) is invertible.
Definition 3.2 A square matrix \(T = [T_{i,j}] \in \mathrm{M}_n(\mathbf{K})\) is upper triangular if \(T_{i,j} = 0\) for all \(i > j\).
Visually, this means that the terms below the diagonal are zero.
Example 3.2 The matrix \[A = \begin{bmatrix} 1 & 5 & 4 \\ 0 & 2 & 2 \\ 0 & 0 & 1 \end{bmatrix}\] is upper triangular.
Proposition 3.4 An upper triangular matrix \(T = [T_{i,j}] \in \mathrm{M}_n(\mathbf{K})\) is invertible if and only if all the diagonal elements \(T_{i,i}\) are non-zero.
Proof. We have seen that \(T\) is invertible if and only if for every \(Y \in \mathbf{K}^n\), the system \(TX = Y\) has a unique solution. We will prove the result by induction on \(n\).
Case \(n=1\): The system \(TX = Y\) is simply \(t_{1,1} x_1 = y_1\), which has a solution for any \(y_1\) if and only if \(t_{1,1} \neq 0\). In this case, the solution is \(x_1 = \frac{y_1}{t_{1,1}}\).
Case \(n > 1\): Assume the result holds for \(n-1\). The system \(TX = Y\) can be written as
\[\left\{\begin{matrix} t_{1,1} x_1 &+& t_{1,2} x_2 &+& \dots &+& t_{1,n} x_n &=& y_1 \\ && t_{2,2} x_2 &+& \dots &+& t_{2,n} x_n &=& y_2 \\ &&&& \ddots && \vdots &=& \ \vdots \\ &&&&&& t_{n,n} x_n &=& y_n \\ \end{matrix}\right.\]
For this system to have a solution for every \(Y\), it is necessary that \(t_{n,n} \neq 0\), and in this case \(x_n = \frac{y_n}{t_{n,n}}\). Let
\[T' = \begin{bmatrix} t_{1,1} & \dots & \dots & t_{1,n-1} \\ 0 & \ddots && \vdots \\ \vdots & \ddots & \ddots & \vdots \\ 0 & \cdots & 0 & t_{n-1,n-1} \end{bmatrix},\]
\[X' = \begin{bmatrix} x_1 \\ \vdots \\ x_{n-1} \end{bmatrix},\]
and
\[Y' = \begin{bmatrix} y_1 - t_{1,n} x_n \\ \vdots \\ y_{n-1} - t_{n-1,n} x_n \end{bmatrix} = \begin{bmatrix} y_1 - t_{1,n} \frac{y_n}{t_{n,n}} \\ \vdots \\ y_{n-1} - t_{n-1,n} \frac{y_n}{t_{n,n}} \end{bmatrix}.\]
The system \(TX = Y\) is then equivalent to
\[\left\{\begin{matrix} x_n &= \frac{y_n}{t_{n,n}} \\ T' X' &= Y' \end{matrix}\right.\]
Thus, \(TX = Y\) has a unique solution for every \(Y\) if and only if \(t_{n,n} \neq 0\) and \(T' X' = Y'\) has a unique solution for every \(Y' \in \mathbf{K}^{n-1}\). By the induction hypothesis, this is equivalent to \(t_{n,n} \neq 0\) and all the diagonal elements are non-zero. Since the coefficients of \(T'\) are \(t_{1,1}, \dots, t_{n-1,n-1}\), it follows that \(TX = Y\) has a unique solution for every \(Y\) if and only if all the diagonal elements of \(T\) are non-zero.
Remark. In the proof of Proposition 3.4, we solved the system \(TX = Y\) by substitution. Starting from the last row, we find the value of \(x_n\), then substitute this value into the second-to-last row to find \(x_{n-1}\). We then continue up the system until we find the value of \(x_1\).
Example 3.3 Consider the matrix
\[A = \begin{bmatrix} 1 & 5 & 4 \\ 0 & 2 & 2 \\ 0 & 0 & 1 \end{bmatrix}.\]
This matrix has all non-zero diagonal elements, so it is invertible. The system \(AX = Y\) is written as
\[\left\{\begin{matrix} x_1 &+& 5 x_2 &+& 4 x_3 &=& y_1 \\ && 2 x_2 &+& 2 x_3 &=& y_2 \\ &&&& x_3 &=& y_3 \end{matrix}\right.\]
We solve the system starting from the last row and work our way up:
\[\left\{\begin{matrix} x_1 + 5 x_2 &=& y_1 - 4 y_3 \\ x_2 &=& \frac{1}{2} y_2 - y_3 \\ x_3 &=& y_3 \end{matrix}\right.\]
\[\left\{\begin{matrix} x_1 &=& y_1 - \frac{5}{2} y_2 + y_3 \\ x_2 &=& \frac{1}{2} y_2 - y_3 \\ x_3 &=& y_3 \end{matrix}\right.\]
Thus, the inverse of the matrix \(A\) is
\[A^{-1} = \begin{bmatrix} 1 & -\frac{5}{2} & 1 \\ 0 & \frac{1}{2} & -1 \\ 0 & 0 & 1 \end{bmatrix}.\]
Remark. From a practical standpoint, calculating the inverse of a matrix is very important. For triangular matrices, this calculation is quite straightforward using the substitution method, as demonstrated in the proof of Proposition 3.4.
For a non-triangular matrix, one approach is to transform it into a triangular matrix. This is done using the Gaussian elimination method (which you have seen), which first obtains a triangular matrix and then solves it by substitution.
The determinant of a matrix is a computational tool that helps, among other things, to determine if a square matrix is invertible.
3.2 Determinant
For a square matrix (which corresponds to a map from \(\mathbf{K}^n\) to itself), we can check whether the associated transformation is injective and surjective, i.e., bijective, through its determinant.
Definition 3.3 (Determinant of a Square Matrix) Let \(A = [a_{i,j}] \in \mathrm{M}_n(\mathbf{K})\) be a square matrix. The number \(\det(A) \in \mathbf{K}\), called the determinant of \(A\), is defined recursively on \(n\).
Case \(n = 1\) \[\det(A) = a_{1,1}.\]
Case \(n \geq 2\)
The cofactor of indices \((i,j)\), denoted \(C_{i,j}\), is defined as \((-1)^{i+j} \det(\overline{A}_{i,j})\) where \(\overline{A}_{i,j}\) is the minor of indices \((i,j)\) obtained by removing the \(i\)-th row and \(j\)-th column from \(A\). The determinant of \(A\) for a fixed \(j\) is given by
\[\det(A) = \sum_{i=1}^n a_{i,j} C_{i,j}.\]
Remark. The result of the determinant formula does not depend on \(j\). The formula for a fixed \(j\) is called the expansion along the \(j\)-th column.
Remark. For \(n=2\), the formula is given by
\[A = \begin{bmatrix} a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \end{bmatrix}\]
\[\det(A) = a_{1,1} a_{2,2} - a_{1,2} a_{2,1}.\]
Example 3.4 If \(A = \begin{bmatrix} 1 & 2 \\ 3 & 6 \end{bmatrix}\), then
\[\det(A) = 1 \times 6 - 2 \times 3 = 0.\]
If
\[A = \begin{bmatrix} 1 & 2 & 4 \\ 0 & 2 & 1 \\ 1 & 0 & 2 \end{bmatrix},\]
expanding along the first column, we get
\[\begin{align*} \det(A) &= 1 \times \det\left(\begin{bmatrix} 2 & 1 \\ 0 & 2 \end{bmatrix}\right) + 0 \times \det\left(\begin{bmatrix} 2 & 4 \\ 0 & 2 \end{bmatrix}\right) + 1 \times \det\left(\begin{bmatrix} 2 & 4 \\ 2 & 1 \end{bmatrix}\right) \\ &= 1 \times (2 \times 2 - 0 \times 1) + 0 + 1 \times (2 \times 1 - 2 \times 4) \\ &= -2 \end{align*}\]
Proposition 3.5 For an upper triangular matrix \(T = [t_{i,j}]\), the determinant of \(T\) is the product of the diagonal elements, i.e.,
\[\det(T) = t_{1,1} \times \cdots \times t_{n,n}.\]
Proof. We prove the result by induction on \(n=1\).
For \(n=1\), the result is obvious since there is only one diagonal element.
Assume the result holds for \(n-1 \geq 1\). We expand the determinant of \(T\) along the first column. The only non-zero term is \(t_{1,1} \det(T')\), where \(T' = [t_{i,j}]_{2 \leq i,j \leq n}\). Since \(T'\) is a triangular matrix of size \(n-1\), by the induction hypothesis, \(\det(T') = t_{2,2} \times \cdots \times t_{n,n}\) and thus \(\det(T) = t_{1,1} \times t_{2,2} \times \cdots \times t_{n,n}\).
By induction principle, the result holds for all \(n \in \mathbf{N}\).
Theorem 3.1 Let \(A\) be a square matrix. The matrix \(A\) is invertible if and only if \(\det(A) \neq 0\).
We will not prove this theorem, as it is a fairly technical result.
Remark. The determinant definition is recursive: to calculate a determinant of size \(n\), we reduce it to \(n\) problems of size \(n-1\). The problem of size 1 can be solved in constant time, so by recursion, the complexity of calculating the determinant is \(n \times (n-1) \times \cdots \times 2 \times 1 = n!\).
The inverse of a matrix of size \(n\) can be computed in \(O(n^3)\) using Gaussian elimination.
Thus, it is much more efficient to calculate the inverse of a matrix than to compute its determinant using the definition! In practice, for calculating the determinant, we do not use the definition but rather other methods. For example, in the Gaussian elimination method, we can stop when we obtain an upper triangular matrix. At this point, the determinant is simply the product of the diagonal terms, giving a method with complexity \(O(n^3)\).
Remark. For an upper triangular matrix, the determinant is the product of the diagonal terms. The above theorem allows us to recover the previous result: an upper triangular matrix is invertible if and only if its diagonal elements are all non-zero.
3.3 Change of Basis Matrix
Definition 3.4 (change of basis matrix) Let \(\mathcal{B} = (e_i)_{1 \leq i \leq n}\) and \(\mathcal{B}' = (e'_j)_{1 \leq j \leq n}\) be two bases of \(\mathbf{K}^n\). Let \((p_{i,j})_{1 \leq i \leq n}\) be the coefficients of \(e'_j\) in the basis \(\mathcal{B}\). The change of basis matrix from the basis \(\mathcal{B}\) to the basis \(\mathcal{B}'\) is the matrix
\[P = [p_{i,j}]_{1 \leq i,j \leq n}.\]
Remark. This change of basis matrix from \(\mathcal{B}\) to \(\mathcal{B}'\) is sometimes denoted \(P_\mathcal{B}^\mathcal{B'}\).
Remark. In Definition 3.4, we have
\[e'_j = p_{1,j} e_1 + \dots + p_{n,j} e_n.\]
This means that the columns of \(P\) are the vectors \(e'_1, \dots, e'_n\) expressed in the initial basis \(\mathcal{B}\).
If \(X'\) is a vector containing the coordinates \((X'_j)_{1 \leq j \leq n}\) of a vector \(u\) in the basis \(\mathcal{B}'\), then the column vector \(X = PX'\) contains the coordinates of \(u\) in the basis \(\mathcal{B}\).
Indeed, we have \(u = \sum_{j=1}^n X'_j e'_j\), which gives
\[\begin{align*} u &= \sum_{j=1}^n X'_j \left(\sum_{i=1}^n p_{i,j} e_i\right) \\ &= \sum_{i=1}^n \left(\sum_{j=1}^n p_{i,j} X'_j\right) e_i \\ &= \sum_{i=1}^n X_i e_i. \end{align*}\]
Thus, \((X_i)\) are the coordinates of \(u\) in the basis \(\mathcal{B}\).
Another way to express the same idea is to say that \(P = \mathrm{Mat}_{\mathcal{B}',\mathcal{B}}(\mathrm{id})\). Be careful with the order of the bases!
Remark. Any invertible matrix \(A \in \mathrm{M}_n(\mathbf{K})\) can be interpreted as a change of basis matrix, specifically as the change of basis matrix from the canonical basis \(\mathcal{B}\) of \(\mathbf{K}^n\) to the image of \(\mathcal{B}\) under \(A\). Indeed, the columns of \(A\) are the images of the canonical basis vectors under \(A\). We know that this is also a basis by Proposition 3.2.
Example 3.5 If \(\mathcal{B}\) is the canonical basis of \(\mathbf{R}^2\) and \(\mathcal{B}'\) is the basis consisting of the vectors \(e'_1 = \begin{bmatrix}1 \\ 1\end{bmatrix}\) and \(e'_2 = \begin{bmatrix}1 \\ -1\end{bmatrix}\), then the change of basis matrix from \(\mathcal{B}\) to \(\mathcal{B}'\) is
\[P = \begin{bmatrix}1 & 1 \\ 1 & -1\end{bmatrix}.\]
A vector with coordinates \(X' = (x', y')\) in the basis \(\mathcal{B}'\) has coordinates \(X = (x, y) = PX'\) in the basis \(\mathcal{B}\), meaning the coordinates are related by the equations
\[\left\{\begin{matrix} x &= x' + y' \\ y &= x' - y' \end{matrix}\right.\]
For example, the vector with coordinates \((5,1)\) in the canonical basis has coordinates \((3,2)\) in the basis \(\mathcal{B}'\).
Proposition 3.6 If \(P\) is the change of basis matrix from \(\mathcal{B}\) to \(\mathcal{B}'\), then \(P\) is invertible, and the change of basis matrix from \(\mathcal{B}'\) to \(\mathcal{B}\) is the matrix \(P^{-1}\).
Proof. We have seen that \(P = \mathrm{Mat}_{\mathcal{B}', \mathcal{B}}(\mathrm{id})\), and since the identity map is a bijection, \(P\) is indeed invertible.
Since the inverse of the identity map is the identity itself, we have \[P^{-1} = \mathrm{Mat}_{\mathcal{B}, \mathcal{B}'}(\mathrm{id})\] and thus the change of basis matrix from \(\mathcal{B}'\) to \(\mathcal{B}\) is \(P^{-1}\).
Example 3.6 Let us return to the bases from Example 3.5. The change of basis matrix from \(\mathcal{B}'\) to \(\mathcal{B}\) is \[\begin{align*} P^{-1} &= \frac{-1}{2} \begin{bmatrix} -1 & -1 \\ -1 & 1 \end{bmatrix} \\ &= \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & -\frac{1}{2} \end{bmatrix} \end{align*}\]
Example 3.7 Consider the vector \(u\) with coordinates \((2,1)\) in the basis \(\mathcal{B}'\). What are its coordinates in the basis \(\mathcal{B}\)?
To find this, we calculate \[P \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 1 \end{bmatrix}.\] Thus, the coordinates of \(u\) in the basis \(\mathcal{B}\) are \((3,1)\).
Conversely, consider \(v\) with coordinates \((2,4)\) in the basis \(\mathcal{B}\). What are its coordinates in the basis \(\mathcal{B}'\)?
To find this, we calculate \[P^{-1} \begin{bmatrix} 2 \\ 4 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & -\frac{1}{2} \end{bmatrix} \begin{bmatrix} 2 \\ 4 \end{bmatrix} = \begin{bmatrix} 1 \\ -1 \end{bmatrix}.\]
3.4 Change of Basis Formula
Let \(f\) be a linear map from \(\mathbf{K}^n\) to itself. If we know the matrix \(A\) of \(f\) in one basis \(\mathcal{B}\), what is the matrix of \(f\) in another basis \(\mathcal{B}'\)? The following change of basis formula provides the answer.
Proposition 3.7 Let \(f\) be a linear map from \(\mathbf{K}^n\) to itself and let \(\mathcal{B}\) and \(\mathcal{B}'\) be two bases of \(\mathbf{K}^n\). If \(A' = \mathrm{Mat}_{\mathcal{B}'}(f)\), \(A = \mathrm{Mat}_\mathcal{B}(f)\), and \(P\) is the change of basis matrix from \(\mathcal{B}\) to \(\mathcal{B}'\), then
\[A = PA'P^{-1},\]
or equivalently,
\[A' = P^{-1}AP.\]
Remark. You can memorize (and forget) this formula, but it is also useful to understand it for deriving it.
- The matrix \(A'\) processes vectors expressed in the basis \(\mathcal{B}'\) and returns vectors expressed in the basis \(\mathcal{B}'\).
- The matrix \(A\) processes vectors expressed in the basis \(\mathcal{B}\) and returns vectors expressed in the basis \(\mathcal{B}\).
- The matrix \(P\) processes vectors expressed in the basis \(\mathcal{B}'\) and returns vectors expressed in the basis \(\mathcal{B}\).
- The matrix \(P^{-1}\) processes vectors expressed in the basis \(\mathcal{B}\) and returns vectors expressed in the basis \(\mathcal{B}'\).
Thus, reading the change of basis formula from right to left, we get the following diagram:
\[\mathcal{B} \overset{A}{\longleftarrow} \mathcal{B} = \mathcal{B} \overset{P}{\longleftarrow} \mathcal{B}' \overset{A'}{\longleftarrow} \mathcal{B}' \overset{P^{-1}}{\longleftarrow} \mathcal{B}.\]
Proof. Let \(u\) be a vector in \(\mathbf{K}^n\). Let \(X\) be the column vector of its coordinates in the basis \(\mathcal{B}'\). Thus, \(P^{-1}X\) is the column vector of the coordinates of \(u\) in the basis \(\mathcal{B}\). The column vector \(AP^{-1}X\) is then the column vector of the coordinates of \(f(u)\) in the basis \(\mathcal{B}\). Finally, \(PAP^{-1}X\) is the column
vector of the coordinates of \(f(u)\) in the basis \(\mathcal{B}'\). Thus, \(PAP^{-1}\) is indeed the matrix of the linear map \(f\) in the basis \(\mathcal{B}'\).
Example 3.8 Let’s revisit the bases \(\mathcal{B}\) and \(\mathcal{B}'\) of \(\mathbf{R}^2\), where \(\mathcal{B}\) is the canonical basis and \(\mathcal{B}'\) is the basis consisting of the vectors \(e'_1 = (1,1)\) and \(e'_2 = (1,-1)\).
Let \(A' = \begin{bmatrix}1 & 0 \\ 0 & 2\end{bmatrix}\) be the matrix of a linear transformation \(f \colon \mathbf{R}^2 \to \mathbf{R}^2\) in the basis \(\mathcal{B}'\). The matrix \(A\) of \(f\) in the basis \(\mathcal{B}\) is therefore
\[\begin{align*} A &= PA'P^{-1} \\ &= \begin{bmatrix}1 & 1 \\ 1 & -1\end{bmatrix} \begin{bmatrix}1 & 0 \\ 0 & 2\end{bmatrix} P^{-1} \\ &= \begin{bmatrix}1 & 2 \\ 1 & -2\end{bmatrix} P^{-1} \\ &= \begin{bmatrix}1 & 2 \\ 1 & -2\end{bmatrix} \times \frac{1}{2} \begin{bmatrix}1 & 1 \\ 1 & -1\end{bmatrix} \\ &= \begin{bmatrix}3/2 & -1/2 \\ -1/2 & 3/2\end{bmatrix} \end{align*}\]
In the basis \(\mathcal{B}'\), the matrix of \(f\) is much simpler than in the canonical basis. If we perform calculations with \(f\), it is advantageous to do them in the basis \(\mathcal{B}'\).
Example 3.9 Continuing with the same change of bases, now if \(A = \begin{bmatrix}3/2 & -1/2 \\ 1/2 & 1/2\end{bmatrix}\), we calculate \(A'\) as follows.
\[\begin{align*} A' &= P^{-1}AP \\ &= \frac{1}{2} \begin{bmatrix}1 & 1 \\ 1 & -1\end{bmatrix} \begin{bmatrix}3/2 & -1/2 \\ 1/2 & 1/2\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & -1\end{bmatrix} \\ &= \frac{1}{2} \begin{bmatrix}2 & 0 \\ 1 & -1\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & -1\end{bmatrix} \\ &= \frac{1}{2} \begin{bmatrix}2 & 2 \\ 0 & 2\end{bmatrix} \\ &= \begin{bmatrix}1 & 1 \\ 0 & 1\end{bmatrix} \end{align*}\]
3.5 Exercises
Exercise 3.1 Let \(A = \begin{bmatrix}a & b \\ c & d\end{bmatrix} \in \mathrm{M}_2(\mathbf{K})\).
- Show that if \(ad - bc \neq 0\), then \(A\) is invertible with inverse \[A^{-1} = \frac{1}{ad - bc} \begin{bmatrix}d & -b \\ -c & a\end{bmatrix}.\]
- Calculate the inverse of the matrix \[\begin{bmatrix}1 & 3 \\ 5 & 7\end{bmatrix}.\]
- Bonus: Revisit the puzzle from the introduction of the course and solve the problem by inverting a certain matrix.
Exercise 3.2 Calculate the determinants of the following matrices and state whether they are invertible or not.
- \(\begin{bmatrix}1 & 2 & 3 \\ 0 & 1 & -1 \\ 1 & 0 & 2\end{bmatrix}\)
- \(\begin{bmatrix}1 & 2 & 1 & 0 \\ 0 & 1 & -1 & 1 \\ 1 & 0 & 2 & 1 \\ 0 & 1 & 2 & 0\end{bmatrix}\)
- \(\begin{bmatrix}1 & 0 & 1 \\ 1 & 1 & 0 \\ 0 & 1 & 1\end{bmatrix}\) (distinguish between the cases \(\mathbf{K} = \mathbf{R}\) and \(\mathbf{K} = \mathbf{F}_2\)).
Exercise 3.3 Consider the following three vectors in \(\mathbf{R}^3\): \[u_{1} = \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix}, \quad u_{2} = \begin{pmatrix} -1 \\ 3 \\ 1 \end{pmatrix}, \quad u_{3} = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix}\]
- Show that \(\mathcal{B}' = (u_{1}, u_{2}, u_{3})\) is a basis of \(\mathbf{R}^3\).
- Determine \(P_{\mathcal{B}}^{\mathcal{B}'}\), the change of basis matrix from the canonical basis \(\mathcal{B}\) to the basis \(\mathcal{B}'\).
- Calculate the change of basis matrix from \(\mathcal{B}'\) to \(\mathcal{B}\).
- Let \(v = u_{1} + 2u_{2} + 3u_{3}\). What are the coordinates of \(v\) in the canonical basis?
- Let \(w = e_{1} + 2e_{2} + 3e_{3}\) in the canonical basis. What are the coordinates of \(w\) in the basis \(\mathcal{B}'\)?
Exercise 3.4 Consider the space \(\mathbf{R}^3\) with the canonical basis \(\mathcal{B} = (e_1, e_2, e_3)\).
- Show that the vectors \(e'_1 = (2, -1, -2)\), \(e'_2 = (1, 0, -1)\), and \(e'_3 = (-2, 1, 3)\) form a basis \(\mathcal{B}'\) of \(\mathbf{R}^3\).
- Provide the change of basis matrices between the bases.
- Let \(f\) be the linear map associated with the matrix \(A = \begin{bmatrix}9 & -6 & 10 \\ -5 & 2 & -5 \\ -12 & 6 & -13\end{bmatrix}\). Calculate the matrix of \(f\) in the basis \(\mathcal{B}'\).
Exercise 3.5 (Lights Out) Lights Out is a puzzle game consisting of a square grid of size \(n \times n\) where the cells are either on or off. Pressing a cell changes the state (on or off) of the cell and its vertical or horizontal neighbors. The goal is to turn off all the lights from an initial configuration. You can play online at https://daattali.com/shiny/lightsout/ or https://www.artbylogic.com/puzzles/gridLights/gridLights.htm. There are also versions available on the iOS and Android stores.
We represent a configuration by a matrix in \(\mathrm{M}_n(\mathbf{F}_2)\) where a 1 means the corresponding cell is on and a 0 means it is off. We start by studying the game in any dimension \(n \in \mathbf{N}\), then focus on the case \(n = 2\). The first two questions help to recall how to represent a matrix as a column vector.
- Provide the column vectors in the canonical basis of \(\mathrm{M}_2(\mathbf{F}_2)\) corresponding to the following matrices. \[\begin{bmatrix}1 & 0 \\ 0 & 0\end{bmatrix}, \quad \begin{bmatrix}1 & 0 \\ 1 & 0\end{bmatrix}, \quad \begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}.\]
- Conversely, provide the matrices corresponding to the following column vectors. \[\begin{bmatrix}0 \\ 1 \\ 0 \\ 1\end{bmatrix}, \quad \begin{bmatrix}1 \\ 1 \\ 1 \\ 0\end{bmatrix}.\]
Let \(C_{i,j}\) be the matrix in \(\mathrm{M}_n(\mathbf{F}_2)\) with zeros everywhere except in positions \((i,j)\), \((i \pm 1, j)\), and \((i, j \pm 1)\) as long as \(i \pm 1, j \pm 1 \in \{1, \dots, n\}\).
- Justify that if the game is in the configuration \(A \in \mathrm{M}_n(\mathbf{F}_2)\), then after pressing the cell at coordinates \((i, j)\), the new configuration is \(A + C_{i,j}\).
- Justify that the order in which we press the cells does not matter. What property did you use?
- Justify that to solve the puzzle, it is sufficient to press each cell at most once.
Let \(A \in \mathrm{M}_n(\mathbf{F}_2)\) be an initial configuration. A solution for \(A\) is a matrix \(S = [s_{i,j}] \in \mathrm{M}_n(\mathbf{F}_2)\) such that pressing the cells \((i, j)\) where \(s_{i,j} = 1\) turns off all the lights.
Verify that if \(A = \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix}\), then \(S = \begin{bmatrix}1 & 0 \\ 0 & 0\end{bmatrix}\) is a solution for \(A\). Similarly, if \(A = \begin{bmatrix}0 & 0 & 0 \\ 1 & 1 & 1 \\ 1 & 0 & 1\end{bmatrix}\), then \(S = \begin{bmatrix}0 & 1 & 0 \\ 1 & 1 & 1 \\ 1 & 1 & 1\end{bmatrix}\) is a solution for \(A\).
Show that if \(S\) is a solution for any \(A\), then \(A = \sum_{1 \leq i, j \leq n} s_{i,j} C_{i,j}\).
Justify that the puzzle can be solved for any initial configuration \(A \in \mathrm{M}_n(\mathbf{F}_2)\) if and only if the vector space spanned by \(\{C_{i,j}\}_{1 \leq i, j \leq n}\) is \(\mathrm{M}_n(\mathbf{F}_2)\) as a whole.
Provide the matrix \(P \in \mathrm{M}_4(\mathbf{F}_2)\) whose columns are the column vectors \(C_{1,1}, C_{2,1}, C_{1,2}, C_{2,2}\) expressed in the canonical basis of \(\mathrm{M}_2(\mathbf{F}_2)\).
Calculate \(P^2\).
Justify that the matrix \(P\) is invertible and provide its inverse.
How can \(P\) be interpreted as a change of basis matrix?
Show that for any initial configuration \(A \in \mathrm{M}_n(\mathbf{F}_2)\), there exists a unique solution. How can this be obtained using \(P\)?
Provide the solution for the following initial configurations:
\[\begin{bmatrix}1 & 0 \\ 0 & 0\end{bmatrix}, \quad \begin{bmatrix}1 & 0 \\ 1 & 0\end{bmatrix}, \quad \begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}.\]
Exercise 3.6 Let \(A\) be an invertible square matrix of size \(n\). Starting from the relation \(AA^{-1} = I_n\), prove that \(^tA\) is also invertible and that \(^t(A^{-1}) = (^tA)^{-1}\).
Deduce that a square matrix \(A\) is invertible if and only if \(^tA\) is invertible, if and only if the rows of \(A\) form a basis of \(\mathbf{K}^n\).
Exercise 3.7 Let \(e_1, e_2, e_3\) be three vectors forming a basis of \(\mathbf{R}^3\). Define the linear map \(\phi\) by \(\phi(e_1) = e_3\), \(\phi(e_2) = -e_1 + e_2 + e_3\), and \(\phi(e_3) = e_3\).
- Write the matrix \(A\) of \(\phi\) in the basis \((e_1, e_2, e_3)\).
- Let \(f_1 = e_1 - e_3\), \(f_2 = e_1 - e_2\), and \(f_3 = -e_1 + e_2 + e_3\). Calculate \(e_1, e_2, e_3\) in terms of \(f_1, f_2, f_3\). Do the vectors \(f_1, f_2, f_3\) form a basis of \(\mathbf{R}^3\)?
- Calculate \(\phi(f_1), \phi(f_2), \phi(f_3)\) in terms of \(f_1, f_2, f_3\). Write the matrix \(B\) of \(\phi\) in the basis \((f_1, f_2, f_3)\) and determine the nature of the map \(\phi\).
- Let \(P = \begin{pmatrix}1 & 1 & -1 \\ 0 & -1 & 1 \\ -1 & 0 & 1\end{pmatrix}\). Why is \(P\) invertible? What matrix do you recognize?
- Calculate \(P^{-1}\). What relation links \(A\), \(B\), \(P\), and \(P^{-1}\)?