Triangularization : My Last Memento of Linear Algebra
ABSTRACT. Schur triangularization is a powerful tool in linear algebra: it implies spectral decompostion, Cayley-Hamilton, removal rule and Jordan canonical form over the complex number field. With the help of algebraic closure, ordinary triangularization serves to generalize the previous results. The spirit of simultaneous triangularization is explored through problem-solving, and several relevant theorems from Lie algebra are recalled. Finally, we consider the triangularization of matrices over PID. A review of complexification, algebraic closure, Gramian and finitely generated modules over PID is given in the appendices. As the goal is to go over linear algebra through one topic, many preliminary results are recalled and some interesting digressions are inserted.
Schur Triangularization
Generalized Schur’s Theorem: Statement
THEOREM. Let \(V\) be a nonzero finite-dimensional real inner product space and \(T\) a linear operator on \(V\).
I) There exists an orthonormal basis \(\beta\) for \(V\) such that
is a block upper triangular matrix, where \(A_j\in M_{2\times 2}(\mathbb{R})\ (j=1,\cdots,q)\) and \(c_j\in \mathbb{R}\ (j=1,\cdots,p)\).
II) Moreover, if \(T\) is normal, then there exists an orthonormal basis \(\beta\) such that
is a block upper triangular matirx, where the multiset \(\{c_1,\cdots,c_p,a_1\pm ib_1,\cdots,a_q\pm ib_q\}\) with \(b_j\neq 0\ (j=1,\cdots,q)\) is the spectrum of the complexification of \(T\).
III) In particular, if \(T\) is orthogonal, then there exists an orthonormal basis \(\beta\) for \(V\) such that
is a block diagonal matrix, where \(\theta_j\in \mathbb{R}\setminus\{k\pi:k\in \mathbb{Z}\}\ (j=1,\cdots,q)\) and \(\varepsilon_j=\pm 1\ (j=1,\cdots,p)\). That is to say, any orthogonal operator on a nonzero finite-dimensional real inner product space is the composite of rotations and reflections (the order does not matter), and the space can be decomposed as a direct sum of pairwise orthogonal \(1\)– or \(2\)-dimensional spaces that are invariant under the operator.
In each case above, the \(q\)-tuple of second-order blocks on the diagonal can be arranged to its arbitrary permutation, e.g., \((A_1,\cdots,A_q)\) can be arranged to \((A_{\sigma(1)}\cdots,A_{\sigma(q)})\), where \(\sigma\) is any given permutation. The similar statement also holds for the \(p\)-tuple of first-order blocks on the diagonal. However, if we change the order of the basis \(\beta\) in I) in order that the \(q\)-tuple and the \(p\)-tuple are mixed, e.g., \(\text{diag}([T]_{\beta})=(A_1,c_1,A_2,c_2,\cdots)\), then \([T]_{\beta}\) will no longer be garanteed to be a blocked upper triangular matrix. The same statement goes for II).
Generalized Schur’s Theorem: Proof
We focus on proving I) and add in the proof the key observations of II), after which III) follows immediately.
Proof of I). Induction on \(n:=\dim V\). The theorem is trivial when \(n=1\), so we may assume that \(n\ge 2\). Assume that the theorem is true for any real inner product space of dimension less than \(n\).
Case 1: \(T\) has an eigenvalue \(\lambda\).
Since \(\ker(T^*-\lambda I)=\text{im}(T-\lambda I)^{\perp}\), we have
and thus there exists a unit vector \(z\) such that \(T^*z=\lambda z\). Define \(W:=\text{span}(\{z\})\). Then \(W\) is \(T^*\)-invariant, and so \(W^{\perp}\) is \(T\)-invariant, with \(\dim W^{\perp}=n-1\). By induction hypothesis, there exists an orthonormal basis \(\gamma\) for \(W^{\perp}\) such that \([T_{W^{\perp}}]_{\gamma}\) is of the stated form. Define \(\beta:=\gamma\cup \{z\}\). Then \(\beta\) is an orthonormal basis for \(V\) such that
is of the stated form, with \([T]_{\beta}(n,n)=\langle T(z),z\rangle=\langle z,T^*(z)\rangle=\langle z,\lambda z\rangle=\lambda\).
Case 2: \(T\) has no eigenvalue.
In this case, \(T^*\) has no eigenvalue either. Let \(x=x_1\otimes 1+x_2\otimes i\ (x_1,x_2\in V)\) be an eigenvector of \((T^*)_{\mathbb{C}}\) with the corresponding eigenvalue \(\overline{\lambda}=\lambda_1+i\lambda_2\ (\lambda_1,\lambda_2\in \mathbb{R})\). Clearly, \(\lambda_2\neq 0\) and \(x_1,x_2\) are linearly independent in \(V\). Also, we have
Define \(W:=\text{span}(\{x_1,x_2\})\). Then \(W\) is a \(2\)-dimensional \(T^*\)-invariant subspace, and so \(W^{\perp}\) is an \((n-2)\)-dimensional \(T\)-invariant subspace. By induction hypothesis, there exists an orthonormal basis \(\gamma\) for \(W^{\perp}\) such that \([T_{W^{\perp}}]_{\gamma}\) is of the stated form. Let \(\{x’_1,x’_2\}\) be an orthonormal basis for \(W\), and define \(\beta’:=\gamma\cup\{x’_1,x’_2\}\). Then \(\beta’\) is an orthonormal basis for \(V\) and
is of the stated form. Therefore I) is proved.
Proof of II). Now we assume that \(T\) is normal. Thanks to the identity \((T_{\mathbb{C}})^*=(T^*)_{\mathbb{C}}\), \(T_{\mathbb{C}}\) is normal as well. Therefore, we have
Consequently, \(W\) is \(T\)-invariant as well and \([T_{W}]_{\{x_1,x_2\}}=\begin{pmatrix}\lambda_1 & -\lambda_2\\ \lambda_2 & \lambda_1 \end{pmatrix}\). Note that
Without the loss of generality, we may assume that \(\|x_1\|=\|x_2\|=1\). Define \(\beta:=\gamma\cup \{x_1,x_2\}\), then \(\beta\) is a orthonormal basis for \(V\) such that
as desired.
Proof of III). Note that if a block upper triangular matrix \(\begin{pmatrix}A & C\\O & B\end{pmatrix}\) is unitary, then \(A^*A=I,B^*B+C^*C=I\) and \(BB^*=I\), implying that \(A,B\) are unitary and \(C=O\). Now III) follows from II) and the fact above, recalling that the eigenvalues of a unitary matrix are all of modulus one. \(\blacksquare\)
Remark. In fact, if the block upper triangular matrix \(\begin{pmatrix}A & C\\O & B\end{pmatrix}\) is normal, then we have \(A^*A=AA^*+CC^*\), and so \(\text{tr}(CC^*)=0\), implying that \(C=O\), and consequently both \(A\) and \(B\) are normal.
Corollary 1: The Spectral Theorem
From the proof of I), we see that for any linear operator on a nonzero finitely-dimensional complex inner product space is unitarily triangonalizable. This is Schur’s theorem. Using the fact in III), it follows that a linear operator on a nonzero finitely-dimensional complex inner product space is unitarily diagonalizable iff it is normal. Moreover, if a linear operator on a finitely-dimensional real inner product space is self-adjoint, then it is normal and hence has the matrix representation
with respect to some orthonormal basis. But the matrix is self-adjoint, and thus it is actually diagonal. Thus we obtain the spectral theorem from the generalized Schur’s theorem. \(\blacksquare\)
Remark 1 (Schur’s Inequality). Let \(A\in M_{n\times n}(\mathbb{C})\) and \(\lambda_i\ (i=1,\cdots,n)\) be the eigenvalues of \(A\). Denote by \(\|\cdot\|_F\) the Frobenius norm. By Schur’s theorem, we have
with the equality holds iff \(A\) is normal. This is Schur’s inequality. In fact, we can derive the following equality:
Therefore, every normal operator has the minimal Frobenius norm in its similarity class. (Note that two similar normal operators are automatically unitarily equivalent and hence have the same Frobenius norm.) Conversely, if a matrix minize the Frobenius norm in its similarity class, then it must be normal.
Remark 2 (Disgression: Low-Rank Approximation). Let \(A\in M_{m\times n}(\mathbb{C})\) and \(\sigma_1\ge \cdots \ge \sigma_k\ge \cdots\ge \sigma_r\ge 0\) be the nonzero eigenvalues of \(A\), where \(r=\text{rank}(A)\) and \(1\le k\le r\). Then we have
(Also, note that \(\|A\|_F^2=\sum_{i=1}^{r}|\sigma_i|^2\).) If \(A=U\Sigma V^*\) is a SVD such that
then \(\widehat{A}=U\widehat{\Sigma}V^*\) achieves the infimum, where
Moreover, if \(\sigma_k\neq \sigma_{k+1}\), then the minimizer is unique. This is Eckart–Young–Mirsky theorem for Frobenius norm. In fact, \(\widehat{A}\) is also the best rank-\(k\) approximation to \(A\) in the spectral norm, and
The proof can be find on Wikipedia.
Corollary 2: Cayley-Hamilton and Removal Rule
Proof of Cayley-Hamilton. By Schur’s theorem, it suffices prove Cayley-Hamilton for any complex upper triangular matrix \(A=(a_{ij})_{n\times n}\).
Note that the characteristic polynomial of \(A\) is \(f(t)=\prod_{k=1}^{n}(t-a_{kk})\) and hence \(f(A)=\prod_{k=1}^{n}(A-a_{kk}I)\). We prove by induction that the first \(l\) column of the matrix \(B_{l}=\prod_{k=1}^{l}(A-a_{kk}I)\) are all \(0\), for all \(1\le l\le n\), and then conclude that \(f(A)=B_n=O\).
When \(l=1\), obvious. Assume that the result is true for \(l-1\), i.e. the first \(l-1\) column of \(B_{l-1}=\prod_{k=1}^{l-1}(A-a_{kk}I)\) are all \(0\). Then \(\forall 1\le i\le n\) and \(\forall 1\le j\le l\), we have
Note that \(\forall 1\le k\le l-1, B_{l-1}(i,k)=0\) (induction hypothesis), and that \(\forall l\le k\le n, (A-a_{ll}I)(k,j)=0\), both \((1)\) and \((2)\) are zero, and so \(B_{l}(i,j)=0\). Therefore, the first \(l\) column of \(B_l\) are all \(0\). \(\blacksquare\)
Thanks to Schur’s theorem, we can prove the following lemma without using the Jordan canonical form.
Lemma. Let \(A\in M_{n\times n}(\mathbb{C})\) and \(\text{Spec}(A)=\{\lambda_1,\cdots,\lambda_n\}\) (multiset). Then for any polynomial \(f\) over \(\mathbb{C}\), we have \(\text{Spec}(f(A))=\{f(\lambda_1),\cdots,f(\lambda_n)\}\). \(\blacksquare\)
Next proposition serves as a preparation for Corollary 3: Jordan Canonical Form. It is our removal rule.
Proposition. Let \(F\) be any subfield of \(\mathbb{C}\). Let \(A\in M_{m\times m}(F),B\in M_{n\times n}(F)\) be two square matrices. Let \(p_A,p_B\) be the characteristic polynomials of \(A,B\). If \(\text{gcd}(p_A,p_B)=1\) over \(F\), then for any \(M\in M_{m\times n}(F)\), the matrix \(\begin{pmatrix}A & M\\O & B\end{pmatrix}\) is similar to \(\begin{pmatrix}A & O\\O & B\end{pmatrix}\) as matrices in \(M_{(m+n)\times (m+n)}(F)\). (Note that if \(F=\mathbb{C}\), then the condition is equivalent to \(\text{Spec}(A)\cap \text{Spec}(B)=\varnothing\).)
Proof. If the Sylvester equation \(AX-XB=M\) has a solution, then
and thus the two matrices are similar. (In fact, the converse is also true, but much more difficult. It’s called Roth’s removal rule.) Consider the linear operator
We need to show that \(\varphi\) is surjective. It suffices to show that \(\varphi\) is injective, i.e., if \(AX=XB\), then \(X=O\). Note that \(A^2X=A(AX)=A(XB)=(AX)B=(XB)B=XB^2\), and \(A^3X=A(A^2X)=A(XB^2)=(AX)B^2=(XB)B^2=XB^3\), etc. Thus, for any polynomial \(f\) over \(F\), we have \(f(A)X=Xf(B)\). Let \(m_A,m_B\) to be the minimal polynomials of \(A,B\) over \(F\). Then \(\text{gcd}(m_A,m_B)=1\) and \(m_B(A)X=Xm_B(B)=O\). We show that \(m_B(A)\) is invertible and therefore \(X=O\). Assume for the contrary that \(0\) is an eigenvalue of \(m_B(A)\). Since that the minimal polynomial of \(A\) over \(\mathbb{C}\) equals \(m_{A}\), by the lemma above there exists \(\lambda\in \mathbb{C}\) such that \(m_A(\lambda)=0\) and \(m_B(\lambda)=0\), and clearly \(\lambda\notin F\). Let \(h\) be the minimal polynomial of \(\lambda\) over \(F\), then \(h|m_A\) and \(h|m_B\), contradicting \(\text{gcd}(m_A,m_B)=1\). \(\blacksquare\)
Remark 1 (Minimal Polynomial). If \(E/F\) be a field extension and \(A\in M_{n\times n}(F)\), then the minimal polynomial of \(A\) over \(E\) equals the minimal polynomial of \(A\) over \(F\). Here is an elegant proof: In the same fashion of complexification, \(F^n\otimes_{F} E\) is naturally a vector space over \(E\), with a canonical isomorphism
Consider the commutative diagram
Remark 2 (Alternative Proof). When \(F=\mathbb{C}\), there is an alternative proof without invoking Cayley-Hamilton. (Note that the existence of minimal polynomials of square matrices is garanteed by Cayley-Hamilton.) By Schur’s theorem, there exist two unitary matrix \(U_1,U_2\) such that \(T_1=U_1AU_1^*,T_2=U_2BU_2^*\) are upper triangular, with the eigenvalues of \(A,B\) on their diagonals. Define \(U=\begin{pmatrix}U_1&O\\O&U_2\end{pmatrix}\). Then \(U\) is unitary and
Therefore, we may assume without loss of generality that \(A=(a_{ij})_{m\times m},B=(b_{ij})_{n\times n}\) are upper triangular without commun diagonal entries. As showed in the previous proof, it suffices to show that if \(AX=XB\), then \(X=(x_{ij})_{m\times n}\) is zero. Indeed,
Hence we are done. \(\blacksquare\)
Corollary 3: Jordan Canonical Form
Schur triangularization implies Jordan canonical form over \(\mathbb{C}\).
Proof. Given any complex square matrix \(A\), by Schur’s theorem \(A\) is unitarily equivalent to an upper triangular matrix of the form
where \(\lambda_1,\lambda_2,\cdots,\lambda_s\) are all the distinct eigenvalues of \(A\). By applying removal rule inductively, we derive that the matrix above is similar to the block diagonal matrix
Therefore it suffices to show that each \(\small\begin{pmatrix}\lambda_i&&\hspace{-1em}\large{*}\\ &\ddots&\\ &&\lambda_i\end{pmatrix}\) has a Jordan canonical form.
We only need to show that if \(\Lambda\in M_{n\times n}(\mathbb{C})\) is strictly upper triangular, then it has a Jordan canonical form. Denote \(L_{\Lambda}:\mathbb{C}^n\to \mathbb{C}^n\) by \(T\). Clearly, \(T\) is nilpotent. Denote by \(k\) the nilpotency index of \(T\). Since \(k=1\) implies \(T=O\), we may assume that \(k\ge 2\). Let \(\gamma_j\) be any basis for \(\ker(T^j)\ (j=1,\cdots,k-1)\). Note that
We construct a Jordan canonical basis for \(T\):
Step 1 Extend \(\gamma_{k-1}\) to a basis for \(\ker(T^k)\): \(\gamma_{k-1}\cup \beta_k\). Then \(\gamma_{k-2}\cup T^1\beta_{k}\) is linearly independent. Indeed, let \(\gamma_{k-1}=\{w_1,\cdots,w_p\},\gamma_{k-2}=\{w’_1,\cdots,w’_q\}\) and \(\beta_k=\{v_1,\cdots,v_m\}\), then
This argument also works in the following steps.
Step 2 Extend \(\gamma_{k-2}\cup T^1\beta_{k}\) to a basis for \(\ker(T^{k-1})\): \(\gamma_{k-2}\cup T^1\beta_k\cup \beta_{k-1}\). Then \(\gamma_{k-3}\cup T^2\beta_k\cup T^1\beta_{k-1}\) is linearly independent.
Step 3 Extend \(\gamma_{k-3}\cup T^2\beta_k\cup T^1\beta_{k-1}\) to a basis for \(\ker(T^{k-2})\): \(\gamma_{k-3}\cup T^2\beta_k\cup T^1\beta_{k-1}\cup\beta_{k-2}\). Then \(\gamma_{k-4}\cup T^3\beta_k\cup T^2\beta_{k-1}\cup T^1\beta_{k-2}\) is linearly independent.
\(\cdots\)
Step k-1 Extend \(\gamma_1\cup T^{k-2}\beta_{k}\cup T^{k-3}\beta_{k-1}\cup\cdots\cup T^1\beta_3\) to a basis for \(\ker(T^2)\): \(\gamma_1\cup T^{k-2}\beta_{k}\cup T^{k-3}\beta_{k-1}\cup\cdots\cup T^1\beta_3\cup \beta_2\). Then \(T^{k-1}\beta_k\cup T^{k-2}\beta_{k-1}\cup\cdots\cup T^1\beta_2\) is linearly independent.
Step k Extend \(T^{k-1}\beta_k\cup T^{k-2}\beta_{k-1}\cup\cdots\cup T^1\beta_2\) to a basis for \(\ker(T^1)\): \(T^{k-1}\beta_k\cup T^{k-2}\beta_{k-1}\cup\cdots\cup T^1\beta_2\cup \beta_1\).
Since \(\gamma_1\) is an arbitrary basis for \(\ker(T^1)\), by substituting \(\gamma_1=T^{k-1}\beta_k\cup T^{k-2}\beta_{k-1}\cup\cdots\cup T^1\beta_2\cup \beta_1\) into Step k-1, we conclude that the union of
is a basis for \(\ker(T^2)\). Repeating this procedure inductively, we see that the union of
is a basis for \(\ker(T^k)=\mathbb{C}^n\). Moreover,
for all \(i\).
Let \(\beta_i=\{v_{i,1},\cdots,v_{i,n_i}\}\ (i=1,\cdots,k)\). Then
is an ordered basis for \(\mathbb{C}^n\) such that
where
for each \(i\). This fulfills the proof. \(\blacksquare\)
Ordinary Triangularization
Statement & Proof
Many results above are proved over \(\mathbb{C}\) using Schur triangularization. Their generalizations are made possible by ordinary triangularization.
Lemma. Let \(F\) be any field. Let \(V\) be a vector space over \(F\), and \(T\) a linear operator on \(V\).
Suppose that \(W\) is a nonzero \(T\)-invariant subspace of \(V\). Then it is easy to check that
is a well-defined linear operator, and it is the unique linear operator such that \(\eta T=\overline{T}\eta\), where \(\eta:V\to V/W\) is the linear transformation defined by \(\eta(v):=v+W\).
To go further, assume that \(V\) is finite-dimensional and nonzero. Let \(\gamma=\{v_1,\cdots,v_k\}\) be an ordered basis for \(W\) and extend \(\gamma\) to an ordered basis \(\beta=\{v_1,\cdots,v_k,v_{k+1},\cdots,v_n\}\) for \(V\). Then \(\alpha=\{v_{k+1}+W,\cdots,v_n+W\}\) is an ordered basis for \(V/W\). Using the equation \([\eta]_{\beta}^{\alpha}[T]_{\beta}=[\overline{T}]_{\alpha}[\eta]_{\beta}^{\alpha}\), it is easy to show that
Based on this fact, we have:
(1) The characteristic polynomials satisfy
(2) If \(T\) is diagonalizable, then \([T]_{\beta}\) is normal and so we must have
(see the remark at the end of Generalized Schur’s Theorem: Proof), and hence both \(T_W\) and \(\overline{T}\) are diagonalizable.
(3) If both \(T_W\) and \(\overline{T}\) is diagonalizable and \(\text{gcd}(p_{T_W},p_T)=1\) over \(F\), then by removal rule \(T\) is diagonalizable as well. \(\blacksquare\)
THEOREM. Let \(F\) be any field. Let \(V\) be a nonzero finite-dimensional vector space over \(F\), and \(T\) a linear operator on \(V\). If the characteristic polynomial of \(T\) splits, then there exists a basis \(\beta\) for \(V\) such that \([T]_{\beta}\) is an upper triangular matrix.
Proof. Induction on \(n:=\dim V\). The theorem is trivial when \(n=1\), so we may assume that \(n\ge 2\). Assume the theorem is true whenever the dimension the space is less then \(n\). Since the characteristic polynomial of \(T\) splits, \(T\) has an eigenvector \(z\). Define \(W:=\text{span}(\{z\})\), then \(W\) is \(T\)-invariant and \(\dim(V/W)=n-1\). By the lemma above, \(p_{\overline{T}}\) divides \(p_T\) and hence splits as well. Applying the induction hypothesis to \(\overline{T}:V/W\to V/W\) finishes the proof. \(\blacksquare\)
Corollaries
TODO:
Simultaneous Triangularization
Problem 1. Let \(A,B\in M_{n\times n}(\mathbb{C})\).
(1) Show that if \(AB=BA\), then \(A,B\) are simultaneously triangularizable.
(2) Show that if \(\text{rank}(AB-BA)=1\), then \(A,B\) are simultaneously triangularizable.
TODO:
Problem 2. Let \(A,B\) and \(C\) be matrices in \(M_n(\mathbb{C})\) such that \(C=AB-BA, AC=CA,BC=CB\).
(1) Show that the eigenvalues of \(C\) are all zero.
(2) Let \(m_A(\lambda)\) and \(m_B(\lambda)\) be the minimal polynomials of \(A\) and \(B\), respectively, and \(k:=\min \{\deg m_A(\lambda), \deg m_B(\lambda), n-1\}\). Show that \(C^k=0\).
(3) If \(n=2\), then \(C=O\).
(4) Show that there exists a commun eigenvector of \(A,B\) and \(C\).
(5) Show that \(A,B\) and \(C\) are simultaneously triangularizable.
TODO:
Triangularization over PID
TODO:
Appendices
Appendix A: Complexification
“The complexification of the real vector space \(\mathbb{R}^n\) is the complex vector space \(\mathbb{C}^n\). The complexification of the real inner product space \((L^2(\Omega;\mathbb{R}),\langle u,v \rangle:=\int_{\Omega} u(x)v(x)\,\mathrm{d}x)\) is the complex inner product space \((L^2(\Omega;\mathbb{C}),\langle f,g \rangle:=\int_{\Omega} f(x)\overline{g(x)}\,\mathrm{d}x)\).”
I. Let \(V\) be a real vector space and \(T\) a linear operator on \(V\). Define the complexification of \(V\) to be the complex vector space
The scalar multiplication is made possible by defining
where \(\otimes\) replaces \(\otimes_{\mathbb{R}}\) for brevity. Define the complexification of \(T\) to be the linear operator
Complexifications of linear transformations are defined in the same fashion.
II. Every vector in \(V_{\mathbb{C}}\) is uniquely of the form
If \(\beta\) is a basis for the real vector space \(V\), then \(\beta\otimes_{\mathbb{R}} 1\) is automatically a basis for the complex vector space \(V_{\mathbb{C}}\). In particular, \(\dim_{\mathbb{C}}(V_{\mathbb{C}})=\dim_{\mathbb{R}}(V)\). It is obvious that if \(T\) is invertible, then so is \(T_{\mathbb{C}}\), and
Moreover, if \(V\) is nonzero and finite-dimensional, then i) thanks to the fact that \(T_{\mathbb{C}}\) has an eigenvector, there exists a \(T\)-invariant subspace of dimension \(1\) or \(2\); ii) The characteristic/minimal polynomial of \(T_{\mathbb{C}}\) is exactly the same as that of \(T\) (and hence has real coefficients).
III. If \(\langle\cdot,\cdot\rangle:V\times V\to \mathbb{R}\) is an inner product, then \(\langle\cdot,\cdot\rangle_{\mathbb{C}}:V_{\mathbb{C}}\times V_{\mathbb{C}}\to \mathbb{C}\) defined by
is the unique inner product on \(V_{\mathbb{C}}\) that restricts back to \(\langle \cdot,\cdot \rangle\). We claim that, with respect to this pair of inner products, if the adjoint of \(T\) exists, then so does the adjoint of \(T_{\mathbb{C}}\), and
Indeed, for any \(v+iw,v’+iw’\in V_{\mathbb{C}}\), we have
IV. Let \(W\) be a complex vector space. Let \(\gamma\) be any basis for \(W\), then \(i\gamma\) is also a basis for \(W\). Clearly, \(\gamma\cap i\gamma=\varnothing\), and \(\gamma\cup i\gamma\) is linearly independent over \(\mathbb{R}\). Define \(W_{\mathbb{R}}\) to be the real vector space formed by all the linear combinations with real coefficients of the vectors in \(\gamma\cup i\gamma\). Since \(W_{\mathbb{R}}=\text{span}_{\mathbb{R}}(\gamma)\oplus \text{span}_{\mathbb{R}}(i\gamma)\), it is independent of the choice of \(\gamma\), and is called the realification of \(W\). They have the same underlying set. Let \(S\) be a linear operator on \(W\), then it is automatically a linear operator on \(W_{\mathbb{R}}\), denoted by \(S_{\mathbb{R}}\). If \(W\) is nonzero and finite-dimensional, then \(\dim_{\mathbb{R}}(W_{\mathbb{R}})=2\dim_{\mathbb{C}}(W)\), and \([S_{\mathbb{R}}]_{\gamma\cup i\gamma}=\begin{pmatrix}\text{Re} [S]_{\gamma} & -\text{Im}[S]_{\gamma} \\ \text{Im}[S]_{\gamma} & \text{Re} [S]_{\gamma}\end{pmatrix}\). If \(\langle \cdot,\cdot \rangle:W\times W\to \mathbb{C}\) is an inner product, then \(\langle \cdot,\cdot \rangle_{\mathbb{R}}:W_{\mathbb{R}}\times W_{\mathbb{R}}\to \mathbb{R}\) defined by \(\langle w_1,w_2 \rangle_{\mathbb{R}}:=\text{Re}\langle w_1,w_2\rangle\) is the unique inner product such that \(\langle w,w \rangle_{\mathbb{R}}=\langle w,w \rangle\) and \(\langle w,iw\rangle_{\mathbb{R}}=0\) for all \(w\in W_{\mathbb{R}}\).
V. Complexification is obviously an additive functor from \(\text{Vect}_{\mathbb{R}}\) to \(\text{Vect}_{\mathbb{C}}\). By knowledge of homological algebra, it is the left adjoint functor of the forgetful functor from \(\text{Vect}_{\mathbb{C}}\) to \(\text{Vect}_{\mathbb{R}}\) (see, for example, Proposition 2.6.3 in Weibel’s Introduction to Homological Algerbra). We now present some natural isomorphisms. The first is
where the isomorphisms are given by
Given another real vector spaces \(U\), we have
And there is a natural isomorphism
(For any \(h\in \text{Hom}_{\mathbb{C}}(U_{\mathbb{C}},V_{\mathbb{C}})\), there exists a unique pair of maps \(f,g:U\to V\) such that \(h(u\otimes 1)=f(u)\otimes 1+g(u)\otimes i\) for all \(u\in U\). It is easy to check that \(f,g\) are linear and \(h=f_{\mathbb{C}}+ig_{\mathbb{C}}\).)
Appendix B: Algebraic Closure
An algebraic closure of a field \(F\) is an algebraic extension of \(F\) that is algebraically closed. We will prove that every field has an algebraic closure. The proof invokes the Axiom of Choice when referring to Krull’s theorem: if \(R\) is a ring and \(I\subset R\) is a proper ideal, then there exists a maximal ideal of \(R\) containing \(I\). The basic idea of the proof goes back to Emil Artin. (Do NOT hesitate to consult the excellent notes by Keith Conrad: 1. [https://kconrad.math.uconn.edu/blurbs/galoistheory/algclosureshorter.pdf]; 2. [http://math.stanford.edu/~conrad/121Page/handouts/algclosure.pdf].)
Proof. Let \(F\) be any field. Denote by \(\{f_j:j\in J\}\) the set of all monic irreducible polynomials over \(F\). Introduce indeterminates \(u_{j,1},\cdots,u_{j,d_j}\) for each \(\lambda\), where \(d_j:=\deg(f_j)\). Let \(R\) be the polynomial ring over \(F\) generated by these indeterminates. For each \(j\), consider the coefficients of the polynomial
Denote by \(I\) the ideal in \(R\) generated by these coefficients. Let \(E_j\) be the splitting field of \(f_j\) over \(F\). Subtituting \(u_{j,1},\cdots,u_{j,d_j}\) by the roots of \(f_j\) in \(E_j\), we see that \(r_{j,i}\) vanishes at these roots. Consequently, \(1\notin I\) and so \(I\) is a proper ideal in \(R\). By Krull’s theorm, there exists a maximal ideal \(\mathfrak{m}\) in \(R\) such that \(I\subset \mathfrak{m}\). Then the field \(K_1:=R/\mathfrak{m}\) is an algebraic extension of \(F\) such that every polynomial over \(F\) has a root in \(K_1\). By repeating inductively, we may construct a sequence of fields \(\{K_n\}_{n=1}^{\infty}\) such that each \(K_{n+1}\) is an algebraic extension of \(K_n\) and every polynomial over \(K_n\) has a root in \(K_{n+1}\). Define \(K:=\bigcup_{n=1}^{\infty}K_n\). Then \(K\) is an algebraic extension of \(F\), and every polynomial over \(K\) has a root in \(K\), i.e., \(K\) is algebraically closed. Hence, \(K\) is an algebraic closure of \(F\). \(\blacksquare\)
In fact, algebraic closure is essentially unique.
TODO:
Appendix C: Gramian Determines Shape
This appendix is a disgression.
Proposition. Let \(\{v_1,\cdots,v_s\}\) and \(\{w_1,\cdots,w_s\}\) be two subsets of \(\mathbb{R}^n\). Show that there exists \(A\in O(n)\) such that \(Av_i=w_i\ (1\le i\le s)\) iff \(\langle v_{i}, v_{j}\rangle=\langle w_{i}, w_{j}\rangle\ (1\le i,j\le s)\), i.e., the two Gramians are equal.
Remark 1. Similarly, if \(\{v_1,\cdots,v_s\}\) and \(\{w_1,\cdots,w_s\}\) be two subsets of \(\mathbb{C}^n\), then there exists \(A\in U(n)\) such that \(Av_i=w_i\ (1\le i\le s)\) iff \(\langle v_{i}, v_{j}\rangle=\langle w_{i}, w_{j}\rangle\ (1\le i,j\le s)\), i.e., the two Gramians are equal.
Proof. We summarize the idea of proving \((\Leftarrow)\) as follows. Given the data \(\langle v_i,v_j \rangle\ (1\le i,j\le s)\), we may focus on a maximal linearly independent subset of \(\{v_1,\cdots,v_s\}\) to study the shape formed by these \(s\) vectors in \(\mathbb{R}^n\). Suppose that \(\{v_{k_1},\cdots,v_{k_r}\}\) is a maximal linearly independent subset of \(\{v_1,\cdots,v_s\}\), then \(\{w_{k_1},\cdots,w_{k_r}\}\) is automatically a maximal linearly independent subset of \(\{w_1,\cdots,w_s\}\). Perform Gram-Schmidt on them simultaneously and extend the resulted orthonormal subsets to two orthonormal bases for \(\mathbb{R}^n\). The associated change of coordinates matrix then fulfills the proof. (The Gram-Schmidt process can be embodied by QR decomposition.)
\((\Rightarrow)\): Obvious.
\((\Leftarrow)\): Note that
Denote \(r:=\text{rank}(v_{1}\ \cdots\ v_{s})\). Without loss of generality, assume that \(v_1,\cdots,v_r\) are linearly independent. We claim that \(w_1,\cdots,w_r\) are linearly independent as well. Indeed,
Therefore, there exist \(B,C\in M_{r\times (s-r)}(\mathbb{R})\) such that
Denote \(G:=G(v_1,\cdots,v_r)=G(w_1,\cdots,w_r)\). Since \(\text{rank}(G)=\text{rank}(v_1\ \cdots\ v_r)=r\), the matrix \(G\) is invertible. Thus we have
Now, it suffices to find some \(A\in O(n)\) such that
By QR decomposition, we have
where \(R_i\in M_{r\times r}(\mathbb{R})\) is an upper triangular matrix with positive digonal entries and \(Q_i\in M_{n\times r}(\mathbb{R})\) satisfies \(Q_i^TQ_i=I_r\) (semi-orthogonal). Thus we have
By the uniqueness of Cholesky decompostion for postive-definite matrices, we have \(R_1=R_2\). Indeed, \(R_1R_2^{-1}=(R_1^{T})^{-1}R_2^T\) is upper triangular and lower triangular, and hence a diagonal matrix, denoted by \(D\). Thus we have
Denote \(R:=R_1=R_2\). Now it suffices to find some \(A\in O(n)\) such that
Since the columns of \(Q_i\) forms an orthonormal subset of \(\mathbb{R}^n\) and thus extends to an orthonormal basis for \(\mathbb{R}^n\), there exists \(\widehat{Q}_i\in O(n)\) such that \(\widehat{Q}_i=(Q_i\ |\ X_i)\) for some \(X_i\in M_{n\times (n-r)}(\mathbb{R})\). Now define \(A:=\widehat{Q}_2\widehat{Q}_1^{-1}\). Then \(A\in O(n)\) and \(AQ_1=Q_2\), as desired. \(\blacksquare\)
Appendix D: Finitely Generated Modules over PID
TODO:
原文地址:http://www.cnblogs.com/chaliceseven/p/16853190.html