Chapter 5
Transcription
Chapter 5
5 | Structure Theorems 5.1 Eigenvalues and Eigenvectors We first recall some numerical examples. 1 0 Example 5.1.1. Diagonalize A = . −1 2 That is, find an invertible matrix P and a diagonal matrix D (if any) such that A = P DP −1 . 3 1 −1 Example 5.1.2. Let A = 2 2 −1 and T (~x) = A~x a matrix transformation on R3 . 2 2 0 Find a basis B (if any) such that [T ]B is a diagonal matrix. Given det(A − λI) = −(λ − 1)(λ − 2)2 . Definition. Let V be a vector space over a field F and T ∈ L(V, V ). A scalar λ ∈ F is called an eigenvalue or characteristic value of T if there exists a nonzero vector ~v ∈ V such that T (~v ) = λ~v . If λ is is an eigenvalue of T , then ~v ∈ V such that T (~v ) = λ~v is called an eigenvector or characteristic vector of T associated with the characteristic value λ. We have that Eλ (T ) = {~v ∈ V : T (~v ) = λ~v } = {~v ∈ V : (T − λI)(~v ) = ~0V } = ker(T − λI) is a subspace of V , called the eigenspace or characteristic space of T associated with λ. Remark. λ is an eigenvalue of T ⇔ ker(T − λI) 6= ~0V ⇔ T − λI is not 1-1. For matrix theory, we restrict ourselves to the case of V is n-dimensional. Then L(V, V ) ∼ = Mn (F ) with T 7→ [T ]B for a fixed basis B of V . Hence, we can only work on Mn (F ). Definition. Let A ∈ Mn (F ). The matrix transformation TA : F n → F n is given by TA (~x) = A~x for all ~x ∈ F n . An eigenvalue of TA is called an eigenvalue of A and the eigenspace of TA is called an eigenspace of A. In other words, Eλ (A) = {~x ∈ F n : A~x = λ~x} = {~x ∈ F n : (A − λIn )~x = ~0n } = Nul(A − λIn ). Then λ is an eigenvalue of A ⇔ ker(TA − λI) 6= ~0n ⇔ Nul(A − λIn ) 6= ~0n ⇔ A − λIn is not invertible ⇔ det(A − λIn ) = 0. 45 46 5. Structure Theorems Definition. The polynomial cA (x) = det(xIn − A) is called the characteristic polynomial of A. Thus we have proved Theorem 5.1.1. For A ∈ Mn (F ), λ is an eigenvalue of A ⇔ det(A − λIn ) = 0, i.e., λ is a root of the characteristic polynomial of A. Since an eigenvalue of an n × n matrix A is a root of cA (x) = det(xIn − A) which has degree n and a polynomial of degree n over a field F has at most n roots in F , A has ≤ n eigenvalues. Theorem 5.1.2. An n × n matrix has at most n eigenvalues. Remark. If A is similar to B, then det A = det B and det(B − λIn ) = det(P −1 AP − λP −1 In P ) = det(P −1 (A − λIn )P ) = det(A − λIn ). Therefore, we have the following result. Theorem 5.1.3. If A and B are similar n × n matrices, then A and B have the same characteristic polynomial and eigenvalues (with same multiplicities). Example 5.1.3. The matrices 1 1 A= 0 1 and 1 0 I= 0 1 have the same determinant, trace, characteristic polynomial and eigenvalue, but they are not similar because P IP −1 = I for any invertible matrix P . Definition. A diagonal matrix D is a square matrix such that all the entries off the main diagonal are zero, that is if D is of the form λ1 0 . . . 0 0 λ2 . . . 0 D= . .. = diag(λ1 , λ2 , . . . , λn ), .. . . .. . . . 0 0 ... λn where λ1 , λ2 , . . . , λn ∈ F (not necessarily distinct). Definition. An n × n matrix A over F is said to be diagonalizable if A is similar to a diagonal matrix, that is, there are an invertible matrix P and a diagonal matrix D such that P −1 AP = D. In this case, we say that P diagonalizes A. Definition. Let V be a finite dimensional vector space and T ∈ L(V, V ) a linear operator. We say that T is diagonalizable if there exists a basis B for V such that [T ]B is a diagonal matrix. Theorem 5.1.4. Let A be an n × n matrix. 1. A is diagonalizable ⇔ A has eigenvectors ~v1 , . . . , ~vn such that P = ~v1 · · · ~vn is invertible. 2. When this is the case, P −1 AP = diag(λ1 , λ2 , . . . , λn ), where for each i, λi is the eigenvalue of A corresponding to ~vi . 47 5.1. Eigenvalues and Eigenvectors Proof. Let P = ~v1 ~v2 . . . Then AP = P D becomes ~vn and D = diag(λ1 , λ2 , . . . , λn ). λ1 0 · · · 0 λ2 · · · A ~v1 ~v2 · · · ~vn = ~v1 ~v2 · · · ~vn . .. . . .. . . 0 0 ··· A~v1 A~v2 · · · A~vn = λ1~v1 λ2~v2 · · · λn~vn . 0 0 .. . λn Comparing columns shows that A~vi = λi~vi for each i, so P −1 AP = D ⇔ P is invertible and A~vi = λi~vi for all i ∈ {1, . . . , n}. The results follow. Theorem 5.1.5. Let ~v1 , . . . , ~vm be eigenvectors corresponding to distinct eigenvalues λ1 , . . . , λm of an n × n matrix A. Then {~v1 , . . . , ~vm } is linearly independent. Proof. We use induction on k. If k = 1, then {~v1 } is linearly independent because ~v1 6= ~0. Let k ≥ 1 and the theorem is true for any k eigenvectors. Let ~v1 , . . . , ~vk+1 be eigenvectors corresponding to distinct eigenvalues λ1 , . . . , λk+1 of A. Let c1 , . . . , ck+1 ∈ F be such that c1~v1 + c2~v2 + · · · + ck+1~vk+1 = ~0. (5.1.1) Since A~vi = λ~vi for all i, multiplying by A both sides gives c1 λ1~v1 + c2 λ2~v2 + · · · + ck+1 λk+1~vk+1 = ~0. (5.1.2) Subtracting (5.1.2) by λ1 ×(5.1.1), we have c2 (λ2 − λ1 )~v2 + · · · + ck+1 (λk+1 − λ1 )~vk+1 = ~0. Since ~v2 , . . . , ~vk+1 are k eigenvectors, they are linearly independent by induction hypothesis, so c2 (λ2 − λ1 ) = · · · = ck+1 (λk+1 − λ1 ) = 0. However, λ1 , . . . , λk+1 are distinct, hence we get c2 = · · · = ck+1 = 0. This implies c1~v1 = ~0, so c1 = 0 because ~v1 6= ~0. Therefore, ~v1 , . . . , ~vk+1 is linearly independent. Corollary 5.1.6. If A is an n × n matrix with n distinct eigenvalues, then A is diagonalizable. Proof. Let ~v1 , . . . , ~vn be eigenvectors corresponding to distinct eigenvalues λ1 , . . . , λn of A. Then they are linearly independent, and so P = ~v1 . . . ~vn is invertible and P −1 AP = diag(λ1 , λ2 , . . . , λn ). 48 5. Structure Theorems Lemma 5.1.7. Let {~v1 , . . . , ~vk } be a linearly independent set of eigenvectors of an n × n matrix A, extend it to a basis of F n , and let P = ~v1 . . . ~vk ~vk+1 . . . ~vn which is invertible. If λ1 , . . . , λk are the (not necessarily distinct) eigenvalues of A corresponding to ~v1 , . . . , ~vk , respectively, then P −1 AP has block form diag(λ1 , . . . , λk ) B −1 P AP = 0 C where B has size k × (n − k) and C has size (n − k) × (n − k). Definition. An eigenvalue λ of a square matrix A is said to have multiplicity m if it occurs m times as a root of the characteristic polynomial cA (x). In other words, cA (x) = (x − λ)m g(x) for some polynomial g(x) such that g(λ) 6= 0. Lemma 5.1.8. Let λ be an eigenvalue of multiplicity m of a square matrix A. Then nullity(A − λI) = dim Eλ (A) ≤ m. Proof. Assume that dim Eλ (A) = d with basis {~v1 , . . . , ~vd }. By Lemma 5.1.7, there exists an invertible n × n matrix P such that P −1 λId B AP = =M 0 C where Id is the d × d identity matrix. Since M and A are similar, (x − λ)Id B cA (x) = cM (x) = det(xIn − M ) = 0 xIn−d − C = (det(x − λ)Id )(det(xIn−d − C)) = (x − λ)d cC (x). Hence, d ≤ m because m is the highest power of (x − λ) in cA (x). Theorem 5.1.9. Let λ1 , . . . , λk be all distinct eigenvalues of an n × n matrix A. For each i ∈ {1, . . . , k}, let mi denote the multiplicity of λi and write di = nullity(A − λi In ). Then 1 ≤ di ≤ mi for all i, n = m1 + · · · + mk and cA (x) = (x − λ1 )m1 · · · (x − λk )mk . Moreover, the following statements are equivalent. (i) A is diagonalizable. (ii) di = nullity(A − λi In ) = dim Eλi (A) = mi for all i. (iii) n = d1 + · · · + dk . 49 5.2. Annihilating Polynomials 5.2 Annihilating Polynomials 2 Let A be an n × n matrix over a field F . Since dim Mn (F ) = n2 , the set {In , A, A2 , . . . , An } is linearly dependent. Then there exist c0 , c1 , . . . , cn2 in F not all zero such that 2 c0 In + c1 A + c2 A2 + · · · + cn2 An = 0. 2 Let f (x) be the polynomial over F defined by f (x) = c0 + c1 x + c2 x2 + · · · + cn2 xn . Then f (x) 6= 0 and f (A) = 0. Let g(x) = α−1 f (x) where α is the leading coefficient of f (x). Then g(x) is monic (leading coefficient = 1) and g(A) = 0. Thus there exists a polynomial p(x) over F such that (a) p(A) = 0 (b) p(x) is monic and (c) ∀ nonzero polynomial q(x), q(A) = 0 ⇒ deg p(x) ≤ deg q(x). We have that such p(x) is unique (Proof!) and it is called the minimal polynomial. Note that if k(x) ∈ F [x] and k(A) = 0, then p(x) | k(x). Remark. If A and B in Mn (F ) are similar, then they have the same minimal polynomial. Recall that the characteristic polynomial of A is given by cA (x) = det(xIn − A). Theorem 5.2.1. The characteristic polynomial and minimal polynomial for A have the same roots. Remark. Although the minimal polynomial and the characteristic polynomial have the same roots, they may not be the same. 5 −6 −6 2 is (x − 1)(x − 2)2 while Example 5.2.1. The characteristic polynomial for A = −1 4 3 −6 −4 (A − I)(A − 2I) = 0, so the minimal polynomial of A is (x − 1)(x − 2). Notice that A is diagonalizable. In general, we have: Theorem 5.2.2. If an n × n matrix A is diagonalizable with distinct eigenvalues λ1 , . . . , λk , then (x − λ1 ) . . . (x − λk ) is the minimal polynomial for A. Theorem 5.2.3. [Cayley-Hamilton] If f (x) is the characteristic polynomial of a matrix A, then f (A) = 0. Proof. Write f (x) = det(xIn − A) = xn + an−1 xn−1 + · · · + a1 x + a0 . Let B = xIn − A. Since adj B is a matrix such that each entry is obtained by using (n − 1) × (n − 1) submatrix of A and computing its determinant, (n−1) n−1 Cij (B) = bij x (n−2) n−2 + bij x (1) (0) + · · · + bij x + bij for all i, j ∈ {1, . . . , n}. Thus adj B = [Cij (B)]Tn×n (n−1) n−1 = [bij x (n−2) n−2 + bij x (1) (0) + · · · + bij x + bij ]Tn×n = Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 50 5. Structure Theorems where Bi ∈ Mn (F ). Recall that (det B)In = B(adj B) = B(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 ). Then (xn + an−1 xn−1 + · · · +a1 x + a0 )In = B(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 ) = (xI − A)(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 ) = Bn−1 xn + Bn−2 xn−1 + · · · + B1 x2 + B0 x − ABn−1 xn−1 + ABn−2 xn−2 + · · · + AB1 x + AB0 . This gives I = Bn−1 an−1 I = Bn−2 − ABn−1 an−2 I = Bn−3 − ABn−2 .. . a1 I = B0 − AB1 a0 I = −AB0 . Therefore An +an−1 An−1 + . . . a1 A + a0 I = An Bn−1 + An−1 (Bn−2 − ABn−1 ) + An−2 (Bn−3 − ABn−2 ) + . . . + A(B0 − AB1 ) − AB0 =0 as desired. 3 1 −1 Example 5.2.2. Determine the minimal polynomial of A = 2 2 −1. 2 2 0 Some consequences of the Cayley-Hamilton are as follows. Corollary 5.2.4. The minimal polynomial of A divides its characteristic polynomial. Recall that 0 is an eigenvalue of A ⇔ 0 = det(A − 0I) = det A ⇔ A is not invertible. Corollary 5.2.5. If f (x) = a0 + a1 x + · · · + an−1 xn−1 + xn is the characteristic polynomial of an invertible matrix A, then a0 6= 0 A−1 = − 1 (a1 I + a2 A + · · · + An−1 ). a0 51 5.3. Symmetric and Hermitian Matrices 5.3 Symmetric and Hermitian Matrices Definition. Let F = R or C and A = [aij ] a matrix over F . The matrix A is said to be symmetric if A = AT . We define AH = [¯ aij ]T , the conjugate transpose of A, called A Hermitian. We say that A is Hermitian or self-adjoint if A = AH . Notice that symmetric and Hermitian matrices are square matrices and they coincide if F = R. 3 1 −1 2 + 3i Example 5.3.1. Let A = and B = . 1 −2 2 − 3i 2 Then A is symmetric and both of them are Hermitian. Theorem 5.3.1. If A is a Hermitian matrix, then (1) ~xH A~x is real for all ~x ∈ Cn and (2) the eigenvalues of A are real. That is, if A is Hermitian, then all roots of cA (x) are real. Example 5.3.2. For vectors ~x and ~y in Cn , we define (~x, ~y ) = ~xH ~y . Then (·, ·) is an inner product on Cn so that k~xk2 = ~xH ~x = |x1 |2 + · · · + |xn |2 for all ~x = (x1 , . . . , xn ) ∈ Cn . Theorem 5.3.2. Two eigenvectors corresponding to different eigenvalues of a Hermitian matrix are orthogonal to one another. Definition. For F = R or C and U ∈ Mn (F ), U is called unitary if U H U = In = U U H . If F = R, a unitary matrix satisfies U T U = In = U U T and may be called an orthonormal matrix. Theorem 5.3.3. Let U ∈ Mn (C) be a unitary matrix. For the inner product defined in Example 5.3.2, we have (U~x, U ~y ) = (~x, ~y ) for all ~x, ~y ∈ Cn , so kU~xk = k~xk for all ~x ∈ C. Corollary 5.3.4. If U = ~u1 ~u2 . . . {1, 2, . . . , n} we have ~un ∈ Mn (C) is a unitary matrix, then for all j, k ∈ (~uj , ~uk ) = ( 1 if j = k, 0 if j 6= k. Remark. The converse of Corollary 5.3.4 is also true and its proof is left as an exercise. 1 1 i cos t − sin t and U2 = Example 5.3.3. U1 = √ sin t cos t 2 i 1 are unitary matrices. Theorem 5.3.5. Every eigenvalue of an unitary matrix U has absolute value one, i.e., |λ| = 1. Moreover, eigenvectors corresponding to different eigenvalues are orthogonal to each other. We are going to explore some very remarkable facts about Hermitian and real symmetric matrices. These matrices are diagonalizable, and moreover diagonalization can be accomplished by a unitary matrix P . This means that P −1 AP = P H AP is diagonal. In this situation, we say that the matrix A is unitarily or orthogonally diagonalizable. Orthogonally and unitary are particularly attractive since the calculation is essentially free and error-free as well: P H = P −1 . 52 5. Structure Theorems Theorem 5.3.6. If a real matrix A is a orthogonally diagonalizable with an orthonormal matrix P , that is P T AP is a diagonal matrix, then A is symmetric. Remark. The converse of Theorem 5.3.6 is also true. In addition, we prove a stronger result. Theorem 5.3.7. [Principal Axes Theorem] Every Hermitian matrix is unitarily diagonalizable. In addition, every real symmetric matrix is orthogonally diagonalizable. Proof. We shall show this statement by induction on n. It is clear for n = 1. Assume that n > 1 and every (n−1)×(n−1) Hermitian matrix is unitary diagonalizable. Consider an n × n Hermitian matrix A. Let λ1 be a real eigenvalue of A with unit eigenvector ~v . Then A~v = λ1~v and k~v k = 1. Let W = {~v}⊥ with orthonormal basis {~z1 , . . . , ~zn−1 }. Thus, R = ~v ~z1 . . . ~zn−1 is an n × n unitary matrix. Observe that λ1 0 B = RH AR = . .. 0 0 b22 .. . ... ... bn2 . . . λ1 0 ... 0 0 b2n 0 = .. .. . C(n−1)×(n−1) . bnn 0 and B H = (RH AR)H = B. Hence, B is Hermitian and so is C. Since C is an (n − 1) × (n − 1) Hermitian matrix, by the induction hypothesis, ∃ an (n −1) × (n − 1) unitary matrix Q such that QH CQ = diag(λ2 , . . . , λn ). 1 0 ... 0 0 Let P = . . Then P is an n × n unitary matrix and .. Q 0 n×n P H BP = P H RH ARP = (RP )H A(RP ). Choose U = RP . Then U H = (RP )H = P H RH = R−1 P −1 = (RP )−1 = U −1 and λ1 λ2 U H AU = P H BP = . .. . λn Hence, A is unitarily diagonalizable. 1 1−i Example 5.3.4. Diagonalize the Hermitian matrix A = . 1+i 0 1 2 0 Example 5.3.5. Orthogonally diagonalize the symmetric matrix A = 2 4 0. 0 0 5 (Given λ = 0, 5, 5). Definition. A square matrix A is normal if AH A = AAH . Clearly, every Hermitian matrix is normal. 53 5.4. Jordan Forms Theorem 5.3.8. A matrix is unitarily diagonalizable if and only if it is normal. Proof. It is a consequence of Schur Triangularization Theorem which is beyond the scope of this course. Real versus Complex (x1 , . . . , xn ) ∈ Rn length: k~xk2 = x21 + · · · + x2n transpose: ATij = Aji (AB)T = B T AT ~x · ~y = ~xT ~y = x1 y1 + · · · + xn yn orthogonality: ~xT ~y = 0 orthonormal: P T P = In = P P T symmetric matrix: AT = A A = P DP −1 = P DP T (real D) orthogonally diagonalizable 5.4 (x1 , . . . , xn ) ∈ Cn k~xk2 = |x1 |2 + · · · + |xn |2 Hermitian: AH ij = Aji H (AB) = B H AH H ~x · ~y = ~x ~y = x ¯ 1 y1 + · · · + x ¯ n yn H ~x ~y = 0 unitary: U H U = In = U U H Hermitian matrix AH = A A = U DU −1 = U DU H (real D) unitarily diagonalizable Jordan Forms Theorem 5.1.9 gives necessary and sufficient conditions for an n × n matrix to be diagonalizable, namely that it should have n independent eigenvectors. We have also seen square matrices which are not diagonalizable. In this section, we discuss the so-called Jordan canonical form, a form of matrix to which every square matrix is similar. Definition. Let A be an n × n matrix. Let λ be an eigenvalue of A with nullity(A − λIn ) = ℓ. Assume that λ is of multiplicity m. Then 1 ≤ ℓ ≤ m. If m = 1, then ℓ = m = 1. If m > 1 and ℓ < m, then λ is said to be defective and the number m − ℓ > 0 of missing eigenvector(s) is called the defect of λ. Note that if A has a defective eigenvalue, then A is not diagonalizable. Definition. The generalized eigenspace Gλ corresponding to an eigenvalue λ of A, consists of all vectors ~v such that, for some k ∈ N, (A − λI)k~v = ~0, that is, [ Gλ (A) = {~v ∈ F n : (A − λI)k~v = ~0 for some k ∈ N} = Nul(A − λI)k . k∈N Definition. A length r chain of generalized eigenvectors based on the eigenvector ~v for λ is a set {~v = ~v1 , ~v2 , . . . , ~vr } of r linearly independent generalized eigenvectors such that (A − λI)~vr = ~vr−1 , (A − λI)~vr−1 = ~vr−2 , .. . (A − λI)~v2 = ~v1 . Since ~v1 is an eigenvector, (A − λI)~v1 = ~0. It follows that (A − λI)r~vr = ~0. 54 5. Structure Theorems We may denote the action of the matrix A − λI on the string of vectors by ~vr −→ ~vr−1 −→ · · · −→ ~v2 −→ ~v1 −→ ~0. Now let W be the subspace of Gλ spanned by {~v1 , . . . , ~vr }. Any vector ~x in W has a representation of the form ~x = c1~v1 + c2~v2 + · · · + cr~vr and A~x = c1 (A~v1 ) + c2 (A~v2 ) + · · · + cr (A~vr ) = c1 (λ~v1 ) + c2 (λ~v2 + ~v1 ) + · · · + cr (λ~vr + ~vr−1 ) = (λc1 + c2 )~v1 + · · · + (λcr−1 + cr )~vr−1 + λcr~vr . Thus A~x is also in W . If B = {~v1 , . . . , ~vr } is a basis for W , then c λc1 + c2 1 λ 1 λc2 + c3 c 2 λ 1 . . . . . . [A~x]B = = J[~x]B = . . λ 1 cr−1 λcr−1 + cr λ λcr cr where λ 1 λ 1 . . J = J(λ; r) = λ 1 λ r×r is called the Jordan block of size r corresponding to λ. 1 1 Example 5.4.1. Let A = . Find generalized eigenspaces of A. −1 3 −1 1 0 0 1 2 Example 5.4.2. Let A1 = −5 −3 −7 and A2 = 0 −1 0 . 0 1 −1 1 0 0 Then A1 and A2 have the same characteristic polynomial (x + 1)3 . Find (1) the minimal polynomials of A1 and A2 , and (2) the generalized eigenspaces of A1 and A2 . Theorem 5.4.1. If a n × n matrix A has t linearly independent eigenvectors, then it is similar to a matrix J, that is, in Jordan form, with t square blocks on the diagonal: J1 .. Jordan form J = M −1 AM = . . Jt Each block has one eigenvector, one eigenvalue, and 1s just above the diagonal: λi 1 λi 1 . . . Jordan block Ji = J(λi , ri ) = . 1 λi r ×r i i The same λi will appear in several blocks, if it has several independent eigenvectors. Moreover, M consists of n generalized eigenvectors which are linearly independent. 55 5.4. Jordan Forms Remark. Theorem 5.4.1 says that every n × n matrix A has n linearly independent generalized eigenvectors. These n generalized eigenvectors may be arranged in chains, with the sum of the lengths of the chains associated with a given eigenvalue λ equal to the multiplicity of λ. But the structure of these chains depends on the defect of λ, and can be quite complicated. For instance, a multiplicity-four-eigenvalue can correspond to • Four length 1 chain (defect 0); • Two length 1 chains and a length 2 chain (defect 1); • Two length 2 chains (defect 2); • A length 1 chain and a length 3 chain (defect 2); • A length 4 chain (defect 3). Observe that, in each of these cases, the length of the longest chain is at most d + 1 where d is the defect of the eigenvalue. Consequently, once we have found all the ordinary eigenvectors corresponding to a multiple eigenvalue λ, and therefore know the defect d of λ, we can begin with the equation (A − λI)d+1 ~u = ~0 (5.4.1) to start building the chains of generalized eigenvectors corresponding to λ. Algorithm: Begin with a nonzero solution ~u1 of Eq. (5.4.1) and successively multiply by the matrix A − λI until the zero vector is obtained. If (A − λI)~u1 = ~u2 6= ~0 (A − λI)~u2 = ~u3 6= ~0 .. . (A − λI)~uk−1 = ~uk 6= ~0 but (A − λI)~uk = ~0, then we get the string of k generalized eigenvectors ~u1 −→ ~u2 −→ · · · −→ ~uk . 0 0 Example 5.4.3. Let A = −2 2 Find the chains of generalized form of A. 0 1 0 0 0 1 with the characteristic polynomial x(x + 2)3 . 2 −3 1 −2 1 −3 eigenvectors corresponding to each eigenvalues and the Jordan 8 0 0 0 0 8 0 3 Example 5.4.4. Let A = 4 0 8 0. Find the minimal polynomial of A and chain(s) of 0 0 0 8 generalized eigenvectors and the Jordan form of A. Example 5.4.5. Write down the Jordan form of the following matrices. 0 0 (1) 0 0 0 0 0 0 1 1 1 0 1 1 1 0 3 0 (2) 0 0 5 3 0 0 0 6 4 0 0 0 7 4 3 0 (3) 0 0 0 3 0 0 0 5 4 0 0 0 6 4 56 5. Structure Theorems Let N (r) = J(0; r) denote an r × r matrix that has 1’s immediately above the diagonal and zero elsewhere. For example, 0 1 0 0 0 1 0 0 0 1 0 0 1 N (2) = , N (3) = 0 0 1 , N (4) = 0 0 0 1 , etc. 0 0 0 0 0 0 0 0 0 Then J(λ; r) = λI + N (r), or in abbreviated J = λI + N . Suppose that f (x) is a polynomial of degree s. Then the Taylor expansion around a point c from calculus gives us f (c + x) = f (c) + f ′ (c)x + f (s) (c) s f ′′ (c) 2 x + ··· + x , 2! s! where f ′ , f ′′ , . . . , f (s) represent successive derivatives of f . In terms of matrices I and N , we have f ′′ (λI)N 2 f (s) (λI)N s + ··· + 2! s! ′′ (s) f (λ) 2 f (λ) s = f (λ)I + f ′ (λ)N + N + ··· + N 2! s! f ′′ (λ) ′ . . f (λ) f (λ) 2! ′′ f (λ) . f (λ) f ′ (λ) 2! . . . . = f ′′ (λ) . . 2! ′ . f (λ) f (λ) r×r f (J) = f (λI + N ) = f (λI) + f ′ (λI)N + because the entries of N k that are k steps above the diagonal are 1’s and all the other entries are zeros. Example 5.4.6. Compute J(λ; 4)2 , J(λ; 3)10 and J(λ; 2)s . s J1 J1 . s .. Remark. If J = is in a Jordan form, then J = Jt 2 1 0 Example 5.4.7. Compute J s for J = 0 2 0. 0 0 3 .. . Jts . Example 5.4.8. Given a square matrix A, use the Jordan form of A, to determine its minimal polynomial. Solution. Let J be the Jordan form of A. Since f (A) = M f (J)M −1 , f (A) = 0 if and only if f (J) = 0. Also, if J(λ; r) is a Jordan block, then f (J(λ; r)) is a Jordan block of f (J). We must thus find a polynomial such that, for every Jordan block J(λ; r) of J, f (J(λ; r)) = 0 holds. But we derived a formula for f (J(λ; r)), and it equals the zero matrix if and only if f (λ), f ′ (λ), . . . , f (r−1) (λ) are all zero. Thus, f (x) and its first r − 1 derivatives must vanish at x = λ; in other words, (x − λ)r must be a factor of f (x). 57 5.4. Jordan Forms Let λ1 , . . . , λk be the distinct eigenvalues of A and mi the “maximum size” of the Jordan blocks corresponding to the eigenvalue λi . Hence, we obtain f (x) = (x − λ1 )m1 . . . (x − λk )mk is the minimal polynomial of A. Example 5.4.9. Find the minimal polynomial of the following matrices. 2 (1) 0 0 3 (4) 0 0 2 0 0 −1 1 3 1 3 3 2 (2) 0 0 3 (5) 8 1 8 5 1 0 2 0 0 −1 1 3 3 1 3 2 1 0 (3) 0 1 0 0 0 −1 8 1 8 5 0 1 0 Exercises for Chapter 5. 1. Let A = 0 0 1. Find a, b, c so that det(A − λI3 ) = 9λ − λ3 . a b c 2. Let T : V → V be a linear operator. A subspace U of V is T -invariant if T (U ) ⊆ U , i.e., ∀~u ∈ U, T (~u) ∈ U . (a) Show that ker T and im T are T -invariant. (b) If U and W are T -invariant, prove that U ∩ W and U + W are also T -invariant. (c) Show that the eigenspace Eλ (T ) is T -invariant. 3. Show that A and AT have the same eigenvalues. m m 4. Show that if λ1 , . . . , λk are eigenvalues of A, then λm for all m ≥ 1. 1 , . . . , λk are eigenvalues of A m Moreover, each eigenvector of A is an eigenvector of A . 5. Let A and B be n × n matrices over a field F . If I − AB is invertible, prove that I − BA is invertible and (I − BA)−1 = I + B(I − AB)−1 A. 6. Show that if A and B are the same size, then AB and BA have the same eigenvalues. 7. Determine all 2 × 2 diagonalizable matrices A with nonzero repeated eigenvalue a, a. 8. Let V be the space of all real-valued continuous functions. Define T : V → V by Z x f (t) dt. (T f )(x) = 0 Show that T has no eigenvalues. 9. Prove that if A is invertible and diagonalizable, then A−1 is diagonalizable. 10. Let V = Span{1, sin 2t, sin2 t}. Let T : V → V defined by T (f ) = f ′′ . Find all eigenvalues and eigenspaces of D. Is T diagonalizable? Explain. n X aij = 0. 11. Let A = [aij ] be an n × n matrix such that for each i = 1, 2, . . . , n, we have j=1 Show that 0 is an eigenvalue of A. 12. Let A be an n × n matrix with characteristic polynomial (x − λ1 )d1 . . . (x − λk )dk . Show that tr A = d1 λ1 + · · · + dk λk . 13. Let A be a 2 × 2 matrix. Prove that the characteristic polynomial of A is given by x2 − (tr A)x + det A = 0. 14. If A and B are 2 × 2 matrices with determinant one, prove that tr AB − (tr A)(tr B) + tr AB −1 = 0. 58 5. Structure Theorems 15. Find the 2 × 2 matrices with real entries that satisfy the equation X 3 − 3X 2 = 16. 17. 18. 19. 20. 21. −2 −2 . −2 −2 (Hint. Apply Theorem.) the Cayley-Hamilton 0 0 c Let A = 1 0 b . 0 1 a Prove that the minimal polynomial of A and the characteristic polynomial of A are the same. A 3 × 3 matrix A has the characteristic polynomial x(x − 1)(x + 2). What is the characteristic polynomial of A2 ? Let V = Mn (F ) be the vector space of n × n matrices over a field F . Let A be an n × n matrix. Let TA be the linear operator on V defined by TA (B) = AB. Show that the minimal polynomial for TA is the minimal polynomial for A. Let U be an n × n real orthonormal matrix. Prove that 2 (a) |tr (U )| ≤ n, and (b) det(U − In ) = 0 if n is odd. ( 1 if j = k, If U = ~u1 ~u2 . . . ~un with (~uj , ~uk ) = , prove that U is unitary. 0 if j 6= k Let A be an n × n symmetric matrix with distinct eigenvalues λ1 , . . . , λk . Prove that (A − λ1 In ) . . . (A − λk In ) = 0. 22. Unitarily diagonalize the following matrices. 2 i i 0 1 0 0 1 0 3 i 2 1 (e) −i 1 0 (d) 1 0 0 (c) −1 0 0 (b) (a) −i 0 −1 2 −i 0 1 0 0 1 0 0 −1 23. Show that every unitarily diagonalizable matrix is normal. 24. Suppose that A is real symmetric and orthonormal. Prove that the only possible eigenvalues of A are ±1. 25. Show that if a real matrix A is skew-symmetric (i.e., AT = −A), then iA is Hermitian. 26. Prove that if A is unitarily diagonalizable, then so is AH . 27. Let A be any square real matrix. Show that the eigenvalues of AT A are all non-negative. 28. Show that the generalized eigenspace Gλ corresponding to an eigenvalue λ of an n × n matrix A is a subspace of F n . 29. Suppose the characteristic polynomial of a 4 × 4 matrix A is (x − 1)2 (x + 1)2 . (a) Prove that A−1 = 2A − A3 . (b) Write down all possible Jordan form(s) of A. 30. Let J = J(λ; r) be an r × r Jordan block with λ on its diagonal. Show that J has only one linearly independent eigenvector corresponding to λ. 31. If J is in Jordan form with k Jordan blocks on the diagonal, prove that J has exactly k linearly independent eigenvectors. 32. These Jordan matrices have eigenvalues 0, 0, 0, 0: 0 1 0 J = 0 1 0 and 0 1 0 K= 1 0 0 . For any matrix M , compare JM with M K. If they equal, show that M is not invertible. Then J and K are not similar. 33. Suppose that a square matrix has two eigenvalues λ = 2, 5, and np (λ) = nullity(A − λI)p , p ∈ N, are as follows: n1 (2) = 2, n2 (2) = 4, np (2) = 5 for p ≥ 3, and n1 (5) = 1, np (5) = 2 for p ≥ 2. Write down the Jordan form of A. 34. If J = J(0; 5) is the 5 × 5 Jordan block with λ = 0. Find J 2 , count its eigenvectors and write its Jordan form. 35. How many possible Jordan forms are there for a 6 × 6 matrix with characteristic polynomial (x − 1)2 (x + 2)4 ? 5.4. Jordan Forms 59 2 a b 36. Let A = 0 2 c ∈ M3 (R). 0 0 1 (a) Prove that A is diagonalizable if and only if a = 0. (b) Find the minimal polynomial of A when (i) a = 0 (ii) a 6= 0. 37. Let V = {h(x, y) = ax2 + bxy + cy 2 + dx + ey + f : a, b, c, d, e, f ∈ R} be a subspace of the space of polynomial in two variables x and y over R. Then B = {x2 , xy, y 2 , x, y, 1} is a basis for V . Define T : V → V by Z ∂ h(x, y) dx . (T (h))(x, y) = ∂y (a) Prove that T is a linear transformation and find A = [T ]B . (b) Compute the characteristic polynomial and the minimal polynomial of A. (c) Find the Jordan form of A. 38. True or False: 3 1 3 0 3 1 3 0 are similar. and are similar. (b) and (a) 0 3 0 3 0 4 0 4 a 1 0 b 0 0 39. Show that 0 a 0 and 0 a 1 are similar. 0 0 b 0 0 a 40. Write down the Jordan formfor the following find its minimal polynomial. matrices and 2 0 1 0 0 −1 0 1 −2 1 (d) −7 9 (a) (c) −2 −2 −3 (b) 0 −1 1 −1 −4 0 0 2 3 4 1 −1 −1 5 −1 −3 5 −5 −2 17 4 3 1 −1 3 (h) 1 (g) 3 −1 3 (f) −1 6 1 (e) 2 2 −1 −3 2 8 −8 10 0 1 2 2 2 0 1 3 7 −1 −4 0 0 2 1 0 1 1 −4 0 −2 0 −1 −4 1 0 2 1 0 0 3 0 0 1 0 0 (l) (k) (j) (i) 0 1 1 0 0 2 1 6 −12 −1 −6 3 2 1 0 0 −6 −14 0 1 0 1 0 0 0 2 0 −4 0 −1 Eigenvalues: (b) −1, −1, −1 (c) 1, 1, 1 (d) 2, 2, 9 (e) 1, 2, 2 (f) 2, 2, 2 (g) 2, 2, 2 (h) 3, 3, 3 (i) −1, −1, 1, 1 (k) 1, 1, 1, 1 (l) 1, 1, 1, 1. 0 7 2 1 0 1 0 0 0 1