Chapter 5

Transcription

Chapter 5
5 | Structure Theorems
5.1
Eigenvalues and Eigenvectors
We first recall some numerical examples.
1 0
Example 5.1.1. Diagonalize A =
.
−1 2
That is, find an invertible matrix P and a diagonal matrix D (if any) such that A = P DP −1 .


3 1 −1
Example 5.1.2. Let A = 2 2 −1 and T (~x) = A~x a matrix transformation on R3 .
2 2 0
Find a basis B (if any) such that [T ]B is a diagonal matrix. Given det(A − λI) = −(λ − 1)(λ − 2)2 .
Definition. Let V be a vector space over a field F and T ∈ L(V, V ).
A scalar λ ∈ F is called an eigenvalue or characteristic value of T if there exists a nonzero
vector ~v ∈ V such that T (~v ) = λ~v . If λ is is an eigenvalue of T , then ~v ∈ V such that T (~v ) = λ~v
is called an eigenvector or characteristic vector of T associated with the characteristic
value λ. We have that
Eλ (T ) = {~v ∈ V : T (~v ) = λ~v } = {~v ∈ V : (T − λI)(~v ) = ~0V } = ker(T − λI)
is a subspace of V , called the eigenspace or characteristic space of T associated with λ.
Remark. λ is an eigenvalue of T ⇔ ker(T − λI) 6= ~0V ⇔ T − λI is not 1-1.
For matrix theory, we restrict ourselves to the case of V is n-dimensional. Then L(V, V ) ∼
=
Mn (F ) with T 7→ [T ]B for a fixed basis B of V . Hence, we can only work on Mn (F ).
Definition. Let A ∈ Mn (F ). The matrix transformation TA : F n → F n is given by
TA (~x) = A~x
for all ~x ∈ F n . An eigenvalue of TA is called an eigenvalue of A and the eigenspace of TA is
called an eigenspace of A. In other words,
Eλ (A) = {~x ∈ F n : A~x = λ~x} = {~x ∈ F n : (A − λIn )~x = ~0n } = Nul(A − λIn ).
Then
λ is an eigenvalue of A ⇔ ker(TA − λI) 6= ~0n
⇔ Nul(A − λIn ) 6= ~0n
⇔ A − λIn is not invertible
⇔ det(A − λIn ) = 0.
45
46
5. Structure Theorems
Definition. The polynomial cA (x) = det(xIn − A) is called the characteristic polynomial of A.
Thus we have proved
Theorem 5.1.1. For A ∈ Mn (F ), λ is an eigenvalue of A ⇔ det(A − λIn ) = 0, i.e., λ is a root of
the characteristic polynomial of A.
Since an eigenvalue of an n × n matrix A is a root of cA (x) = det(xIn − A) which has degree n
and a polynomial of degree n over a field F has at most n roots in F , A has ≤ n eigenvalues.
Theorem 5.1.2. An n × n matrix has at most n eigenvalues.
Remark. If A is similar to B, then det A = det B and
det(B − λIn ) = det(P −1 AP − λP −1 In P ) = det(P −1 (A − λIn )P ) = det(A − λIn ).
Therefore, we have the following result.
Theorem 5.1.3. If A and B are similar n × n matrices, then A and B have the same characteristic
polynomial and eigenvalues (with same multiplicities).
Example 5.1.3. The matrices
1 1
A=
0 1
and
1 0
I=
0 1
have the same determinant, trace, characteristic polynomial and eigenvalue, but they are not
similar because P IP −1 = I for any invertible matrix P .
Definition. A diagonal matrix D is a square matrix such that all the entries off the main
diagonal are zero, that is if D is of the form


λ1 0 . . . 0
 0 λ2 . . . 0 


D= .
..  = diag(λ1 , λ2 , . . . , λn ),
.. . .
 ..
.
.
. 
0
0
...
λn
where λ1 , λ2 , . . . , λn ∈ F (not necessarily distinct).
Definition. An n × n matrix A over F is said to be diagonalizable if A is similar to a diagonal
matrix, that is, there are an invertible matrix P and a diagonal matrix D such that P −1 AP = D.
In this case, we say that P diagonalizes A.
Definition. Let V be a finite dimensional vector space and T ∈ L(V, V ) a linear operator. We
say that T is diagonalizable if there exists a basis B for V such that [T ]B is a diagonal matrix.
Theorem 5.1.4. Let A be an n × n matrix.
1. A is diagonalizable ⇔
A has eigenvectors ~v1 , . . . , ~vn such that P = ~v1 · · · ~vn is invertible.
2. When this is the case, P −1 AP = diag(λ1 , λ2 , . . . , λn ), where
for each i, λi is the eigenvalue of A corresponding to ~vi .
47
5.1. Eigenvalues and Eigenvectors
Proof. Let P = ~v1 ~v2 . . .
Then AP = P D becomes
~vn and D = diag(λ1 , λ2 , . . . , λn ).

λ1 0 · · ·

 0 λ2 · · ·
A ~v1 ~v2 · · · ~vn = ~v1 ~v2 · · · ~vn  .
.. . .
 ..
.
.
0 0 ···
A~v1 A~v2 · · · A~vn = λ1~v1 λ2~v2 · · · λn~vn .

0
0

.. 
. 
λn
Comparing columns shows that A~vi = λi~vi for each i, so
P −1 AP = D ⇔ P is invertible and A~vi = λi~vi for all i ∈ {1, . . . , n}.
The results follow.
Theorem 5.1.5. Let ~v1 , . . . , ~vm be eigenvectors corresponding to distinct eigenvalues λ1 , . . . , λm of
an n × n matrix A. Then {~v1 , . . . , ~vm } is linearly independent.
Proof. We use induction on k.
If k = 1, then {~v1 } is linearly independent because ~v1 6= ~0.
Let k ≥ 1 and the theorem is true for any k eigenvectors.
Let ~v1 , . . . , ~vk+1 be eigenvectors corresponding to distinct eigenvalues λ1 , . . . , λk+1 of A.
Let c1 , . . . , ck+1 ∈ F be such that
c1~v1 + c2~v2 + · · · + ck+1~vk+1 = ~0.
(5.1.1)
Since A~vi = λ~vi for all i, multiplying by A both sides gives
c1 λ1~v1 + c2 λ2~v2 + · · · + ck+1 λk+1~vk+1 = ~0.
(5.1.2)
Subtracting (5.1.2) by λ1 ×(5.1.1), we have
c2 (λ2 − λ1 )~v2 + · · · + ck+1 (λk+1 − λ1 )~vk+1 = ~0.
Since ~v2 , . . . , ~vk+1 are k eigenvectors, they are linearly independent by induction hypothesis, so
c2 (λ2 − λ1 ) = · · · = ck+1 (λk+1 − λ1 ) = 0.
However, λ1 , . . . , λk+1 are distinct, hence we get
c2 = · · · = ck+1 = 0.
This implies c1~v1 = ~0, so c1 = 0 because ~v1 6= ~0.
Therefore, ~v1 , . . . , ~vk+1 is linearly independent.
Corollary 5.1.6. If A is an n × n matrix with n distinct eigenvalues, then A is diagonalizable.
Proof. Let ~v1 , . . . , ~vn be eigenvectors corresponding
to distinct eigenvalues λ1 , . . . , λn of A.
Then they are linearly independent, and so P = ~v1 . . . ~vn is invertible
and P −1 AP = diag(λ1 , λ2 , . . . , λn ).
48
5. Structure Theorems
Lemma 5.1.7. Let {~v1 , . . . , ~vk } be a linearly independent set of eigenvectors of an n × n matrix A,
extend it to a basis of F n , and let
P = ~v1 . . . ~vk ~vk+1 . . . ~vn
which is invertible. If λ1 , . . . , λk are the (not necessarily distinct) eigenvalues of A corresponding
to ~v1 , . . . , ~vk , respectively, then P −1 AP has block form
diag(λ1 , . . . , λk ) B
−1
P AP =
0
C
where B has size k × (n − k) and C has size (n − k) × (n − k).
Definition. An eigenvalue λ of a square matrix A is said to have multiplicity m if it occurs m
times as a root of the characteristic polynomial cA (x).
In other words,
cA (x) = (x − λ)m g(x)
for some polynomial g(x) such that g(λ) 6= 0.
Lemma 5.1.8. Let λ be an eigenvalue of multiplicity m of a square matrix A.
Then nullity(A − λI) = dim Eλ (A) ≤ m.
Proof. Assume that dim Eλ (A) = d with basis {~v1 , . . . , ~vd }. By Lemma 5.1.7, there exists an
invertible n × n matrix P such that
P
−1
λId B
AP =
=M
0 C
where Id is the d × d identity matrix. Since M and A are similar,
(x − λ)Id
B
cA (x) = cM (x) = det(xIn − M ) = 0
xIn−d − C = (det(x − λ)Id )(det(xIn−d − C))
= (x − λ)d cC (x).
Hence, d ≤ m because m is the highest power of (x − λ) in cA (x).
Theorem 5.1.9. Let λ1 , . . . , λk be all distinct eigenvalues of an n × n matrix A.
For each i ∈ {1, . . . , k}, let mi denote the multiplicity of λi and write di = nullity(A − λi In ).
Then 1 ≤ di ≤ mi for all i, n = m1 + · · · + mk and
cA (x) = (x − λ1 )m1 · · · (x − λk )mk .
Moreover, the following statements are equivalent.
(i) A is diagonalizable.
(ii) di = nullity(A − λi In ) = dim Eλi (A) = mi for all i.
(iii) n = d1 + · · · + dk .
49
5.2. Annihilating Polynomials
5.2
Annihilating Polynomials
2
Let A be an n × n matrix over a field F . Since dim Mn (F ) = n2 , the set {In , A, A2 , . . . , An } is
linearly dependent. Then there exist c0 , c1 , . . . , cn2 in F not all zero such that
2
c0 In + c1 A + c2 A2 + · · · + cn2 An = 0.
2
Let f (x) be the polynomial over F defined by f (x) = c0 + c1 x + c2 x2 + · · · + cn2 xn . Then f (x) 6= 0
and f (A) = 0.
Let g(x) = α−1 f (x) where α is the leading coefficient of f (x). Then g(x) is monic (leading
coefficient = 1) and g(A) = 0. Thus there exists a polynomial p(x) over F such that
(a) p(A) = 0
(b) p(x) is monic and
(c) ∀ nonzero polynomial q(x), q(A) = 0 ⇒ deg p(x) ≤ deg q(x).
We have that such p(x) is unique (Proof!) and it is called the minimal polynomial. Note that if
k(x) ∈ F [x] and k(A) = 0, then p(x) | k(x).
Remark. If A and B in Mn (F ) are similar, then they have the same minimal polynomial.
Recall that the characteristic polynomial of A is given by
cA (x) = det(xIn − A).
Theorem 5.2.1. The characteristic polynomial and minimal polynomial for A have the same roots.
Remark. Although the minimal polynomial and the characteristic polynomial have the same
roots, they may not be the same.


5 −6 −6
2  is (x − 1)(x − 2)2 while
Example 5.2.1. The characteristic polynomial for A = −1 4
3 −6 −4
(A − I)(A − 2I) = 0,
so the minimal polynomial of A is (x − 1)(x − 2). Notice that A is diagonalizable. In general, we
have:
Theorem 5.2.2. If an n × n matrix A is diagonalizable with distinct eigenvalues λ1 , . . . , λk , then
(x − λ1 ) . . . (x − λk ) is the minimal polynomial for A.
Theorem 5.2.3. [Cayley-Hamilton] If f (x) is the characteristic polynomial of a matrix A, then
f (A) = 0.
Proof. Write f (x) = det(xIn − A) = xn + an−1 xn−1 + · · · + a1 x + a0 . Let B = xIn − A. Since
adj B is a matrix such that each entry is obtained by using (n − 1) × (n − 1) submatrix of A and
computing its determinant,
(n−1) n−1
Cij (B) = bij
x
(n−2) n−2
+ bij
x
(1)
(0)
+ · · · + bij x + bij
for all i, j ∈ {1, . . . , n}. Thus
adj B = [Cij (B)]Tn×n
(n−1) n−1
= [bij
x
(n−2) n−2
+ bij
x
(1)
(0)
+ · · · + bij x + bij ]Tn×n
= Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0
50
5. Structure Theorems
where Bi ∈ Mn (F ). Recall that
(det B)In = B(adj B) = B(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 ).
Then
(xn + an−1 xn−1 + · · · +a1 x + a0 )In
= B(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 )
= (xI − A)(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 )
= Bn−1 xn + Bn−2 xn−1 + · · · + B1 x2 + B0 x
− ABn−1 xn−1 + ABn−2 xn−2 + · · · + AB1 x + AB0 .
This gives
I = Bn−1
an−1 I = Bn−2 − ABn−1
an−2 I = Bn−3 − ABn−2
..
.
a1 I = B0 − AB1
a0 I = −AB0 .
Therefore
An +an−1 An−1 + . . . a1 A + a0 I
= An Bn−1 + An−1 (Bn−2 − ABn−1 ) + An−2 (Bn−3 − ABn−2 ) + . . .
+ A(B0 − AB1 ) − AB0
=0
as desired.

3 1 −1
Example 5.2.2. Determine the minimal polynomial of A = 2 2 −1.
2 2 0

Some consequences of the Cayley-Hamilton are as follows.
Corollary 5.2.4. The minimal polynomial of A divides its characteristic polynomial.
Recall that
0 is an eigenvalue of A ⇔ 0 = det(A − 0I) = det A ⇔ A is not invertible.
Corollary 5.2.5. If f (x) = a0 + a1 x + · · · + an−1 xn−1 + xn is the characteristic polynomial of an
invertible matrix A, then a0 6= 0
A−1 = −
1
(a1 I + a2 A + · · · + An−1 ).
a0
51
5.3. Symmetric and Hermitian Matrices
5.3
Symmetric and Hermitian Matrices
Definition. Let F = R or C and A = [aij ] a matrix over F .
The matrix A is said to be symmetric if A = AT . We define AH = [¯
aij ]T , the conjugate
transpose of A, called A Hermitian. We say that A is Hermitian or self-adjoint if A = AH .
Notice that symmetric and Hermitian matrices are square matrices and they coincide if F = R.
3 1
−1
2 + 3i
Example 5.3.1. Let A =
and B =
.
1 −2
2 − 3i
2
Then A is symmetric and both of them are Hermitian.
Theorem 5.3.1. If A is a Hermitian matrix, then
(1) ~xH A~x is real for all ~x ∈ Cn
and
(2) the eigenvalues of A are real.
That is, if A is Hermitian, then all roots of cA (x) are real.
Example 5.3.2. For vectors ~x and ~y in Cn , we define (~x, ~y ) = ~xH ~y .
Then (·, ·) is an inner product on Cn so that
k~xk2 = ~xH ~x = |x1 |2 + · · · + |xn |2
for all ~x = (x1 , . . . , xn ) ∈ Cn .
Theorem 5.3.2. Two eigenvectors corresponding to different eigenvalues of a Hermitian matrix
are orthogonal to one another.
Definition. For F = R or C and U ∈ Mn (F ), U is called unitary if U H U = In = U U H . If
F = R, a unitary matrix satisfies U T U = In = U U T and may be called an orthonormal matrix.
Theorem 5.3.3. Let U ∈ Mn (C) be a unitary matrix.
For the inner product defined in Example 5.3.2, we have
(U~x, U ~y ) = (~x, ~y ) for all ~x, ~y ∈ Cn , so kU~xk = k~xk for all ~x ∈ C.
Corollary 5.3.4. If U = ~u1 ~u2 . . .
{1, 2, . . . , n} we have
~un ∈ Mn (C) is a unitary matrix, then for all j, k ∈
(~uj , ~uk ) =
(
1 if j = k,
0 if j 6= k.
Remark. The converse of Corollary 5.3.4 is also true and its proof is left as an exercise.
1 1 i
cos t − sin t
and U2 =
Example 5.3.3. U1 = √
sin t cos t
2 i 1
are unitary matrices.
Theorem 5.3.5. Every eigenvalue of an unitary matrix U has absolute value one, i.e., |λ| = 1.
Moreover, eigenvectors corresponding to different eigenvalues are orthogonal to each other.
We are going to explore some very remarkable facts about Hermitian and real symmetric
matrices. These matrices are diagonalizable, and moreover diagonalization can be accomplished
by a unitary matrix P . This means that P −1 AP = P H AP is diagonal. In this situation, we say
that the matrix A is unitarily or orthogonally diagonalizable. Orthogonally and unitary are
particularly attractive since the calculation is essentially free and error-free as well: P H = P −1 .
52
5. Structure Theorems
Theorem 5.3.6. If a real matrix A is a orthogonally diagonalizable with an orthonormal matrix P ,
that is P T AP is a diagonal matrix, then A is symmetric.
Remark. The converse of Theorem 5.3.6 is also true. In addition, we prove a stronger result.
Theorem 5.3.7. [Principal Axes Theorem] Every Hermitian matrix is unitarily diagonalizable.
In addition, every real symmetric matrix is orthogonally diagonalizable.
Proof. We shall show this statement by induction on n. It is clear for n = 1.
Assume that n > 1 and every (n−1)×(n−1) Hermitian matrix is unitary diagonalizable. Consider
an n × n Hermitian matrix A.
Let λ1 be a real eigenvalue of A with unit eigenvector ~v .
Then A~v = λ1~v and k~v k = 1.
Let W = {~v}⊥ with orthonormal
basis {~z1 , . . . , ~zn−1 }.
Thus, R = ~v ~z1 . . . ~zn−1 is an n × n unitary matrix. Observe that

λ1
0

B = RH AR =  .
 ..
0
0
b22
..
.
...
...
bn2 . . .

 
λ1 0
...
0
0


b2n 

 0
=

..   ..


.
C(n−1)×(n−1) 
.
bnn
0
and B H = (RH AR)H = B. Hence, B is Hermitian and so is C.
Since C is an (n − 1) × (n − 1) Hermitian matrix, by the induction hypothesis,
∃ an (n −1) × (n − 1) unitary
matrix Q such that QH CQ = diag(λ2 , . . . , λn ).

1 0 ... 0
0



Let P =  .
. Then P is an n × n unitary matrix and

 ..

Q
0
n×n
P H BP = P H RH ARP = (RP )H A(RP ).
Choose U = RP . Then U H = (RP )H = P H RH = R−1 P −1 = (RP )−1 = U −1 and


λ1


λ2


U H AU = P H BP = 
.
..


.
λn
Hence, A is unitarily diagonalizable.
1
1−i
Example 5.3.4. Diagonalize the Hermitian matrix A =
.
1+i
0


1 2 0
Example 5.3.5. Orthogonally diagonalize the symmetric matrix A = 2 4 0.
0 0 5
(Given λ = 0, 5, 5).
Definition. A square matrix A is normal if AH A = AAH .
Clearly, every Hermitian matrix is normal.
53
5.4. Jordan Forms
Theorem 5.3.8. A matrix is unitarily diagonalizable if and only if it is normal.
Proof. It is a consequence of Schur Triangularization Theorem which is beyond the scope of this
course.
Real versus Complex
(x1 , . . . , xn ) ∈ Rn
length: k~xk2 = x21 + · · · + x2n
transpose: ATij = Aji
(AB)T = B T AT
~x · ~y = ~xT ~y = x1 y1 + · · · + xn yn
orthogonality: ~xT ~y = 0
orthonormal: P T P = In = P P T
symmetric matrix: AT = A
A = P DP −1 = P DP T (real D)
orthogonally diagonalizable
5.4
(x1 , . . . , xn ) ∈ Cn
k~xk2 = |x1 |2 + · · · + |xn |2
Hermitian: AH
ij = Aji
H
(AB) = B H AH
H
~x · ~y = ~x ~y = x
¯ 1 y1 + · · · + x
¯ n yn
H
~x ~y = 0
unitary: U H U = In = U U H
Hermitian matrix AH = A
A = U DU −1 = U DU H (real D)
unitarily diagonalizable
Jordan Forms
Theorem 5.1.9 gives necessary and sufficient conditions for an n × n matrix to be diagonalizable,
namely that it should have n independent eigenvectors. We have also seen square matrices which
are not diagonalizable. In this section, we discuss the so-called Jordan canonical form, a form
of matrix to which every square matrix is similar.
Definition. Let A be an n × n matrix. Let λ be an eigenvalue of A with nullity(A − λIn ) = ℓ.
Assume that λ is of multiplicity m. Then 1 ≤ ℓ ≤ m.
If m = 1, then ℓ = m = 1.
If m > 1 and ℓ < m, then λ is said to be defective and the number m − ℓ > 0 of missing
eigenvector(s) is called the defect of λ.
Note that if A has a defective eigenvalue, then A is not diagonalizable.
Definition. The generalized eigenspace Gλ corresponding to an eigenvalue λ of A, consists
of all vectors ~v such that, for some k ∈ N, (A − λI)k~v = ~0, that is,
[
Gλ (A) = {~v ∈ F n : (A − λI)k~v = ~0 for some k ∈ N} =
Nul(A − λI)k .
k∈N
Definition. A length r chain of generalized eigenvectors based on the eigenvector ~v for λ
is a set {~v = ~v1 , ~v2 , . . . , ~vr } of r linearly independent generalized eigenvectors such that
(A − λI)~vr = ~vr−1 ,
(A − λI)~vr−1 = ~vr−2 ,
..
.
(A − λI)~v2 = ~v1 .
Since ~v1 is an eigenvector, (A − λI)~v1 = ~0. It follows that
(A − λI)r~vr = ~0.
54
5. Structure Theorems
We may denote the action of the matrix A − λI on the string of vectors by
~vr −→ ~vr−1 −→ · · · −→ ~v2 −→ ~v1 −→ ~0.
Now let W be the subspace of Gλ spanned by {~v1 , . . . , ~vr }. Any vector ~x in W has a representation of the form
~x = c1~v1 + c2~v2 + · · · + cr~vr
and
A~x = c1 (A~v1 ) + c2 (A~v2 ) + · · · + cr (A~vr )
= c1 (λ~v1 ) + c2 (λ~v2 + ~v1 ) + · · · + cr (λ~vr + ~vr−1 )
= (λc1 + c2 )~v1 + · · · + (λcr−1 + cr )~vr−1 + λcr~vr .
Thus A~x is also in W . If B = {~v1 , . . . , ~vr } is a basis for W , then

 
 c 
λc1 + c2
1
λ 1
 λc2 + c3  

c
 2 

  λ 1
 . 


.
 . 

.
. .
[A~x]B = 
= J[~x]B
=
.
 . 

 

λ 1  cr−1 
λcr−1 + cr 
λ
λcr
cr
where

λ 1

 λ 1




. .
J = J(λ; r) = 


λ 1
λ r×r

is called the Jordan block of size r corresponding to λ.
1 1
Example 5.4.1. Let A =
. Find generalized eigenspaces of A.
−1 3




−1 1
0
0
1
2
Example 5.4.2. Let A1 = −5 −3 −7 and A2 =  0 −1 0 .
0
1 −1
1
0
0
Then A1 and A2 have the same characteristic polynomial (x + 1)3 . Find
(1) the minimal polynomials of A1 and A2 , and
(2) the generalized eigenspaces of A1 and A2 .
Theorem 5.4.1. If a n × n matrix A has t linearly independent eigenvectors, then it is similar to
a matrix J, that is, in Jordan form, with t square blocks on the diagonal:


J1


..
Jordan form
J = M −1 AM = 
.
.
Jt
Each block has one eigenvector, one eigenvalue, and 1s just above the diagonal:


λi 1


λi 1




. .
.
Jordan block
Ji = J(λi , ri ) = 


. 1
λi r ×r
i
i
The same λi will appear in several blocks, if it has several independent eigenvectors. Moreover, M
consists of n generalized eigenvectors which are linearly independent.
55
5.4. Jordan Forms
Remark. Theorem 5.4.1 says that every n × n matrix A has n linearly independent generalized
eigenvectors. These n generalized eigenvectors may be arranged in chains, with the sum of the
lengths of the chains associated with a given eigenvalue λ equal to the multiplicity of λ. But the
structure of these chains depends on the defect of λ, and can be quite complicated. For instance,
a multiplicity-four-eigenvalue can correspond to
• Four length 1 chain (defect 0);
• Two length 1 chains and a length 2 chain (defect 1);
• Two length 2 chains (defect 2);
• A length 1 chain and a length 3 chain (defect 2);
• A length 4 chain (defect 3).
Observe that, in each of these cases, the length of the longest chain is at most d + 1 where d
is the defect of the eigenvalue. Consequently, once we have found all the ordinary eigenvectors
corresponding to a multiple eigenvalue λ, and therefore know the defect d of λ, we can begin
with the equation
(A − λI)d+1 ~u = ~0
(5.4.1)
to start building the chains of generalized eigenvectors corresponding to λ.
Algorithm: Begin with a nonzero solution ~u1 of Eq. (5.4.1) and successively multiply by the
matrix A − λI until the zero vector is obtained. If
(A − λI)~u1 = ~u2 6= ~0
(A − λI)~u2 = ~u3 6= ~0
..
.
(A − λI)~uk−1 = ~uk 6= ~0
but (A − λI)~uk = ~0, then we get the string of k generalized eigenvectors
~u1 −→ ~u2 −→ · · · −→ ~uk .

0
0
Example 5.4.3. Let A = 
−2
2
Find the chains of generalized
form of A.

0
1
0
0
0
1
 with the characteristic polynomial x(x + 2)3 .
2 −3 1 
−2 1 −3
eigenvectors corresponding to each eigenvalues and the Jordan

8 0 0 0
0 8 0 3

Example 5.4.4. Let A = 
4 0 8 0. Find the minimal polynomial of A and chain(s) of
0 0 0 8
generalized eigenvectors and the Jordan form of A.

Example 5.4.5. Write down the Jordan form of the following matrices.

0
0
(1) 
0
0
0
0
0
0
1
1
1
0

1
1

1
0

3
0
(2) 
0
0
5
3
0
0
0
6
4
0

0
0

7
4

3
0
(3) 
0
0
0
3
0
0
0
5
4
0

0
0

6
4
56
5. Structure Theorems
Let N (r) = J(0; r) denote an r × r matrix that has 1’s immediately above the diagonal and
zero elsewhere. For example,




0 1 0 0
0 1 0
0 0 1 0
0 1

N (2) =
, N (3) = 0 0 1 , N (4) = 
0 0 0 1 , etc.
0 0
0 0 0
0 0 0 0
Then J(λ; r) = λI + N (r), or in abbreviated J = λI + N .
Suppose that f (x) is a polynomial of degree s. Then the Taylor expansion around a point c from
calculus gives us
f (c + x) = f (c) + f ′ (c)x +
f (s) (c) s
f ′′ (c) 2
x + ··· +
x ,
2!
s!
where f ′ , f ′′ , . . . , f (s) represent successive derivatives of f . In terms of matrices I and N , we
have
f ′′ (λI)N 2
f (s) (λI)N s
+ ··· +
2!
s!
′′
(s)
f (λ) 2
f (λ) s
= f (λ)I + f ′ (λ)N +
N + ··· +
N
2!
s!

f ′′ (λ)
′
.
.
f (λ) f (λ)

2!


′′
f
(λ)



. 
f (λ) f ′ (λ)


2!

.
.
.
. 
=


f ′′ (λ) 


.
.


2!


′

. f (λ) 
f (λ) r×r
f (J) = f (λI + N ) = f (λI) + f ′ (λI)N +
because the entries of N k that are k steps above the diagonal are 1’s and all the other entries
are zeros.
Example 5.4.6. Compute J(λ; 4)2 , J(λ; 3)10 and J(λ; 2)s .


 s
J1
J1



.
s
..
Remark. If J = 
 is in a Jordan form, then J = 
Jt


2 1 0
Example 5.4.7. Compute J s for J = 0 2 0.
0 0 3
..
.
Jts


.
Example 5.4.8. Given a square matrix A, use the Jordan form of A, to determine its minimal
polynomial.
Solution. Let J be the Jordan form of A. Since f (A) = M f (J)M −1 , f (A) = 0 if and only if
f (J) = 0. Also, if J(λ; r) is a Jordan block, then f (J(λ; r)) is a Jordan block of f (J). We must
thus find a polynomial such that, for every Jordan block J(λ; r) of J, f (J(λ; r)) = 0 holds.
But we derived a formula for f (J(λ; r)), and it equals the zero matrix if and only if f (λ), f ′ (λ),
. . . , f (r−1) (λ) are all zero. Thus, f (x) and its first r − 1 derivatives must vanish at x = λ; in other
words, (x − λ)r must be a factor of f (x).
57
5.4. Jordan Forms
Let λ1 , . . . , λk be the distinct eigenvalues of A and mi the “maximum size” of the Jordan blocks
corresponding to the eigenvalue λi . Hence, we obtain
f (x) = (x − λ1 )m1 . . . (x − λk )mk
is the minimal polynomial of A.
Example 5.4.9. Find the minimal polynomial of the following matrices.

2

(1) 0
0

3




(4) 





0 0
2 0
0 −1
1
3 1
3
3

2

(2) 0
0

3




(5) 





8 1
8









5

1 0
2 0
0 −1
1
3
3 1
3

2 1 0
(3) 0 1 0 
0 0 −1


8 1
8









5


0 1 0
Exercises for Chapter 5.
1. Let A = 0 0 1. Find a, b, c so that det(A − λI3 ) = 9λ − λ3 .
a b c
2. Let T : V → V be a linear operator.
A subspace U of V is T -invariant if T (U ) ⊆ U , i.e., ∀~u ∈ U, T (~u) ∈ U .
(a) Show that ker T and im T are T -invariant.
(b) If U and W are T -invariant, prove that U ∩ W and U + W are also T -invariant.
(c) Show that the eigenspace Eλ (T ) is T -invariant.
3. Show that A and AT have the same eigenvalues.
m
m
4. Show that if λ1 , . . . , λk are eigenvalues of A, then λm
for all m ≥ 1.
1 , . . . , λk are eigenvalues of A
m
Moreover, each eigenvector of A is an eigenvector of A .
5. Let A and B be n × n matrices over a field F . If I − AB is invertible, prove that I − BA is invertible
and (I − BA)−1 = I + B(I − AB)−1 A.
6. Show that if A and B are the same size, then AB and BA have the same eigenvalues.
7. Determine all 2 × 2 diagonalizable matrices A with nonzero repeated eigenvalue a, a.
8. Let V be the space of all real-valued continuous functions. Define T : V → V by
Z x
f (t) dt.
(T f )(x) =
0
Show that T has no eigenvalues.
9. Prove that if A is invertible and diagonalizable, then A−1 is diagonalizable.
10. Let V = Span{1, sin 2t, sin2 t}. Let T : V → V defined by T (f ) = f ′′ .
Find all eigenvalues and eigenspaces of D. Is T diagonalizable? Explain.
n
X
aij = 0.
11. Let A = [aij ] be an n × n matrix such that for each i = 1, 2, . . . , n, we have
j=1
Show that 0 is an eigenvalue of A.
12. Let A be an n × n matrix with characteristic polynomial (x − λ1 )d1 . . . (x − λk )dk .
Show that tr A = d1 λ1 + · · · + dk λk .
13. Let A be a 2 × 2 matrix. Prove that the characteristic polynomial of A is given by
x2 − (tr A)x + det A = 0.
14. If A and B are 2 × 2 matrices with determinant one, prove that
tr AB − (tr A)(tr B) + tr AB −1 = 0.
58
5. Structure Theorems
15. Find the 2 × 2 matrices with real entries that satisfy the equation
X 3 − 3X 2 =
16.
17.
18.
19.
20.
21.
−2 −2
.
−2 −2
(Hint. Apply
Theorem.)

 the Cayley-Hamilton
0 0 c
Let A = 1 0 b .
0 1 a
Prove that the minimal polynomial of A and the characteristic polynomial of A are the same.
A 3 × 3 matrix A has the characteristic polynomial x(x − 1)(x + 2).
What is the characteristic polynomial of A2 ?
Let V = Mn (F ) be the vector space of n × n matrices over a field F . Let A be an n × n matrix.
Let TA be the linear operator on V defined by TA (B) = AB.
Show that the minimal polynomial for TA is the minimal polynomial for A.
Let U be an n × n real orthonormal matrix. Prove that
2
(a) |tr (U )| ≤ n, and
(b) det(U
− In ) = 0 if n is odd.
(
1 if j = k,
If U = ~u1 ~u2 . . . ~un with (~uj , ~uk ) =
, prove that U is unitary.
0 if j 6= k
Let A be an n × n symmetric matrix with distinct eigenvalues λ1 , . . . , λk . Prove that
(A − λ1 In ) . . . (A − λk In ) = 0.
22. Unitarily diagonalize the following matrices.






2 i i
0 1 0
0 1 0
3 i
2 1
(e) −i 1 0
(d) 1 0 0
(c) −1 0 0 
(b)
(a)
−i 0
−1 2
−i 0 1
0 0 1
0 0 −1
23. Show that every unitarily diagonalizable matrix is normal.
24. Suppose that A is real symmetric and orthonormal. Prove that the only possible eigenvalues of A are
±1.
25. Show that if a real matrix A is skew-symmetric (i.e., AT = −A), then iA is Hermitian.
26. Prove that if A is unitarily diagonalizable, then so is AH .
27. Let A be any square real matrix. Show that the eigenvalues of AT A are all non-negative.
28. Show that the generalized eigenspace Gλ corresponding to an eigenvalue λ of an n × n matrix A is
a subspace of F n .
29. Suppose the characteristic polynomial of a 4 × 4 matrix A is (x − 1)2 (x + 1)2 .
(a) Prove that A−1 = 2A − A3 .
(b) Write down all possible Jordan form(s) of A.
30. Let J = J(λ; r) be an r × r Jordan block with λ on its diagonal. Show that J has only one linearly
independent eigenvector corresponding to λ.
31. If J is in Jordan form with k Jordan blocks on the diagonal, prove that J has exactly k linearly
independent eigenvectors.
32. These Jordan matrices have eigenvalues 0, 0, 0, 0:

0 1

0
J =




0 1
0
and

0 1

0
K=


1
0
0

.

For any matrix M , compare JM with M K. If they equal, show that M is not invertible. Then J
and K are not similar.
33. Suppose that a square matrix has two eigenvalues λ = 2, 5, and np (λ) = nullity(A − λI)p , p ∈ N, are
as follows:
n1 (2) = 2, n2 (2) = 4, np (2) = 5 for p ≥ 3, and n1 (5) = 1, np (5) = 2 for p ≥ 2.
Write down the Jordan form of A.
34. If J = J(0; 5) is the 5 × 5 Jordan block with λ = 0. Find J 2 , count its eigenvectors and write its
Jordan form.
35. How many possible Jordan forms are there for a 6 × 6 matrix with characteristic
polynomial (x − 1)2 (x + 2)4 ?
5.4. Jordan Forms
59

2 a b
36. Let A = 0 2 c  ∈ M3 (R).
0 0 1
(a) Prove that A is diagonalizable if and only if a = 0.
(b) Find the minimal polynomial of A when (i) a = 0 (ii) a 6= 0.
37. Let V = {h(x, y) = ax2 + bxy + cy 2 + dx + ey + f : a, b, c, d, e, f ∈ R} be a subspace of the space
of polynomial in two variables x and y over R. Then B = {x2 , xy, y 2 , x, y, 1} is a basis for V . Define
T : V → V by
Z
∂
h(x, y) dx .
(T (h))(x, y) =
∂y

(a) Prove that T is a linear transformation and find A = [T ]B .
(b) Compute the characteristic polynomial and the minimal polynomial of A.
(c) Find the Jordan form of A.
38. True or False:
3 1
3 0
3 1
3 0
are similar.
and
are similar.
(b)
and
(a)
0 3
0 3
0 4
0 4 


a 1 0
b 0 0
39. Show that 0 a 0 and 0 a 1 are similar.
0 0 b
0 0 a
40. Write down the Jordan formfor the following

 find its minimal
 polynomial.
 matrices and
2 0
1
0
0
−1 0
1
−2 1
(d) −7 9
(a)
(c) −2 −2 −3
(b)  0 −1 1 
−1 −4
0 0
2
3
4
1 −1 −1







5 −1
−3 5 −5
−2 17 4
3 1 −1
3
(h)  1
(g)  3 −1 3 
(f) −1 6 1
(e) 2 2 −1
−3
2
8
−8
10
0
1
2
2
2
0







1 3
7
−1 −4 0 0
2 1 0 1
1 −4
0 −2
0 −1 −4
1
 0 2 1 0

0

3
0
0
1
0
0



(l) 
(k) 
(j) 
(i) 
0 1
1
 0 0 2 1
6 −12 −1 −6
3
2 1 0
0 −6 −14
0
1 0 1
0 0 0 2
0 −4
0 −1
Eigenvalues: (b) −1, −1, −1 (c) 1, 1, 1 (d) 2, 2, 9 (e) 1, 2, 2 (f) 2, 2, 2 (g) 2, 2, 2 (h) 3, 3, 3
(i) −1, −1, 1, 1 (k) 1, 1, 1, 1 (l) 1, 1, 1, 1.

0
7
2

1
0
1
0
0

0
1