Bare Bones Algebra, Eh!?

Transcription

Barebones Algebra, EH ?! :
Ancient Abstract Algebra
c
Peter Hoffman !1995
This book contains the bare bones of the theory, covering basic abstract
algebra in a fairly solid way at the undergraduate level, but not attempting
the comprehensiveness needed for PhD students. That is, it’s a textbook,
not an encyclopedia. On the other hand, various tedious calculations and
mechanical illustrations probably occur less frequently here than they do in
the ‘average’ text. The book is aimed at mathematicians teaching talented
and industrious students, and also at especially talented students for selfstudy. I’ve had the pleasure of teaching many very good Honours Pure
Math classes over the years. Some students in our Mathematics Faculty
at Waterloo get their honours degrees without having much demanded of
them in the way of basic mathematics, so these classes have often been quite
elite. What follows is the endpoint in the evolution of sets of notes for these
courses. As a result, the pace here would possibly be considered rather brisk
by many, but not by anyone with some interest and talent in mathematics,
who is willing to work hard. Math is certainly fun; but not fun and easy for
most of us!
The other side of the coin is that being succinct should help the reader
to see the forest as a whole without being distracted by too much scrub and
deadwood; and should also help the lecturer, if existing, to concentrate on
examples, applications and motivation. Not appearing to need 449 pages (or
even 149) to arrive at Abel/Galois/Ruffini’s famous discoveries also helps to
maintain motivation. As we proceed, more of the theory is left for the reader
to fill in, especially routine checking which hardly merits the name ‘proof’.
This approximates day-to-day research in mathematics just as much as work
on solving problems. Such problems of course cannot be overemphasized
as tools for mastering the material in any mathematics course. Here, an
exercise will often be inserted, as an essential step, into the middle of a
proof. Involvement as a player, rather than spectator, makes the learning
more intense. As students proceed through the theoretical material, they will
develop the ability to see when some checking which ought to be done has
been left out. There are some other exercises as well, but not as many routine
computations as some texts contain. Another ‘novelty’ is the inclusion of
a large collection of exercises ‘in random order’ at the end of the book.
Researchers (or students writing an exam) cannot argue as follows: “This
problem is in Section xyz; therefore, the tools needed are from that section.”
So these should be useful.
The reader will find it helpful (but essential only for a few examples and
for the sections from 46 onwards) to have a working knowledge of linear
algebra. You’ll need to be familiar with the elementary facts about the
standard number systems, Cartesian coordinates, induction, and with simple
counting arguments using binomial coefficients.
Sections 3 to 13 are on groups, 14 to 22 on rings, 23 to 34 (excluding
25) on basic field theory, 35 to 41 on Galois theory (of finite extensions
and mostly in characteristic zero), 42 to 46 on modules and canonical forms
for similarity, 47 to 50 on linear representations of finite groups, and 51–
2 on division algebras. The numbering of results and exercises is such as
requires no explanation. The extensive index should be helpful to readers who
already have some background in abstract algebra. Much of the later material
depends on earlier definitions and theorems. The mathematician referred to
above will easily see how to omit and permute sections as needed; and will
not need gratuitous advice from me concerning how much material can be
covered per unit of teaching time—suffice to say that there are about two
or three solid one-term courses in what follows, perhaps two for a graduate
student filling in gaps in his or her background.
Risking condemnation for nattering nabobery1 , we note that if it’s not
in boldface, then it’s not a joke, but not conversely.
I wish to thank the students in my algebra classes over the years for
helping me to learn algebra better. In particular Ian Goldberg, David Kerr
and Mike Mosca, who found some errors in an earlier version of this book,
deserve special thanks. Of course, all remaining errors are ones for which I
claim priority. I am very grateful also to the following good friends: Debbie
Brown, Lois Graham, Carolyn Jackson, Linda Kelly and Annemarie Nittel, for
their excellent help with the ‘keyboarding’.
1
Spiro T. Agnew, circa 1971
ii
Basic Notation.
Suppose that B is a set, and that b is a member of B. Such a statement
will frequently be shortened to “b ∈ B”. As an aid to short-term memory,
we’ll often correlate notation in this way, with members denoted by the lowercase letter whose corresponding uppercase letter denotes the set, sometimes
using subscripts etc. Also we’ll try to denote different species of objects in
a given discussion using different species of letters: lowercase, uppercase,
Greek, etc. But, of course, sets themselves can be members of other sets, so
this has some limitations.
A third species in a given discussion might be maps, or functions, between
sets, with notation such as γ : A → B. The domain of γ is A, and its
codomain is B. If B were a proper subset of C—this is written B ⊂ C, B %= C
: here the symbol ⊂ includes the possibility that the two sets are equal—and
if we enlarged the codomain of the function to C, then the new function,
from A to C, must be regarded as being different from γ, even though it
has the same domain and the same ‘formula’. This is usually not done in
calculus. But it is necessary for the adjective ‘surjective’ to be meaningful:
the previous map γ is surjective if and only if
Imγ := { b ∈ B : ∃a ∈ A with γ(a) = b } = B .
The display asserts that every element of B has the form γ(a) for at least one
a ∈ A. We have used “ := ” to indicate that the notation to its left is being
defined. So we’ve defined the image of γ, denoted Imγ. The real assertion
in the display is made by the last “ = ” sign without the “ : ” sign. (The
latter symbol is called a ‘colon’, in case you didn’t know! A proliferation of
symbols “ ; ” is known as cancer of the semi-colon.)
We have also used the quantifier “ ∃ ” ; this should be read: “there
exists”. Occasionally we’ll also use “ ∃! ” ; this should be read: “there exists
a unique”. The other quantifier is “ ∀ ” ; it is read: “for all”.
Our map γ is injective if and only if
∀a1 , a2 ∈ A ,
[ γ(a1 ) = γ(a2 ) =⇒ a1 = a2 ] .
This says that distinct members of A have distinct images in B under the
function γ. The symbol “ =⇒ ” is read: “implies”. We say that the function
γ is bijective if and only if it is both injective and surjective. In this case, γ
iii
has an inverse, γ −1 : B → A. Furthermore, the sets A and B then have the
same ‘size’ or cardinal number. When the sets are infinite, this is a definition
of great significance; when they are finite, whether it is a definition or an
assertion is a question of how you set up your foundations of mathematics.
When dealing with (potentially) infinite sets, the phrase “almost all” means
“all but finitely many”.
The standard number systems will be denoted in the usual way:
Z := the set of all integers (positive and negative) ;
Q := the set of all rational numbers ;
R := the set of all real numbers ;
C := the set of all complex numbers .
The first two of these have countable cardinality; that is, there is a
bijective function in each case from the set to the set of positive integers.
However, neither R nor C is countable. Each of these is a subset of the
next. Each can be constructed using the previous set, so that all of standard
mathematics can in principle be based on logic and a few assumptions about
sets. Detailed knowledge of this will not be needed here, but it is assumed
that you know a certain amount about these number systems.
In particular, the notorious real number e, which arises basically from
seeking functions which equal their own derivatives, satisfies the famous identity of Euler :
eiθ = cos(θ) + i sin(θ) .
It follows that e2πi/n is a complex number whose nth power equals 1. The
only other complex numbers with this property are the powers of e2πi/n , of
which there are only “n” in total. Notice that we use the ordinary Roman
‘e’ for this number, leaving the italicized e to be used in many other contexts
without causing ambiguity.
Very occasionally in the book, the word(s) “(finite) multiset” occur. Informally, this just means a ‘set where some elements can appear more than
once’—so it’s not really a set. If the reader were to insist, this could be made
more formal in at least two ways: as a finite set plus a function from that
set to the positive integers (the ‘frequency of occurrence function’); or as
an equivalence class of finite sequences (where two are equivalent if they are
re-arrangements of each other).
We shall often deal with an equivalence relation on a set S. Recall that
this is any relation, say ∼, between some pairs of elements of S which has
iv
the following three properties:
(i) ∀a ∈ S, a ∼ a (reflexivity) ;
(ii) ∀a, b ∈ S, a ∼ b =⇒ b ∼ a (symmetry) ;
(iii) ∀a, b, c ∈ S, (a ∼ b & b ∼ c) =⇒ a ∼ c (transitivity) .
The equivalence class of an element a of S is denoted [a] or [a]∼ ; it consists
of all elements in S which are related to a with respect to the relation ∼ .
Each equivalence class is non-empty; the union of all classes is S; and no two
distinct classes have any element in common, i.e. they are disjoint (although,
in all but the extreme case where ∼ means equality, we’ll have elements a %= b
for which [a] = [b] ). Sometimes the set whose elements are the equivalence
classes is denoted as S / ∼ .
An important instance is where S = Z and ∼ is congruence (mod k), for
some fixed positive integer k. Two integers a and b are related under this
relation precisely when their difference is an integer multiple of k. This is
denoted a ≡ b (mod k)—and when a and b are not related with respect to
congruence (mod k), this is of course denoted a %≡ b (mod k). In this example,
the equivalence classes are also called congruence classes. The infinite set Z
is thus partitioned as the union of finitely many congruence classes, each of
which is an infinite set. In fact, the number of congruence classes is “k”.
In this example, the set of all congruence classes is denoted Zk , rather than
using the cumbersome notation Z /(≡ mod k) . Another special fact about
this example is that there is a well-defined method for adding, and another
for multiplying, two congruence classes. The formulae are
[a] + [b] := [a + b] ;
[a] · [b] := [a · b] .
This looks tautological at first, or circular. But it isn’t—the operations on the
left-hand sides are being defined in terms of the familiar operations applied
to ordinary integers on the right-hand sides. The (hopefully temporary)
confusion is caused by the abuse of notation—we’ve used the same notation,
“+”, for two different mathematical objects (and similarly for the notation
“·”).
Of course, there are examples of equivalence relations where equivalence
classes have cardinalities different from each other, even with some classes
finite and some infinite.
v
CONTENTS
1. Permutations. 1
2. Binary operations. 4
I. Elementary Group Theory
6
3. Groups. 6
4. Elementary properties of groups. 9
5. Isomorphisms and multiplication tables. 13
6. Subgroups and cyclic groups. 16
7. Cosets and Lagrange’s theorem. 19
8. Direct product; groups of small order. 23
9. Morphisms. 29
10. Normal subgroups and the first isomorphism theorem. 31
11. Solubility and a Sylow theorem. 35
12. Sn is not soluble. 40
13. Finitely generated abelian groups. 43
Appendix ZZZ. The generalized associative law. 51
II. Commutative Rings
14.
15.
16.
17.
18.
19.
20.
21.
22.
53
Rings. 53
Structure of Z×
n . 58
Polynomials and ring extensions. 61
Principal ideals & unique factorization. 70
The field of fractions. 76
Prime & maximal ideals. 78
Gauss’ theorem. 81
Polynomials in several variables and ring extensions. 83
Symmetric polynomials. 86
vi
III. Basic Field Theory
91
23. Prime fields and characteristic. 91
24. Simple extensions. 92
25. Review of vector spaces and linear maps. 94
26. The degree of an extension. 96
27. Straight-edge & compass constructions. 100
28. Roots and splitting fields. 104
Appendix A. Algebraic closure and the fundamental theorem of (19th century) algebra. 110
29. Repeated roots and the formal derivative. 118
30. Finite subgroups of F ∗ . 120
31. The structure of finite fields. 120
32. Moebius inversion. 122
33. Cyclotomic polynomials. 123
34. Primitive elements exist in characteristic zero. 126
IV. Galois Theory
129
35. The Galois group. 129
36. The general equation of degree n. 145
37. Radically unsolvable over Q. 147
38. The Galois correspondence. 148
39. (Fermat prime)-gons are constructible. 158
40. Automorphisms of finite fields. 159
41. Galois theoretic proof of the fundamental theorem of (19th century)
algebra. 160
Appendix B. Separability and the Galois correspondence in arbitrary characteristic. 163
Appendix C. Solubility implies solvability. 168
vii
V. Modules over PIDs and Similarity of Matrices
42.
43.
44.
45.
46.
177
Basics on modules. 177
Structure of finitely generated modules over euclidean domains. 180
Generators, relations and elementary operations. 191
Finitely generated abelian groups revisited. 197
Similarity of matrices. 198
VI. Group Representations
212
47. G-modules & representations. 212
48. Characters of representations. 227
49. Algebraic integers. 242
50. Divisibility & the Burnside (p, q) theorem. 244
!
Appendix . Tensor products. 250
VII. A Dearth of Division Rings
258
51. Finite division rings are fields. 258
52. Uniqueness of the quaternions. 260
References
267
Index of Notation
268
Index
271
viii
1. Permutations.
A permutation of the set { 1, 2, · · · , n } is a bijective map from that
set to itself. The set of all “ n! ” such permutations is denoted Sn . If α and
β are in Sn , define αβ (or sometimes α · β) in Sn to be their composition,
α ◦ β. (Caution: The opposite convention is sometimes used, αβ meaning
“ do α before doing β ”.) In this book, αβ means “ do β before doing α ”.
Let e ∈ Sn be the identity map. (There is, strictly speaking, a different e for
each n, but using the same name for all of them won’t lead to any confusion.)
Then αe = α = eα for all α ∈ Sn . If σ ∈ Sn , let σ −1 be the map inverse to
σ. Then σσ −1 = e = σ −1 σ.
A given permutation σ is often denoted
"
"
1
2
3
···
n
σ(1) σ(2) σ(3) · · · σ(n)
#
.
#
1 2 3 4 5 6 7
For example,
is the permutation σ ∈ S7 de2 5 1 4 3 7 6
fined by σ(1) = 2, σ(2) = 5, σ(3) = 1, σ(4) = 4, σ(5) = 3, σ(6) = 7
and σ(7) = 6. In this notation, the lower line is a re-arrangement of the
sequence 1, 2, 3, · · · , n . Some people prefer to think of permutations as rearrangements.
Given “k” distinct numbers a1 , a2 , · · · , ak between 1 and n for k ≥ 2,
the symbol (a1 a2 · · · ak ) denotes the permutation σ defined by specifying its
values : σ(a1 ) = a2 , σ(a2 ) = a3 , · · · , σ(ak−1 ) = ak , σ(ak ) = a1 , and σ(x) = x
for any other x between 1 and n . Such permutations are called cycles. The
length of the cycle is k . The identity permutation, e, is sometimes considered
to be a cycle of length 1, but not here. A transposition is a cycle of length 2;
it interchanges two numbers, leaving everything else fixed. A set of cycles is
disjoint if no number is moved by more than one of them; i.e. if (a1 a2 · · · ak )
is in the set, then no ai occurs in any other cycle in the set.
Theorem 1.1. Any permutation is a product of a disjoint set of cycles.
Proof. Prove it for all elements of Sn by induction on n. By convention,
e is the product of the empty set of cycles. For n = 1, 2 or 3, every other
element of Sn is a cycle. (Check this !) Suppose that the theorem holds for
all elements of Sn−1 , where n > 3. Let σ ∈ Sn .
1
Permutations
2
Case 1. If σ(n) = n , then σ maps { 1, 2, · · · , n − 1 } to itself, so it
may be thought of as an element of Sn−1 . By the inductive hypothesis, it is
a product of disjoint cycles. (In this case, none of the cycles will involve n
itself).
Case 2. If σ(n) %= n , let a1 = n, a2 = σ(a1 ), a3 = σ(a2 ), · · · . This
sequence must eventually have repeats (why?), and the first repeat is a1
since σ is injective. So let k be the unique integer larger than 1 such that
ak+1 = a1 , but a1 , a2 , · · · , ak are all distinct. Define σ̄ by σ̄(ai ) = ai for all i,
and σ̄(x) = σ(x) if 1 ≤ x ≤ n but x %= ai for any i . Then σ̄(n) = n, so
σ̄ is a product of disjoint cycles, by Case 1. Also σ = σ̄ · (a1 a2 · · · ak ) since
they map elements of { 1, 2, · · · , n } in the same way as each other. No
ai occurs in any cycle in the decomposition of σ̄ since σ̄(ai ) = ai for all i .
After decomposing σ̄, we see that σ is a product of disjoint cycles.
Exercise 1A. Verify the equation σ = σ̄ · (a1 a2 · · · ak ) from a few lines
above.
Exercise 1B. Show that the order of appearance of the cycles in a decomposition as in 1.1 is irrelevant, and furthermore that such a decomposition
is unique (i.e. no other disjoint set of cycles will do).
Corollary 1.2. Any permutation σ is a product of transpositions.
(Intuitively, any re-arrangement can be achieved by a sequence of interchanges of pairs of elements.)
Proof. Since σ is a product of cycles, it is enough to show that any cycle
is a product of transpositions. But
(a1 a2 · · · ak ) = (a1 ak )(a1 ak−1 ) · · · · (a1 a2 ) .
Exercise 1C. Show that, in the corollary, there is no uniqueness, the
order is relevant, and the transpositions won’t always be disjoint.
Theorem 1.3. Given σ, the number of transpositions in products of
transpositions which equal σ is either always even or always odd.
Proof. Let X = { 1, 2, · · · , n }. Define
sign(σ) :=
$
{k,#}⊂X
(
σ(k) − σ(&)
)
k−&
Permutations
3
where the product is over all subsets { k, & } with two elements.
[The number of such subsets is “n(n − 1)/2”.] Since σ(k)−σ(#)
is the same
k−#
σ(#)−σ(k)
as #−k
, the order chosen for k and & doesn’t matter.
Now, for all σ, we have sign(σ) = ± 1 :
| sign(σ) | =
$
{k,#}⊂X
%
σ(k) − σ(&)
{i,j}⊂X | i − j |
|
| = %
= 1
k−&
{k,#}⊂X | k − & |
(making the change of variable i = σ(k), j = σ(&) in the numerator).
Exercise 1D. Why can’t we use the same argument to show (the false
statement) that sign(σ) itself is always +1??
We are simply saying that each difference, in one or the other order, of
two distinct elements of X occurs once in the numerator and once in the
denominator. It follows that sign(σ) = (−1)N , where N is the number of
two-element subsets {k, &} for which σ(k) − σ(&) differs in sign from k − &.
Thus, for any transposition τ , we have sign(τ ) = −1 :
for if τ = (ab) where a < b, the number of subsets which have either the
form {a, x} with a < x ≤ b, or the form {x, b} with a ≤ x < b, is exactly
“2b − 2a − 1”, an odd integer.
Finally, sign(αβ) = sign(α)sign(β) : for
sign(αβ)
=
sign(β)
$
{k,#}⊂X
(
αβ(k) − αβ(&)
) =
β(k) − β(&)
$
{i,j}⊂X
(
α(i) − α(j)
) = sign(α)
i−j
[making the change of variable i = β(k), j = β(&)].
It follows easily by induction on m that
sign(σ1 σ2 · · · σm ) = sign(σ1 )sign(σ2 ) · · · sign(σm ).
Now suppose that σ can be written in two ways as a product of transpositions,
σ = τ1 τ 2 · · · τ m
and
σ = τ1& τ2& · · · τs& .
Then
sign(σ) = sign(τ1 )sign(τ2 ) · · · sign(τm ) = (−1)(−1) · · · (−1) = (−1)m .
Binary Operations
4
But similarly sign(σ) = (−1)s . Thus (−1)m = (−1)s , so either m and s are
both even, or they are both odd, as required.
Corollary to the proof 1.4. We have sign(σ) = (−1)m , where m is the
number of transpositions in some chosen decomposition of σ as a product of
transpositions.
Definition. We say that σ is even or odd according as σ is a product of
an even or an odd number of transpositions. Thus
σ is even ⇐⇒ sign(σ) = +1 ;
σ is odd
⇐⇒ sign(σ) = −1 .
A product of two even, or two odd, permutations is even. A product of an
odd times an even, or an even times an odd, is always odd. Note that a cycle
of even length is odd, and one of odd length is even.
2. Binary operations.
Definition. A binary operation on a set S is a map S × S → S.
Notation. Denote the image of (a, b) as a ∗ b, or a · b, or a + b etc.;
and then the operation is denoted ∗ or · or +, respectively. The additive
notation, +, is seldom used if the operation is not commutative, as defined
below.
Definition. The operation ∗ is
(i) associative ⇐⇒ (a ∗ b) ∗ c = a ∗ (b ∗ c) ∀a, b, c in S ; and
(ii) commutative ⇐⇒ a ∗ b = b ∗ a ∀a, b ∈ S.
It is not difficult to give examples of operations which have neither, either
or both of these properties.
Exercise 2A. Give such examples.
Given more than two elements of S, but a finite number, one can multiply
them in many different ways using a given operation, either by altering the
brackets (which are necessary since, to begin with, one can only multiply
pairs of elements), or by altering the order. When ∗ is associative, brackets
are unnecessary:
Proposition 2.1. When ∗ is associative, all products of a1 , a2 , · · · , an in
that order with any bracketing are equal.
Binary Operations
5
See Appendix ZZZ after Section 13 for a more precise statement, and
proof. As an example when n = 4, to show that
(a1 ∗ a2 ) ∗ (a3 ∗ a4 ) = a1 ∗ [(a2 ∗ a3 ) ∗ a4 ] ,
notice that the left hand side is a1 ∗ [a2 ∗ (a3 ∗ a4 )] by applying associativity
to a1 , a2 , a3 ∗ a4 . But the right hand side also is equal to a1 ∗ [a2 ∗ (a3 ∗ a4 )]
by applying associativity to a2 , a3 , a4 , and leaving a1 alone.
Notation. When ∗ is associative, denote by a1 ∗ a2 ∗ · · · ∗ an the element
obtained by multiplying a1 , a2 , · · · , an in that order. (We have already used
this in Section 1 but there is no logical circularity.)
When ∗ is also commutative, the order doesn’t matter:
Proposition 2.2. If ∗ is both associative and commutative, then
aσ(1) ∗ aσ(2) ∗ · · · ∗ aσ(n) = a1 ∗ a2 ∗ · · · ∗ an
for all σ ∈ Sn .
Example. When n = 3,
a1 ∗ a2 ∗ a3 = a1 ∗ a3 ∗ a2 = a2 ∗ a1 ∗ a3 = a2 ∗ a3 ∗ a1 = a3 ∗ a2 ∗ a1 = a3 ∗ a1 ∗ a2 .
Proof. Suppose first that τ = (ij) where i < j. Then
aτ (1) ∗ aτ (2) ∗ · · · ∗ aτ (n)
= a1 ∗ a2 ∗ · · · ∗ ai−1 ∗ aj ∗ ai+1 ∗ ai+2 ∗ · · · ∗ aj−1 ∗ ai ∗ ∗aj+1 ∗ · · · ∗ an
= a1 ∗ a2 ∗ · · · ∗ ai−1 ∗ ai+1 ∗ ai+2 ∗ · · · ∗ aj−1 ∗ ai ∗ aj ∗ aj+1 ∗ · · · ∗ an
= a1 ∗ a2 ∗ · · · ∗ ai−1 ∗ ai ∗ ai+1 ∗ · · · ∗ aj−1 ∗ aj ∗ aj+1 ∗ · · · ∗ an ,
applying commutativity first to the pair [ aj , ai+1 ∗ ai+2 ∗ · · · ∗ aj−1 ∗ ai ] and
then to the pair [ ai+1 ∗ ai+2 ∗ · · · ∗ aj−1 , ai ].
Now prove the theorem by induction on the minimum number of transpositions needed to decompose σ as a product of transpositions. If that number
is zero, then σ = e for which the theorem statement is clearly true. If it is
one, then σ is a transposition and we’ve just proved it. For the inductive
Groups
6
step, we can write σ = σ̄τ , where τ is a transposition and the number for σ̄
is one less than for σ. Then
aσ(1) ∗ aσ(2) ∗ · · · ∗ aσ(n) = aσ̄τ (1) ∗ aσ̄τ (2) ∗ · · · ∗ aσ̄τ (n)
= aσ̄(1) ∗ aσ̄(2) ∗ · · · ∗ aσ̄(n)
= a1 ∗a2 ∗· · ·∗an
(by the case just done : let bi = aσ̄(i) )
(by the inductive hypothesis).
I. Elementary Group Theory
Sections 3 to 13 study a type of algebraic object, called a group, which
is the most important object within non-commutative algebra. Its historical
origin was the material which is studied here in the sections on Galois theory (beginning with 35), and this remains one of the main motivations for
studying groups. There are many other motivations within algebra and also,
for example, from geometry, topology, physics, chemistry and differential
equations, not necessarily entirely separate from each other.
3. Groups.
Definition. A group is a set G together with a binary operation ∗ on G,
such that the following three axioms hold:
(1) The operation ∗ is associative.
(2) There exists an element 1 ∈ G (called the identity element) such
that 1 ∗ g = g and g ∗ 1 = g for all g ∈ G.
(3) For all g ∈ G, there exists an element g −1 ∈ G (called the inverse
of g) such that g −1 ∗ g = 1 and g ∗ g −1 = 1.
We do not need to assume uniqueness of 1, nor of g −1 for fixed g. These
facts are proved in the next section. When considering several groups G, H, · · ·
simultaneously, we often use the notations 1G , 1H , · · · for their identity elements.
Groups
7
Examples.
A. GL(Rn ), the set of all linear isomorphisms (that is, bijective linear
transformations ) from Rn to itself, with composition as operation.
B. GL(n, R), the set of all real n × n non-singular (i.e. invertible)
matrices , with matrix multiplication as operation. The set R× of non-zero
real numbers under ordinary multiplication is the case n = 1 of this.
C. The set, C× , of non-zero complex numbers under multiplication.
D. Ck = { 1, ρ, ρ2 , · · · , ρk−1 }, where ρ = e2πi/k , under multiplication.
E. Sn , with the usual product of permutations, namely composition.
F. O(Rn ) , the group of orthogonal transformations of Rn .
G. O(n), the group of n × n real orthogonal matrices.
H. D4 , the group of symmetries of the square; i.e. those eight elements of
O(R2 ) which map the square below to itself.
(–1,1)
(1,1)
(–1,–1)
(1,–1)
I. The set of eight matrices which represent the elements of D4 with
respect to the standard basis for R2 .
J. D3 , the group of symmetries of an equilateral triangle; to be precise,
those six elements of O(R2 ) which map the triangle to itself.
Groups
ρ5
"
"
"
"
"
"
ρ
!
!
!
8
!
!
!
"
!
" !
"! 9
ρ
Here ρ = e2πi/12 . (Try to describe the elements of D3 and D4 —see examples (iii) and (iv) after 4E.)
K. Rn is a group under vector addition. The case n = 1 is the reals
under addition. The case n = 2 is the complex numbers under addition. To
generalize, any vector space is a group with respect to its addition.
L. The set Z of integers under addition.
M. The set Zk of all “k” of the congruence classes (mod k) under addition
(mod k). Another group, this time using instead the operation of multiplication (mod k), is formed by taking the subset, Z×
k , of only those congruence
classes containing integers prime to k. Except when there is no chance of
ambiguity, it is important to be quite explicit about what the operation is,
when specifying a group.
N. If A is any set, the set G of all bijective maps from A to itself under
composition. (When A = { 1, 2, · · · , n }, this G is Sn .)
Exercise 3A. Check all the axioms carefully for each of these examples;
also, where relevant, check that the operation is well defined.
Definition. A group G is commutative if and only if its operation is
commutative.
Examples C., D., K., L., M. are commutative.
Non-commutative examples are A. for n > 1; B. for n > 1; E. for n > 2;
F. and G. for n > 1; H.; I.; J.; and N. if the set A has three or more elements.
Notation. The operation in a group will from now on be denoted by
juxtaposition (i.e. ab is the product of a and b), or by +. The latter notation
Elementary properties of groups
9
is used here only when G is commutative, and the word abelian is often used
instead of ‘commutative’, especially when this additive notation is used. In
the additive notation, the identity element in G is called instead the zero of
G and denoted 0, and the inverse of g is called instead the negative of g and
denoted −g (instead of g −1 ).
Definition. A group G is finite if and only if it has finitely many elements. The number of elements in G is called the order of G and is denoted
|G| here. In the examples, D., E., H., I., J., and M. are finite groups of orders
k, n!, 8, 8, 6, and k, respectively. The second one in M. has order equal to
the number of integers which are between 0 and k and are prime to k. This
number is denoted Φ(k), and Φ is called Euler’s function. (Euler existed
before THE Edmonton, which juxtaposition should suggest to those with
good taste in team sports the pronunciation of his name by Anglo mathematicians.) The other groups are infinite, except for N., which is finite if and
only if A is a finite set, and both F. and G. for n = 1.
4. Elementary properties of groups.
Proposition 4.1. The identity element in a group is unique.
Proof. If 1& also denotes an element with the properties of the identity,
then 11& = 1& since 1 is an identity, but 11& = 1 since 1& is an identity. Hence
1 = 1& , i.e. they denote the same element, as required.
Proposition 4.2. The inverse of any group element is unique.
Proof. If ĝ also denotes an inverse for g, then
ĝ = ĝ1 = ĝgg −1 = 1g −1 = g −1 ,
as required.
Note. (g −1 )−1 = g.
Proposition 4.3. Let a, b ∈ G, a group. Then the equations ax = b
and ya = b each have exactly one solution in G.
Proof. Since a(a−1 b) = 1b = b, the first equation certainly has one
solution, namely x = a−1 b. If s is any solution, then as = b. Multiply both
10
sides on the left by a−1 , yielding s = a−1 b. Thus there is only one solution.
The proof for ya = b is left as an exercise.
Exercise 4A. Do the rest of this proof. Also show that the statement of
4.3 may be used as an axiom to replace (2) and (3) in the definition of the
term ‘group’, as long as we assume that the set G is non-empty. Finally, for
finite G %= ∅, so may the ‘cancellation’ law: ∀a, b, c, (ab = ac or ba = ca) ⇒
b = c ; but not for infinite G.
−1
−1
Proposition 4.4. (a1 a2 · · · an )−1 = a−1
n an−1 · · · a1 .
Proof. Proceed by induction.
For n = 2 :
−1
−1
−1 −1
a−1
is the inverse of a1 a2 .
2 a1 a1 a2 = a2 1a2 = 1 , so a2 a1
Inductive step :
(a1 a2 · · · an )−1 = [(a1 )(a2 · · · an )]−1
= (a2 · · · an )−1 (a1 )−1
=
−1
a−1
n an−1
· · · a−1
1
(by the case just proved)
(by the inductive hypothesis).
Definition. For n ≥ 1 and g ∈ G, define
g n := gg · · · g (“n” copies of g) , or, more succinctly :
g 1 := g and g n := g n−1 g (inductive definition).
Define g 0 := 1 .
Define g −n := (g −1 )n .
Corollary 4.5. For all integers n, we have (g −1 )n = (g n )−1 .
So we could have defined g −n to be (g n )−1 for n > 0.
Proof. Let ai = g for all i in 4.4. This does it for n > 0. Do it as
Exercise 4B, for n ≤ 0.
Proposition 4.6. For all i, j ∈ Z ,
(i) g i g j = g i+j and
(ii) (g i )j = g ij .
Proof. Prove (ii) as Exercise 4C. As for (i), to prove it for j ≥ 1, fix
i and use induction on j. (Exercise 4D. Write this out.) For j = 0, it is
11
trivial, and the case j < 0 follows from the case j > 0 as follows:
g i g j = (g −1 )−i (g −1 )−j = (g −1 )(−i)+(−j) = (g −1 )−(i+j) = g i+j .
Definition. An element g has finite order if and only if g k = 1 for some
k ≥ 1. In this case, the order of g , written ||g||, is the least such k ≥ 1.
Otherwise, g is said to have ‘infinite order’.
Theorem 4.7. If ||g|| = k < ∞, then the elements 1, g, g 2 , · · · , g k−1 are
all distinct from each other. Every other power of g is equal to one of these.
In fact
g i = g j ⇐⇒ i ≡ j (mod k) .
Proof. First we prove the last statement. If i ≡ j(mod k), then j = i + sk
for some s ∈ Z. Thus
g j = g i+sk = g i g sk = g i 1s = g i ,
proving ⇐. Conversely, suppose that g i = g j . Then g i−j = g i g −j = 1.
Divide k into i − j to give quotient m and remainder r with 0 ≤ r ≤ k − 1.
Then i − j = mk + r, so
1 = g i−j = (g k )m g r = 1m g r = g r .
But g s %= 1 for 1 ≤ s ≤ k − 1, so r = 0. Thus i ≡ j (mod k), proving
⇒. The first statement follows from the last, since no two of the integers
0, 1, 2, · · · , k − 1 are congruent (mod k). So does the second statement, since
every integer is congruent (mod k) to exactly one of the integers 0, 1, · · · , k−1.
Exercise 4E. Show that if g has infinite order, then g i %= g j when i %= j.
Deduce that every element of a finite group has finite order.
12
Examples.
(i) e2πi/k has order k in Ck . It and its powers, for all k, give all the
elements of finite order in C× , that is, all the complex roots of unity.
(ii) In Sn , every transposition has order 2.
(iii) In D4 , rotations of 90◦ and 270◦ have order 4. Rotation of 180◦ and
the four reflections have order 2.
(iv) In D3 , rotations of 120◦ and 240◦ have order 3. The three reflections
have order 2.
(v) In R× , every element has infinite order except 1 (which has order 1)
and −1 (which has order 2). That is, ±1 are the only roots of unity in R× .
(vi) In Z, every element has infinite order except 0.
(vii) In Zk , the element [1] has order k.
(viii) Let G be the set of all complex k th roots of unity for all k ≥ 1, under
complex multiplication. Then G is an infinite group, but all of its elements
have finite order.
(ix) In any group, there is exactly one element of order 1.
(x) If an invertible square matrix has finite order, then all of its eigenvalues
must be roots of unity. Is the converse true?
Exercise 4F. Check all the details concerning the above examples.
(xi) Orders of permutations.
Theorem 4.8. If σ = γ1 γ2 · · · γs is a product of disjoint cycles of lengths
&1 , &2 , · · · , &s , then the order of σ is the least common multiple of &1 , &2 , · · · , &s .
Proof. We must show that
σ j (x) = x
∀x
⇐⇒
&i | j
∀i .
Suppose that σ j (x) = x for all x ≤ n . Fix i, and let γi be the cycle
(x1 x2 · · · x#i ) . Define k by requiring that 1 ≤ k ≤ &i and that k ≡ j +
1(mod &i ). Then (γi )j (x1 ) = xk . Since no xr is moved by any cycle other
Isomorphisms and multiplication tables
13
than γi , we have σ j (x1 ) = xk . Since all the xr ’s are distinct, and σ j (x1 ) = x1 ,
we have k = 1, so 1 ≡ j + 1 (mod &i ). Thus j ≡ 0(mod &i ), i.e. &i | j, proving
⇒.
Conversely, suppose that &i | j ∀i ≤ s . Fix x. If x doesn’t occur in
any of the cycles, then σ(x) = x, so σ j (x) = x. Otherwise, suppose that
x occurs in γi = (x1 x2 · · · x#i ), say x = xm . Then σ j (x) = xk , where k is
defined by 1 ≤ k ≤ γi and k ≡ j + m (mod &i ). But j ≡ 0 (mod &i ), so
k ≡ m(mod &i ). Since k and m are both between 1 and &i , we have k = m.
Thus σ j (x) = xk = xm = x, as required, proving ⇐ .
5. Isomorphisms and multiplication tables.
Definition. Let G and H be groups. We say that G is isomorphic to
H, and write G ∼
= H, if and only if there is a bijective map φ : G → H such
that
φ(g1 g2 )
=
φ(g1 )φ(g2 )
for all g1 , g2 ∈ G.
↑
product in G
↑
product in H
Such a map φ is called an isomorphism . This means that we have a
1-1 correspondence between elements in the two groups which ‘respects’ the
multiplications.
Exercise 5A. Check details for the following examples :
(i) GL(Rn ) ∼
= GL(n, R), an isomorphism being given by assigning to
each linear map its matrix relative to some fixed basis.
(ii) O(Rn ) ∼
= O(n), the isomorphism given as in (i), making sure this
time that the basis is orthonormal.
(iii) Ck ∼
= Zk by mapping (e2πi/k )j to [j].
[ When H is additive, the condition on φ reads:
φ(g1 g2 ) = φ(g1 ) + φ(g2 ) . ]
(iv) Z2 ∼
= S2 via that φ for which φ([0]) = e and φ([1]) = (12).
14
(v) S3 ∼
= D3 , although this is not yet obvious.
(vi) S4 ∼
%= D4 , since | S4 |%=| D4 |.
(vii) R ∼
= R+ (:=
x
φ(x) = e .
the positive reals under multiplication) by letting
(viii) R ∼
%= Z, since Z is countable and R is not.
Clearly, isomorphic finite groups must have the same order. In addition,
they must have the same ‘group theoretic’ properties. For example:
Proposition 5.1. If G is commutative and G ∼
= H, then H is also
commutative.
Proof. Let h1 , h2 ∈ H. Choose an isomorphism φ : G → H and denote
by gi ∈ G the elements for which φ(gi ) = hi . Then
h1 h2 = φ(g1 )φ(g2 ) = φ(g1 g2 ) = φ(g2 g1 ) = φ(g2 )φ(g1 ) = h2 h1 .
For example, D3 ∼
%= C6 ; D4 %∼
= C8 ; and Sn %∼
= Cn! for n > 2 ; although
in each case the two groups have the same order.
Proposition 5.2. If φ : G → G& is an isomorphism, then
φ(1) = 1& ,
and
φ(g −1 ) = φ(g)−1 for all g.
Proof. φ(1) = φ(1 · 1) = φ(1)φ(1). Multiply by φ(1)−1 to get the first
identity. Now φ(g −1 )φ(g) = φ(g −1 g) = φ(1) = 1& , so φ(g −1 ) is the inverse of
φ(g), as required, using the following.
Exercise 5B. In any group, if ab = 1, then ba = 1.
Proposition 5.3. If φ : G → G& is an isomorphism, then
5φ(g)5 = 5g5 for all g ∈ G.
Proof. Firstly g j = 1 ⇒ [φ(g)]j = φ(g j ) = φ(1) = 1& . (Prove the second
equality as an exercise.) On the other hand,
[φ(g)]j = 1& =⇒ φ(1) = 1& = φ(g j ) =⇒ g j = 1 ,
since φ is injective. Thus g j = 1 ⇔ [φ(g)]j = 1& , and so g and φ(g) have the
same order, the smallest positive j for which both statements hold.
15
Examples. (i) Z ∼
%= ∪∞
k=1 Ck , although both of them are countable
abelian groups.
(ii) Let
&"
D2 =
1 0
0 1
# "
,
−1 0
0 1
# "
,
1 0
0 −1
# "
,
−1 0
0 −1
#'
The set D2 is a group under matrix multiplication (and is isomorphic to
the group of symmetries of a suitable two-sided figure). If A and B are
any two non-identity elements in D2 , then AB is the third, and we have
A2 = B 2 = (AB)2 = I, so D2 has no element of order 4. Thus D2 %∼
= C4 ,
though both are commutative of order 4. The group D2 is called the Klein
4-group .
For a finite group one can make a multiplication table :
C4
1
i
i2
i3
1
1
i
i2
i3
i
i
i2
i3
1
i2
i2
i3
1
i
D2
i3
i3
1
i
i2
I
A
B
AB
I
I
A
B
AB
A
A
I
AB
B
B
B
AB
I
A
AB
AB
B
A
I
Pseudotheorem 5.3.5: Two finite groups are isomorphic if and only
if you can name the elements of both in such a way that the corresponding
multiplication tables are the same.
Example. In D3 , let a = rotation of 120◦ , and let b = reflection in the
y-axis. In S3 , let a = (123) and let b = (12). Then both have the following
multiplication table :
1
a
a2
b
ab
a2 b
1
a a2
b ab a2 b
1
a a2
b ab a2 b
a a2
1 ab a2 b b
a2
1
a a2 b b ab
b a2 b ab 1 a2 a
ab b a2 b a
1 a2
a2 b ab b a2 a
1
This gives an algorithm (but a very impractical one) for determining up
to isomorphism all possible groups of any given finite order. However, it is
Subgroups and cyclic groups
16
not even known how many there are, for example, of order 256. (See also
Sections 8, 12 and 13).
Proposition 5.4. Isomorphism behaves like an equivalence relation.
That is, we have the following.
(i) G ∼
= G for all G, ; in fact, the identity map id: G → G is an
isomorphism.
(ii) G ∼
= H ⇒H ∼
= G; in fact, if φ : G → H is an isomorphism, then
−1
φ : H → G is also an isomorphism.
(iii) [G ∼
=H &H∼
= K] ⇒ G ∼
= K; in fact, if φ : G → H and ψ : H → K
are isomorphisms, then so is ψφ : G → K.
Proof. The maps id, φ−1 and ψφ are certainly bijective. We must show
that they map products to products:
id(g1 g2 ) = g1 g2 = id(g1 )id(g2 ) .
(ψφ)(g1 g2 ) = ψ[φ(g1 g2 )] = ψ[φ(g1 )φ(g2 )]
= [ψ(φ(g1 )][(ψ(φ(g2 )] = [(ψφ)(g1 )][(ψφ)(g2 )] .
Given h1 , h2 ∈ H, choose gi ∈ G with φ(gi ) = hi . Then
φ−1 (h1 h2 ) = φ−1 [φ(g1 )φ(g2 )] = φ−1 [φ(g1 g2 )]
= g1 g2 = φ−1 (h1 )φ−1 (h2 ) .
6. Subgroups and cyclic groups.
Definition. A subset H of a group G is a subgroup of G exactly when
the following three conditions hold:
(I) a ∈ H and b ∈ H =⇒ ab ∈ H ;
(II) 1 ∈ H ; and
(III) a ∈ H =⇒ a−1 ∈ H.
Note. By (I), H inherits a multiplication from G, one which is associative. By (II) and (III), H is a group with this multiplication.
Examples. Ck ⊂ C∗ ; O(Rn ) ⊂ GL(Rn ) ; O(n) ⊂ GL(n, R) ;
D3 ⊂ O(R2 ) ; D4 ⊂ O(R2 ) ; 3Z = {0, ±3, ±6, ±9, · · ·} ⊂ Z ⊂ R ;
{[0], [3]} ⊂ Z6 ; {1, − 1} ⊂ C6 ; An := {σ : sign(σ) = 1} ⊂ Sn .
17
Exercise 6A. Prove: i) In each case above, the smaller is a subgroup
of the larger. ii) For any group G, the extreme subsets {1} and G itself are
both subgroups of G.
Proposition 6.1. Let H be a non-empty subset of a group G.
Then (H is a subgroup of G ) ⇐⇒ ([a ∈ H and b ∈ H] ⇒ a−1 b ∈ H ).
Proof. To prove ⇒, note that a−1 ∈ H by (III), so that a−1 b ∈ H by (I).
To prove ⇐, choose b ∈ H, since H is non-empty. Then b−1 b = 1 ∈ H, so
(II) holds. If a ∈ H, then a−1 1 = a−1 ∈ H , so (III) holds. Now if a ∈ H
and b ∈ H, then a−1 ∈ H, so (a−1 )−1 b = ab ∈ H, so (I) holds.
Proposition 6.2. If {Hα } is an indexed non-empty collection of subgroups of G, then ∩α Hα is a subgroup of G.
Proof. Since 1 ∈ Hα ∀α, we have 1 ∈ ∩α Hα , so ∩α Hα %= ∅. Now apply
6.1. If a ∈ ∩α Hα and b ∈ ∩α Hα , then ∀α, a ∈ Hα and b ∈ Hα . Thus
a−1 b ∈ Hα ∀α. Thus a−1 b ∈ ∩α Hα , and so ∩α Hα is a subgroup.
Exercise 6B. In general, a union of subgroups is not a subgroup. What
set theoretic condition on a family of subgroups is equivalent to its union
being a subgroup?
Definition. Let A be any subset of a group G. The subgroup of G generated by A is the smallest subgroup of G which contains A. This definition
is somewhat glib (greatest lower informative bound), since there is no
guarantee beforehand that such a subgroup exists. The following explicit
descriptions of this subgroup take care of that problem.
Two descriptions:
H1 :=
(
{ H : A ⊂ H and H is a subgroup of G } ;
i.e. H1 is the intersection of all subgroups containing A.
H2 := {
k
$
i=1
ai : k ≥ 0 ; and ∀i, either ai ∈ A or a−1
i ∈ A } ;
i.e. H2 is the set of all products, a1 a2 · · · ak , each of whose factors is either
an element from A or the inverse of an element from A.
18
Theorem 6.3. The sets H1 and H2 are both subgroups of G and are in
fact equal.
Proof. The set H1 is a subgroup by 6.2. The set H2 is a subgroup using
either 6.1 or the definition of the term subgroup. Now H1 ⊂ H2 , since H2
qualifies as an H in the intersection defining H1 , i.e. A ⊂ H2 and H2 is a
subgroup. On the other hand, H2 ⊂ H1 since each ai ∈ H1 and H1 is closed
under multiplication.
Examples.
(i) If A is a subgroup, then the subgroup generated by A is A itself.
(ii) The symmetric group Sn is generated by its subset of transpositions.
Exercise 6C. Prove that Sn is generated by { (12) , (123 · · · n) }, and
also by { (12) , (23) , · · · , (n − 1 n) }. Prove that An (which is called the
alternating group ) is generated by { (123) , (124) , · · · , (12n) }.
(iii) If A = {g}, description H2 shows that the group generated by A is the
set, { g j : j ∈ Z }, of all powers of g.
Definition. A group G is a cyclic group if and only if G can be generated
by one element (more precisely, · · · by a singleton set).
Theorem 6.4. If G is finite cyclic, then G ∼
= Z|G| . If G is infinite cyclic,
then G ∼
= Z.
Proof. Suppose that G is cyclic and choose a generator g. Define a map
φ : Z → G by φ(j) = g j . Then
φ(j + k) = g j+k = g j g k = φ(j)φ(k) .
Also φ is surjective since G consists of all the powers of g.
Now suppose further that G is infinite. Then all powers of g are distinct
by 4E, so φ is injective. Thus φ is an isomorphism and G ∼
= Z.
On the other hand, suppose that G is finite of order k. By 4.7,
G = { 1, g, g 2 , · · · , g k−1 } ,
and g has order k. Define ψ : Zk → G by ψ([j]) = g j . By 4.7, ψ is well
Cosets and Lagrange’s theorem
19
defined and bijective. Also
ψ([j] + [k]) = ψ([j + k]) = g j+k = g j g k = ψ([j])ψ([k]) .
Hence ψ is an isomorphism, and G ∼
= Zk , as required.
Corollary 6.5. Up to isomorphism, there is only one cyclic group of
each finite order and one infinite cyclic group.
Note. Any element in a group generates a unique cyclic subgroup of the
group. The order of that subgroup is the order of the element.
6+ Cayley’s Theorem.
If G = { g1 , g2 , · · · , gn } is a finite group of order n, and x ∈ G, then the
sequence xg1 , xg2 , · · · , xgn is just gσ(1) , gσ(2) , · · · , gσ(n) for some unique
σ ∈ Sn . If x corresponds to σ in this way, and y to γ, then xy corresponds
to σγ, and 1 to e, and x−1 to σ −1 . The subset of such σ corresponding to
elements of G is a subgroup, H, of Sn , and G ∼
= H. This is Cayley’s
theorem :
Any group of order n is isomorphic to a subgroup of Sn .
Exercise 6D. Verify all these statements to prove the theorem (or prove
it in some other way, if you prefer).
7. Cosets and Lagrange’s theorem.
One can generalize the number theoretic notion of congruence(mod k) as
follows.
Definition. Let H be a subgroup of G and let a, b ∈ G. Then we say
that a is congruent to b (mod H), and write a ≡ b(mod H) , if and only if
a−1 b ∈ H . [In an additive group, · · · ⇔ b − a ∈ H.]
Examples.
(i) G = Z, H = nZ = { 0, ± n, ± 2n, ± 3n, · · · }. Then
a ≡ b(mod nZ) ⇐⇒ a ≡ b(mod n).
20
(ii) G = C× , H = { z ∈ C : |z| = 1 }, the circle group.
Then a ≡ b(mod H) ⇐⇒ |a| = |b|. (N.B. Of course, here | | means
‘complex modulus’, not ‘order’.)
(iii) G = R, H = Z. Then a ≡ b(mod Z) ⇐⇒ a and b have the same
decimal expansion to the right of the decimal point.
(iv) G = Sn , H = An = the group of even permutations. Then
a ≡ b(mod An ) ⇐⇒ a and b are either both even or both odd.
(v) If H = G, then a ≡ b(mod H) for all a and b.
(vi) If H = {1}, then a ≡ b(mod H) ⇐⇒ a = b.
(vii) If G = S3 and H = {e, (12)}, then we have three congruences:
e ≡ (12)(mod H) , (123) ≡ (13)(mod H) , and (132) ≡ (23)(mod H) ,
but no other pair of distinct elements is congruent (mod H) (except by
reversing the order of a pair above).
Exercise 7A. Check the details for each example.
Proposition 7.1. Congruence (mod H) is an equivalence relation on the
set G.
Proof. (i) For all a, we have a ≡ a(mod H), since a−1 a = 1 ∈ H.
(ii) If a ≡ b(mod H) then b ≡ a(mod H), since a−1 b ∈ H implies that
−1
b a = (a−1 b)−1 ∈ H.
(iii) If a ≡ b(mod H) and b ≡ c(mod H), then a ≡ c(mod H) :
assuming that both a−1 b ∈ H and b−1 c ∈ H results in a−1 c ∈ H, since
a−1 c = (a−1 b)(b−1 c).
Proposition 7.2. The equivalence class, [g0 ], of g0 under congruence
(mod H) is { g0 h : h ∈ H } , the set of ‘right multiples’ of g0 by elements
of H.
Proof. x ∈ [g0 ] ⇐⇒ g0 ≡ x(mod H) ⇐⇒ g0−1 x ∈ H ⇐⇒ ∃h ∈ H such
that x = g0 h ⇐⇒ x ∈ { g0 h : h ∈ H }.
Definition. The class in 7.2 is called the left coset of g0 (mod H), and is
denoted g0 H (in additive notation, g0 + H), rather than [g0 ].
21
Corollary 7.3. If g1 H %= g2 H, then g1 H ∩ g2 H is empty. Also
G =
)
g0 H .
g0 ∈G
Proof. This is just the partitioning effect of any equivalence relation. (It
may also be proved directly rather easily.)
Proposition 7.4. If H is finite, then g0 H has the same number of
elements as H, for any g0 ∈ G.
Proof. Define µ : H → g0 H by µ(h) = g0 h. Then µ is bijective (whether
H is finite or not), proving the assertion (and more).
Theorem 7.5. (Lagrange) The order of any subgroup of a finite group
divides the order of the group. In fact,
|G| = |H| × number of left cosets of G(mod H) ,
for any subgroup, H, of G.
Proof. By 7.3, G is the disjoint union of its left cosets (mod H). By
7.4, all left cosets have the same number of elements, namely |H|.
Definition. The number of left cosets of G(mod H) is called the index of
H in G, and denoted [G : H]. Thus 7.5 may be written [G : H] = |G| / |H|.
Corollary 7.6. The order of any element of a finite group divides the
order of the group .
Proof. The order of an element g is the order of the cyclic subgroup H
generated by g, so this follows from 7.5.
Exercise 7B. Define right coset and a relation analogous to congruence
whose equivalence classes are the right cosets. Deduce from an analogue of
7.5 that the number of right cosets is also the index.
22
Examples.
(i) C12 has the following proper subgroups :
{1} ; {1, ρ6 } ; {1, ρ4 , ρ8 } ; {1, ρ3 , ρ6 , ρ9 } and {1, ρ2 , ρ4 , ρ6 , ρ8 , ρ10 } ,
of orders 1, 2, 3, 4 and 6 respectively, all dividing 12. For example, {1, ρ6 }
has index 6 in C12 , the six cosets being:
{1, ρ6 } ; {ρ, ρ7 } ; {ρ2 , ρ8 } ; {ρ3 , ρ9 } ; {ρ4 , ρ10 } and {ρ5 , ρ11 } .
(ii) Z12 has proper subgroups
{[0]} ; {[0], [6]} ; {[0], [4], [8]} ; {[0], [3], [6], [9]} and {[0], [2], [4], [6], [8], [10]}.
This example is similar to (i) because C12 ∼
= Z12 .
(iii) S3 has proper subgroups
{e} ; {e, (12)} ; {e, (13)} ; {e, (23)} ; and {e, (123), (132)}
of orders 1,2,2,2 and 3, all dividing 6. For example {e, (12)} has index 3 in
S3 , the left cosets being
{e, (12)} ; {(123), (13)} and {(132), (23)} .
(iv) D4 has subgroups : {I}, of order 1 ;
{I, A2 } , {I, B} , {I, AB} , {I, A2 B} , {I, A3 B} ,
of order 2, all isomorphic to C2 ;
{I, A, A2 , A3 } isomorphic to C4 and
{I, A2 , B, A2 B} and {I, A2 , AB, A3 B} isomorphic to D2
all of order 4; and itself of order 8.
Corollary 7.7. Assuming that G is a finite group, for all g ∈ G, we have
g |G| = 1 .
Direct product ; groups of small order
23
Proof. By 7.6, there exists r ∈ Z with |G| = r||g||. Then
g |G| = g ||g||r = 1r = 1 .
Corollary 7.8. (Euler) aΦ(k) ≡ 1(mod k) for all integers a prime to k,
where Φ(k) is defined to be the number of integers prime to k between 1 and
k.
Proof. Apply 7.7 with g = [a] and G = Z×
k , the group of congruence
classes prime to k under multiplication (mod k), noting that |Z×
k | = Φ(k).
We’ve come down from the ethereal abstractions of general group theory
to the terra firma of number theory. And you’ve just pronounced the next
mathematician’s name more or less correctly, except perhaps for emphasizing
the second syllable. In particular, Fermat doesn’t rhyme with doormat!
Corollary 7.9. (Fermat) If p is prime, then ap ≡ a(mod p) for all
a ∈ Z.
Proof. This is trivial if p divides a. If not, let k = p in 7.8, giving
ap−1 ≡ 1(modp), so ap ≡ a(mod p).
Exercise 7C. This last step seems to use some general fact. Assume
that a ≡ b(mod H) Given a third element c, which, if either, of the following
then hold :
ca ≡ cb(mod H)
;
ac ≡ bc(mod H)
?
8. Direct product ; groups of small order.
We’ve already been using the Cartesian product of two sets A and B,
which is defined to be
A × B := { (a, b) : a ∈ A and b ∈ B } .
When both sets have a group structure, so does their product, as follows:
24
Definition. If G and H are groups, define their direct product to be the
set G × H with the multiplication :
(g1 , h1 ) (g2 , h2 )
:=
↑
(multiplication being
(g1 g2 , h1 h2 )
↑
↑
(already known multiplications
defined)
in G and H)
Proposition 8.1. The set G × H is a group with this multiplication.
The identity is (1G , 1H ), where 1G and 1H are the identities in G and H
respectively. The inverse of (g, h) is (g −1 , h−1 ).
Proof. (1) To verify associativity, we have
[(g1 , h1 ) (g2 , h2 )] (g3 , h3 ) = (g1 g2 , h1 h2 ) (g3 , h3 )
= ((g1 g2 )g3 , (h1 h2 )h3 ) = (g1 (g2 g3 ) , h1 (h2 h3 ))
= (g1 , h1 ) (g2 g3 , h2 h3 ) = (g1 , h1 ) [(g2 , h2 ) (g3 , h3 )] .
(2) Next,
(1G , 1H )(g, h) = (1G · g, 1H · h) = (g, h) = (g · 1G , h · 1H ) = (g, h)(1G , 1H ) .
(3) Finally,
(g −1 , h−1 )(g, h) = (g −1 g, h−1 h) = (1G , 1H ) .
Similarly, (g, h)(g −1 , h−1 ) = (1G , 1H ) .
Exercise 8A. Some lines above are unnecessary. Show that, to check
that a given ‘set with associative operation’—called a ‘semigroup’— is actually a group, it is only necessary to check that there is a (say) left identity
element, and then that each element has a left inverse. On the other hand,
replacing “left” by “right” only once above is not a sufficient condition for
‘groupiness’.
Notes. (a) If G1 , G2 and G3 are groups, then one might wish to define
G1 × G2 × G3 to be the set of ordered triples (g1 , g2 , g3 ), with the obvious
25
coordinate-wise multiplication. It is almost equal to, and certainly isomorphic to, both (G1 × G2 ) × G3 and G1 × (G2 × G3 ). Similarly for any finite
sequence, G1 , G2 , · · · , Gn , of groups.
(b) Clearly |G × H| = |G| · |H| , so inductively
|G1 × G2 × · · · × Gn | = |G1 | · |G2 | · . . . · |Gn | .
(c) If G and H are both commutative, then so is G × H.
Examples. (i) C2 × C2 ∼
%= C4 .
= D2 ∼
∼
∼
∼
(ii) C2 × C3 = C6 %= S3 = D3 .
(iii) C2 × C4 ∼
%= either C8 or D4 ; C2 × C2 × C2 ∼
%= any of the previous
three.
(iv) R×R×· · ·×R = Rn ; Rn and R are isomorphic as groups (HINT:
even as vector spaces over Q), but not isomorphic as real vector spaces if n
is larger than one.
(v) R× × R× × · · · × R× ∼
= the subgroup consisting of all the diagonal
matrices in GL(n, R).
(vi) S 1 × S 1 looks like a (hollow) torus, where S 1 is the circle group.
(vii) For any G, we have {1} × G ∼
= G ; for any G and H, we have
∼
G × H = H × G.
Exercise 8B. Check the details above.
Theorem 8.2. If |G| is a prime p, then G ∼
= Cp .
Thus, for k = 2, 3, 5, 7, 11, · · · , there is only one group, up to isomorphism,
of order k.
Proof. Since p > 1, we can choose g ∈ G with g %= 1. Then ||g|| > 1, but
||g|| divides p by 7.6, so ||g|| = p, since p is prime. Thus there are “p” distinct
powers of g. Thus G = {1, g, g 2 , · · · , g p−1 } is the cyclic group generated by g.
By 5.4, G ∼
= Cp .
Proposition 8.3. Any group of order 4 is isomorphic to exactly one of
C4 or C2 × C2 .
Proof. If G has an element g of order 4, then G = {1, g, g 2 , g 3 } is cyclic
of order 4, so G ∼
= C4 by 5.4. If not, then every non-identity element has
26
order 2, by 7.6. Let a and b denote distinct elements of order 2. Then ab %= a
since b %= 1 ; ab %= b since a %= 1 ; and ab %= 1 since a−1 = a %= b. Hence
G = {1, a, b, ab} and a2 = b2 = (ab)2 = 1. Also ba = a2 (ba)b2 = a(ab)2 b = ab.
This determines the multiplication table of G; and C2 × C2 has the same
multiplication table if we let a and b be (−1, 1) and (1, −1). (This is the
same as the table given for D2 previously, if we change A to a and B to b.)
Thus G ∼
= C2 × C2 (using 5.3.5).
Exercise 8C. Show that a group whose elements satisfy g 2 = 1 for all g
is necessarily commutative.
The classification of the remaining cases for orders less than 12 is given
below in 8.4 to 8.8.
C6 or D3 .
C10 or D5 .
C9 or C3 × C3 .
The proofs of these are more enlightening using general theorems. They
are done in sections 11 and 13. In 8.4 and 8.5, the two given groups are not
isomorphic because one is commutative and the other isn’t. Alternatively
we can look at the orders of elements. In 8.6, the group C9 has elements of
order 9, but C3 × C3 doesn’t, using:
Proposition 8.7. If group elements g and h have finite order, then
||(g, h)|| = LCM{ ||g|| , ||h|| } .
Proof. By induction and manipulation, (g, h)n = (g n , hn ) . So :
(g, h)n = (1G , 1H ) ⇐⇒ (g n = 1G
and hn = 1H ) ⇐⇒
(||g|| divides n , and ||h || divides n) ⇐⇒ LCM{ ||g|| , ||h|| } divides n .
27
Finally, consider groups of order 8. We have three commutative groups
C8 , C4 ×C2 , and C2 ×C2 ×C2 (no pair isomorphic since the maximum orders
of elements are 8, 4, and 2 respectively). There is also the non-commutative
group D4 . There is a fifth group of order 8, namely Qrn, the quaternion
group :
Qrn = { ± 1 , ± i , ± j , ± k } ,
often pictured as a subset of R4 , where
1 = (1, 0, 0, 0) ; i = (0, 1, 0, 0) ; j = (0, 0, 1, 0) ; k = (0, 0, 0, 1) .
The multiplication table is determined by
i2 = j 2 = k 2 = −1 ; (−1)(i) = −i ; (−1)(j) = −j ; (−1)(k) = −k; and
ij = k = − ji ; jk = i = − kj ; ki = j = − ik .
One can verify associativity by computing; Qrn is clearly not commutative;
and Qrn ∼
%= D4 , since Qrn has only one element, −1, of order 2, whereas D4
has five such elements (four reflections and rotation of 180◦ ).
C8 , C4 × C2 , C2 × C2 × C2 , D4 or Qrn.
(See sections 11 and 13).
The problem of classifying finite groups has attracted mathematicians for
many years, with much success, but no final solution yet in sight. Summary
of our statements:
order
# of groups
# of commutative groups
1 2 3 4 5 6 7 8 9 10 11
1 1 1 2 1 2 1 5 2 2 1
1 1 1 2 1 1 1 3 2 1 1
Definition. Let H and L be subgroups of a group G. We say that G
splits as the internal direct product of H and L if and only if the following
map is an isomorphism:
28
φ : H × L −→ G
(h, l) 9→ hl
;
i.e. φ(h, l) = hl .
Theorem 8.9. A group G splits as the internal direct product of H and
L if and only if the following three conditions hold:
(i) H ∪ L generates G;
(ii) H ∩ L = {1}; and
(iii) if h ∈ H, and & ∈ L, then &h = h&.
Note. (i) ,(ii), and (iii) are often used as the definition of splitting.
Proof. ⇒:
(i) If g ∈ G, then g ∈ Imφ, so g = h& for some h ∈ H and & ∈ L. Thus g
is in the subgroup generated by H ∪ L, as required.
(ii) If g ∈ H ∩ L, then (g, g −1 ) ∈ H × L. But φ(g, g −1 ) = 1, and φ is
injective, so (g, g −1 ) = (1, 1), i.e. g = 1, as required.
(iii) We have
h& = φ((h, &)) = φ((1, &)(h, 1)) = φ((1, &))φ((h, 1)) = &h .
⇐: We prove this using iii), ii) and i) in that order. The function φ is a
morphism of groups (see Section 9), since
φ[(h1 , &1 )(h2 , &2 )] = φ(h1 h2 , &1 &2 ) = h1 h2 &1 &2
= h1 &1 h2 &2 = φ(h1 , &1 )φ(h2 , &2 ) .
It is injective, since
φ[(h, &)] = 1 =⇒ h& = 1 =⇒ h = &−1 ∈ H ∩ L
=⇒ h = &−1 = 1 =⇒ (h, &) = (1, 1) ,
as required.
It is surjective since Imφ is a subgroup of G containing H ∪ L and so it must
be all of G, as required. It contains H ∪ L because
φ[(h, 1)] = h
and
φ[(1, &)] = & .
Example. C12 is not the internal direct product of C2 and C3 , since
C2 ∪ C3 generates only C6 . It is not the internal direct product of C4 and C6
since C4 ∩ C6 = C2 %= 1. It is the internal direct product of C4 and C3 . The
Morphisms
29
first two statements follow more easily by noting that 12 is equal to neither
2×3 nor 4×6.
9. Morphisms.
Definition. A function φ : G → H between groups is a homomorphism
or a morphism of groups if and only if, for all g1 and g2 in G, we have
φ(g1 g2 ) = φ(g1 )φ(g2 ) .
An epimorphism is a surjective morphism. A monomorphism is an injective
morphism. An isomorphism is a bijective morphism.
Thus ‘iso.’ is the same as ‘both mono. and epi.’, and agrees with the
previous definition.
Exercise 9A. Check all the details in the examples below.
Examples. (i) φ : Z → Zk , φ(n) := [n], is an epimorphism.
(ii) φ : Ck → C∗ , φ(x) := x, is a monomorphism.
(iii) φ : H → G, φ(x) := x, where H is a subgroup of G, is a monomorphism.
(iv) φ : GL(n, R) → R∗ , φ(A) := detA, is an epimorphism.
(v) φ : R∗ → GL(n, R) , φ(x) := xI , the diagonal matrix with all x’s in
the diagonal, is a monomorphism.
(vi) φ : Sn → C2 , φ(σ) := sign(σ) , is an epimorphism for n ≥ 2, an
isomorphism for n = 2, and a monomorphism for n = 1 and 2.
(vii) If φ : G → H and ψ : H → J are both morphisms, then so is
ψ ◦ φ : G → J.
From here until the end of Section 10, ASSUME THAT
φ : G → G& is a morphism, and that 1 and 1& are the identity elements in
groups G and G& , respectively.
Proposition 9.1. We have φ(1) = 1& , and φ(g −1 ) = φ(g)−1 for all
g ∈ G.
Proof. This is the same as that for 5.2; bijectivity was not used in that
proof.
Morphisms
30
Proposition 9.2. Imφ := { φ(g) : g ∈ G } is a subgroup of G& .
Proof. Now G is not empty, so Imφ is not empty. Also
φ(a)−1 φ(b) = φ(a−1 )φ(b) = φ(a−1 b) ∈ Imφ ,
so Imφ is a subgroup by 6.1. (Imφ is called the image of φ.)
Definition. The kernel of φ is the subset of G defined by
Kerφ := { g ∈ G : φ(g) = 1& } .
Proposition 9.3. Kerφ is a subgroup of G.
Proof. We know that φ(1) = 1& , so Kerφ isn’t empty. Also
a, b ∈ Kerφ =⇒ φ(a−1 b) = φ(a)−1 φ(b) = 1&−1 1& = 1& ,
so 6.1 proves it.
Examples. (i) φ : Z → Zk , φ(n) := [n], has kernel equal to the subgroup
kZ = { 0, ± k, ± 2k, · · · }. (Note that kZ is not coset notation here.)
(ii) φ : GL(n, R) → R∗ , φ(A) := detA, has kernel equal to the group,
SL(n, R), of all matrices of determinant +1. These groups are called the
general linear group and the special linear group, respectively.
(iii) φ : Sn → C2 , φ(σ) := sign(σ) , has kernel An , the alternating group
(of all even permutations).
(iv) If φ is a monomorphism, then Kerφ = {1} and Imφ ∼
= G.
(v) The trivial morphism φ : G → G& , φ(g) := 1& for all g, has kernel
equal to G.
(vi) φ : Ck → C# , φ(x) := xk/d , where d = GCD{k, &}, has kernel equal
to Ck/d , and image Cd .
Exercise 9B. Show that this last φ is a well-defined function and a
morphism. Also check the other details in these examples.
Proposition 9.4. φ is a monomorphism if and only if Kerφ = {1}.
Normal subgroups and the first isomorphism theorem
31
Proof. If φ is a monomorphism, then φ−1 (g & ) contains at most one element, for all g & ∈ G& . But then Kerφ = φ−1 (1& ) = {1}. Conversely suppose
that Kerφ = {1}. Then
φ(a) = φ(b) =⇒ φ(a−1 b) = φ(a−1 )φ(b) = 1&−1 1& = 1&
=⇒ a−1 b ∈ Kerφ =⇒ a−1 b = 1 =⇒ a = b .
Thus φ is injective, as required.
10. Normal subgroups and the first isomorphism theorem.
Proposition 10.1. φ(a) = φ(b)
Proof. We have :
⇐⇒
a ≡ b(mod Kerφ).
φ(a) = φ(b) ⇐⇒ φ(a−1 b) = 1&
⇐⇒ a−1 b ∈ Kerφ ⇐⇒ a ≡ b(mod Kerφ) .
Corollary 10.2. φ(a) = φ(b) ⇐⇒ aK = bK , where K = Kerφ.
Proof. Each set gK is an equivalence class under the equivalence relation
which is congruence (mod K).
Thus the equivalence relation ∼ defined by [a ∼ b ⇐⇒ φ(a) = φ(b)] is
just congruence (mod Kerφ). But it’s an elementary fact of set theory that
the equivalence classes under ∼ are in 1-1 correspondence with the image of
φ, for any function φ between sets, no group structure being needed. Let’s
reprove it in this case:
Definition. If H is any subgroup of a group G, denote by G/H the set
of all left cosets of G(mod H). (The number of elements of G/H is then
[G : H].)
Proposition 10.3. If K = Kerφ, then ψ : G/K → Imφ, defined by
ψ(gK) = φ(g) is a well-defined bijective set function.
Proof. To see that ψ is well defined, we must show that the formula is
independent of the choice of g within a given coset; i.e. that
aK = bK ⇒ φ(a) = φ(b). This is half of 10.2. To show that ψ is injective, we
32
must show that ψ(aK) = ψ(bK) ⇒ aK = bK ; i.e. φ(a) = φ(b) ⇒ aK = bK.
This is just the other half of 10.2. Finally, ψ is surjective, since
(g & ∈ Imφ) ⇒ (∃g ∈ G with φ(g) = g & ) ⇒ (∃gK ∈ G/K with ψ(gK) = g & ) .
Now Imφ is a group, by 9.2. The natural thing to do is to make explicit
the unique binary operation on G/K which converts ψ from being merely a
bijective set function into being a group isomorphism. Now
ψ(aK)ψ(bK) = φ(a)φ(b) = φ(ab) = ψ(abK) .
We want the latter to be equal to ψ[(aK)(bK)] for the (as yet undefined)
product of aK and bK. But ψ is injective, so we are forced to define:
(aK)(bK) := (ab)K .
Now we could try to define a binary operation in this way on G/H for any
subgroup H, but it would not be well-defined in general. The subgroup
must be normal in G:
Definition. A subgroup N of a group G is normal in G if and only if,
for all g in G and n in N , we have g −1 ng ∈ N .
Definition. When N is a normal subgroup of a group G, define (modulo
checking that it is well defined) a multiplication on G/N as follows :
(aN )(bN )
:=
↑
(multiplication to be defined)
(ab)N
↑
(multiplication in G already given)
Theorem 10.4. This multiplication is well defined, and G/N becomes
a group using it. The identity element is N . The inverse of gN is g −1 N .
Definition. G/N is then called the quotient group ‘G modulo N ’.
Proof. For well definition, we must show that the definition is independent of choice of a and b within their respective cosets, i.e. that
aN = ãN and bN = b̃N
=⇒
abN = ãb̃N .
33
The assumptions yield a−1 ã ∈ N and b−1 b̃ ∈ N , and therefore
(ab)−1 ãb̃ = [b−1 (a−1 ã)b][b−1 b̃] ∈ N ,
since the last element is a product of two elements of N (the left-hand factor
being in N because N is normal). But (ab)−1 ãb̃ ∈ N yields the desired
conclusion.
Associativity follows easily:
aN {(bN )(cN )} = (aN )(bcN ) = {a(bc)}N
= {(ab)c}N = (abN )(cN ) = {(aN )(bN )}cN .
As for the identity element, (1N )(gN ) = (1 · g)N = gN = (gN )(1N ), as
required. For inverses,
(g −1 N )(gN ) = (g −1 g)N = 1N = N ,
as required.
Proposition 10.5. For any morphism φ : G → G& , the subgroup Kerφ
is normal in G.
Proof. If g ∈ G and n ∈ Kerφ, then
φ(g −1 ng) = φ(g −1 )φ(n)φ(g) = φ(g)−1 1& φ(g) = 1& ,
so g −1 ng ∈ Kerφ.
Theorem 10.6. (1st isomorphism theorem) If φ : G → G& is
any morphism, then the map ψ : G/Kerφ → Imφ , given by setting
ψ(gKerφ) := φ(g) , is an isomorphism.
Proof. The group structure on G/Kerφ comes from 10.5 and 10.4. Also
ψ is a well defined bijective set function by 10.3. It remains only to show
that ψ is a morphism. But this was essentially done just after the proof of
10.3:
ψ(aK)ψ(bK) = φ(a)φ(b) = φ(ab) = ψ(abK) = ψ{(aK)(bK)} .
This last computation had to work because we chose the multiplication
in G/Kerφ to make it work—see the discussion after 10.3.
34
There is one important case where every subgroup is normal—when G is
abelian:
Proposition 10.7. If G is commutative, then every subgroup H of G is
normal in G.
Proof. If g ∈ G and n ∈ H, then g −1 ng = g −1 gn = n ∈ H.
Examples. (i) kZ is the kernel of φ : Z → Zk , φ(m) = [m], and
Imφ = Zk , so Z/kZ ∼
= Zk by 10.6. In fact, the groups Z/kZ and Zk are
equal, and the map ψ corresponding to this particular φ is the identity map.
(ii) The subgroup {e, (12)} is not normal in S3 , nor are {e, (13)} nor
{e, (23)}. But A3 = {e, (123), (132)} is normal—in general An is the kernel
of φ : Sn → C2 , φ(σ) = sign(σ), so An is normal in Sn . As long as n > 1,
Imφ = C2 , so, by 10.6, Sn /An ∼
= C2 . The map ψ, corresponding to this φ,
maps An , the set of even permutations, to +1, and the other coset, the set
of odd permutations, to −1.
Exercise 10A. Show that a subgroup of index 2 is necessarily normal.
(iii) Let Dn be the subgroup of all the orthogonal transformations which
preserve some chosen regular n-gon centred at the origin in R2 . It consists
of “n” reflections, and “n” rotations (including the identity element). Define
φ : Dn → R∗ by φ(T ) = detT . Then Imφ = C2 and Kerφ = Dn+ , the cyclic
group of rotations. By 10.6, Dn /Dn+ ∼
= C2 . The corresponding ψ maps Dn+
to +1, and the other coset, consisting of reflections, to −1.
(iv) The trivial morphism φ : G → G& , φ(g) = 1& ∀g, has image {1& } and
has kernel G, so G/G ∼
= {1& }, which is rather obvious directly.
(v) The identity map from G to G has image G and has kernel {1}, so
G/{1} ∼
= G, another unsurprising consequence of 10.6.
Exercise 10B. Check, check, check!
Exercise 10C. For both of the following asserted isomorphisms, state the
needed hypotheses concerning normality, and then find isomorphisms. Note
that when one is looking for an isomorphism whose domain is a quotient
group, it’s usually easiest to define an epimorphism on the numerator, and
prove that its kernel is the denominator. In the second one, A and B are
both subgroups of a larger containing group, and ab = ba for all a ∈ A and
Solubility and a Sylow theorem
35
b ∈ B.
(G/K)/(H/K) ∼
= G/H
;
(A × B)/D ∼
= subgroup gend. by A ∪ B ,
where D = { (x, x) : x ∈ A ∩ B }.
(One could also take D = { (x−1 , x) : x ∈ A ∩ B }.)
11. Solubility and a Sylow theorem.
To begin, here is a (perhaps nasty)
Exercise 11A. Find an example of a group G, and k, a positive integer
dividing |G|, such that G has no subgroup of order k.
There is, however, an important partial converse to Lagrange’s theorem.
In this section, p and q always denote primes.
Theorem 11.1. (Sylow) If p is a prime and pt divides |G| , where G is
a finite group, then G has a subgroup of order pt .
Proof. For any non-empty subset A of G, define a subset
HA := { g ∈ G : gA = A }, where gA := { ga : a ∈ A }. Then HA is a
subgroup of G. Furthermore |HA | ≤ |A|, since multiplying distinct elements
g into some fixed element of A yields distinct answers. Let nA be the number
of distinct subsets gA as g ranges over G. Then g1 A = g2 A ⇐⇒ g1 ≡
g"2 (mod HA#), so nA = |G/HA | = |G|/|HA |. Since the binomial coefficient
|G| − 1
isn’t divisible by p, and
pt − 1
p
t
"
|G|
pt
#
the largest powers of p dividing
"
#
= |G|
"
|G|
pt
"
#
|G| − 1
pt − 1
#
,
and |G|/pt are the same. But
|G|
is the number of subsets with pt elements, which is a sum of numbers
pt
nA for various A with pt elements. Hence there exists A0 with pt elements
such that nA0 is not divisible by any larger power of p than the largest such
36
power which divides |G|/pt . Then |HA0 | = |G|/nA0 is divisible by pt , but
|HA0 | ≤ |A0 | = pt , so |HA0 | = pt . Hence we let the required subgroup be
HA0 .
Exercise 11B. Look up the definition of the action of a group on a set
at the start of Section 47. See also 47S. Now check the first two of the
following comments about actions.
Above we have G acting on the set of all its subsets with pt elements, and
we construct a subgroup fixing a subset. On the other hand, in the proof of
11.4 below, the group acts on itself by conjugation.
In sections 47 to 50, we study the linear actions of a group on vector
spaces. In the sections on Galois theory, the groups which arise do so by
virtue of their actions on sets of roots of polynomials, and also via operationpreserving actions on algebraic objects called fields.
Corollary 11.2. (Cauchy) If a prime p divides the order of the finite
group G, then G has an element of order p (a non-identity element of smallest
order in a subgroup of order pt ).
Theorem 11.3. If p is an odd prime, then any group of order 2p is
isomorphic to exactly one of C2p or Dp (cf. 8.4 and 8.5).
Proof. Using 11.2, choose a, b in G with ||a|| = p and ||b|| = 2. Then
G = { ai b j : 0 ≤ i ≤ p − 1 , 0 ≤ j ≤ 1 } ,
since all these products are distinct. Now ba %= ai for any i, since b %= ai−1 .
Hence ba = ai b for some i. Then
2
a = b2 ab−2 = bai b−1 = (bab−1 )i = ai .
Hence i2 ≡ 1(mod p), so i ≡ ±1. When i ≡ +1 , we get G ∼
= C2p .
∼
When i ≡ p − 1 , we get G = Dp . ( Use 5.3.5).
Hard Exercise 11C. Classify groups of order pq , where p and q are
distinct primes.
37
Theorem 11.4. 1f G is a non-trivial group whose order is a prime power,
then G contains a non-identity element x for which g −1 xg = x for all g ∈ G.
Remark and Definitions. The above condition, namely xg = gx for
all g ∈ G, says that x lies in the centre of G. So the theorem says that
a p-group has a non-trivial centre. That the centre of a group is always a
(clearly normal) subgroup is a routine verification.
Proof. For all x ∈ G, let nx be the number of elements in the set
{ g xg : g ∈ G }. Let Hx = { g ∈ G : g −1 xg = x }, a subgroup. Now
g1−1 xg1 = g2−1 xg2 if and only if g1−1 ≡ g2−1 (mod Hx ). Thus nx = |G|/|Hx | is
a power of p. But |G| = ps is a sum of numbers nx for various x. Thus the
number of x with nx = 1 is a multiple of p. It isn’t 0 because n1 = 1; hence
there are at least “p − 1” elements as claimed in the theorem.
−1
Definition. The set containing nx elements in the above proof is called
the conjugacy class of x in G .
Exercises 11D. Deduce that a non-trivial group G of prime power order
p has a subgroup H1 of order p which is normal in G.
By considering a normal subgroup of order p in G/H1 , show that G has a
normal subgroup H2 of order p2 if s ≥ 2.
Continue by induction to construct subgroups Ht normal in G of order pt for
0 ≤ t ≤ s.
Deduce that there is a sequence of subgroups
s
{1} = H0 ⊂ H1 ⊂ H2 ⊂ · · · ⊂ Hs = G
where |Hi | = pi . Furthermore, each Hi is normal in Hi+1 , and the corresponding quotient group is cyclic (since its order is p, a prime).
Remark and Definition. Hence a p-group is soluble.
In our situation just above, each Hi is actually normal in all of G, but that
is not required of the tower of subgroups in general for solubility. We only
require the existence of a tower where successive group extensions are normal
with cyclic quotient group.
38
Any tower in which successive group extensions are normal will be referred
to as a subnormal series for the top group. The notation
{1} = H0 ! H1 ! H2 ! · · · ! Hs = G
will often be used for such a series.
ON WORD USAGE. Wishing to annoy our friends in both Great
Britain and the United States, we have adopted solubility for groups, and
solvability for equations (in Section 35, where the two are mathematically
related).
Theorem 11.5. Any group of order p2 is isomorphic to exactly one of
Cp2 or Cp × Cp . (cf. 8.3 and 8.6).
Proof. If G %∼
= Cp2 , then every non-identity element has order p. Choose
an element a %= 1 with g −1 ag = a for all g, by 11.4. Choose an element
b ∈
/ {1, a, a2 , · · · ap−1 }. Then G = {ai bj : 0 ≤ i , j ≤ p − 1}. Since
p
a = bp = 1 and ba = ab, we have G ∼
= Cp × Cp . (Use 5.3.5.)
Exercises 11E. Let G be a non-abelian group of order 8. Show that G
has an element a of order 4. Choosing an element b ∈
/ {1, a, a2 , a3 }, show that
G = {ai bj : 0 ≤ i ≤ 3 , 0 ≤ j ≤ 1}. Show that ba = a3 b. Show that either
b2 = 1 and G ∼
= D4 ; or else b2 = a2 and G ∼
= Qrn. Combined with the theory
of abelian groups in Section 13, this proves 8.8. Try to classify groups of
orders p3 and p2 q. First look up the other Sylow theorems in Artin, p. 206.
Exercises 11F. Much later we shall need the fact that the image of a
soluble group under a morphism is also soluble. (Equivalently, a quotient of
a soluble group is soluble.) Here is a sequence of exercises to establish that.
Let H be a normal subgroup of a group G, and φ a morphism with domain
G.
(A) Prove that φ(H) is normal in φ(G).
(B) Show how φ induces an epimorphism
θ : G/H −→ φ(G)/φ(H)
gH 9→ φ(g)φ(H) .
(C) Show that the image of a cyclic group under a morphism is itself
cyclic. (More generally, the image of a generating set is a generating set for
the image.)
39
(D) By applying φ to each term of a subnormal series,
{1} = H0 ! H1 ! H2 ! · · · ! Hs = G ,
deduce that if G is soluble, then so is φ(G).
Exercise 11G. Prove that if N , a normal subgroup of G, and G/N are
both soluble, then so is G.
Exercise 11H. Prove that any subgroup of a soluble group is also soluble.
REMARKS CONCERNING GENERATORS AND RELATIONS.
We have had several examples of specific groups in which we found generators and relations. In such circumstances, one knows when ‘all’ relations
have been found : a list of all the group elements (with no redundancies)
presumably exists in some form, and to say that enough relations have been
found is substantiated by giving a method to take an arbitrary ‘word’ (that
is, an iterated product) in the generators and their inverses, and reduce it to
one of the group elements in the given list. As we’ve seen above, this way of
thinking about a group is useful in studying classification results.
But there is a more subtle question which we’ll leave to your later studies.
(It is much better studied in connection with the fundamental group in algebraic topology and with actions of groups on graphs). Suppose given not a
group, but just some symbols to be used as generators, and a set of relations
in ‘group words’ involving these symbols. Is there then a group with exactly
these generators and relations? The answer is yes. But even when the set of
relations is empty, it’s a bit of work to give a correct exposition constructing
the group. Such a group is called the free group on the set of generators.
(It is infinite cyclic if there is only one generator, but highly non-abelian if
more than one.) But once it has been constructed, the answer to the general
case comes easily : one constructs the free group on new, different symbols
which are in 1-1 correspondence with the given generators, and then factors
this free group by the smallest normal subgroup containing the words in the
new symbols which correspond to the desired relations. The group defined
by a given specification of generators and relations is unique up to a unique
isomorphism, in a suitable sense.
Challenging Exercise 11I. In the symmetric group Sn , denote the
transposition (i i + 1) as τi for 0 < i < n. Prove that Sn is given by
Sn is not soluble
40
these “n − 1” generators and the relations
τi2 = e
;
(τi τi+1 )3 = e
;
τj τi = τi τj if j > i + 1 .
Above one takes all values of i for which the left-hand sides are meaningful.
12. Sn is not soluble.
More precisely, this is true for all n > 4; we have towers
S2 " {e} ;
S3 " A3 " {e} ;
S4 " A4 " {e, (12)(34), (13)(24), (14)(23)} " {e, (12)(34)} " {e} .
Exercise 12A. Show that, in every case, the smaller group is normal in the
larger (as implicitly claimed by using the symbols "), and the corresponding
quotient group is cyclic.
Theorem 12.1. The group Sn is not soluble for n ≥ 5.
This theorem will be deduced easily from the next one.
Theorem 12.2. For n ≥ 5, the group An is simple, i.e. has no normal
subgroups other than the two extreme ones.
Remarks. The smallest non-abelian simple group is A5 , of order 60. It
follows easily, using the next section, that every group of smaller order is
soluble. Several other finite and infinite families of finite simple groups were
discovered a long time ago. Between 1960 and 1975, a few new, individual,
such groups were added to earlier ones (and collectively called sporadic simple groups). By 1980, many mathematicians were convinced that a proof had
been constructed that this was the complete list of finite simple groups (up to
isomorphism), although apparently no individual mathematician had completely checked the (several thousand page) proof by 1993. This rumoured
theorem is important because all finite groups are built up in towers where
the successive quotients are finite simple groups. About half of the proof of
classification involves representation theory, whose rudiments are given here
in sections 47 to 50. (We’ll go a small distance to demonstrate the flavour of
Sn is not soluble
41
this by proving the famous (p, q)–theorem of Burnside : a finite non-abelian
simple group has order divisible by at least three distinct primes. This comes
close to precluding any non-abelian group of order less than 60 from being
simple.) ‘Simple’ refers not to the innards of such a group, but to its external relation to other groups: it cannot be regarded in any non-trivial way
as being ‘built up’ out of a normal subgroup and the corresponding quotient
group.
Proof of 12.2. Let H be a non-trivial normal subgroup of An . We’ll
show that H = An by a sequence of reductions.
Firstly, it suffices to show that each 3-cycle is in H , since the set of
3-cycles (12k), for 3 ≤ k ≤ n, generates An (Ex. 6C).
Since, for three distinct positive integers a, b and c not exceeding n, there
is an even permutation
γ =
"
1 2 3 ···
a b c ···
#
,
(write such a permutation down and then, if necessary to make it even,
interchange the two entries at the right hand end of the lower line—recall
that n ≥ 5), and since
γ −1 (abc)γ = (123) ,
it suffices to show that H contains at least one 3-cycle, that is, some element
which fixes exactly “n − 3” integers.
Define
m = mH := max{ & < n : ∃β ∈ H with β fixing exactly “&” integers }.
It remains only to show that m = n − 3; that is, m ≥ n − 3, since m = n − 1
or n − 2 are clearly impossible. Pick some β ∈ H which fixes “m” integers,
and assume, for a contradiction, that m < n − 3, so that β moves at least
4 integers. We’ll divide into three cases, in each instance producing an even
permutation, α, and thence a non-identity element, β −1 α−1 βα (clearly in H),
which fixes more than “m” integers, contradicting the definition of m.
(I) Suppose that β is a cycle, (abcde · · ·), of odd length at least five. Let
α = (abc). Then β −1 α−1 βα fixes a as well as all numbers which β fixes, but
moves b (to c), as required.
(II) Suppose that β is a product of a disjoint set of an even number of
transpositions. Let (ab) and (cd) be two of them. Let α = (cde) for an
integer e differing from a, b, c and d. Then βα−1 βα fixes all integers which
Sn is not soluble
42
β fixes except possibly e, but also fixes a and b. Furthermore it moves c to
β(e) %= c, producing the required contradiction.
(III) Alternatively, the set of cycles in β must contain two cycles, δ and
,, of lengths at least three and at least two, respectively. Write δ = (uvw · · ·)
and , = (xy · · ·). Let α = (yxw). Then β −1 α−1 βα again fixes everything that
β fixes and fixes u as well, but moves v (to ,−1 (x)), producing the required
contradiction.
Proof of Theorem 12.1. If Sn were soluble, a tower showing this would
need to have at least one intermediate group, since Sn is not cyclic. We’ll
show that Sn has only three normal subgroups, the obvious two, and An .
Since An is simple and not cyclic, this leaves no possibilities for such a tower.
So let H be a proper non-trivial normal subgroup of Sn . By 12.2, H ∩ An
is either An or {e}.
Exercise 12B. Show that if B is normal in C, then B ∩ A is normal in
A, for any subgroup A of C (as just used).
In the first case, since no group fits between An and Sn , we have H = An ,
as required. In the second case, since the product of two odd permutations is
even, H must have order 2. But none of the elements of order 2 in Sn is fixed
by all conjugations, such elements being products of disjoint transpositions,
by 4.8. So the second case cannot occur.
This theorem will be used in Section 35 (as the input from group theory)
in the proof that there can be no formula, involving only the operations
of arithmetic and nth roots, for solving the general polynomial equation of
degree 5 or greater.
Finitely generated abelian groups
43
13. Finitely generated abelian groups.
The theorem below gives a complete, non-redundant list, up to isomorphism, of all those abelian groups which have finite sets of generators. The
list consists of certain direct products of cyclic groups, so that the structure
of these groups is especially simple. In Section 43 ahead, a more general
theorem is presented. It is proved in slightly more detail than we give here,
and depends on only a few facts about rings from sections 14 and 17.
Definitions. When groups G and H are abelian and additive notation
*
is used, the group G × H is denoted G H and is called the direct sum of G
and H. Internal direct sums are defined as before for internal direct products.
A group G is finitely generated if and only if at least one of the finite subsets
of G generates G. Obviously any finite group is finitely generated.
Lemma 13.1 An abelian group is the internal direct sum of subgroups
G1 , · · · , Gr if and only if each element can be written uniquely in the form
g1 + g2 + · · · + gr with each gi ∈ Gi .
Exercise 13A. Prove this.
Theorem 13.2. If G is a finitely generated abelian group, then there
is exactly one sequence { k1 , k2 , · · · , kr }, where for each i, the integer ki
divides ki+1 and either ki = 0 or ki > 1, such that
r
+
*
*
*
∼
G = Zk1
Zk2
···
Zkr =:
Zki .
i=1
(Here Z0 means Z.)
Note. The sequence of integers { k1 , · · · , kr } is called the sequence of
invariant factors of G. Any zeros must come at the end, and they will occur
if and only if G is infinite. If |G| = 8 for example, the possible sequences are
{8}, {2, 4}, and {2, 2, 2} (cf. 8.8). By its very definition, a cyclic group is
finitely generated. As an easy exercise, show that a direct product of finitely
many finitely generated groups is also finitely generated. Thus 13.2 does
what is claimed in the first sentence of this section.
Proof. We prove the existence of { k1 , · · · , kr } with the required
properties by induction on r := the minimum number of elements which can
44
generate G. For r = 0,
G = {0} =
0
+
Zki .
i=1
(The reader who dislikes starting with r = 0, and/or taking a direct sum
indexed by the empty set, should exclude the zero group from the statement
of the theorem, and modify/simplify the inductive step just ahead by setting
r = 1, in order to obtain a proof of an initial case with r = 1.)
For the inductive step: Let
P := { m ∈ Z : ∃{g1 , · · · gr } generating G, and
integers m2 , · · · , mr with mg1 +
r
,
mi gi = 0} .
i=2
First assume that P = {0}. Choose {g1 , · · · , gr } generating G. Then each
element can be written as m1 g1 + · · · + mr gr (uniquely, since P = {0} ), so G
is the direct sum of the cyclic groups generated by the gi (use 13.1). Since
mgi %= 0 for all m %= 0 (because P = {0}), each of these cyclic groups is
isomorphic to Z. So let k1 = k2 = · · · = kr = 0, and we’re done.
If P %= {0}, then P has positive elements. Let
k1 := min{ m ∈ P : m > 0 } .
Note that k1 %= 1, since G cannot be generated by {g2 , · · · , gr }.
Now consider all possible choices of integers {&2 , &3 , · · · , &r } and
of generators {g1 , · · · , gr } such that k1 g1 + ri=2 &i gi = 0 .
(a) For all such choices, k1 | &j for all j > 1:
Divide k1 into &j , giving &j = sk1 + R with 0 ≤ R < k1 . Then we have a
relation
,
Rgj + k1 (g1 + sgj ) +
&i gi = 0,
i*=1,j
which by minimality of k1 implies that R = 0, since the set
{ gj , g1 + sgj , g2 , · · · , gj−1 , gj+1 , · · · , gr }
also generates G. Thus &j = sk1 is a multiple of k1 , as required.
45
-
r
(b) For all such choices, if
i=1 mi gi = 0, then k1 | m1 :
Write m1 = qk1 + R with 0 ≤ R < k1 . Then we have a relation
Rg1 +
r
,
i=2
(mi − q&i )gi = 0,
so again R = 0.
Now make a fixed such choice, and define
g¯1 := g1 +
r
,
(&i /k1 )gi ,
i=2
using a). Then k1 g¯1 = 0. Let H be the cyclic group generated by g¯1 . Then
H∼
= Zk1 , since, by choice of k1 , the element g¯1 cannot have order less than k1 .
Let J be the group generated by {g2 , · · · , gr }. By the inductive hypothesis, J
is the internal direct sum of cyclic groups generated by (say) {g¯2 , g¯3 , · · · , g¯r },
*
*
and isomorphic to (say) Zk2
···
Zkr , where k2 | k3 | k4 · · ·.
It remains to show that G is the internal direct sum of H and J, and
that k1 | k2 . The set H ∪ J generates G, since {g¯1 , g2 , · · · , gr } generates G. If
x ∈ H ∩ J, let
r
,
x = m1 g¯1 =
(−mi )gi .
i=2
-
Then ri=1 mi gi = 0, so k1 | m1 by (b). Hence x = m1 g¯1 = 0. Thus
H ∩ J = {0}, so G ∼
= H ⊕ J. Finally we have
r
,
ki g¯i =
i=1
r
,
0 = 0,
i=1
so k1 | k2 by (a).
To prove uniqueness, suppose that
G =
r
+
i=1
Zki
∼
=
s
+
Z#i = H ,
i=1
where ki | ki+1 and &i | &i+1 , all these integers being non-negative. By adding
1’s at the beginning of the shorter sequence, we can assume that r = s.
We shall show that ki = &i for all i. If not, for a contradiction, let j =
46
min{i : ki %= &i }. Let p be a prime and α an integer such that pα+1 | kj ,
but pα+1 does not divide &j (interchanging the k’s and &’s if necessary). Let
pβ G := { pβ x : x ∈ G }. Since G ∼
= H, we have
pα G/pα+1 G ∼
= pα H/pα+1 H .
But the left hand side has at least pr−j+1 elements since
α
p G/p
α+1
G ∼
=
r
+
pα Zki /pα+1 Zki
i=1
and pα Zki /pα+1 Zki has p elements as long as pα+1 | ki , which holds for
j ≤ i ≤ r. However, the right hand side has at most pr−j elements, because
pα+1 does not divide &i for 1 ≤ i ≤ j, and so pα Z#i /pα+1 Z#i is a trivial group
for these i. From this contradiction, we conclude that ki = &i for all i, as
required.
Definitions. The torsion subgroup of G is its set of elements of finite
order. G is torsion-free if and only if its torsion subgroup is {0}. A basis for
G is an indexed set {gα } ⊂ G such that each element of G can be expressed
uniquely as a finite integral combination α kα gα , where the kα are integers,
zero for all but finitely many α. Then G is a free abelian group if and only if
it has a basis.
If G has a finite basis {g1 , g2 , · · · , gr }, then clearly
*
*
*
G ∼
Z
···
Z (“r” copies of Z) .
= Zr := Z
A free abelian group is evidently torsion-free. The group of rational numbers,
Q , is torsion free, but is not free abelian. (CAUTION. There is a definition
of an important idea in general, not-necessarily-abelian group theory, that
of free group. See the remarks after 11H. The only [free abelian] group
which is an abelian [free group] is the case of rank r = 1 [defined below], i.e.
isomorphic to Z.) The group Q/Z is a torsion abelian group (i.e. it coincides
with its torsion subgroup) but is not finitely generated, nor is it a direct
product or direct sum of cyclic groups, even in any sense in which one uses
infinitely many factors (an idea which the reader may wish to investigate!)
Note. The above definitions make sense for any abelian group G. In the
following corollaries, G is assumed also to be finitely generated. Some are
47
true more generally, and these and others can be proved independently of
13.2.
Corollary 13.3. G is torsion-free if and only if G is free.
Corollary 13.4. The torsion subgroup is finite.
Corollary 13.5. If G is free, then any two bases have the same number
of elements (called the rank of G).
Corollary 13.6. G is the internal direct sum of its torsion subgroup and
a free abelian group whose rank is unique.
Definition. The p-component of G is the subgroup
{ g ∈ G : ∃α with pα g = 0 } .
Corollary 13.7. The p-component of G is isomorphic to
Zpt1
*
Zpt2
*
···
*
Zpt!
for a unique sequence 1 ≤ t1 ≤ t2 ≤ · · · ≤ t# .
Theorem 13.8. The torsion subgroup of G is the internal direct sum
of the p-components of G for those primes p such that the p-component is
non-zero.
(By 13.4 there are finitely many such primes.)
%
Proof. Given g of order k = ti=1 pαi i , where p1 , · · · , pt are the primes
%
referred to, define kj to be i*=j pαi i . Since GCD{k1 , · · · , kt } = 1, we have
si ki = 1 for some integers si . Then g = ti=1 si ki g is a decomposition, as
required, since ki g has order pαi i .
Uniqueness of the decomposition follows easily from the fact that 0 is the
only element whose order is both a power of pi and prime to pi .
The p-primary decomposition is often more convenient than the invariant
factor decomposition, but passing from either to the other, and listing all
possible abelian groups of some given finite order, are easy tasks. See 44D
and the paragraphs following it.
48
We now give a precise definition of ‘the abelian group with generators
x1 , · · · , xn and relations rj := ni=1 αji xi = 0 for 1 ≤ j ≤ k ’ , and show
how to decompose such a group.
Let
F := Zn := Z
*
Z
*
···
*
Z
(“n” copies of Z) ,
and let x̄i := (0, 0, · · · , 0, 1, 0, · · · , 0) ∈ F (with 1 in the ith place). Let
R be the subgroup generated by {r̄1 , · · · , r̄k } where r̄j := αji x̄i . Define
G := F/R and define xi := x̄i + R.
To decompose, let A be the (k × n)-matrix (αji ). Using integer row and
column operations—again we refer to Section 44 for more details—reduce A
to a diagonal matrix




D=




d1
0
...
d#
0  ,
0


where di | di+1 for all i. Then D = P AQ where P and Q are integer matrices
with integer matrix inverses. They are found by applying the row and column
operations
respectively
toidentity matrices of the correct sizes. For x =



y1
x1
 . 
 .. 
 .  , define y =  ..  := Q−1 x . If we write the relations Ax = 0,
xn
yn
an equivalent system of relations is (P AQ)(Q−1 x) = 0, i.e. Dy = 0, i.e.
di yi = 0 for 1 ≤ i ≤min(k, n). Define dk+1 = dk+2 = · · · = dn = 0 if k < n.
Since x = Qy, the set {y1 , · · · , yn } also generates G. Let t =min{i : di %= 1}—
(except if di = 1 for all i where we’d have G = {0} and could forget about
this calculation!) Then y1 = y2 = · · · = yt−1 = 0, and G is the direct sum
of the cyclic groups generated by yt , yt+1 , · · · , yn and has invariant factors
dt , dt+1 , · · · , dn .
Exercise 13B. Decompose in both ways the abelian group generated
by {x1 , x2 , x3 , x4 } with the following relations. Also find generators for the
direct summands.
(a)
8x1 + 8x4 = 10x3
49
10x1 = 350x4
16x1 + 90x3 + 16x4 = 20x2
8x1 = 10x2 + 352x4
(b)
2x1 − 5x2 + 37x3 + 39x4 = 0
x1
+ x3 + 2x4 = 0
x1 − 5x2 − 4x3 − 3x4 = 0
(c)
4x1 + 3x2 = 6x3
x1 + 2x2 = 2x3
5x1 + 5x2 + x4 = 7x3
2x3 = 0
x1 + x2 = 2x3
Exercise 13C. Prove that a non-zero finite abelian group cannot be
decomposed as a direct sum of non-trivial groups if and only if it is cyclic of
prime power order. (We then refer to it as being indecomposable. ) Deduce
the Krull-Schmidt Theorem for finite abelian groups: The decomposition
of such a group into a direct product of indecomposable factors is unique up to
isomorphism of the factors and the order in which they appear. (This theorem
holds for arbitrary finite groups.) The existence of such a decomposition for
any finite group is quite easy to see.
Exercise 13D. Given distinct finite abelian groups H ⊂ G, prove that
there is no intermediate group between G and H if and only if G/H is cyclic
of prime order (i.e. is an abelian simple group ). Deduce the Jordan-Holder
Theorem for finite abelian groups : a composition series
{1} = H0 ! H1 ! H2 ! · · · ! Hs = G
for G (i.e. each inclusion is the inclusion of a proper normal subgroup, and
no new groups can be ‘inserted’ in between, preserving the normality of each
extension) is unique in the following sense: the sequence of quotient groups,
Hi+1 /Hi , is unique up to isomorphism and re-ordering. (This theorem also
The generalized associative law
50
is true for arbitrary finite groups—that’s why we bothered to write the word
“normal” above.) The existence of such a composition series for any finite
group is quite easy to see.
For example, the composition factors for Z6 are { Z2 , Z3 } , and can be
made to appear in either order. For S4 they are { Z2 , Z2 , Z3 , Z2 }; see the
composition series given just before 12.1. Although they aren’t forbidden by
Jordan and Holder from appearing in a different order, they actually cannot
in this case, because S4 has no normal subgroup of order 8, and A4 has no
subgroup at all of order 6 (cf. 11A). A finite soluble group (see 11D) is
one whose composition quotients are cyclic of prime order. This wasn’t quite
our definition, but can be deduced. In the other direction, for solubility , it
suffices to have a tower in which the successive quotients are abelian.
Exercise 13E. Prove this. (Use the structure theorem— first show that
if a normal subgroup N can be ‘inserted’ into a group Q, and if H ! G with
G/H ∼
= Q , then a group K can be ‘inserted’ as follows: H ! K ! G with
K/H ∼
= N and G/K ∼
= Q/N .)
A non-soluble group is one whose composition quotients include at least
one non-abelian simple group.
If you reacted as intended, your proofs in 13C and 13D used (at least
parts of) the structure theory for finite abelian groups. On the one hand,
this illustrates the power of having such a strong grip on what all the finite
abelian groups look like. On the other hand, the proofs of these results at
their correct level of generality (no assumption of ‘abelianity’) are much more
elegant.
Exercise 13F. Prove that any subgroup of a finitely generated abelian
group is itself finitely generated. More delicately, for any subset S of a finitely
generated abelian group, there is a finite subset T of S, such that T and S
generate the same subgroup.
APPENDIX ZZZ. The generalized associative law.
This boring section gives a rigorous formulation of 2.1, the generalized
associative law. The word ‘bracketing’ needs to be eliminated. Since 2.1 says
that we can multiply sequences of more than two elements unambiguously,
the correct formulation should involve the notion of an n-fold product map
Pn from sequences of length n yielding elements in S.
51
Proposition. Suppose that ∗ is an associative binary operation on a set
S. Then there is exactly one sequence of maps Pn : S n → S for n ≥ 1 having
the following properties.
(i) P1 (s) = s for all s ;
(ii) Pn (s1 , s2 , · · · , sn ) = Pi (s1 , s2 , · · · , si ) ∗ Pn−i (si+1 , si+2 , · · · , sn ) for all
n ≥ 1 and 1 ≤ i ≤ n − 1 .
Proof. (Existence): Define inductively P1 (s) = s, and, for n ≥ 2,
Pn (s1 , s2 , · · · , sn ) = s1 ∗ Pn−1 (s2 , · · · , sn ) .
So (i) holds by definition. We prove (ii) by induction on n. For n = 1, there
are no values of i with 1 ≤ i ≤ n − 1, so the condition holds. Suppose that
n ≥ 2 and the condition (ii) holds for n − 1. Prove it for n in two cases:
If i = 1, the needed identity is immediate from the definition.
If 2 ≤ i ≤ n − 1, then
52
Pi (s1 , s2 , · · · , si ) ∗ Pn−i (si+1 , si+2 , · · · , sn )
= {s1 ∗ Pi−1 (s2 , · · · , si )} ∗ Pn−i (si+1 , si+2 , · · · , sn ) [by def n. , since i ≥ 2]
= s1 ∗ {Pi−1 (s2 , · · · , si ) ∗ Pn−i (si+1 , si+2 , · · · , sn )} [by associativity]
= s1 ∗ Pn−1 (s2 , · · · , sn )
[by the inductive hypothesis]
= Pn (s1 , s2 , · · · , sn )
[by definition] .
(Uniqueness): Suppose that Qn : S n → S is another such sequence.
We prove that Qn = Pn by induction on n. For n = 1, we have that
Q1 (s) = s = P1 (s) for all s, so Q1 = P1 .
Suppose that n ≥ 2 and Qn−1 = Pn−1 . Then
Qn (s1 , · · · , sn ) = Q1 (s1 ) ∗ Qn−1 (s2 , · · · , sn )
= s1 ∗ Qn−1 (s2 , · · · , sn )
[by (ii) with i = 1]
[ by (i)]
= s1 ∗ Pn−1 (s2 , · · · , sn )
= Pn (s1 , · · · , sn )
[by inductive hypothesis]
[ by definition] .
Since this holds for all (s1 , · · · , sn ), we get Qn = Pn .
Remark. Of course we normally use the notation s1 ∗ s2 ∗ · · · ∗ sn ,
or even s1 s2 · · · sn , rather than Pn (s1 , · · · , sn ) .
II. Commutative Rings.
Sections 14 to 22 study the most basic objects of commutative algebra.
This will probably look somewhat more familiar than did groups, since the
idea of ring is based on the standard number systems and polynomials, and,
in the non-commutative case, on square matrices.
14. Rings.
Definitions. A Z-algebra (or associative algebra over Z) is an ordered
triple (R , + , ·), where R is a set, and both + and · are binary operations
on R, with the following properties.
(1) (R , +) is an abelian group.
(2) a(bc) = (ab)c for all a, b, c in R. (This, of course, is associativity.)
(3) a(b + c) = (ab) + (ac) and
(b + c)a = (ba) + (ca) for all a, b, c in R. (This is distributivity.)
Above, and henceforth, we write the multiplication using juxtaposition. Also
we continue with the conventions from kindergarten that multiplication is done
before addition etc., so that use of brackets can be minimized. For example, the
brackets on the right hand sides of the distributive laws are unnecessary.
An element 1 is an identity element if and only if a1 = a = 1a for all a ∈ R.
If R has a 1, then we call R a ring. If R is a ring, then an element u is
invertible (or is a unit ) if and only if there is an element v (in R, of course)
with uv = 1 = vu. Then v is the inverse of u. A division ring (or skewfield)
is a ring in which every non-zero element is invertible. (Sometimes 1 %= 0 is
also assumed in a division ring, as we do in Sections 51 and 52. This merely
excludes the one example R = {0}.) We say that R is commutative if and
only if ab = ba for all a, b ∈ R. A field is a commutative division ring in
which 1 %= 0.
Remarks. It has become customary to require a ring to have an identity
element. The reader should be aware, when reading books on the subject,
that some do not make this requirement. See Section 52 for the general
definition of an (associative) R-algebra. It is convenient that this, with R =
Z, is a ready-made term for what used to be called a ring. We shall try
to stick to the words identity element and invertible element, since unit is
sometimes used for each of them.
53
Rings
54
Exercises 14A. 0x = 0 = x0 for all x ; 1 = 0 =⇒ R = {0} ;
− (ab) = (−a)(b) = (a)(−b) ; (−a)(−b) = ab ;—ad infinitum—; in any
ring.
Examples. The standard number systems Z, Q, R, and C are commutative rings, all but Z being fields. Given a commutative ring R, one has:
(1) R[x], the ring of polynomials with coefficients in R—see Section 16;
(2) Mn (R) or Rn×n , the ring of n × n matrices with entries from R. This ring
is almost never commutative. Its invertibles are the matrices with inverses,
sometimes called ‘non-singular’.
An example of a non-commutative division ring is not so easy to find.
(There aren’t any finite ones; see Section 51.) The classic example is the
quaternions . This is R4 with vector addition and the following multiplication. Let {1, i, j, k} be the standard ordered basis for R4 , and multiply
them using the multiplication table for the quaternion group, Qrn, in Section
8. Then the distributive law , together with the extra requirement, involving
scalar multiplication,
(!!)
α·(vw) = (v)(α·w) = (α·v)(w)
for all α ∈ R and v, w ∈ R4 , determines the product of any two vectors.
This is the best way to remember the multiplication, though one can write
a horrible formula
(a1 , a2 , a3 , a4 )(b1 , b2 , b3 , b4 ) = (f1 , f2 , f3 , f4 )
where each fi is a function of 8 variables a1 , a2 , · · · , b4 . Proving associativity
and distributivity may be done by a lengthy computation.
Associativity may also verified by checking it for products of the basic
elements i, j and k, and then using distributivity to verify it in general. The
following is easier. Regard a quaternion (a1 , a2 , a3 , a4 ) as a pair of pairs
((a1 , a2 ), (a3 , a4 )), i.e. as a pair of complex numbers, (c1 , c2 ). Letting c̄ =
complex conjugate of c ∈ C, the multiplication becomes
(c1 , c2 )(d1 , d2 ) = (c1 d1 − c2 d¯2 , c1 d2 + c2 d¯1 ) .
Now to verify associativity:
[(c1 , c2 )(d1 , d2 )](e1 , e2 ) = (c1 d1 − c2 d¯2 , c1 d2 + c2 d¯1 )(e1 , e2 )
Rings
55
= (c1 d1 e1 − c2 d¯2 e1 − c1 d2 e¯2 − c2 d¯1 e¯2 , c1 d1 e2 − c2 d¯2 e2 + c1 d2 e¯1 + c2 d¯1 e¯1 ).
On the other hand,
(c1 , c2 )[(d1 , d2 )(e1 , e2 )] = (c1 , c2 )(d1 e1 − d2 e¯2 , d1 e2 + d2 e¯1 )
= (c1 d1 e1 − c1 d2 e¯2 − c2 d¯1 e¯2 − c2 d¯2 e1 , c1 d1 e2 + c1 d2 e¯1 + c2 d¯1 e¯1 − c2 d¯2 e2 ).
Comparing answers gives the result.
As for distributivity:
(c1 , c2 )[(d1 , d2 ) + (e1 , e2 )] = (c1 , c2 )(d1 + e1 , d2 + e2 )
= (c1 d1 + c1 e1 − c2 d¯2 − c2 e¯2 , c1 d2 + c1 e2 + c2 d¯1 + c2 e¯1 ) ,
whereas
(c1 , c2 )(d1 , d2 ) + (c1 , c2 )(e1 , e2 )
= (c1 d1 − c2 d¯2 , c1 d2 + c2 d¯1 ) + (c1 e1 − c2 e¯2 , c1 e2 + c2 e¯1 )
= (c1 d1 − c2 d¯2 + c1 e1 − c2 e¯2 , c1 d2 + c2 d¯1 + c1 e2 + c2 e¯1 ),
as required.
The other distributive law may be verified similarly, or else by the following trick, using the quaternionic conjugate (a,b,c and d are reals):
a + bi + cj + dk := a − bi − cj − dk .
Exercise 14B. Prove the identities
h = h ; h1 + h2 = h1 + h2 ; and h1 h2 = h2 h1 .
(Note the order reversal.)
Then, using the distributive law already verified,
(h1 + h2 )h3 = (h1 + h2 )h3 = h3 (h1 + h2 ) =
h3 (h1 + h2 ) = h3 h1 + h3 h2 = h3 h1 + h3 h2 =
h1 h3 + h2 h3 = h1 h3 + h2 h3 .
Rings
56
The formula hh̄ = ||h||2 , where ||h|| here means the length of h as a vector
in R4 , immediately leads to the existence of inverses:
h−1 = h̄ / ||h||2
for h %= 0 .
Here we are identifying each real number with the corresponding multiple
of 1, the quaternionic identity element. The quaternions are unique in the
sense that no other Rn has a multiplication making it into a non-commutative
division ring satisfying the condition above labeled as (!!). See Section 52.
Let’s discuss the analogues, for rings, of group morphisms, etc.
Definitions. Let R and R& be rings, whose identity elements are 1 and
&
1 respectively. A map φ : R → R& between rings is a morphism of rings if
and only if, for all x and y in R,
φ(x + y) = φ(x) + φ(y) ;
φ(xy) = φ(x)φ(y) ;
and φ(1) = 1& .
Exercise 14C. None of these can be deduced from the other two.
‘Epi-’, ‘mono-’ and ‘iso-’ are used as prefixes for ‘morphism’ just as in group
theory. An additive subgroup I ⊂ R is a left (resp. right) (resp. two-sided)
ideal if and only if [r ∈ R and x ∈ I] implies that [rx ∈ I (resp. xr ∈ I)
(resp. both rx ∈ I and xr ∈ I)].
An additive subgroup T ⊂ R is a subring of R if and only if it is also closed
under multiplication and contains the identity element of R.
Proposition 14.1. Let φ be a ring morphism from R to R& . Then
(i) Imφ is a subring of R& ;
(ii) Kerφ is a two-sided ideal in R .
Proof. (i) The map φ is, in particular, a morphism of additive groups, so
Imφ is an additive subgroup of R& . Also φ(1) = 1& ∈ Imφ. Finally, if x& and
y & are in Imφ, let x& = φ(x) and y & = φ(y). Then x& y & = φ(x)φ(y) = φ(xy) ∈
Imφ, as required.
(ii) Kerφ is certainly an additive subgroup of R, since φ is a morphism
of abelian groups. Also, if both r ∈ R and x ∈ Kerφ then φ(rx) = rφ(x) =
r0 = 0, so rx ∈ Kerφ ; and similarly for xr.
Definition. If I is a two-sided ideal in R, define a multiplication on
the additive quotient group R/I as follows : (x + I)(y + I) := xy + I.
Theorem 14.2. The abelian group R/I is a ring with this multiplication,
which is, in particular, well defined.
Rings
57
Proof. To show that the multiplication is well defined, we must show
that if x + I = x̃ + I and y + I = ỹ + I, then xy + I = x̃ỹ + I . But x̃ = x + r
for some r ∈ I, and ỹ = y + s for some s ∈ I. Thus x̃ỹ = xy + xs + ry + rs.
But xs ∈ I since I is a right ideal, and ry ∈ I since I is a left ideal, and
rs ∈ I for both reasons. Thus the multiplication is well defined.
To prove associativity,
[(x + I)(y + I)](z + I) = (xy + I)(z + I) = (xy)z + I
= x(yz) + I = (x + I)(yz + I) = (x + I)[(y + I)(z + I)] .
Distributivity is just as easy. The identity element is 1 + I.
Example. The subgroup nZ is a two-sided ideal in Z, so we have that
Zn = Z/nZ is now ‘officially known’ to be a ring. The ring Z is unusual in
that all of its additive subgroups happen to be ideals.
Theorem 14.3. (First isomorphism theorem for rings) If φ :
R → R& is a ring morphism, then the isomorphism of additive groups ψ :
R/Kerφ → Imφ (given by the first isomorphism theorem for groups) is
actually a ring isomorphism.
Proof. The only extra assertions, beyond the theorem for groups, are
that ψ behaves with respect to multiplication,
ψ[(x + Kerφ)(y + Kerφ)] = ψ(xy + Kerφ)
= φ(xy) = φ(x)φ(y) = ψ(x + Kerφ)ψ(y + Kerφ) ,
and that
ψ(1 + Kerφ) = φ(1) = 1& .
Definition. An integral domain is a commutative ring (of course, with
1) such that:
(i) if x %= 0 and y %= 0, then xy %= 0 (i.e. the only zero divisor is 0);
(ii) 1 %= 0 (equivalently R %= {0}).
Caution. Some authors don’t require commutativity, or don’t require
the ring to have a 1, or allow the zero ring to be an integral domain.
Proposition 14.4. If F is a field, and R is a subring of F , then R is an
integral domain.
Structure of Z×
n
58
Proof. Certainly R is a commutative ring, since it inherits commutativity
from F . Furthermore 1 %= 0 in a field F. It also inherits the property of having
no non-zero zero divisors: if x %= 0 and xy = 0, then
y = (x−1 x)y = x−1 (xy) = x−1 0 = 0 .
Corollary. Any field is an integral domain; Z, Q, R and C are integral
domains (facts which you presumably don’t find new or surprising).
Note. This is a phoney proof for Z, since, in the construction of the
number systems, one needs to verify the integral domain property of Z before
constructing Q and verifying that Q is a field. A converse of 14.4 will be
proved in 18.1, mimicking the construction of Q, and showing that, up to
isomorphism, all integral domains arise as in 14.4.
15. Structure of Z×
n.
The set R× of invertibles in a ring R is easily seen to be a group under
multiplication. Thus Z×
k is a commutative group of order
Φ(k) := the number of integers between 1 and k which are prime to k ,
since [&]modk has an inverse in Zk if and only if & is prime to k.
Exercise 15A. Check these assertions.
Exercise 15B. If R and S are rings, make the abelian group R × S into
a ring by using the multiplication
(r1 , s1 )(r2 , s2 ) := (r1 r2 , s1 s2 ) .
We have 1R×S = (1R , 1S ). Furthermore, (R × S)× = R× × S × since
(r, s)−1 = (r−1 , s−1 ) . Verify these statements.
Proposition 15.1. If k and & are relatively prime, then, as rings,
Zk# ∼
= Zk × Z# .
Exercise 15C. The map Zk# → Zk × Z# sending [x]k# to ([x]k , [x]# ) is
a ring isomorphism. Complete the proof as an exercise.
Structure of Z×
n
59
×
∼ ×
Corollary. If GCD{k, &} = 1, then Z×
k# = Zk × Z# , so that
Φ(k&) = Φ(k)Φ(&) . Extending to any number of factors, for distinct primes
p1 , · · · , pt , we have
×
(Zpα1 1 ···pαt t )× ∼
α × · · · × Z αt
= Z×
p
p 1
t
1
and
Φ(pα1 1 · · · pαt t ) = Φ(pα1 1 ) · · · Φ(pαt t ) =
$
i
(pαi i − pαi i −1 ) .
We can write the last formula as
Φ(k) = k
$
p|k
( 1−
1
),
p
product over all primes p dividing k.
Lemma 15.2 For all integers y, all non-negative i and all positive α, we
have :
(i)
i+1
i
y 2 ≡ ±1(mod 2α ) ⇐⇒ y 2 ≡ 1(mod 2α+1 ) ;
and
(ii) if p is an odd prime,
i+1
i
y p ≡ 1(mod pα ) ⇐⇒ y p
≡ 1(mod pα+1 ) .
Exercise 15D. Give the proof. Hint. For ⇐ in (ii), show that : z p ≡
1(mod pα+1 ) ⇒ z ≡ 1(mod pj ) for 1 ≤ j ≤ α, by induction on j.
The theorem below, when combined with the observations above, shows
how to write the finite commutative group Z×
n as a direct product of cyclic
groups. We already know from Theorem 13.2 (written in multiplicative
notation) that such a decomposition must exist.
Theorem 15.3. (i) For α > 1, we have
Z×
2α
∼
= C2 × C2α−2 .
Structure of Z×
n
60
(ii) If p is an odd prime, then
Z×
pα
∼
=
Cpα−1 (p−1) .
Proof. (ii) For α = 1, the group Z×
p is cyclic, since if r is its largest
invariant factor, then the polynomial xr − 1 (over Zp ) has all p − 1 elements
of the group at issue as roots. This is impossible for r < p − 1, so r = p − 1,
as required. (See Sections 16 and 30 for a generalization of this idea, and
more details.)
For α > 1, it suffices to find group elements [x] of order p − 1, and
[y] of order pα−1 , since then, by coprimeness, [xy] will have order pα−1 (p − 1).
α−1
So let x = z p , where [z]p is a generator for Z×
p , which exists by the previous
paragraph. For α = 2, the group Z×
certainly
has an element [y]p2 of order
p2
p , since p divides the order of the group. (Use either 13.2 or 11.2.) By (ii)
of the previous lemma , [y]pα will have order pα−1 in Z×
pα , using induction on
α.
(i) For α = 2, this is clear, since the group at issue has order 2. For α = 3,
the group has four elements, all of order divisible by 2, as required, since
12 ≡ 32 ≡ 52 ≡ 72 ≡ 1(mod 8) .
Using (i) of the previous lemma and induction on α, we see that the element
α−2
of maximum order in Z×
for α ≥ 3. Thus Z×
2α has order 2
2α has invariant
α−2
factors {2, 2 } for α ≥ 3, as required.
Note. We have used several well known facts about Z. These are proved
in a more general context in Section 17.
Exercise 15E. Write the group of invertibles in the ring of integers mod
2160 as a product of cyclic groups, in both invariant factor form and pprimary form.
Exercise 15F. Fix positive integers m > n. Find a formula for the exact
power of p which divides
GCD{ k m − k n : k ≥ 2 } ,
for each prime p.
EXCEPT WHERE NOTED EXPLICITLY, ALL RINGS
IN THE REMAINING SECTIONS ARE COMMUTATIVE.
Polynomials and ring extensions
61
For a good treatment of the ‘non-commutative theory’, see Lam(1991). See
also the last two sections, 51 and 52, of this book.
16. Polynomials and ring extensions.
An extension of a ring R is any ring S which contains R as a subring.
Particularly later in field theory, the extension will be referred to with the
notation S ⊃ R (a noun), since R is almost as important to keep in mind as
is S. The notation R ⊂ S should be read as a phrase: ‘R is a subring of S’,
or ‘R, which is a subring of S’, or ‘S is an extension of R’.
Definitions. Given a ring extension S ⊃ R and s ∈ S, we say that s is
algebraic over R if and only if, for some m > 0 and some r0 , r1 , . . . , rm in
R with rm %= 0, we have
r0 + r1 s + r 2 s2 + · · · + rm sm = 0 .
Write the left-hand side as Σri si . (It can be thought of as an infinite sum, but
with ri = 0 for all i > m .) If Σri si = 0 with ri ∈ R implies that all ri = 0,
then we say that s is transcendental over R. So ‘transcendental’ means the
same as ‘not algebraic’. In either case, define a subset R[s] of S by
R[s] := { Σri si : ri ∈ R, ri = 0 for almost all i }.
Theorem 16.1. The set R[s] is a subring of S, and is an extension of R.
It coincides with the intersection of the collection of all subrings of S which
contain R ∪ {s}.
Definition. The ring R[s] is called the subring generated by R ∪ {s} or
the subring generated by s over R.
Exercise 16A. Show that the intersection of any collection of subrings
of a given ring S is a subring of S.
Proof of 16.1. Applying inductively proved general associative, commutative and distributive laws for any (finite) number of ring elements, we
find
Σri si ± Σri& si = Σ(ri ± ri& )si
and
&
(Σri si )(Σri& si ) = Σ(Σij=0 rj ri−j
)si ,
62
so that R[s] is closed under +, − and · . Taking m = 0 (that is, ri = 0 for
all i > 0) shows that R ⊂ R[s]. In particular, 1 ∈ R[s]. The first assertion
is thus established. The intersection T in the second assertion is a subring
of S by 16A. It clearly contains R ∪ {s}, so by closure, R[s] ⊂ T. But that
T ⊂ R[s] is obvious, since R[s] is one of the rings in the intersection defining
T.
Remark. You may have noted the analogy with the two descriptions
given in 6.3 of the subgroup generated by a subset of a group.
Exercise 16B. Write out all the details concerning the first sentence of
the above proof.
Proposition 16.2. Continuing with the same notation, elements of R[s]
are uniquely expressible in the form Σri si if and only if s is transcendental
over R.
Examples of non-uniqueness. Let R = R, S = C and s = i, where
i2 = −1. Then
0 + 0i + 0i2 = 1 + 0i + 1i2 .
√
Here R[s] = S. Alternatively, let R = Q, S = R and s = 2. Then
√
√
√
√
0 + 0 2 + 0( 2)2 = 2 + 0 2 + (−1)( 2)2 .
Here R[s] %= S. In both of these examples, elements
of R[s] are uniquely
√
3
expressible in the form r0 + r1 s ; whereas, if
2 had been used instead of
√
2 for s, we’d get unique expressions r0 +r1 s+r2 s2 . In 24.2, this is explained
in general, at least when s is algebraic and both R and S are fields.
Proof of 16.2. Assume that s is transcendental over R and that Σri si =
Then Σ(ri − ri& )si = 0, so ri − ri& = 0 for all i, by transcendence, as
required. Conversely, if Σri si = 0 = Σ0si , then, by uniqueness, ri = 0 for all
i, proving transcendence of s.
Σri& si .
Remarks. The reader will be familiar with polynomials Σri xi , at least
when R = R. The object x is often called a ‘variable’ or ‘indeterminate’,
without any attempt being made to define the words. Actually, in kindergarten algebra, the above polynomial would usually be thought of as the
function from R to R which maps each real number t to the real Σai ti . Below we’ll see that it is important to think of a polynomial as an expression,
63
rather than as a function, since, for certain R such as Zp , the two rings (of
polynomials and of polynomial functions) are not in 1-1 correspondence. We
want two polynomials to differ if any coefficients differ, so 16.2 motivates
the following.
Definition. ‘The’ polynomial ring in one variable over R is R[x] for any
x which is transcendental over R.
Notation. To avoid having to say explicitly that we are considering the
polynomial ring, we shall never place [x] to the right of R except when x
is transcendental over R. This applies to the letter x as well as to x& , x1 , x2
etc..., but not to other letters. This depends on having a ring R none of
whose extensions involved in the discussion has an element algebraic over R
and already named x. For example, x certainly isn’t transcendental either
over R[x], or over R[x3 ]. There is nothing to stop the reader from placing
[x] to the right of either of these, but I shall refrain from doing so. If x
is transcendental over R, then a shorter notation for the object denoted by
doing so is just R[x].
It is fairly obvious that R[x] is unique up to isomorphism, and not quite
so obvious that it always exists:
Theorem 16.3. i) (Uniqueness) If s and t are both transcendental
over R, then R[s] ∼
= R[t].
ii) (Existence) For any R, there exists a polynomial ring R[x]. That is,
there exists an extension ring containing an element which is transcendental
over R.
Proof i) It is a routine verification to check that the function Σri si 9→ Σri ti
is an isomorphism of rings, as required.
Exercise 16C. Do it!
ii) Preliminary Remarks. Many mathematicians, including perhaps
the reader, will feel that this next proof is hardly necessary. They say: “Just
take an abstract symbol x and operate with it according to the usual rules
of algebra, including the rule that distinct polynomial expressions are to
be regarded by definition as denoting different objects.” The author would
disagree with this only in pointing out that, although following these rules of
algebra has not yet led to a contradiction, this fact alone is no guarantee that
64
such a contradiction won’t occur tomorrow. The following proof gives this
guarantee: such a contradiction would be caused by something other than
‘rules for polynomials’. This is analogous to ‘rules for complex numbers’,
where i is regarded as an abstract symbol satisfying only laws which follow
from commutativity and the law i2 = −1. A construction of C (say, as
R × R) is similarly needed simply to guarantee relative consistency.
To proceed to the proof, define
S := { (r0 , r1 , r2 , · · · ) : ri ∈ R, ri = 0 for almost all i } .
Thus, S is a subset of the set of infinite sequences from R, the latter, strictly
speaking, being the set of functions from {0, 1, 2, . . .} to R. Define operations
on S as follows:
(r0 , r1 , · · · ) + (r0& , r1& , · · · ) := (r0 + r0& , r1 + r1& , · · · ) ;
&
(r0 , r1 , · · · ) (r0& , r1& , · · · ) := (r0 r0& , r0 r1& + r1 r0& , · · · , Σij=0 rj ri−j
, ··· ) .
(The last summation gives the entry in the (i + 1)th slot, which is indexed by
i.) Note that both right-hand sides are in S : the terms are in R, and all but
finitely many are zero. Verification of the ring axioms is now routine; note
that the identity element of S is (1, 0, 0, 0, · · · ). Exercise 16D. Do this
verification.
We do have a problem: strictly speaking, R is not a subring of S. Define
φ : R → S by φ(r) = (r, 0, 0, · · · ). Another routine verification shows that
φ is an injective morphism of rings. Thus S has a subring φ(R) isomorphic
to R. If we ‘identify’ R with φ(R), then S becomes an extension of R, as
required.
Aside: Remarks on ‘Identification’. The last assertion perhaps
smells a bit fishy. But most readers will already have seen something similar: for example, after constructing Q from Z, one identifies a ∈ Z with
a/1 ∈ Q ; after constructing C as R × R, one identifies R with the subfield
R × {0} of C . These relatively innocuous sleights-of-hand can be accomplished in at least three different ways: Suppose that φ : R → S is an
injective ring morphism. The arrow-theorist’s approach is to generalize the
meaning of the term ring extension to mean any ordered triple (R, S, φ) as
above. The conspicuous consumer’s approach is to throw away the old copy
65
of R (it’s last year’s model !!). Let the symbol R now stand for φ(R).
Finally, wanting R itself to be the subring, the careful plodder’s approach is
to remove φ(R) from the set S and to replace it by R. That is, define
S & := R ∪ T
where T is disjoint from R and has the same cardinality as S \ φ(R). Then
pick any bijection ψ : S & → S which agrees on R with φ. Define operations
on S & by
a +S" b := ψ −1 [ψ(a) +S ψ(b)]
a ·S" b := ψ −1 [ψ(a) ·S ψ(b)]
Then S & is a ring, R is a subring of S & , and the diagram
ψ
# S
S&
%
$
inclusion$
$
&
'
&
&
φ
R
commutes, with ψ being an isomorphism of rings. Any ring theoretic property
φ
of R → S will carry over to R -→ S & . Henceforth we shall not dwell at all on
this business of identification. There are no algebraic pitfalls associated with
it, and the set theoretic pitfalls are few and obvious. End of Aside.
To complete the proof, let x := (0, 1, 0, 0, · · · ) ∈ S. By induction,
x = (0, 0, · · · , 0, 1, 0, 0, · · · ), with 1 in the (n + 1)th slot. Now, using
the identification of a ∈ R with (a, 0, 0, · · · ) ∈ S, we get
n
Thus, if
-
,
ai xi = (a0 , a1 , a2 , · · · ) .
ai xi is zero in S, then ai = 0 in R for all i, as required.
Exercise 16E. Write out the details of the above paragraph.
Definition. Let M ap(R, R) be the set of all functions φ : R → R .
(φ needn’t be a morphism.) Define operations by
(φ1 + φ2 )(b) := φ1 (b) + φ2 (b) ; (φ1 φ2 )(b) := φ1 (b)φ2 (b) .
Exercise 16F. Verify that M ap(R, R) becomes a commutative ring.
Check that your proof generalizes to M ap(X, R) for any set X.
66
Caution. The multiplication in M ap(R, R) is not composition, and the
multiplicative identity is not the identity function: it is the constant function
sending a to 1 for all a.
Definition. Define µ : R[x] → M ap(R, R) by ‘substitution’:
n
,
[ µ(
i=0
i
ai x ) ](b) :=
n
,
ai b i .
i=0
Then µ is a morphism of rings, and Imµ is the ring of polynomial functions
on R.
Example. Let R = Z3 . Then 1 + x + x2 %= 1 + x2 + x3 in Z3 [x]. But
µ(1 + x + x2 ) = µ(1 + x2 + x3 ). It is the function from Z3 to itself which sends
0 to 1, 1 to 0, and 2 to 1. This example will be explained more generally in
16G.
Conclusion. The morphism µ is not a monomorphism in general.
This is why we don’t define a polynomial to be a polynomial function.
Theorem 16.4. If R is an infinite field, then µ is a monomorphism.
So in this important and familiar case, there is a 1-1 correspondence between
polynomials and polynomial functions. The proof is at the end of this section.
The theorem is true more generally for any infinite integral domain R, as we
shall see later.
-
n
i
Definition. If f =
i=0 ai x is a non-zero polynomial, i.e. if
ai %= 0 for at least one i, define the degree of f by
deg(
,
ai xi ) = max{i : ai %= 0} .
Proposition 16.5.Whenever all three degrees are defined, we have
deg(f + g) ≤ max{deg(f ), deg(g)}
The proof is trivial.
Theorem 16.6. Assume that R is an integral domain. Then
(i) R[x] is an integral domain;
67
(ii) if f %= 0 %= g , then deg(f g) = deg(f ) + deg(g) ;
(iii) f is invertible in R[x] if and only if f is a constant which is invertible in
R.
-
-
i
Proof. If f = ni=0 ai xi with an %= 0, and g = m
i=0 bi x with bm %= 0,
then an bm is the coefficient of xn+m in f g. But an bm %= 0 since R is an
integral domain. Thus f g %= 0, proving (i). Also if i + j > m + n, then
either i > n or j > m, so ai bj = 0. Thus the coefficient of xk in f g is
zero if k > m + n. Hence deg(f g) = m + n . Finally, if f g = 1, then
0 = deg1 = deg(f g) = degf + degg = m + n. But m ≥ 0 and n ≥ 0. Hence
n = 0 and m = 0, and so f is a constant, whose inverse in R is the constant
g. Hence f is a constant which is invertible in R. Conversely, if ab = 1 in R,
the same is clearly true in R[x], as required.
68
Assume for the rest of this section that F is a field.
Theorem 16.7. (The Division Algorithm) Suppose given polynomials f ∈ F [x] and d ∈ F [x], with d %= 0. Then there is exactly one pair
(q, r) ∈ F [x] × F [x] with the properties:
(i) f = qd + r (q is the quotient and r the remainder); and
(ii) either r = 0 or deg(r) < deg(d) .
Proof. (Uniqueness) Suppose that (q1 , r1 ) and (q2 , r2 ) both satisfy (i)
and (ii). Then q1 d + r1 = f = q2 d + r2 , so r1 − r2 = (q2 − q1 )d. If r1 %= r2 ,
then neither is 0 (check degrees in the last equation). Also
max(degr1 , degr2 ) ≥ deg(r1 − r2 ) = degd + deg(q2 − q1 ) ≥ degd .
But degd < degr1 and degd <degr2 by (ii), giving a contradiction. Thus
r1 = r2 , so (q2 − q1 )d = 0. But d %= 0, so q1 = q2 , proving uniqueness.
(Existence) For fixed d, proceed by induction on degf . If f = 0 or
if degf <degd, let q = 0 and let r = f . Then (i) and (ii) are satisfied. If
degf =degd, suppose that f = ni=0 ai xi and d = ni=0 bi xi with bn %= 0 %= an .
-n−1
−1
i
Let q = b−1
n an . If n = 0, let r = 0. If n > 0, let r =
i=0 (ai − bn an bi )x .
Having proved existence for all f with degf < m, where m > degd, suppose
-m−1
i
i
that f = m
i=0 ai x where am %= 0. Then f = a0 +xf1 where f1 =
i=0 ai+1 x .
Since degf1 = m − 1, we have f1 = q1 d + r1 , where r1 = 0 or deg r1 < degd
by the inductive hypothesis. Also a0 = q0 d + r0 , where r0 = 0 or degr0 <
degd. Thus
f = a0 + xf1 = (q0 + xq1 )d + (r0 + xr1 ) .
Also r0 + xr1 = q2 d + r2 by the first part of the proof. So let
q = q0 + xq1 + q2
and
r = r2 .
Definition. If f ∈ F [x] and b ∈ F , define
f (b) :=
n
,
ai bi = [µ(f )](b) ,
i=1
-n
where f = i=1 ai xi . We say that b is a root of f (or a zero of f ) if and
only if f (b) = 0. Since µ is a morphism,
(f + g)(b) = f (b) + g(b)
and
(f g)(b) = f (b)g(b) .
69
Remainder Theorem 16.8. If f ∈ F [x] and a ∈ F , then there exists
q with f = (x − a)q + f (a) ; i.e. the remainder on dividing f by (x − a) is
f (a).
Proof. We have f = (x − a)q + r, where either r = 0 or else
degr <deg(x − a) = 1. Thus r is a constant. But
f (a) = (a − a)q(a) + r(a) = r(a) ,
so r is the constant f (a), as required.
Corollary 16.9. An element a ∈ F is a root of f if and only if there
exists a polynomial q ∈ F [x] with f = (x − a)q , (that is, · · ·if and only if
x − a is a divisor of f in F [x].)
Theorem 16.10. If deg(f ) = k, then f has at most “k” roots.
Proof. When k = 1, f = a0 + a1 x, and f has exactly one root, namely
−a−1
1 a0 . Proceeding inductively, suppose that this theorem is true for all f
with degf < n. For a contradiction, let degf = n, and suppose that f has at
least “n + 1” distinct roots, namely b0 , b1 , · · · , bn . Then f = (x − b0 )g by
the Remainder theorem, and degg = n − 1. But 0 = f (bi ) = (bi − b0 )g(bi ).
Since bi %= b0 for i > 0, g(bi ) = 0 for 1 ≤ i ≤ n. Thus g has “n” distinct roots
b1 , · · · , bn , contradicting the inductive hypothesis.
Proof of Theorem 16.4. [µ(f ) = 0] =⇒ [f (a) = 0 ∀a ∈ R] =⇒
[f has infinitely many roots ] =⇒ [f = 0] by 16.10. Thus Kerµ = {0}, as
required.
Exercise 16G. i) Prove that the kernel of µ : Zp [x] → M ap(Zp , Zp ) is
the ideal I consisting of all multiples of xp − x , (a principal ideal—see the
next section).
ii) Thus the ring of polynomial functions over Zp is isomorphic to the ring
Zp [x]/I .
iii) Prove that the latter ring has exactly “ p p ” elements.
iv) Deduce that every function Zp → Zp is a polynomial function.
Exercise 16H. If R is a non-zero finite commutative ring, then the substitution map µ : R[x] → M ap(R, R) is not injective—give a simple cardinality
argument to prove this, rather than an example.
Principal ideals & unique factorization
70
17. Principal ideals & unique factorization.
Let R denote a commutative ring.
Definition. For a ∈ R, the principal ideal generated by a is the set
aR := { ar : r ∈ R } .
Exercise 17A. Prove that this set aR is an ideal..
Definition. A principal ideal domain (“PID”) is an integral domain in
which every ideal is a principal ideal.
Theorem 17.1. The rings Z, and F [x], when F is a field, are PID’s.
Proof. We already know that they are integral domains, so let I be an
ideal in F [x] (resp. in Z). If I = {0}, then I is the principal ideal generated
by 0. If I %= {0}, choose a non-zero a ∈ I such that no non- zero element of
I has degree (resp. absolute value) smaller than that of a. Then we’ll prove
that I is the principal ideal generated by a, as required. Since a ∈ I, we
have ar ∈ I for all r ∈ F [x] (resp. all r ∈ Z), so aF [x] ⊂ I (resp. aZ ⊂ I).
To reverse the inclusion, let b ∈ I, and divide a into b to obtain b = qa + r
where either r = 0 or degr <dega (resp. |r| < |a| ). Since b ∈ I and qa ∈ I,
we have r = b − qa ∈ I. But a was a non-zero element of I with smallest
possible degree (resp. absolute value), so we must have r = 0. Thus b = qa,
and so b ∈ aF [x] (resp. b ∈ aZ), as required.
Definition. In our commutative ring, we say that a divides b, and write
a | b, if and only if there exists c ∈ R with b = ac. We say that a and b are
associates, and write a ∼ b, if and only if there exists an invertible u ∈ R
with b = au.
Proposition 17.2. Assume that R is an integral domain. Then
(i) for all a, we have a | a ;
(ii) a | b and b | c implies that a | c ;
(iii) a | b ⇐⇒ bR ⊂ aR ;
(iv) a ∼ b ⇐⇒ (both a | b and b | a) ⇐⇒ aR = bR ;
(v) ∼ is an equivalence relation ;
(vi) 0 is an associate of only itself ;
(vii) the invertibles are the associates of 1 .
71
Proof. (i) a = a1 .
(ii) b = ar and c = bs implies that c = ars.
(iii) (a | b) ⇐⇒ (∃c, b = ac) ⇐⇒ (b ∈ aR) ⇐⇒ (bR ⊂ aR) .
(iv) (a ∼ b ) =⇒ (∃ invertible u with b = au) =⇒
(b = au & a = bu−1 ) =⇒ (a | b & b | a) =⇒
(bR ⊂ aR & aR ⊂ bR) , by (iii) =⇒ (aR = bR) =⇒
(∃ c, d with a = bc and b = ad) =⇒ (a = b = 0 or else cd = 1), since
b = bcd, and cancellation of non-zero elements is valid in an integral domain,
=⇒ a ∼ b , since in the first case, 0 ∼ 0, and in the second case c is a unit
whose inverse is d.
(v) Clearly “a ∼ b ⇐⇒ aR = bR ” defines an equivalence relation.
(vi) 0 ∼ a ⇐⇒ 0R = aR ⇐⇒ a ∈ {0} ⇐⇒ a = 0 .
(vii) 1 ∼ a ⇐⇒ 1R = aR ⇐⇒ 1 ∈ aR
⇐⇒ ∃b with ab = 1 ⇐⇒ a is invertible.
Note. We only used the integral domain property once.
Definition. An element d is a greatest common divisor for a and b,
written d = GCD{a, b}, if and only if
(i) d | a and d | b, and
(ii) if c | a and c | b, then c | d .
Definition. An element m is a least common multiple for a and b,
written m = LCM{a, b}, if and only if
(i) a | m and b | m, and
(ii) if a | n and b | n, then m | n .
Note. GCD’s and LCM’s don’t always exist.
Proposition 17.3. Assume that R is an integral domain. Then any two
GCD’s for {a, b} are associates. Similarly for LCM’s.
Proof. If d1 and d2 are both GCD’s for a and b , then d1 | d2 , since d1
divides any common divisor by (ii) of the definition. Symmetrically, d2 | d1
. So by 17.2(iv), d1 ∼ d2 . The proof for LCM’s is similar.
Exercise 17B. Show that if GCD{a, b} and LCM{a, b} both exist, then
their product is ab, up to associates. Use this to investigate under what
conditions the existence of one implies the existence of the other.
72
Theorem 17.4. Assume that R is a principal ideal domain. Then, if
a, b ∈ R ,
(i) the set I = { ra + sb : r ∈ R, s ∈ R } is an ideal in R, and is the
principal ideal generated by some GCD for {a, b}. In particular, GCD{a, b}
exists and has the form ra + sb for some r, s ∈ R. Also,
(ii) aR∩bR is an ideal in R and is the principal ideal generated by LCM{a, b}.
Proof. (i) An easy computation shows that I is an ideal. Since R is a
PID, we can choose d such that I = dR. Now a ∈ I, so d | a. Similarly, d | b.
Thus d is a common divisor. Suppose that c | a and c | b. Then c | (ra + sb)
for any r and s, i.e. c divides any element of I. In particular c | d. Thus d
is a greatest common divisor.
(ii) Any intersection of ideals is an ideal.
Exercise 17C. Prove this, including the case of an infinite collection of
ideals.
Thus we may choose m such that aR ∩ bR = mR. Then m ∈ mR ⊂ aR,
so a | m. Symmetrically, b | m. Thus m is a common multiple of a and
b. Now suppose that a | n and b | n. Then n ∈ aR and n ∈ bR. Hence
n ∈ aR ∩ bR = mR. So m | n, and m is a least common multiple.
Note. In Z and F [x], use the Euclidean algorithm to compute GCD’s,
LCM’s, and {r, s} as in the theorem. That algorithm is the same in F [x] as
in Z.
Definition. A factorization a = bc is trivial if and only if one of b, c is
invertible and the other is an associate of a. An element a ∈ R is irreducible
if and only if it is not zero, not invertible, and it admits only trivial factorizations. Thus in an integral domain, a non-zero non-invertible is irreducible
if and only if all of its divisors are either invertibles or associates of itself.
Example. In F [x] the invertibles are the non-zero constants, so f ∼ g if
and only if either is a non-zero constant multiple of the other. A polynomial
is irreducible if and only it has positive degree and it is not the product of
two polynomials of strictly smaller degree. In particular, every polynomial
of degree 1 is irreducible.
Definition. A unique factorization domain (sometimes called Gaussian
domain or UFD—not to be confused with unidentified flying domain) is
an integral domain R such that the following hold:
73
(1) Any a ∈ R which is non-zero and not invertible is a product of irreducibles; i.e. there exists s ≥ 1 and irreducibles p1 , p2 , · · · , ps (not
%
necessarily distinct) such that a = si=1 pi .
(2) Furthermore, any other such factorization of a may be obtained by
replacing each pi by an associate and re-indexing (or re-ordering). That is, if
%s
%r
i=1 pi =
j=1 qj , where each pi and qj is irreducible, then r = s and there
exists a permutation σ of { 1, 2, · · · , s } such that pi ∼ qσ(i) for 1 ≤ i ≤ s.
Note. See Exercises 17D for important non-UFD’s.
Theorem 17.5. For any field F , the ring F [x] is a UFD. In other words:
Any element of F [x] with positive degree is a product of irreducible polynomials.
Furthermore, any two such factorizations of a polynomial may be obtained from
one another by multiplying the factors by non-zero constants and re-ordering.
%
Proof. Existence. We prove by induction on degf that f = si=1 pi for
some s ≥ 1 and irreducibles pi . If degf = 1, then f is already irreducible
so let s = 1 and p1 = f . Now suppose that degf = n > 1, and that any
polynomial of degree less than n can be so factored. If f is irreducible, again
let s = 1 and p1 = f . If not, then f = gh, where g and h both have degree
%
%
less than n. By the inductive hypothesis, g = ti=1 pi and h = si=t+1 pi
where each pi is irreducible. Multiplying these together completes the proof
of existence.
Uniqueness. To do this we first need a lemma :
Lemma 17.6. In any PID, if p is irreducible and p | g1 g2 , then either
p | g1 or p | g2 .
Proof. Let d = GCD{p, g1 }. Then d | p and p is irreducible, so either
d ∼ p or d is invertible. If d ∼ p, then p | g1 since d | g1 . If d is invertible, we
may take d = 1. Then there exist elements r and s with 1 = pr + g1 s. Hence
g2 = (pr + g1 s)g2 = (rg2 )(p) + (s)(g1 g2 ). But p | (rg2 )(p), and p | (s)(g1 g2 )
since p | g1 g2 , so p | g2 as required.
Proof of Uniqueness. It follows easily from 17.6, by induction on s,
%
that if p is irreducible and p | si=1 gi , then p | gi for some i.
%
%
We prove by induction on degf that, if f = si=1 pi ∼ rj=1 qj for irreducibles pi and qj , then r = s and qσ(i) ∼ pi for some permutation σ of
74
%
{ 1, · · · , s }. If deg f = 1, because deg( si=1 pi ) ≥ s, we must have s = 1
and f = p1 . Similarly r = 1 and f ∼ q1 . So we have p1 ∼ q1 , as required.
Now suppose that uniqueness has been proved for polynomials of degree
%
%
less than n, and that deg f = n, and, as above, that f = si=1 pi ∼ rj=1 qj .
%
Then p1 | rj=1 qj , so p1 | qj1 for some j1 . Since qj1 is also irreducible we have
%
% 1 −1 %r
p1 ∼ qj1 . Thus si=2 pi ∼ jj=1
qj j=j1 +1 qj = g , say. Since deg g < n, we
may apply our inductive hypothesis, and conclude that s−1 = r −1 and that
pi ∼ qji for 2 ≤ i ≤ s and some rearrangement {j2 , · · · , js } of {1, · · · , s}\ {j1 }
. This completes the proof.
%
%
s
Comments. (1) When si=1 pi =
merely ∼), let ui be the
j=1 qj (not
%
units such that pi = ui qσ(i) . Then we must have si=1 ui = 1 .
(2) Just as in 17.1, we could go through the last proof, replacing F [x] by
Z, polynomial by integer, and degree by absolute value. This would give a
proof of unique factorization in Z—the ‘fundamental theorem of arithmetic’.
(3) In fact, any PID is a UFD. This of course implies the result for F [x]
and for Z. But there are plenty of important UFD’s which are not PID’s.
For example, Z[x], and F [x1 , x2 , · · ·] (polynomials in several variables) for
any field F , are UFD’s. But the proof requires more work than above. See
Section 20.
(4) In 17.1 and 17.5, we may replace F [x] by any Euclideanizable domain, as defined below. Then replacing deg by δ, these results and their
proofs generalise immediately. (Scholars may object to the word used above,
but presumably not to the distinction between the adjectives Euclidean and Euclideanizable, any more than they would object to teaching students the distinction between metric and metrizable spaces). Thus any such domain which
admits at least one δ as below is a PID. And in the generalisation of 17.5
we would have partly proved (3); but not wholly, since one can find PID’s
which are not Euclideanizable.
Definition. A Euclidean domain is a pair (R, δ), where R is an integral
domain, and δ : R \ {0} → N is a function satisfying :
(i) δ(ab) ≥ δ(a) for all non-zero a and b ;
(ii) for all a and non-zero b in R, there is a pair (q, r) from R such that
a = qb + r and either r = 0 or δ(r) < δ(b)—(Uniqueness of (q, r) isn’t
assumed.)
75
An integral domain R is Euclideanizable iff ∃δ with (R, δ) being a Euclidean domain.
The standard examples are then the ones we’ve been concentrating on:
Z with δ being the absolute value, and F [x] (for a field F ) with δ being the
degree.
Exercises 17D. (A) Write the proofs in detail of the first claim in paragraph (4) just above.
(B) Now try to generalize to arbitrary PID’s, i.e. prove the first statement in (3). You will probably be led, for example, to showing that a PID
cannot have an infinite, strictly increasing sequence of ideals.
(C) Show that (R, δ) √
is a Euclidean domain (called the Gaussian integers), where R = { a + b −1 : a, b ∈ Z }, and δ is the restriction of the
complex modulus. Show also that the remainder is not always unique in this
example.
√
√
(D) Prove that, changing −1 to −5 in the domain above, the complex
modulus continues to satisfy (i) in the definition above, and can be used to
show that:
a) ±1 are the only invertibles; √
b) elements such as 2, 3 and 1 ± −5 are irreducible in this domain; and
c) the existence half of
holds.
√
√unique factorization
But show that 2·3 = (1+ −5)(1− −5) is an instance where the uniqueness
half fails [and so (ii) certainly fails for the complex modulus restricted to this
domain, and for any other function satisfying (i)].
Exercise 17E. Make a careful construction of a ring which is exactly
like F [x], except that arbitrary non-negative rational powers of x are used,
only finitely many in a given element. Show that you’ve produced an integral
domain in which the existence half of unique factorization fails—note that
x = x1/2 x1/2 = x1/4 x1/4 x1/4 x1/4 = · · · .
(5) There is a nice treatment of a theorem of Fermat, that any integer
prime which is congruent to 1(mod 4) is a sum of two integer squares, using
the Euclidean domain in (C) just above. This is an example in which the
Euclidean function δ allows for more than one (quotient, remainder) pair in
the ‘division algorithm’, (ii). See Herstein, p. 113.
The following facts, concerning R and C, are probably already known to
The field of fractions
76
the reader.
(i) Every non-constant polynomial in C[x] has a root in C.
(ii) The only irreducibles in C[x] are the linear polynomials ax + b.
(iii) The irreducibles in R[x] are the linears and the quadratics
ax2 + bx + c for which b2 < 4ac.
Statement (i) is often called the fundamental theorem of algebra; see
Appendix A and Section 41 for two proofs. Statements (i), (ii) and (iii) are
easily shown to be logically equivalent. The unique factorization theorem
takes a more specific form for R[x] and C[x], using (ii) and (iii) respectively.
Later (in 20A and 31.4), we’ll see that Q[x], as well as Zp [x] for primes p,
have irreducibles of every positive degree.
18. The field of fractions.
Definition. Let R be an integral domain. Define QR , the field of fractions of R as follows: as a set, it is the set of equivalence classes
QR := ( R × [R \ {0}] ) / ∼ ,
where (a1 , b1 ) ∼ (a2 , b2 ) if and only if a1 b2 = a2 b1 .
Exercise 18A. Before reading further, check that this is an equivalence
relation.
Denote the equivalence class of (a, b) as a/b. The operations are defined by:
(a/b) + (c/d) := (ad + bc)/(bd)
and
(a/b)(c/d) := (ac)/(bd) .
You may remember these formulae from kindergarten.
Note. If b and d are both non-zero, then so is bd, so denominators remain
in R \ {0}.
Theorem 18.1. The triple (QR , +, ·) is well defined and is a field. The
zero is 0/b for any b %= 0. The negative of a/b is (−a)/b. The identity is
a/a for any a %= 0. If a %= 0 %= b, the inverse of a/b is b/a. Finally, the map
φ : R → QR given by φ(a) = a/1 is a monomorphism of rings.
Proof. It is not hard to verify that ∼ is an equivalence relation. For
example, the relations (a1 , b1 ) ∼ (a2 , b2 ) and (a2 , b2 ) ∼ (a3 , b3 ) imply that
The field of fractions
77
a1 b2 = a2 b1 and a2 b3 = a3 b2 . Therefore (a1 b2 )(a2 b3 ) = (a2 b1 )(a3 b2 ). Now if
a2 %= 0, then a2 b2 %= 0, so we can cancel a2 b2 and obtain that a1 b3 = a3 b1 . If
a2 = 0, then a1 b2 = 0 = a3 b2 , so a1 = 0 = a3 , and again a1 b3 = a3 b1 . Thus,
in both cases, (a1 , b1 ) ∼ (a3 , b3 ). This proves the transitivity of ∼. The other
two properties are much easier to prove.
To show that the operations are well defined amounts to showing that if
(a1 , b1 ) ∼ (a2 , b2 )
and
(c1 , d1 ) ∼ (c2 , d2 ) ,
then
(a1 d1 + b1 c1 , b1 d1 ) ∼ (a2 d2 + b2 c2 , b2 d2 )
and
(a1 c1 , b1 d1 ) ∼ (a2 c2 , b2 d2 ) .
This is an easy computation. The verifications of the associative, commutative and distributive laws are straightforward calculations, using the corresponding laws in R. For example, to prove distributivity:
(a/b + c/d)(e/f ) = [(ad+bc)/bd][e/f ] = (ad+bc)e/bdf = (ade+bce)/bdf.
But
(a/b)(e/f ) + (c/d)(e/f ) = ae/bf + ce/df = (aedf + cebf )/bf df.
The two answers are the same, since
(aedf + cebf, bf df ) ∼ (ade + bce, bdf ) .
We have 0/b + c/d = (0d + bc)/bd = bc/bd = c/d, so 0/b is the zero. Also,
(−a)/b + a/b = [(−a)(b) + ab]/b = 0/b, so (−a)/b is the negative of a/b.
Furthermore, (a/a)(c/d) = ac/ad = c/d, so a/a is the identity for any a %= 0.
Also, (a/b)(b/a) = ab/ba = 1/1, the identity, so a/b and b/a are inverse, for
non-zero a and b.
Finally, the zero and identity in QR are distinct, since 0/1 = 1/1 implies
that 0 = 1 in R, which is false. Thus QR is a field.
The map φ is a morphism by a trivial calculation. To show that it is
injective, suppose that φ(a) is zero. Then a/1 = 0/1 in QR , so (a, 1) ∼ (0, 1),
or a · 1 = 1 · 0. Hence a = 0. This shows that Kerφ = {0}, as required.
Exercise 18B. Fill in all the details in the proof above.
Prime & maximal ideals
78
Note. This theorem shows that any integral domain is isomorphic to
a subring of some field. Conversely, any subring R of a field is an integral
domain. See 14.4.
Examples. (1) QZ = Q, and 18.1 reduces to the usual construction of
the field of rational numbers, starting from the integers.
(2) If F is a field, QF [x] is denoted F (x)←(curly brackets!!), and is called
the field of rational functions with coefficients in F . We shall identify a
polynomial f with the rational function f /1 (just as we identify integers
with rationals which can be written in the form a/1).
19. Prime & maximal ideals.
Let I be an ideal in R, a commutative ring (with 1 , of course).
Definition. We say that I is prime in R if and only if
(1) I %= R ; and
(2) ab ∈ I implies that either a ∈ I or b ∈ I .
We say that I is maximal in R if and only if
(1) I %= R ; and
(2) if J is an ideal with I ⊂ J ⊂ R, then either J = I or J = R.
The following statements may easily be verified directly, but they are also
trivial corollaries of the next theorem:
{0} is prime ⇐⇒ R is an integral domain.
{0} is maximal ⇐⇒ R is a field.
Any maximal ideal is a prime ideal.
Exercise 19A. Prove the statements above.
Theorem 19.1. (a) I is prime ⇐⇒ R/I is an integral domain.
(b) I is maximal ⇐⇒ R/I is a field.
Proof. (a)⇒: Assume that I is prime. Then :
(a + I)(b + I) = 0 + I ⇒ ab + I = I ⇒ ab ∈ I ⇒ (a ∈ I or b ∈ I) ⇒ (a + I =
0 + I or b + I = 0 + I) . Also, I %= R, so R/I is not the zero ring.
(a)⇐: Assume that R/I is an integral domain. Then :
ab ∈ I ⇒ (a + I)(b + I) = 0 + I ⇒ (a + I = 0 + I or b + I = 0 + I) ⇒ (a ∈ I
or b ∈ I). Also, since R/I is not the zero ring, I %= R.
79
(b)⇒: Assume that I is maximal. Then R/I is not the zero ring, since
I %= R. If a + I %= 0 + I, then a ∈
/ I. It remains to construct an inverse
for a + I. Let J = { x + ay : x ∈ I, y ∈ R }. Then J is an ideal, and
J %= I since a ∈ J, and I ⊂ J ⊂ R. Thus J = R. Choose x0 ∈ I and y0 with
x0 + ay0 = 1. Then
(a + I)(y0 + I) = (1 − x0 ) + I = 1 + I ,
so y0 + I is an inverse for a + I.
(b)⇐: Assume that R/I is a field. Then R/I is not the zero ring, so
I %= R. Suppose that J is an ideal, I ⊂ J ⊂ R, and J %= I. Choose x ∈ J \ I.
Then there exists y ∈ R with (x + I)(y + I) = 1 + I. Thus 1 − xy ∈ I, so
1 − xy ∈ J. But xy ∈ J since x ∈ J. Thus 1 ∈ J, and J = R, as required.
The following can be generalized quite a bit, but is sufficient for our
purposes.
Theorem 19.2. Let R be a principal ideal domain with d ∈ R. Then
the following hold.
(a) The ideal dR is prime if and only if either d = 0 or d is irreducible.
(b) If R is a field, then dR is maximal if and only if d = 0.
(c) If R is not a field, then dR is maximal if and only if d is irreducible. Thus
prime and maximal ideals are nearly the same in a PID.
Proof. (a) Assume that dR is prime and d %= 0. Then d is not invertible
since dR %= R. Suppose that d = ab. Then ab ∈ dR so either a ∈ dR
or b ∈ dR. If a ∈ dR, let a = dc. Then d = dcb, so cb = 1, and so b is
invertible. Symmetrically, if b ∈ dR, then a is invertible. Thus d admits only
trivial factorizations, i.e. d is irreducible. Conversely, if d is irreducible and
ab ∈ dR, then d divides ab, so either d divides a or d divides b (by 17.6).
Thus either a ∈ dR or b ∈ dR, as required. If d = 0, then dR = {0} is prime,
since R is an integral domain.
(b) If R is a field, then {0} and R are the only ideals, {0} is maximal,
and dR = {0} if and only if d = 0.
(c) Assume that R is not a field and dR is maximal. Then dR is prime,
so by (a), either d = 0 or d is irreducible. But the first possibility is ruled
out since {0} is not maximal. Conversely, suppose that d is irreducible, J is
an ideal and dR ⊂ J ⊂ R. Then J = aR for some a, so a divides d. Thus,
either a ∼ d, which implies that J = dR, or a is invertible, which implies
80
that J = R. Finally, dR %= R since d is not invertible. Thus dR is maximal,
as required.
Theorem 19.3. Given a ring morphism φ : F [x] → K, where F and K
are fields, there are two possibilities:
(1) φ is a monomorphism; or
(2) Imφ is a subfield of K isomorphic to F [x]/dF [x] for some irreducible
d ∈ F [x].
Proof. Suppose that φ is not injective. Now F [x] is a PID, so Kerφ =
dF [x] for some d ∈ F [x]. By the first isomorphism theorem, Imφ ∼
= F [x]/dF [x].
Thus F [x]/dF [x] is an integral domain, since Imφ is a subring of the field K.
Thus dF [x] is a prime ideal in F [x]. Hence d is irreducible, since d %= 0 (for
d = 0 ⇔ φ is a monomorphism). Therefore F [x]/dF [x] is a field and so Imφ
is a field.
Remarks. a) It is immediate from Zorn’s Lemma (see Artin, p. 588)
that every non-zero ring has at least one maximal ideal. We won’t need this
fact later except in one alternative proof.
b) There is an exceptionally important example connecting algebra and
topology. Let X be a compact Hausdorff space. The continuous functions
form a subring, C(X), of the ring of all functions from X to C, where the
operations are defined by just adding and multiplying values of the functions.
For each x ∈ X, the set of continuous functions f for which f (x) = 0 is an
ideal, Ix , in C(X). This ideal is maximal, and it can be proved that every
maximal ideal has this form. Thus there is a 1-1 correspondence between the
maximal ideals in C(X) and the points of the space X. One can go further
and even define, purely algebraically, the topology on the set of maximal
ideals which makes the above correspondence into a homeomorphism. This
theme of ‘algebraizing topology and geometry’ has become very important in
recent years, especially the generalization to the non-commutative situation.
A purely algebraic analogue of the above correspondence is the evident
fact that the maximal ideals, (x − z)C[x], in the ring C[x] are in 1-1 correspondence with the ‘points’, z, in C. It is a non-trivial theorem (Hilbert’s
Nullstellensatz) that this generalizes to a 1-1 correspondence between the
set of maximal ideals in the ring of polynomials in n variables and the set of
points in complex n-space. This is a fundamental connection between algebra and geometry in a subject called algebraic geometry, discussed briefly
Gauss’ theorem
81
at the end of Section 21.
Exercise 19B. Prove that Ix above is a maximal ideal. (Take X to be
the closed interval [0, 1] if you haven’t studied topology.)
20. Gauss’ theorem.
Let R be a UFD. We’ll prove :
Theorem 20.1. (Gauss) R[x] is also a UFD.
It follows that polynomials in several variables over a UFD also form a
UFD; see the next section, especially 21.3. Coefficients in Z or any field are
especially important examples.
Definition. The non-zero polynomial
R[x] if and only if GCD{a0 , · · · , an } = 1.
-n
i=0
ai xi ∈ R[x] is primitive in
Gauss’ Lemma 20.2. If f and g are primitive in R[x], then so is the
polynomial f g.
Proof. For a contradiction, suppose that p divides all coefficients of f g,
where p is an irreducible in R. Let
f =
,
ai xi
;
g =
,
bj x j ;
k = min{ i : p does not divide ai } ; & = min{ j : p does not divide bj }.
-
But p divides each of : i+j=k+l ai bj ; ai for all i < k ; and bj for all j < &.
Thus p | ak b# , so p divides either ak or b# , contradicting the definitions of k
and &.
R).
Theorem 20.3. Let f ∈ R[x].
(i) If f is constant, then (f is irreducible in R[x] ⇐⇒ f is irreducible in
(ii) If deg f > 0, then (f is irreducible in R[x] ⇐⇒ f is irreducible in
QR [x] and primitive in R[x]).
Proof. (i) Clearly f = gh is a non-trivial factorization in R[x] if and
only if it is a non-trivial factorization in R.
(ii)⇐: If f = gh in R[x], then g and h cannot both have positive degree
since f is irreducible in QR [x]. We may assume that g is a non-zero constant.
Polynomials in several variables
82
Then g must be invertible in R, since f is primitive in R[x]. Hence f = gh
is a trivial factorization in R[x].
ii)⇒: If f were not primitive in R[x], it would have a non-trivial factorization f = gh, where g is the GCD of the coefficients of f . Supposing, for a
contradiction, that f is not irreducible in QR [x], we would have a non-trivial
factorization f = gh. But for all non-zero g ∈ QR [x], there exists a λ ∈ QR
with λg primitive in R[x]. Hence exists a µ ∈ QR , such that µf = ḡ h̄, where
ḡ and h̄ are primitive in R[x]. Since, by Gauss’ lemma, ḡ h̄ is primitive in
R[x], the element µ must be invertible in R, so f = µ−1 ḡ h̄ is a non-trivial
factorization in R[x], contradicting the hypothesis.
Proof of Gauss’ Theorem 20.1.
Existence. Factor f as p1 p2 · · · pt in QR [x]. Multiplying each pi by an
element in QR , we obtain f = µp̄1 p̄2 · · · p¯t , where each p̄i is primitive and
irreducible in R[x], and µ ∈ QR . Since p̄1 p̄2 · · · p¯t is primitive, actually µ ∈ R.
Now factorize µ = µ1 µ2 · · · µs in R. Then f = µ1 µ2 · · · µs p̄1 p̄2 · · · p¯t is a
factorization in R[x].
Uniqueness. Suppose that
µ1 µ2 · · · µs p̄1 p̄2 · · · p̄t = ν1 ν2 · · · νr q̄1 q̄2 · · · q̄v ,
where : µi and νj are irreducibles in R; p̄i and q̄j are primitive irreducibles
in R[x]. We must have µ1 µ2 · · · µs ∼ ν1 ν2 · · · νr since these are the GCD’s of
the coefficients. Hence r = s and µi ∼ νi for all i after re-arranging, since R
is a UFD. Also t = v, and p̄i ∼ q̄i for all i (after re-arranging) in QR [x]—and
therefore in R[x]—, since QR [x] is a UFD.
Exercise 20A. (Eisenstein’s Criterion). Let p be a prime. Show that
a polynomial with integer coefficients which has all but the top coefficient
divisible by p, and the bottom one not divisible by p2 , is irreducible in Z[x]—
and therefore in Q[x]. (Argue as in the proof of Gauss’ Lemma 20.2.)
21. Polynomials in several variables
and ring extensions.
Let k be a positive integer. The case k = 1 of what follows has already
been given in the first part of Section 16. Let S ⊃ R be a ring extension,
and let (s1 , · · · , sk ) be a sequence from S. Initially we don’t preclude the
possibility that si = sj when i %= j.
83
Proposition 21.1. The ring R[s1 ][ s2 ] · · · [sk ] is
{
,
ri1 ,···,ik si11 si22 · · · sikk : ri1 ,···,ik is in R and is almost always zero. }.
It coincides with the intersection, T , of all subrings of S which contain R ∪
{s1 , · · · , sk }. It is independent of the order of (s1 , · · · , sk ).
Definition. Denote this ring as R[s1 , · · · , sk ], and call it the ring generated by R ∪ {s1 , · · · , sk }, or the ring generated by {s1 , . . . , sk } over R.
Sketch Proof of 21.1. The first assertion follows by induction on k
from the definition of R[s] before 16.1. The third assertion clearly follows
from the second, which is proved by the same argument as in 16.1: Because
T is a ring containing R ∪ {s1 , . . . , sk }, it contains all elements as in the first
assertion. Because R[s1 , · · · , sk ] is one of the rings in the intersection defining
T, it contains T.
Definition. The sequence (s1 , . . . , sk ) is algebraically independent over
R if and only if
rI si11 · · · sikk = 0 implies that all rI are zero, where the
rI ∈ R. Otherwise it is algebraically dependent over R.
Remarks. Clearly (s1 ) is algebraically dependent over R if and only if
s1 is algebraic over R. Any sequence with si = sj for some i %= j is evidently
algebraically dependent. The algebraic dependence of a sequence of distinct
elements depends only on the set of those elements. Thus we shall refer
to the algebraic dependence and independence of sets. An infinite set will,
by definition, be algebraically independent when all of its finite subsets are.
That case will seldom occur here.
Exercise 21A. Prove that (s1 , . . . , sk ) is algebraically independent over
R if and only if, for all i, the element si is transcendental over the subring
R[s1 , · · · , si−1 ].
Definition. ‘The’ polynomial ring over R in “k” variables is the ring
R[x1 , · · · , xk ] for any (x1 , · · · , xk ) which is algebraically independent over R.
As a convention which saves words, we shall never place the string [x1 , · · · , xk ]
after R unless (x1 , · · · , xk ) is algebraically independent over R (but this applies only to the letter x).
Remarks. By obvious inductions on k from the case k = 1 done in 16.3,
we get the following.
84
1) R[x1 , · · · , xk ] is unique up to isomorphism. More precisely, if (s1 , · · · , sk )
and (t1 , · · · , tk ) are both algebraically independent over R, then there
is a unique isomorphism
R[s1 , · · · , sk ] −→ R[t1 , · · · , tk ]
which fixes each r ∈ R and maps each si to ti . See also the extension
principle in k variables just ahead.
2) For any R, there does exist a polynomial ring R[x1 , · · · , xk ]. For the
inductive step, use 16.3ii) to choose an xk which is transcendental over
R[x1 , · · · , xk−1 ]. The construction in the proof of 16.3 would then give a
ring R[x1 , x2 ] consisting of sequences whose terms are sequences whose
terms are in R ! Evidently this is not usually a good way to think of a
polynomial in two variables; and it gets worse for larger k.
Also by induction k, 16.5 and 20.1 give
Proposition 21.2. If R is an ID, then so is R[x1 , . . . , xk ]. In this case ,
the invertibles in the latter ring are simply the invertibles in its subring R of
‘constants’.
Theorem 21.3. If R is a UFD, then so is R[x1 , . . . , xk ].
Examples. Thus Z[x1 , . . . , xk ] and (when F is a field) F [x1 , . . . , xk ] are
UFD’s.
In the above two results, we cannot change ID/UFD to PID. In fact
R[x1 , . . . , xk ] is never a PID if k > 1. For example, the ideal
{ s1 x1 + s2 x2 : si ∈ R[x1 , · · · , xk ] } ,
generated by x1 and x2 , is not principal. In Z[x], the ideal
{
,
ai xi : a0 is even } ,
generated by 2 and x, is not principal.
Exercise 21B. Take this to its evident conclusion: Prove in detail that
if k > 0, and R is an ID such that R[x1 , · · · , xk ] is a PID, then k = 1 and R
is a field.
85
Let S ⊃ R be a ring extension.
Morphism Extension Principle (1 variable). Given a ∈ S, there is
a unique ring morphism φ : R[x] −→ S which
i) sends elements of R to themselves in S, and
ii) sends x to a.
In fact φ(f ) = f (a). Certainly this φ has properties i) and ii). Furthermore,
any ring morphism satisfying i) and ii) must send bi xi to bi ai , i.e. send
f to f (a). Thus it coincides with our given φ. This extension property is the
basic property which is not always shared by the ring of polynomial functions.
The image of φ is clearly equal to R[a]. The map φ is injective if and only
if f (a) %= 0 whenever f %= 0, i.e. if and only if a is transcendental over R.
Thus a is transcendental over R if and only if φ determines an isomorphism
R[x] → R[a].
Morphism Extension Principle (k variables). Given a sequence
(a1 , · · · , ak ) ⊂ S, there is a unique ring morphism
φ : R[x1 , · · · , xk ] −→ S which
i) sends elements of R to themselves in S, and
ii) sends each xi to ai .
In fact φ(f ) = f (a1 , · · · , ak ).
This follows directly as in the case k = 1 just above, and also by induction
on k using the case k = 1. The image of φ is R[a1 , · · · , ak ]. The map φ is
injective if and only if (a1 · · · , ak ) is algebraically independent over R, and
in this case, R[a1 , · · · , ak ] ∼
= R[x1 , · · · , xk ].
The glories of algebraic geometry.
Not much will be done in this book with polynomials in more than one
variable. The subject which studies the solution sets in ‘n-space’ of systems
consisting of one or more polynomial equations in n variables is known as
algebraic geometry. It has probably produced and motivated the most important body of pure mathematics in this century. Its influence pervades
algebra, topology, and parts of analysis. Its applications range from number
theory to particle physics to computer graphics to mathematical logic. See
Artin, pp. 373–379, for a brief introduction to algebraic geometry.
22. Symmetric polynomials.
Symmetric polynomials
86
Besides the uniqueness of the polynomial ring, here’s another application
of the extension principle in several variables. It gives the ring morphism
from R[x1 , · · · , xk ] to itself which permutes the variables according to some
given permutation, γ, in the symmetric group Sk . Simply take S to be
R[x1 , · · · , xk ] and take each ai to be xγ(i) . Call the resulting map φγ . It
sends f (x1 , · · · , xk ) to f (xγ(1) , · · · , xγ(k) ). Then φγ is an isomorphism, whose
inverse, (φγ )−1 , is φγ −1 , since first re-arranging the variables, then putting
them back in place gives the identity map. More precisely, the identity map
of the polynomial ring, and φγ −1 ◦φγ , and φγ ◦φγ −1 , are all morphisms sending
constants to themselves and each xi to itself. By the uniqueness part of the
extension principle, they all must be the same, as required. The fact that
re-arranging the variables gives an isomorphism is a more detailed expression
of the fact that the order of the variables is immaterial.
Definition. Define SymmR[x1 , · · · , xn ], the set of symmetric polynomials, by: f ∈ SymmR[x1 , · · · , xk ] if and only if φγ (f ) = f ∀γ ∈ Sk .
Proposition 22.1. The set SymmR[x1 · · · , xk ] is a subring of R[x1 , · · · , xk ].
Proof. We have φγ (1) = 1. If φγ (f ) = f and φγ (g) = g, then φγ (f ± g) =
f ± g, and φγ (f g) = f g.
Definition. Define the ith elementary symmetric polynomial,
ei ∈ R[x1 · · · , xk ], to be the coefficient of ti in the element
k
$
j=1
(1 + xj t) ∈ R[x1 · · · , xn ][t] ,
where t is transcendental over R[x1 · · · , xn ].
Proposition 22.2. We have ei ∈ SymmR[x1 · · · , xk ] ;
e0 = 1 ; ei = 0 for i > k ; and, for 1 ≤ i ≤ k,
ei =
,
1≤j1 <···<ji ≤k
xj1 xj2 · · · xji .
Note. We’re assuming that k is ‘fixed’. The polynomial ei , of course,
depends on k.
87
Proof. To show that ei is symmetric, note that for all γ ∈ Sk , we have
k
$
j=1
(1 + xγ(j) t) =
k
$
(1 + xj t) .
j=1
Note also that the right-hand side has 1 as coefficient of t0 , and has degree
k as a polynomial in t, so e0 = 1 and ei = 0 for i > k.
Exercise 22A. Prove the last formula in 22.2 by induction on k, or
otherwise.
The following is a fundamental fact about symmetric polynomials, and
is often phrased: Any symmetric polynomial can be expressed uniquely as a
polynomial in the elementary symmetric polynomials.
Theorem 22.3. The set { e1 , e2 , · · · , ek } is algebraically independent
over R, and generates SymmR[x1 , · · · , xk ]. Thus
SymmR[x1 , · · · , xk ] = R[e1 , · · · , ek ] .
(So the subring, SymmR[x1 , · · · , xk ], is actually isomorphic to its extension
ring, R[x1 , · · · , xk ].)
Proof. Proceed by induction on k. When k = 1, it is clear, since e1 (x1 ) =
x1 is transcendental over R, and SymmR[x1 ] = R[x1 ]. For the inductive step,
let ē1 , · · · , ēk−1 denote the ESF’s in x1 , · · · , xk−1 .
Algebraic independence: Suppose, for a contradiction, that each fi is a
polynomial in “k − 1” variables, and that f = ni=0 fi ti is a non-zero polynomial of least degree n (in the last variable t) such that f (e1 , · · · , ek−1 )(ek ) = 0.
Now f0 %= 0, since otherwise we could factor out a copy of ek from the relation, contradicting the minimality of n.
Exercise 22B. Justify this, without assuming that R is an ID.
On the other hand, let φ : R[x1 , · · · , xk ] → R[x1 , · · · , xk−1 ] be the ring morphism obtained by ‘setting xk equal to zero’. Then φ(ei ) = ēi for 1 ≤ i ≤
k − 1, and φ(ek ) = 0. Applying φ to the relation fi (e1 , · · · , ek−1 )(ek )i = 0
yields f0 (ē1 , · · · , ēk−1 ) = 0. By the inductive hypothesis, { ē1 , · · · , ēk−1 } is
algebraically independent, so f0 = 0, giving the required contradiction.
88
Generation: We must show that SymmR[x1 , · · · , xk ] ⊂ R[e1 , · · · , ek ]. Define the total degree of a non-zero polynomial in R[x1 , · · · , xk ] to be the
maximum sum of exponents of xi ’s which occurs in a monomial with nonzero coefficient in the polynomial. Suppose, for a contradiction, that g ∈
SymmR[x1 , · · · , xk ] is a non-constant polynomial of least total degree not
lying in R[e1 , · · · , ek ]. Then φ(g) ∈ SymmR[x1 , · · · , xk−1 ], so by the inductive hypothesis, φ(g) = f0 (ē1 , · · · , ēk−1 ) for some f0 . Regarding f0 as
a polynomial in k variables (constant with respect to the last variable),
let g1 = g − f0 (e1 , · · · , ek ) ∈ SymmR[x1 , · · · , xk ]. For 1 ≤ j ≤ k, let
ψj : R[x1 , · · · , xk ] → R[x1 , · · · , xk ] be the map obtained by setting xj equal
to zero. Then ψk (g1 ) = 0 since φ(g1 ) = 0. Since g1 is symmetric, we have
ψj (g1 ) = 0 for all j. But, if
g1 =
then
ψj (g1 ) =
,
ai1 ,···,ik xi11 · · · xikk ,
,
ij =0
ai1 ,···,ik xi11 · · · xikk .
Thus ai1 ,···,ik = 0 if ij = 0 for some j, i.e. every non-zero term in g1 involves
each xj to a positive power. Thus g1 = x1 x2 · · · xk g2 = ek g2 for some g2 . Now
g2 is symmetric. Also
totaldeg.f0 (e1 , · · · , ek ) = tot.deg.f0 (ē1 , · · · , ēk−1 ) = tot.deg.φ(g) .
But tot.deg.φ(g) ≤ tot.deg.(g). Thus
tot.deg.g1 ≤ max{tot.deg.g, tot.deg.f0 (e1 , · · · , ek )} ≤ tot.deg.g .
So tot.deg.g2 = tot.deg.(g1 ) − k < tot.deg.g. By the minimality of tot.deg.g,
we get g2 = f2 (e1 , · · · , ek ) for some f2 . Thus
g = g1 + f0 (e1 , · · · , ek ) = ek g2 + f0 (e1 , · · · , ek )
= ek f2 (e1 , · · · , ek ) + f0 (e1 , · · · , ek ) .
So g ∈ R[e1 , · · · , ek ], completing the proof.
Theorem 22.4. If F is a field, and f ∈ F [t] is a monic polynomial, in
the trancendental t, of degree n with roots a1 , a2 , · · · , an (repeats possible)
in F , then the coefficient of ti in f is (−1)n−i en−i (a1 , · · · , an ).
89
Proof. Let φ : F [x1 , · · · , xn ][t] → F [t] be the ring morphism defined by
φ( fi ti ) = fi (−a1 , − a2 , · · · , − an )ti . Then, working in the field of
fractions of F [x1 , · · · , xn ][t], we get
-
f =
n
$
i=1
(t − ai ) = φ[
= φ[tn
n
,
i=0
n
$
i=1
(t + xi )] = φ[tn
n
$
(1 + t−1 xi )]
i=1
en−i (x1 , · · · , xn )(t−1 )n−i ] =
=
n
,
i=0
n
,
i=0
en−i (−a1 , · · · , −an )ti
(−1)n−i en−i (a1 , · · · , an )ti ,
as required.
Note. This gives a formula for getting the coefficients, given the roots.
Going the other way is a different kettle of fish, as the next 60 or so pages
will demonstrate.
Exercise 22C. Prove that 22.4 holds more generally with coefficients in
any commutative ring.
Remark. There is a method for making sense of a ‘limit’ of the rings
SymmR[x1 , · · · , xk ], as the number of variables, k, tends to infinity. This
produces a very important ring, which is often called the ring of symmetric
functions, although its elements resemble functions even less than in the case
of finitely many variables. See the first chapter of Macdonald for a veritable
feast of information on this ring and its connections to classical algebra and
combinatorics.
Exercise 22D. In R[x1 , · · · , xk ], define ∆ :=
Prove that
φγ (∆) = sign(γ)∆ .
%
i<j
(xi −xj ) . Let γ ∈ Sk .
In many books, the proof that the sign function is well defined uses this
‘partially symmetric’ polynomial ∆ (also called an alternating polynomial,
or an alternating function, because the subgroup which fixes it is Ak ), instead
of using the closely related direct formula for the sign which we used in 1.3.
III. Basic Field Theory.
Sections 23 to 34 give the material which is crucial to the Galois theory in the following group of sections, where we prove the famous results
about non-solvability of polynomial equations. The sections here are essentially that portion of elementary field theory which doesn’t use anything at
all substantial about groups. We give a review without proofs of the linear
algebra needed, as well as the application of field theory to the classical problems concerning construction of figures using only straight-edge and compass
(Section 27). One exception here is part of Gauss’ theorem concerning which
regular n-gons can be constructed. He did it without Galois theory, but for
us it’s easier to delay the last step and use the Galois correspondence. The
second notable result here is the complete classification of finite fields, up to
isomorphism, done in Section 31. This is much in contrast to the case of
finite groups, where classification still seems to lie far in the future. (Perhaps
one of you will do it?)
23. Prime fields and characteristic.
Definitions. Let b ∈ F , a field. If n ∈ Z, define n · b as follows:
n · b := b + b + · · · + b
0 · b := 0
;
(“n”times)
if n > 0;
n · b := − [(−n) · (b)] = (−n) · (−b)
if n < 0 .
The characteristic of F is zero, ch(F ) := 0, if and only if n · 1 %= 0 for all
n > 0. If ch(F ) %= 0, define
ch(F ) := min{ n : n · 1 = 0 , n > 0 } .
The prime subfield of F , denoted P , is the intersection of all subfields of F .
Clearly P is a subfield of every subfield of F .
Theorem 23.1. Assume that ch(F ) = 0. Then n · b %= 0 for any b %= 0
and n %= 0. In this case, P ∼
= Q.
Theorem 23.2. Assume that ch(F ) = p > 0. Then p is a prime integer.
If b %= 0, then n · b = 0 if and only if p divides n. In this case, P ∼
= Zp .
90
Prime fields and characteristic
91
Proof of both. Define φ : Z → F by φ(n) := n · 1 . Then φ is a ring
morphism, so Imφ is an integral domain, by 14.1 and 14.4. Thus Kerφ is a
prime ideal in Z by 19.1a). Therefore, by 19.2a):
either Ker φ = {0}, i.e. n · 1 = 0 implies that n = 0, and so ch(F ) = 0;
or else Kerφ = pZ for some prime p.
In the latter case, n · 1 = 0 ⇐⇒ p divides n.
In both cases, if b %= 0, then n · b = 0 ⇐⇒ n · 1 = (n · b)(b−1 ) = 0.
So if ch(F ) = 0, then n · b = 0 ⇐⇒ n = 0.
And if ch(F ) = p, then n · b = 0 ⇐⇒ p divides n. So by the first
isomorphism theorem, Imφ ∼
= Zp . Thus Imφ is a subfield of F , so P is
contained in Imφ. But Zp has no proper subfields, so P =Imφ, and P ∼
= Zp ,
as required.
Finally, if ch(F ) = 0, define ψ : Q → F by ψ(m/n) := φ(n)−1 φ(m). Then
ψ is well defined, partly because φ(n) %= 0 if n %= 0. It is a ring morphism
between fields, so is injective. Thus Imψ ∼
= Q, so Imψ is a subfield of F and
therefore contains P . The reverse inclusion follows because Q contains no
proper subfields, and therefore neither does any field isomorphic to Q such as
Imψ. More directly we can check this last point by noting that, for integers
m and n, we have 1 ∈ P , and therefore m · 1 = φ(m) ∈ P . Thus φ(n)−1 ∈ P ,
and so φ(n)−1 φ(m) ∈ P . Hence P = Imψ, and so P ∼
= Q, as required.
Note. Since any field has only the two extreme ideals, and a morphism
between fields, F → F & , cannot have F as kernel (since it maps 1 to 1), it
has {0} as kernel, so is injective. We’ll call it a field map. It provides an
isomorphism of F with a subfield of F & .
Exercises 23A. Show that ch(F ) = 0 implies that F is infinite. Is the
converse true? Prove that, if ch(F ) %= ch(F & ), then there are no field maps
F → F &.
Tedious(?) Exercise 23B. Rewrite the proof of 23.1/23.2 avoiding
the use of the results on prime and maximal ideals in Section 19.
24. Simple extensions.
Definition. Let a ∈ K, an extension field of F . The field generated by
F and a, denoted F (a), is the intersection of all those subfields of K which
contain F ∪ {a}.
Simple extensions
92
Note that F [a] ⊂ F (a), but possibly F [a] is not a field, in which case it
is a proper subring of F (a).
If K = F (a) for some a ∈ K, then K will be called a simple extension of
F.
Theorem 24.1. Suppose that a is transcendental over F . Then there
is a unique isomorphism ψ : F (x) → F (a) for which both ψ(b) = b for all
b ∈ F and ψ(x) = a. In this case, F [a] %= F (a).
(Recall that F (x) is the field of rational functions over F , i.e. the field of
fractions of the polynomial ring—so there is no conflict of notation between
this earlier use of F (x) and the use introduced in this section.)
Definition. In this case, F (a) is called a simple transcendental extension
of F .
Theorem 24.2. Suppose that a is algebraic over F . Then there is a
unique monic irreducible f0 ∈ F [x] such that there exists an isomorphism
ψ : F [x]/f0 F [x] −→ F (a)
for which both ψ(b + f0 F [x]) = b for all b ∈ F and ψ(x + f0 F [x]) = a. In this
case, ψ is also unique, and F [a] = F (a).
Definition. In this case, F (a) is called a simple algebraic extension of
F . Also f0 is called the minimal polynomial of a over F . It is the monic
polynomial of least degree such that f0 (a) = 0, as we shall see in the proof.
Its degree is called also the degree of a over F .
Proof of both. Uniqueness of ψ: In 24.1, the conditions on ψ imply
that
,
,
,
,
ψ( bi xi /
ci xi ) = ψ( bi xi )ψ( ci xi )−1
,
= (
,
ψ(bi )ψ(x)i )(
so ψ is unique. In 24.2 we have
,
ψ(
bi xi + f0 F [x]) =
,
ψ(ci )ψ(x)i )−1 = (
,
,
bi ai )(
ci ai )−1 ,
ψ(bi + f0 F [x])ψ(x + f0 F [x])i =
,
b i ai ,
so ψ is unique. [Using these formulae, we could now verify the properties, but
we’ll proceed differently.]
Vector spaces and linear maps
93
Existence. (Notice the analogies between this proof and the proof of 23.1/23.2.)
Let φ : F [x] → K be the morphism defined by setting φ(f ) := f (a). Then
F [a] = Imφ, which is an integral domain, by 14.1 and 14.2. Thus Kerφ is a
prime ideal in F [x], by 19.1a). Hence Kerφ is either {0} or f0 F [x] for some
irreducible f0 in F [x], by 19.2a). Now
φ is injective ⇐⇒ [f (a) = 0 ⇒ f = 0] ⇐⇒ a is transcendental .
If a is transcendental, let ψ : F (x) → F (a) be ψ(f /g) = φ(g)−1 φ(f ). Then
ψ is well-defined and is a non-zero morphism between fields, so is injective.
Because Imψ contains F ∪ {a} and is a subfield of F (a), it is clear that
Imψ = F (a). Thus ψ is also surjective, as required. Since F [a] = ψ(F [x])
and F [x] %= F (x), we have F [a] %= F (a).
Now assume that a is algebraic, and so Kerφ is a prime ideal generated
by a monic irreducible f0 . By the first isomorphism theorem, there exists
an isomorphism ψ from F [x]/f0 F [x] to Imφ with the given properties. Now
Imφ is a subfield of K containing F ∪ {a}, so F (a) ⊂ Imφ. But Imφ =
F [a] ⊂ F (a). Thus Imφ = F (a) = F [a]. Finally any monic irreducible f
such that f (a) = 0 must have GCD{f, f0 } of positive degree (since a is a
root of GCD{f, f0 } = rf + sf0 for some r, s), so f = f0 . Thus f0 is unique.
Tedious(?) Exercise 24A. Rewrite the proof of 24.1/24.2 avoiding
the use of the results on prime and maximal ideals in Section 19.
25. Review of vector spaces and linear maps.
Let F be any field. An F-vector space is a set V together with
i) an addition + on V such that (V, +) is an abelian group;
ii) a scalar multiplication F × V → V , (α, v) 9→ α · v ;
such that for all α, β in F and v, v1 , v2 in V , we have
(α + β) · v = α · v + β · v
α · (v1 + v2 ) = α · v1 + α · v2 ;
;
α · (β · v) = (αβ) · v
;
1·v = v .
A linear combination from a set S ⊂ V is any element of the form
,
s∈S
αs · s ,
Vector spaces and linear maps
94
where { s ∈ S | αs %= 0 } is finite.
S generates V
⇐⇒ every element is a linear combination from S ;
S is linearly independent ⇐⇒ [
S is a basis for V
,
s∈S
αs · s = 0 ⇒ αs = 0 ∀s ] ;
⇐⇒ S is linearly independent and generates V
.
Theorem 25.1. (i) Any generating set contains a basis.
(ii) Any linearly independent set is contained in some basis.
(iii) Any two bases for V ‘are in 1-1 correspondence’.
Note. Using either (i), since V generates V , or (ii), since the empty set
is linearly independent, we see that V has a basis. The number of elements
in any basis, well defined by (iii), is the dimension of V , denoted dimV , and
is either a non-negative integer, or ∞ (or rather an infinite cardinal, for the
sophisticated).
Definition. A map φ : V → W between F -vector spaces is linear if and
only if
φ(v1 + v2 ) = φ(v1 ) + φ(v2 )
and
φ(α · v) = α · φ(v)
for all α ∈ F and v1 , v2 , v ∈ V .
Proposition 25.2. Assume that φ is linear. Then
φ(
,
s∈S
αs · s) =
,
s∈S
αs · φ(s).
If φ is surjective and S generates V , then φ(S) generates W . If φ is injective
and S is linearly independent, then φ(S) is linearly independent. If φ is
bijective and S is a basis for V , then φ(S) is a basis for W .
Definition. V ∼
= W as F -vector spaces if and only if there is a bijective
linear φ : V → W .
Degree of an extension
Corollary.
cardinals.
V ∼
= W
=⇒
dimV = dimW
95
as (possibly infinite)
Theorem 25.3. Conversely, dim V = dimW < ∞ =⇒ V ∼
= W.
Note. We won’t need it here, but 25.3 holds without requiring finite
dimensionality, as long as we interpret dimension as a cardinal. A countable dimensional vector space will not be isomorphic to a vector space whose
dimension is some uncountable cardinal. As a Q-vector space, R has uncountable dimension.
26. The degree of an extension.
Let F be any field.
Theorem 26.1. If K is an extension of F , then the addition in K,
together with the scalar multiplication µ : F × K → K, µ(b, a) := ba, gives
K the structure of a vector space with scalars in F .
Proof. Clearly K is an abelian group under addition. Furthermore, the
laws given in the second part of the definition of vector space all hold, since
K is a ring, and those laws are part of the definition of ring.
Definitions. The degree of K over F , which is denoted as [K : F ], is the
dimension of K as a vector space over F . Possibly [K : F ] = ∞. Here we
won’t need to distinguish between different infinite cardinals. In this case,
K is an infinite extension of F (which is saying much more than just that
K is infinite as a set). Otherwise, i.e. if [K : F ] is finite, we say that K is
a finite extension of F (even though K is not finite as a set unless F is, in
this case).
Remark. It is clear now what the additive group structure of a field
is; i.e. as a group under +, we have K ∼
= P [K:P ] ∼
= Qn or Zp n for some
(possibly infinite) cardinal n. Once the characteristic and the degree over the
prime field are known, any additional information about a field necessarily
involves its multiplication.
96
Theorem 26.2. Let a ∈ K, an extension of F . Then the following hold.
(i) [F (a) : F ] = ∞ ⇐⇒ a is transcendental over F .
(ii) [F (a) : F ] is finite ⇐⇒ a is algebraic over F . In this case, the
degree [F (a) : F ] is also the degree of the minimal polynomial of a, i.e. the
degree of a over F , and { 1, a, a2 , · · · , an−1 } is a basis for F (a) as an
F -vector space, where n := [F (a) : F ].
Proof. We prove the implications ⇐= , since, for example, (i)⇒ is the
same as (ii)⇐ .
(i) {a is transcendental over F } =⇒ {F (a) ∼
= F (x) by a ring isomorphism mapping all elements of F to themselves }
=⇒ {F (a) ∼
= F (x) by an isomorphism of F -vector spaces }
=⇒ F (a) and F (x) have the same dimension over F ,
i.e. [F (a) : F ] = [F (x) : F ] = ∞. The last equality follows from the fact that
F (x) contains a subspace F [x] which is easily seen to be infinite dimensional.
Challenge. Can you actually write down a basis over F for the vector
space F (x)?? At least, can you show that its dimension is the smallest infinite
cardinal, i.e. ‘countable’ ? Think about partial fractions! While you’re at it,
why not formulate and prove a theorem on partial fraction decomposition in
F (x), drawing on your experience learning methods of integration?
(ii) {a is algebraic over F } =⇒ {F (a) ∼
= F [x]/f0 F [x] by an
isomorphism of F -vector spaces, where f0 is the minimal polynomial
of a} =⇒
[F (a) : F ] = [(F [x]/f0 F [x]) : F ] = degf0 .
If n := degf0 , then
{ 1 + f0 F [x], x + f0 F [x], x2 + f0 F [x], · · · · · · , xn−1 + f0 F [x] }
is a basis for F [x]/f0 F [x]. Thus the image of that set under the isomorphism,
namely { 1, a, · · · , an−1 }, is a basis for F (a).
97
Theorem 26.3. Given an iterated field extension F ⊂ E ⊂ K, we have
[K : F ] = [K : E] [E : F ] .
In fact, if {aλ }λ∈L is an indexed basis for E as an F -vector space, and
{bµ }µ∈M is an indexed basis for K as an E-vector space, then {aλ bµ }(λ,µ)∈L×M
is an indexed basis for K as an F -vector space.
Proof. The first sentence follows from the second, since [E : F ] is the
cardinality of L, and [K : E] = cardM , so [K : F ] would be, as required,
card(L × M ) = (cardL)(cardM ).
(i) Every element of K is a linear combination of {aλ bµ } with coefficients
in F : Let c ∈ K. Since {bµ } is a basis for K over E, there exist eµ ∈ E such
that c = µ∈M eµ bµ . Since {aλ } is a basis for E over F, for all µ there exist
fµλ ∈ F such that eµ = λ∈L fµλ aλ . Then
c =
,
fµλ aλ bµ ,
(λ,µ)∈L×M
as required.
A field full of eµ ’s will excite the naturalist even more than the
algebraist—to say nothing of ν’s.
(ii) The set {aλ bµ } is linearly independent over F : Suppose that
, ,
(
fµλ aλ )bµ = 0 .
µ∈M λ∈L
-
Since fµλ aλ ∈ E, and {bµ } is linearly independent over E, it follows that
fµλ aλ = 0 for all µ ∈ M . Since {aλ } is linearly independent over F , it now
follows that fµλ = 0 for all µ and λ, as required.
Corollary 26.4. If F ⊂ E ⊂ K are field extensions, then (K is a finite
extension of F ) ⇐⇒ (both K is a finite extension of E, and E is a finite
extension of F ) . In this case, both [E : F ] and [K : E] are divisors of
[K : F ].
Corollary 26.5. If a ∈ K, a finite extension of F , then a is algebraic
over F , and the degree of a over F divides [K : F ].
98
Proof. Let E = F (a) in 26.4 and apply 26.2.
Definition. If {a1 , · · · , an } ⊂ K ⊃ F , define F (a1 , · · · , an ) to be the
intersection of all subfields of K which contain F ∪ {a1 , · · · , an }. It is called
the field generated by {a1 , · · · , an } over F , and is in fact the set
{ g(a1 , · · · , an )−1 f (a1 , · · · , an ) : f, g ∈ F [x1 , · · · , xn ] ; g(a1 , · · · , an ) %= 0 } .
It is easily seen that
F (a1 , · · · , ai )(ai+1 , · · · , an ) = F (a1 , · · · · · · , an )
for 1 ≤ i ≤ n.
Theorem 26.6. Let K be a finite extension of F . Then there exists a
positive integer n and a set { a1 , · · · , an } ⊂ K such that
F ⊂ F (a1 ) ⊂ F (a1 , a2 ) ⊂ • • • ⊂ F (a1 , · · · , an ) = K .
Only the last equality is at issue, but we want to emphasize the tower of fields.
Note. In a sense, this determines the structure of finite extensions, since
at each stage F (a1 , · · · , ai ) = F (a1 , · · · , ai−1 )(ai ) is a simple algebraic extension of the previous stage F (a1 , · · · , ai−1 ); and the structure of simple
algebraic extensions was analysed in 24.2 and 26.2. The element ai is even
algebraic over F , by 26.5.
Proof. Proceed by induction on [K : F ], simultaneously for all K and
F . If [K : F ] = 1, any n and ai will do, since K = F , and each inclusion is
actually equality. For the inductive step, let [K : F ] > 1, so that K %= F .
Choose a1 ∈ K \ F . Then
[K : F ] = [K : F (a1 )][F (a1 ) : F ] ,
and a1 ∈
/ F , so F (a1 ) %= F . Thus [F (a1 ) : F ] > 1. It follows that
[K : F (a1 )] < [K : F ], and so we may apply the inductive hypothesis to the extension K of F (a1 ). Thus there exist a2 , · · · , an such that
K = F (a1 )(a2 , · · · , an ) = F (a1 , · · · , an ), as required.
Straight-edge & compass constructions
99
Exercise 26A. Suppose that a ∈ K ⊃ E ⊃ F . Show that
[E(a) : E] ≤ [F (a) : F ] .
Give an example where the inequality is strict.
The following calculations will probably need at least 26A and the fact
that xn − k is irreducible in Q[x] for n > 0 when the integer k is divisible
by some prime but not by its square. This fact follows immediately from
Eisenstein’s criterion given in 20A.
√
√
Exercise 26B. Calculate [Q( 2, 3 5) : Q].
√ √
Exercise 26C. Calculate [Q( 2, 5) : Q].
√
√
Exercise 26D. Calculate [Q(4 2, 3 2) : Q].
√
√
Exercise 26E. Calculate [Q(4 2, 6 2) : Q].
√
√
Exercise 26F. Calculate [Q(4 5, 6 7) : Q].
√
Exercise 26G. Calculate [Q(3 2, e2πi/3 ) : Q]. Show that the bigger
field is also the field generated over Q by all the complex roots of x3 − 2.
√
Exercise 26H. Calculate [Q( 2, e2πi/3 ) : Q].
Exercise 26I. Calculate [Q(e2πi/8 ) : Q], and [K : Q] where K is the
field generated over Q by all the complex roots of x8 − 1.
Exercise 26J. Calculate [Q(e2πi/12 ) : Q], and [L : Q] where L is the
field generated over Q by all the complex roots of x12 − 1.
27. Straight-edge & compass constructions.
The reader may prefer to skip this section, and return to it only after
Section 33, or even 39, since we have left the proofs of a couple of later
results here to those sections. We have included this section here to make
it clear that the proofs of such things as the impossibility of angle trisection
and of construction of certain other figures depends only on the elementary
field theory that we’ve developed so far.
The main work will be to consider the set of all points which can be
produced in finitely many steps starting from two initial points, where each
100
step is the use of a straight-edge to draw a line or of a compass to draw a
circle. New points are produced as the intersections of these curves. We want
to give an algebraic characterization of the coordinates of such ‘constructible
points’.
We’ll assume that the reader has already seen in kindergarten geometry
how to make certain basic constructions such as perpendicular bisectors,
angle bisection, and getting the line parallel to a given line through a given
point
All points, lines and circles below are in the plane R2 . Starting from two
points, define inductively sets of constructible points P, lines L, and circles
C, as follows.
P := ∪k≥0 Pk ⊂ R2 ; L := ∪k≥0 Lk ; C := ∪k≥0 Ck ; where :
L0 := empty set,
Lk := Lk−1 ∪ [ set of lines joining pairs of points in Pk−1 ] ;
C0 := empty set,
Ck := Ck−1 ∪ [ circles centred in Pk−1 and radius a distance between
a pair of points in Pk−1 ] ;
P0 := { (0, 0) , (1, 0) },
Pk := Pk−1 ∪ [intersection points of lines/circles in Ck ∪ Lk ].
Let Quad ⊂ R be the set of all the coordinates of all the points in P. It
is easily seen that α ∈ Quad if and only if |α| is a distance between points
in P.
Proposition 27.1. Quad is a subfield of R.
Proof. Closure under the four field operations follows in general quite
easily from the case of positive reals. Given such positive a and b in Quad,
ponder the following diagram to see closure under multiplication and inversion (addition and subtraction are very easy) :
((
((
((
(
((
(( 1/a (((
((
((
((
(•( 1
(
(
((
((
1
((
((
((
a
((
(•(
(
((
((
((
((
((
(
b
ab
•
((
(
((
((
(
((
(
101
((
Definition. Let Fk be the field generated by all the coordinates of all
the points in Pk .
Proposition 27.2. [Fk : Q] is a power of 2.
Proof. Since
[Fk : Q] = [Fk : Fk−1 ][Fk−1 : Fk−2 ] · · · [F2 : F1 ][F1 : Q] ,
it suffices to show that each [Fi+1 : Fi ] is a power of 2 (where F0 = Q when
i = 0). If a1 , a2 , · · · , ar are the coordinates of all the points in Pi+1 , then
Fi+1 = Fi (a1 , a2 , · · · , ar ), so we think of getting to Fi+1 from Fi by a tower
as in 26.6. But, by 26A,
[Fi (a1 , a2 , · · · , aj ) : Fi (a1 , a2 , · · · , aj−1 )] ≤ [Fi (aj ) : Fi ] ,
so it suffices to show that [Fi (aj ) : Fi ]= 1 or 2 for all i and j. But this is
clear, since either aj ∈ Fi , or else aj is a solution to one of : a pair of linear,
a linear and a quadratic, or a pair of quadratic equations with coefficients
in Fi , the latter case being reducible to a linear and quadratic. This follows
from the kindergarten procedures for finding intersection points of lines and
circles with one another.
Main Theorem 27.3. If α ∈ Quad, then the degree, [Q(α) : Q], is a
power of 2.
4
Proof. If α ∈ Quad = k≥0 Fk , then α ∈ Fk for some k. This gives the
existence of a k with Q(α) ⊂ Fk . Hence [Q(α) : Q] divides [Fk : Q] , which
is a power of 2, as required.
102
Applications. 1. General angle trisection is impossible, since cosine(π/9)
has degree 3 over Q; its minimal polynomial is x3 − 34 x− 18 . [Use trigonometric
identities, e.g. for cos(3θ).]
2. Hence a regular 9-gon cannot be constructed. Gauss completely determined for which n the regular n-gon is constructible; see sections 33 and
39 later in this book.
√
3. ‘Doubling the cube’ is impossible, since [Q(3 2) : Q] = 3.
4. See the comment after A7 below concerning the impossibility of ‘squaring the circle’.
Better Theorem 27.4. A real number α is in Quad if and only if there
exist fields
K0 = Q ⊂ K1 ⊂ K2 ⊂ · · · ⊂ Kr = Q(α)
such that [Ki+1 : Ki ] = 2 for all i. But there exist α with [Q(α) : Q] equal
to power of 2 and α ∈
/ Quad; for example, when α is equal to either real root
4
of x − 2x − 2.
Proof of the first statement. In the proof of 27.2, we saw how to fit
a tower of fields between Fi+1 and Fi , in which each extension has degree 2.
Given α ∈ Quad, choose k with α ∈ Fk and juxtapose all these towers for
0 ≤ i < k. Now intersect Q(α) with each field in the last tower.
Exercise 27A. Show that each extension in this new tower has
degree at most 2.
After removing any repeats, this produces a tower as required.
Conversely, given a tower as in the statement, we can see that α is in
Quad by showing that if K is a subfield of Quad, then so is any degree 2
extension L of K which consists of real numbers.
Exercise 27B. i) Show that, except in
√ characteristic 2, any field
extension of degree 2 has the form F ( β) ⊃ F , i.e. the form
F (α) ⊃ F where α2 ∈ F .
ii) Show that the exclusion of characteristic 2 √
is needed.
3
iii) Show that the analogue for degree 3 and
β √is false.
To proceed, 27Bi) shows that L√has the form K( β), where β ∈ K is
positive. We need only show that β ∈ Quad, which is clear for any positive
β ∈ Quad by drawing a circle of diameter β+1 (since the chord perpendicular
to a diameter, through a point on the diameter of distance 1 from the circular
Roots and splitting fields
103
√
boundary, has length 2 β).
Exercise 27C. Prove this last statement.
The proof of the second statement now amounts to showing that there
are no fields at all which fit between Q and Q(α), since the latter has degree 4 over the former, the polynomial in the statement being irreducible by
Eisenstein’s criterion (given at the end of Section 20). This proof is done in
detail at the end of Section 39.
28. Roots and splitting fields.
Let F be any field.
Theorem 28.1. If non-constant g ∈ F [x], then there exists a finite
extension K of F , with [K : F ] ≤ deg(g), such that g has a root in K.
Proof. Let p be an irreducible factor of g. Let K = F [x]/pF [x], where
F is identified with the subfield {b + pF [x] : b ∈ F } ⊂ K. Then we have
[K : F ] = deg(p) ≤ deg(g). Let a = x + pF [x]. Then
p(a) = p(x) + pF [x] = 0 + pF [x] ,
so p(a) is zero in K. Thus g(a) = 0 in K, and a is a root of g.
Note. Because of choice for p, the field containing a root of g is not
unique. However, if g were irreducible, deg(g) divides [K : F ] by 26.5, so
[K : F ] = deg(g) for any K as in 28.1. Even stronger:
Theorem 28.2. If g ∈ F [x] is irreducible, and â, ã are roots of g in
extensions K̂, K̃ of F , then there exists a unique isomorphism ψ from F (â)
to F (ã) such that ψ maps all elements of F to themselves, and maps â to ã.
(See also 28.5.)
Proof. The polynomial g is (up to associates) the minimal polynomial
of â over F , so by 24.2 there is a unique isomorphism
ψ̂ : F [x]/gF [x] −→ F (â)
with the properties ψ̂(x + gF [x]) = â and ψ̂(b) = b for all b ∈ F . Similarly,
there exists a unique
ψ̃ : F [x]/gF [x] −→ F (ã)
104
(using 24.2). Let ψ = ψ̃ ◦ (ψ̂)−1 . If ψ1 and ψ2 both had the properties of ψ,
then ψ1 ◦ ψ̂ and ψ2 ◦ ψ̂ would both have the properties of ψ̃, so they agree;
and therefore ψ1 = ψ2 .
Theorem 28.3. Let non-constant g ∈ F [x]. Then there is a finite
extension K of F such that g is a product of linear polynomials in K[x], i.e.
‘K contains all the roots of g’.
Proof. Proceed by induction on deg(g). If deg(g) = 1, let K = F . For
the inductive step, suppose that deg (g) > 1. By 28.1 , there is a finite
extension E of F in which g has a root a. Then g = (x − a)g1 for some
g1 ∈ E[x]. By the inductive hypothesis, since deg(g1 ) is less than deg(g),
there exists a finite extension K of E such that g1 is a product of linear
polynomials in K[x]. Then K is a finite extension of F by 26.4 and g ‘splits’
in K[x].
Definition. If K is an extension of F and g ∈ F [x] is a non-constant,
say g splits over K if and only if g is a product of linears in K[x]. The field
K is a splitting field for g over F if and only if
(1) g splits over K, and
(2) g doesn’t split over E if F ⊂ E ⊂ K and E %= K.
Theorem 28.4. Any non-constant g ∈ F [x] has a splitting field over F
which is a finite extension of F .
Proof. By 28.3, construct a finite extension K & of F such that g splits
over K & . Then g = b(x − a1 )(x − a2 ) · · · (x − an ), where b and each ai are in
K & . Now b ∈ F , since it is the leading coefficient of g. Thus, if F ⊂ K && ⊂ K & ,
the polynomial g splits over K && if and only if K && contains ai for all i, using
unique factorization in K & [x]. Now let K either be the intersection of all such
K && or else be F (a1 , · · · , an ); the two coincide by the usual arguments (see
6.3 and 16.1) about generating an algebraic object either by combining its
elements or by intersecting certain subobjects.
Remarks. This argument shows that inside any extension K & of F such
that g(x) splits in K & [x], there is a unique splitting field K over F for g(x).
This uniqueness is strong in that K is not just unique up to isomorphism. But
it is weak in that we must remain within a fixed extension K & . Below in 28.9
105
we shall prove the very important fact that K is unique up to isomorphism,
independently of any containing field K & . The principles 28.5 to 28.8 just
below will play important roles later in Galois theory, as well as being useful
here in proving the uniqueness of splitting fields.
Before giving the ‘official uniqueness of splitting fields proof’, here is a
sketch of a different approach. Given two splitting fields K1 ⊃ F and K2 ⊃ F
for g(x) ∈ F [x], one can find a field K & and two field maps φi : Ki → K &
which agree on F :
K1 φ
(THE DIAGRAM “COMMUTES”)
F
*
)
))
++
,
+
+
1
+
,
K&
*
)
)
[For the cognoscenti who know tensor products,
K2 φ2
take K & = (K1 ⊗F K2 )/I for a maximal ideal I.
See the exercise below.]
Then both φ1 (K1 ) and φ2 (K2 ) are essentially splitting fields for g(x) over F
[modulo the fact that F → K & isn’t, strictly speaking, an extension in the
sense of our definition]. Both are subfields of K & , so φ1 (K1 ) = φ2 (K2 ) by the
‘strong’ uniqueness discussed above. Thus φ−1
2 φ1 defines an isomorphism, as
required, from K1 to K2 .
!
Extended Exercise 28A. Learn about the tensor product, , e.g. in
!
Greub or Atiyah-Macdonald—the basics are in Appendix
after Sec!
tion 50—and how R = K1 F K2 can be made into a commutative ring
(even an F -algebra—see Section 52) into which K1 and K2 embed. Alternatively, check that the following messy version works: Pick F -bases
B & = {b&1 , b&2 , · · · , b&u } and B && = {b&&1 , b&&2 , · · · , b&&t } for K1 and K2 respectively.
Write

b&p b&q = Σui=1 a&i (p, q)b&i 

""
b&&r b&&s = Σtj=1 a&&j (r, s)bj


with a&i (p, q) , a&&j (r, s) in F .
Define R to be the F -vector space of dimension ut with basis B & × B&& . Define
the multiplication, say ∗, on R by
9
:
[ Σ(p,r) cp,r (b&p , b&&r ) ] ∗ Σ(q,s) dq,s (b&q , b&&s )
9
:
:= Σ(i,j) Σ(p,q,r,s) a&i (p, q)a&&j (r, s)cp,r dq,s (b&i , b&&j ) .
106
The embeddings from K1 and K2 into R will be determined by F -linearity
and by specifying b&i 9→ (b&i , 1) and b&&j 9→ (1 , b&&j ).
The first of the following extension principles is essentially a rehash of
28.2.
Definition. Given a field map φ : F → F ∗ and g(x) = Σ ai xi in F [x],
the polynomial g ∗ (x) ∈ F ∗ [x] which corresponds to g(x) is defined to be
g ∗ (x) := Σφ(ai )xi . We are actually dealing with a ring morphism F [x] → F ∗ [x],
but wish to avoid excessive notation.
28.5. Simple Extension Principle. Given φ : F → F ∗ , a field map,
assume that g(x) is irreducible in F [x], and that g ∗ (x) ∈ F ∗ [x] corresponds
to g(x), using φ. Let α and α∗ be roots g(x) and g ∗ (x) respectively, in
extensions of F and F ∗ . Then there is a unique field map θ : F (α) → F ∗ (α∗ )
which agrees with φ on F , and such that θ(α) = α∗ :
# F (α)
F
φ F∗
-θ
# F ∗ (α∗ )
Proof. Let g(x) have degree m, and define
θ(β0 + β1 α + · · · + βm−1 αm−1 ) := φ(β0 ) + φ(β1 )α∗ + · · · + φ(βm−1 )(α∗ )m−1
with βi ∈ F ; i.e. θ(f (α)) = f ∗ (α∗ ). The conditions on θ force this formula,
proving uniqueness. Essentially the proof of 28.2 shows that θ is a morphism
of rings: multiplication in F (α) [resp. F ∗ (α∗ )] is given by reducing modulo
g(α) [resp. modulo h(α∗ ), for some irreducible factor h(x) of g ∗ (x)]. (Possibly
h(x) = g ∗ (x).)
107
28.6. General Extension Principle. Given a field map
φ : F → F ∗ , an extension K ∗ ⊃ F ∗ , and a finite extension K ⊃ F , there
exists a finite extension L ⊃ K ∗ and a field map ψ : K → L agreeing with φ
# K
F
on F :
$ ψ
φ .
$
∗
# K∗ # L
Remark. This principle, rather than
F
!
, could have been used in the above
sketch alternative proof of the uniqueness of splitting fields.
Proof. If [K : F ] = 1, then K = F, so let L = K ∗ and ψ(β) = φ(β)
for all β. Proceed by induction on [K : F ]. Assume that [K : F ] > 1.
Choose any α ∈ K \ F and let g(x) be its minimal polynomial over F. Let
g(x) correspond using φ to g ∗ (x) ∈ F ∗ [x]. Let α∗ be a root of g ∗ (x) in some
extension of K ∗ . Consider the diagram
# K
# F (α)
F
-θ
∗
F (α∗ )
φ
-&
F
∗
$
'
&
$
.
$
ψ
∗
∗
K (α )
$
.
K∗
# L
'
&
&
First define θ using the simple extension principle. Now
[K : F (α)] = [K : F ] / [F (α) : F ] < [K : F ] ,
so we may apply the inductive hypothesis to the upper right-hand part of
the diagram to obtain L and ψ. Note that L ⊃ K ∗ is finite, since both
L ⊃ K ∗ (α∗ ) and K ∗ (α∗ ) ⊃ K ∗ are. That ψ agrees with φ on F follows
easily, since the ‘diamond’ portion of the diagram commutes.
108
28.7. Splitting Extension Principle. Given a field map
φ : F → F ∗ , g(x) ∈ F [x], a splitting field K ⊃ F for g(x), and a finite
extension K ∗ ⊃ F ∗ such that the corresponding g ∗ (x) splits in K ∗ [x], there
exists a field map
# K
F
η : K → K ∗ agreeing with φ on F :
φ F∗
-η
# K∗
Proof. First apply the General E. P. using exactly its notation and diagram.
If δ ∈ K and g(δ) = 0, then g ∗ (ψ(δ)) = 0 by an easy calculation. Thus
ψ(K) ⊂ K ∗ , and we may take η(ω) = ψ(ω) for all ω ∈ K (so η and ψ differ
only with respect to their codomains).
Exercise 28B. Let φ : F → F ∗ be an isomorphism of fields. It is
intuitively clear that, since φ could be used to ‘identify’ F with F ∗ , results
relating vector spaces over a single field have corresponding results relating
vector spaces over the two fields. For example: We could call a function, say
η : V → V ∗ from an F –vector space to an F ∗ –vector space, ‘φ–linear’ if and
only if it is a group morphism satisfying η(f v) = φ(f )η(v). Show that the
existence of an injective such map implies that dimF (V ) ≤dimF ∗ (V ∗ ). What
would the condition be on η for the opposite inequality?
28.8. Splitting Field Uniqueness Principle. Given an isomorphism
φ : F → F ∗ of fields, and splitting fields K ⊃ F and K ∗ ⊃ F ∗ respectively
for g(x) ∈ F [x] and g ∗ (x) ∈ F ∗ [x] which correspond using φ, there exists an
isomorphism η : K → K ∗ agreeing with φ on F .
(Use the same diagram as in 28.7.)
Proof. Pick any η using 28.7. Since η is a field map, it is injective,
yielding [K : F ] ≤ [K ∗ : F ∗ ] by 28B. But our hypotheses are symmetric, so
the reverse inequality holds, and so [K : F ] = [K ∗ : F ∗ ]. Thus η is surjective,
as required, by the ‘linear pigeon-hole principle’—a linear map between vector spaces of the same finite dimension is injective ⇔ it is surjective—and
similarly for a φ–linear map as in 28B.
Corollary 28.9. Taking F = F ∗ and φ to be the identity map, it follows
that a splitting field over F for g(x) ∈ F [x] is unique up to an isomorphism
fixing each element of F.
Algebraic closure
109
Remarks. i) Surjectivity of η also follows from the observation that
η(K) is a splitting field for g ∗ (x) over φ(F ) in 28.7.
ii) The next 50 pages or so of this book depend for their significance largely
upon the fact that the isomorphism in 28.9 is almost never unique—when
K = K ∗ , the set of all such automorphisms of K is called the Galois group
(under composition).
Where to find your roots?
Many readers will already be aware that any polynomial in C[x] has all
of its roots in C ; see the following appendix. Most of classical algebra is
concerned with polynomial equations over C and its subfields. Therefore
it is not an unreasonable tactic for the reader to imagine that most of the
upcoming work (except for finite fields in Section 31) takes place within C,
or even within subfields of C which are finite extensions of Q. The following
appendix also indicates how, replacing Q by an arbitrary ‘base field’ F ,
one may find a (usually non-finite) extension of F (namely, its algebraic
closure) in which all the action can take place. However, the material of the
present section—the existence and uniqueness of splitting fields—is all that
is really needed. It really is needed, even when only working over Q ; and
it is definitely more elementary than any proofs of either the ‘fundamental
theorem of algebra’ or the existence of algebraic closures.
Appendix A. Algebraic closure and the
fundamental theorem of (19th century) algebra.
As indicated in the last paragraph, study of the material in this appendix
is not essential for the understanding of subsequent sections, which only use
it for a few examples. The student should however, read at least the statements of A1 and A2.
Definition. A field F is algebraically closed if and only if any one, and
therefore all, of the following hold.
Theorem A1. Given a field F, the following are equivalent.
(i) Every irreducible in F [x] is linear.
(ii) Every non-constant in F [x] has a root in F.
(iii) Every non-constant in F [x] splits in F [x].
(iv) If K ⊃ F is an algebraic extension, then K = F.
(v) If K ⊃ F is a finite extension, then K = F .
(vi) For any simple extension F (α) ⊃ F , either α ∈ F or α is transcendental
over F.
The proof, which is quite easy, will be left as an exercise.
Theorem A2. The field C is algebraically closed.
This is commonly known as the fundamental theorem of algebra,
but that name should be qualified as in the appendix title. It was first
proved (in several different ways) by Gauss.
√ All proofs depend on some
ideas of a topological nature. (Note that Q( −1 ), for example, is definitely
not algebraically closed). Most proofs also depend on one or another piece
of mathematical machinery deeper than what we have so far given. For
example, courses in one variable complex analysis often give a proof using
the theory of complex line integrals. In Section 41 ahead is a proof in which
the machinery is algebraic (Galois theory), and the ‘topology’ is reduced to
the familiar facts that
1) an odd degree real polynomial has a real root, and
2) a positive real has a positive real square root.
At the end of this appendix is sketched the ‘best’ proof, in the sense that it
uses a single, intuitively appealing topological idea which directly explains
why the theorem is true.
Proposition A3. If E ⊃ F and K ⊃ E are both algebraic extensions,
then so is K ⊃ F .
Proof. Let α ∈ K have minimal polynomial Σn0 ai xi over E. Since each
ai is algebraic over F , the extension L := F (a0 , a1 , . . . , an−1 ) ⊃ F is finite.
Also L(α) ⊃ L is finite. Thus L(α) ⊃ F is finite, and so α is algebraic over
F.
Theorem A4. Given an extension K ⊃ F, let
L = { α ∈ K : α is algebraic over F } .
Then
(i) L is a subfield of K; and
(ii) if K is algebraically closed, then so is L.
110
Proof. i) If α and β are in L, then F (α, β) ⊃ F is finite, so algebraic.
But F (α, β) contains the elements α − β, αβ, and α/β if β %= 0. Thus each
of them is in L.
ii) Let g(x) be a non-constant in L[x]. It has a root α in K. Then L ⊃ F
and L(α) ⊃ L are algebraic. By A3, so is L(α) ⊃ F . Since α is algebraic
over F, the definition of L yields that α ∈ L.
Corollary A5. The set A of all algebraic numbers (all complex numbers
which are algebraic over Q) is an algebraically closed field.
Remark. Thus A ⊃ Q is a non-finite algebraic extension. The straightedge and compass constructible numbers Quad ⊃ Q gave us one earlier.
Theorem A6. Let K ⊃ F be algebraic. Then |K| = |F | unless F is a
finite field, in which case K is either finite or countable.
4
Sketch Proof. Write F [x] = ∞
r=0 Pr , where Pr consists of zero and all
polynomials of degree at most r. Thus
|Pr | = |F r+1 | = |F |r+1
(= |F | if
F is infinite) .
Each non-zero element of Pr has at most r roots in K. We can write K =
4∞
r=0 Kr , where Kr consists of those elements in K which are algebraic over
F of degree at most r. By the above facts about |Pr |, we see that |Kr | = |F | if
F is infinite, and Kr is finite if F is finite. The proof is completed by noting
that a countable union of sets with a fixed infinite cardinality has that same
cardinality, and a countable union of finite sets is at most countable.
Corollary A7. (Cantor) The field A is countable, and so the set,
C \ A, of transcendental numbers is uncountable (in particular, non-empty!).
Proving that special numbers (such as π and e) are transcendental is much
harder than Cantor’s indirect
√ proof above of the existence of transcendental
numbers. The fact that π is not algebraic immediately implies that ‘the
circle cannot be squared’ by straight-edge and compass. This was one of the
famous geometry problems from antiquity: Construct a square of the same
area as a given disc.
Exercise AA. (i) Prove that if F is countable, then so is F [x].
(ii) Deduce that for any n > 0, there exists a subset {α1 , . . . , αn } of C which
is algebraically independent over Q.
111
ASIDE. Here is a simple method, due to Liouville, for proving that certain specially designed numbers are transcendental. This preceded Cantor’s
method by only a few decades. Gauss apparently had no proof of the existence of transcendentals—but one never knows !
Theorem A8. If z is a real algebraic number of degree d > 1, then there
is a number M > 0 such that |z − pq | ≥ M/q d for all integers p and q with
q > 0. The number M depends on z but not on p and q.
(Hence, in the sense just described, algebraic numbers are harder to approximate by rationals than are transcendental numbers!)
Proof. Multiplying to rid the minimal polynomial of denominators, we
get a polynomial f (x) = Σdi=0 ai xi with each ai ∈ Z, satisfying f (z) = 0 and
f (p/q) %= 0 for all p/q ∈ Q. By factoring (x − y) out of each (xi − y i ) we can
write
f (x) − f (y) = (x − y)h(x, y) ,
where h(x, y) is a polynomial of two variables. Since h(z, y) is a continuous
function of y, it is bounded on the interval z − 1 ≤ y ≤ z + 1; i.e. ∃B > 0
with |h(z, y)| ≤ B for all y such that |z − y| ≤ 1. Now
|f (p/q)| = |q −d (Σai q d−i pi )| = |q −d · (a nonzero integer)| ≥ q −d .
Thus, if |z − pq | ≤ 1, we have
p
p
q
q
q −d ≤ |f (p/q)| = |f (z) − f (p/q)| = |z − | |h(z, p/q)| ≤ |z − | · B .
Hence |z − pq | ≥ 1/(Bq d ) whenever |z − pq | ≤ 1. Taking
M = min{ 1 , 1/B }
gives the required result.
The result isn’t interesting if z isn’t real, but does the proof depend on it being
real? Is there a generalization to approximating by numbers with both real and
imaginary parts being rational?
112
−n!
Application. Let z = Σ∞
= .1100010 · · · 010 · · · 010 · · · · · · .
n=1 10
6 th
24 th
120 th · · ·
For a contradiction, assume that z is algebraic of degree d. Then d > 1
because z ∈
/ Q (since the decimal expansion isn’t periodic). Choose M as in
the theorem. But now for a given s, consider the rational
s
,
10−n! = p/10s! = p/q .
n=1
We get
p
1
q
10(s+1)!−1
M/(10s! )d ≤ |z − | <
The last inequality is clear since z −
p
q
.
= .00 · · · 01 · · · (whatever)· · ·.
(s + 1)! th place
Thus 0 < M < 10−[(s+1)!−s!d−1] . Letting s → ∞, we get 0 < M ≤ 0, a contradiction. Hence z is a transcendental number.
Exercise AB. Write down other transcendental numbers using this method.
Write a ‘formula’ for uncountably many of them.
END OF ASIDE.
Theorem A9. Let F be a field and let S be any subset of non-constants
in F [x]. Then there is an extension M ⊃ F such that all members of S split
in M [x].
The proof is trivially reduced to the largest case, S = F [x] \ F, for which
we sketch a proof at the end of this appendix.
Definition. Let K ⊃ F be an extension, and let S be a subset of
F [x] \ F. We say that K is a splitting field for (F, S) if and only if A9 holds
for M = K, but fails for M = E if F ⊂ E ⊂ K and E %= K.
Exercise AC. Show that K is algebraically closed if and only if it is the
splitting field for (K, K[x] \ K).
Theorem A10. Every pair (F, S) has a splitting field K, which is unique
up to an isomorphism fixing each element of F.
Remark. This improves upon the last section only when S is infinite:
if S is finite, a splitting field for (F, S) is easily seen to be the same as a
113
splitting field over F for the polynomial which is the product of all members
of S.
Sketch Proof. Using A9, let M ⊃ F be an extension such that all members of S split in M [x]. Let R be the set of all roots in M of all members of
S. Clearly F (R), the field generated by F ∪ R, is a splitting field for (F, S)
and is the only one inside M. The latter “bang-on uniqueness”, and the fact
that any diagram
K1 +Φ1
*
)
)
,
+
F )
L
++
,
+
*
)
)
can be completed, with a field L and field
K2 Φ 2
maps Φi , to be a commutative diagram, gives uniqueness:
we have Ki ∼
= Φi (Ki ), and Φ1 (K1 ) = Φ2 (K2 ) by “bang-on uniqueness.”
Remark. The last sentence is the same as the alternative proof of uniqueness for splitting fields in the previous section. A proof by extending maps is
more awkward here, since it needs some sort of transfinite induction. Note
however that the existence of L in the proof depends in general on the axiom
!
of choice. For example, if L is constructed as (K1 F K2 )/I, that dependence
is on the existence of the maximal ideal I.
Definition. Given an extension K ⊃ F , we say that K is ‘the’ algebraic
closure of F if and only if any one, and so all four, of the following hold.
Theorem A11. Given K ⊃ F , the following are equivalent.
(i) K is algebraically closed and K ⊃ F is algebraic.
(ii) K is algebraically closed, but L isn’t for F ⊂ L ⊂ K and L %= K.
(iii) K is a splitting field for all of F [x] over F (i.e. for (F, F [x] \ F ) in
previous notation).
(iv) K ⊃ F is algebraic, and, for any algebraic extension E ⊃ F , there exists
a field morphism φ : E → K which is the ‘identity’ on F .
Remarks and examples. (I) By iii) and A10, each F has an algebraic
closure which is unique up to an isomorphism fixing elements of F.
(II) By i) and A5, the algebraic closure of Q is A (not C !!).
(III) For each prime p, all the finite fields Fpn in Section 31 ahead have the
same algebraic closure: Any algebraic closure Kp for Fpn is also an algebraic
closure for its prime subfield Fp , by i), since Kp is an algebraic extension of Fp
by A3. In fact Kp is simply the union of all the finite fields of characteristic
114
p, in the following sense. Firstly, for each n, the field Kp contains exactly
n
one subfield of order pn , the splitting field of xp − x over Fp . The union of
all these finite subfields is easily seen to be a subfield E of Kp . To see that
E = Kp , note that any root in Kp of a polynomial in Fp [x] lies in some finite
extension of Fp , so in E.
Exercise AD. Find a proper infinite subfield of Kp .
(IV) It is mildly surprising that building a minimal K over F so that all
of F [x] splits in K[x] results in all of K[x] splitting in K[x].
Sketch Proof of A11. i) ⇒ ii): Choose α ∈ K \ L with minimal
polynomial g(x) over F. Then g(x) does not split in L[x].
ii) ⇒ i): By A4, the set L, of those elements of K which are algebraic
over F, is an algebraically closed field, so L = K.
i) ⇒ iii): Any g(x) in F [x] splits in K[x] since any g(x) in K[x] does.
Suppose that F ⊂ E ⊂ K and E %= K. Let α ∈ K \ E. Since K ⊃ F is
algebraic, α is algebraic over F. Let g(x) be its minimal polynomial over F.
Then g(x) does not split in E[x].
iii) ⇒ i): Certainly K ⊃ F is algebraic. Suppose given a simple algebraic
extension K(α) ⊃ K. By A3, K(α) ⊃ F is algebraic. Thus α is algebraic
over F, so α ∈ K, as required.
Exercise AE. Prove the equivalence of (iv) with the remaining conditions
in the theorem.
Outline of a deduction of A9 from Zorn’s lemma. Choose any set
T containing F such that T is infinite and |T | > |F |. Consider the collection
C of all fields K such that K is an algebraic extension of F and, as a set, K
is a subset of T. Then C is a set. By the cardinality result A6 and by A3, if
K ∈ C and L ⊃ K is an algebraic extension, one can construct L& ⊃ K with
L& ∈ C such that L& ∼
= L by an isomorphism fixing elements of K. Given K
and K & in C, define K ≤ K & to mean that K & is an extension of K (more
than just K ⊂ K & as sets). Suppose that L is a non-empty subset of C which
is linearly ordered by the relation ≤ . Then ∪L is easily seen to be in C. By
Zorn’s lemma (see Artin, p.588), C has a maximal member M. Suppose for
a contradiction that non-constant g(x) ∈ F [x] does not split in M [x]. Choose
an algebraic extension L ⊃ M such that g(x) splits in L[x]. We may assume
that L ∈ C by the earlier remark. This contradicts the maximality of M.
115
Outline of the ‘homotopy’ proof of A2. Dividing by the leading
coefficient, and noting that zero is a root if the bottom coefficient is zero, it
remains to show that any polynomial
g(x) = xn + an−1 xn−1 + · · · + a0 ∈ C[x]
has a root in C when a0 %= 0 and n > 0. For a contradiction, suppose that it
doesn’t. Then for each real r > 0, we get a continuous function
ωr : S 1 −→ C \ {0}
by ωr (z) := g(rz), where S 1 is the unit circle {z : |z| = 1}. Values are
taken in the punctured plane C \ {0} by our assumption—this is crucial here!
Picture each ωr as a parameterized closed loop; that is, as a particle moving
continuously in C \ {0}, as the parameter z starts from 1 and moves around
the circle S 1 . By varying r continuously on the positive line from any r0 to
any r1 , it is intuitively clear that ωr0 and ωr1 are homotopic : i.e. they can
be continuously deformed from one to the other (without passing through
the origin in C !) For r0 very close to zero, it is clear from continuity that
ωr0 (z) stays close to a0 for all z, and so the loop ωr0 does not ‘wrap around’
the origin at all; in fact, it can be continuously deformed to the constant loop
which just stays at a0 . But for r1 sufficiently large, we shall show that ωr1
wraps around the origin “n” times in a counterclockwise direction; it could
be deformed to the loop z 9→ (r1 z)n which travels n times around the circle
of radius r1n at constant speed. This is the contradiction to the fact that ωr0
is homotopic to ωr1 which proves the theorem. (It is also the point which
requires the most work when setting up the machinery to make this into a
complete proof.)
To prove the claim about ωr1 , it suffices to show that, for all z ∈ S 1 ,
|ωr1 (z) − (r1 z)n | ≤ r1n /2
by choosing r1 suitably: For then, ωr1 (z) is always within the ‘moving disc’ of
radius r1n /2 centered at (r1 z)n . So the ‘ωr1 -particle’ gets dragged around the
origin “n” times, being trapped in the moving disc, however crazily it moves
around within that disc. In fact, a homotopy converting ωr1 into z 9→ (r1 z)n
is definable by continuously shrinking the moving disc to its center. You
should think of the particle as a dog attached to a leash of length r1n /2, whose
116
Repeated roots and formal derivative
117
master walks “n” times around the origin, along the circle of radius r1n centred
at the origin.
To prove the required inequality, if r1 ≥ 1, then
|ωr1 (z) − (r1 z)n |
=
≤
=
≤
|an−1 (r1 z)n−1 + an−2 (r1 z)n−2 + · · · + a0 |
|an−1 (r1 z)n−1 | + · · · · · · · · · + |a0 |
|an−1 |r1n−1 + |an−2 |r1n−2 + · · · + |a0 |
nM r1n−1
for M = max { |ai | : 0 ≤ i < n }. But if we choose r1 ≥ 2M n, then
M nr1n−1 ≤ r1n /2 ,
as required. Thus the job is done by choosing
r1 := max { 1 , 2n|an−1 | , · · · , 2n|a0 | } .
29. Repeated roots and the formal derivative.
Definition. For any field F , define D : F [x] → F [x] by
n
,
D(
i=0
ai x i ) =
n−1
,
(i + 1)ai+1 xi .
i=0
The operator D is called the formal derivative for obvious reasons.
Exercise 29A. Prove the identities
i) D(f + g) = D(f ) + D(g) ;
ii) D(f g) = f D(g) + D(f )g ;
iii) D(f n ) = nf n−1 D(f ) .
Repeated roots and formal derivative
118
Proposition 29.1. In F [x], we have that (x − a)2 divides f if and only
if (x − a) divides both f and D(f ).
Proof. ⇒: If f = (x − a)2 g, then
D(f ) = (x − a)[2g + (x − a)D(g)] .
⇐: If f = (x − a)h, then D(f ) = (x − a)D(h) + h. Since (x − a) divides
D(f ), we have (x − a) dividing h. Thus (x − a)2 divides f .
Definition. a is a repeated root of f if and only if (x − a)2 divides f .
Theorem 29.2. Let f ∈ F [x]. Then f has no repeated root in any
extension of F if and only if GCD{f, D(f )} = 1 in F [x].
(Note that one of the conditions involves arbitrary extensions of F , whereas the
other takes place entirely within F [x].)
Corollary 29.3. When the characteristic is 0, an irreducible in F [x] can
have no repeated root in any extension of F .
More generally, an irreducible g could have a repeated root only if Dg
were 0, since otherwise, because deg(Dg) < deg(g), we get that the required
GCD is 1. In the case of characteristic zero, Dg is certainly non-zero (and
has degree one less than that of g) since
D(xn ) = nxn−1 %= 0
for all n > 0 .
Proof of 29.2: ⇐: There exist s, t in F [x] with
sf + tD(f ) = 1 .
Since (x − a) does not divide 1, it cannot divide both f and D(f ) in any
extension K[x] ⊃ F [x] for any a ∈ K. Thus a is not a repeated root, i.e. not
a root of order greater than 1.
⇒: If GCD{f, D(f )} = g, and g is not a non-zero constant, then F has
an extension K in which g has a root a. Then (x − a) divides both f and
Df in K[x], so a is a repeated root of f .
Example. xt − 1 has no repeated roots in K if t is prime to ch(K), or if
ch(K) = 0.
Finite subgroups of F ×
119
30. Finite subgroups of F × .
Recall that F × is the group F \ {0} under multiplication.
Theorem 30.1. If F is any field, then F × has exactly one subgroup of
order t for each t for which xt − 1 splits into distinct linear factors in F [x],
and no other finite subgroups. For each such t, this subgroup is cyclic.
Proof. If G is a subgroup of order t, clearly each of its elements is a
root of xt − 1. Thus G consists of the “ t ” distinct roots, so G is the only
%
subgroup of order t, and xt − 1 = a∈G (x − a). To show that G is cyclic,
suppose that
G ∼
= Ct1 × Ct2 × · · · × Ctr
%
(using 13.2), where 1 < t1 | t2 | · · · | tr , and t = r1 ti . Then atr = 1 for
every a ∈ G, so xtr − 1 has “ t ” roots in G. Hence tr ≥ t, which implies that
tr = t, r = 1, and G is cyclic.
Definition. The generators of such a group G are called primitive tth
roots of unity. They are those tth roots of unity which are not sth roots of
unity for any s < t. For each such t, there are exactly “ Φ(t) ”
primitive tth roots of unity, where Φ is Euler’s function.
Remark. A special case is that, for a finite field F , the group F × is
cyclic. For Zp , we already used this to start the inductive calculation of the
group Z×
pn in Section 15. On the other hand, this observation is the crucial
one which we use in the next section to obtain the classification of finite
fields.
31. The structure of finite fields.
Proposition 31.1. A finite field F has “ pn ” elements for some prime
p and some n ≥ 1.
Proof. Let P denote the prime subfield of F . Then [F : P ] = n, for
some n < ∞. Thus F ∼
= P n as a P -vector space. But P has “ p ” elements
where p = ch(F ), so |F | = |P n | = pn .
Lemma 31.2. Let F have characteristic p and prime subfield P . Then
n
F has “ pn ” elements if and only if F is the splitting field of xp − x over P .
Structure of finite fields
120
n
Proof. ⇒: If a ∈ F × , then ap −1 = 1, since F × is a group of order pn −1.
n
n
Hence ap = a for all a ∈ F , so F contains “ pn ” roots of xp − x, and is the
splitting field.
n
⇐: Let R be the set of roots in F of xp − x. Since
n
(a ± b)p
n
n
= ap ± b p
in any field of characteristic p, the set R is closed under addition and subtraction. Closure under multiplication and division is trivial, so R is a subfield.
Hence R = F . Since
n
n
n
GCD{ xp − x , D(xp − x) } = GCD{ xp − x , − 1 } = 1 ,
n
the polynomial xp − x has no repeated roots, so |F | = pn .
Theorem 31.3. Given (p, n), there is exactly one field of order pn up
n
to isomorphism, namely the splitting field, Fpn , of xp − x over Zp .
Proof. The field Fpn has “ pn ” elements by the lemma. If K is any
field with “ pn ” elements and prime field P , then any isomorphism P → Zp
extends to an isomorphism K → Fpn by 28.8, the uniqueness of splitting
n
fields, since K is the splitting field of xp − x over P by the lemma.
n
Note. The group F×
pn is cyclic of order p − 1 by the previous section,
and has primitive tth roots of unity for the divisors t of pn − 1, and only for
these t. We have
$
n
xp − x =
(x − a) .
a∈Fpn
Theorem 31.4. For all n ≥ 1, the ring Zp [x] contains irreducible polynomials f of degree n, and for any such f ,
( Zp [x] / f Zp [x] ) ∼
= Fpn .
Any extension in the realm of finite fields is simple.
Proof. Let θ be any generator of the cyclic group F×
pn . Then we have
×
Fpn = Zp (θ) since the powers of θ fill out Fpn . So θ has degree n over Zp ,
and its minimal polynomial is irreducible of degree n. Now Zp [x]/f Zp [x] is a
field with “ pn ” elements, if f is irreducible of degree n, so it is isomorphic
Moebius inversion
121
to Fpn . We have observed that Fpn is a simple extension of Fp . Thus any
finite field is a simple extension of its prime subfield, and therefore of each
of its subfields.
Note. This shows that Z[x] and Q[x] have plenty of irreducibles of every
degree (as does Eisenstein’s criterion in 20A more directly).
Exercise 31A. Show that Fpn has a subfield of order pr if and only if r
divides n.
Exercise 31B. Show that Fpn does not have two subfields of the same
order.
Exercise 31C. Show that
n
xp − x =
$$
g(x) ,
r|n
where the inside product is over the set of all monic irreducibles g(x) of
degree r in Zp [x]. After reading the next section, deduce that the cardinality
of the latter set is
,
r−1
µ(s)pr/s .
s|r
Exercise 31D. Is every function from Fpn to itself necessarily a polynomial function (cf. 16G)?
32. Moebius inversion.
Summations in this section are over all positive divisors of some given
positive integer. For positive integers k, define µ(k) inductively by
µ(1) = 1
,
;
µ(r) = 0
if k > 1 .
r|k
Exercise 32A. Show that µ is given explicitly by
µ(pα1 1 pα2 2
· · · pαt t )
=
&
(−1)t
0
if all αi = 1 ;
otherwise ;
where the pi ’s are distinct primes and each αi > 0.
Moebius inversion
122
Theorem 32.1. (Moebius Inversion Formula) If {an }n≥1 is a sequence from an abelian group, then
,
bk =
ar
=⇒
ak =
r|k
,
µ(k/r)br .
r|k
Proof. We have
,
,
µ(k/r)br =
r|k
[µ(k/r)
r|k
,
as ] =
s|r
,
[as
s|k
,
µ(k/r)]
s|r|k
(letting & = k/r)
=
,
[ as
,
# | k/s
s|k
µ(&) ] = ak · 1 +
-
,
s|k , s<k
as · 0 = ak .
Example. The identity, k = r|k Φ(r), can be obtained by counting the
elements of Ck : each element generates the subgroup Cr for some unique r
dividing k, and Cr has “ Φ(r) ” distinct choices of generator. By Moebius
inversion,
,
Φ(k) =
µ(k/r)r .
r|k
Exercise 32B. Use this to reprove that
$
Φ(
i
pαi i ) =
$
i
(pαi i − piαi −1 )
Cyclotomic polynomials
123
33. Cyclotomic polynomials.
Definition. Suppose that F is an extension of Q which contains primitive
k roots of unity. Define the k th cyclotomic polynomial by the formula
%
ck (x) := (x − ζ), product over the “ Φ(k) ” distinct primitive k th roots of
unity ζ. The polynomial ck is monic of degree Φ(k). Since the roots of xk − 1
are the primitive rth roots of unity for all r dividing k, we get
th
xk − 1 =
$
cr (x)
(∗)k
r|k
Using (∗)k and induction on k, we see that:
(i) (∗)k determines ck ;
(ii) ck (x) ∈ Z[x] for all k, since we have {f primitive in Z[x] and
f g ∈ Z[x]} ⇒ g ∈ Z[x] ;
(iii) ck (x) does not depend on choice of F . (We could take F to be
C,
k
k
or to be any splitting field for x − 1, since x − 1 has no repeated
roots.)
Applying Moebius inversion to (∗)k in the commutative group [Z(x)]* under
multiplication, we get
ck (x) =
$
r|k
(xr − 1)µ(k/r)
in Z(x).
Theorem 33.1. (Gauss) The polynomial ck (x) is irreducible in Q[x].
For primitive k th roots of unity ζ, the extension Q(ζ) of Q has degree Φ(k).
The minimal polynomial over Q of any primitive k th root of unity is ck (x).
Proof. The three statements are clearly equivalent.
To prove that ck (x) is irreducible, it suffices to show that all primitive
k th roots of unity have the same minimal polynomial over Q, since this
polynomial divides ck (x), and would have degree at least Φ(k), so it would
coincide with ck (x). If ζ is any primitive k th root of unity, then any other
such root can be written as ζ p1 p2 ···pr for (not necessarily distinct) primes pi
which don’t divide k. Hence we need only show that ζ and ζ p have the same
minimal polynomial over Q, if p is a prime which does not divide k. Choose
multiples f (x) and g(x) of the minimal polynomials of ζ and ζ p respectively,
such that f and g are primitive in Z[x]. Since ζ is a root of g(xp ), it is
Cyclotomic polynomials
124
clear that f (x) divides g(xp ) in Q[x]. Now f (x) is primitive in Z[x], so f (x)
divides g(xp ) in Z[x].
For a contradiction, suppose that f (x) and g(x) are not associates in Q[x].
Then f (x) and g(x) are relatively prime in Q[x] and both divide (xk − 1),
so f (x)g(x) divides (xk − 1) in Z[x]. Let f¯(x) and ḡ(x) in Zp [x] be f and g
with coefficients reduced mod p. Then ḡ(xp ) = ḡ(x)p , so f¯(x) divides ḡ(x)p
in Zp [x]. Thus f¯(x) and ḡ(x) have a common root θ in some extension of
Zp . But f¯(x)ḡ(x) divides (xk − 1) in Zp [x], and so θ is a repeated root of
(xk − 1). This is a contradiction, since GCD{ xk − 1 , kxk−1 } = 1 in Zp [x]
when p doesn’t divide k.
Hence f (x) and g(x) are associates in Q[x], so that ζ and ζ p have the
same minimal polynomial.
Now we can prove most of Gauss’ theorem, alluded to earlier in Section
27, determining which regular n-gons are constructible by straight-edge and
compass. The proof below uses only the weaker version of 33.1 in which we
assume that k is a prime. See Exercise 33B below for an easier proof of that
weaker version.
Theorem 33.2. (Gauss) The regular n-gon is constructible if and only
β
if n factors as 2α p1 p2 · · · pt , where the pi are distinct primes of the form 22 +1.
First half of the Proof. (See Section 39 for the second half.) It is easily
seen that the set P √
of constructible points, thought of as a subset of C, is in
fact the field Quad( −1). The main theorem on Quad extends immediately
to imply that elements of P are necessarily algebraic of degree equal to a
power of 2. It is also clear that the regular n-gon is constructible if and only
if e2πi/n ∈ P.
But if p is an odd prime dividing n, then p − 1 divides Φ(n); and if p2
divides n, then p divides Φ(n). Therefore, if Φ(n) is a power of 2, then
the only odd primes which can divide n must be of the form 2γ + 1; and
the squares of these primes cannot divide n. Primes of this form are called
β
Fermat primes. It is elementary to show that they must have the form 22 +1.
(Use the factorization of xodd + 1.) This proves the theorem in one direction.
In the second half of the proof, we’ll show that e2πi/p ∈ P for each Fermat
prime p. By angle bisection (or by a simple, purely algebraic argument), we
α
see that e2πi/2 ∈ P for all α > 0. Thus it remains only to show that
Primitive elements exist in characteristic 0
125
e2πi/ab ∈ P whenever e2πi/a ∈ P and e2πi/b ∈ P with GCD{a, b} = 1. But
this is immediate from the fact that P is a field, and that sa + tb = 1 for
some integers s and t—write
e2πi/ab = (e2πi/a )t (e2πi/b )s .
Exercise 33A. Show that 0 ≤ β ≤ 4 do give primes above, but β = 5
doesn’t. (No other values of β are known which give primes, and many are
known which don’t.)
Exercise 33B. Show that for primes p, the cyclotomic polynomial cp (x)
is equal to
p−1
,
0
xi = (xp − 1)/(x − 1) .
Deduce that cp (x + 1) has all but the top coefficient divisible by p, and
the bottom one not divisible by p2 . Prove that an integer polynomial with
these divisibility properties is irreducible in Z[x]—Eisenstein’s Criterion,
given also at the end of 20. (Argue as in the proof of Gauss’ Lemma 20.1.)
Conclude that cp (x) is irreducible in Q[x]. (This is somewhat simpler than the
given proof of 33.1, don’t you think?)
34. Primitive elements exist in characteristic zero.
A simple extension may be written F (γ) ⊃ F for various choices of γ,
each
for the extension. For example,
√ of
√ which is called a primitive element
√ √
2 + 3 is a primitive element for Q( 2, 3) ⊃ Q : We have
√
√
√
√
√
( 2 + 3)3 − 9( 2 + 3) = 2 2
√
√
√
yielding 2 ∈ Q( 2 + 3). Thus
√
√
√
√
√
√
3 = ( 2 + 3) − 2 ∈ Q( 2 + 3),
√ √
√
√
giving Q( 2, 3) ⊂ Q( 2 + 3) , the opposite inclusion being obvious.
Theorem 34.1. In characteristic zero, any finite extension K ⊃ F is
simple.
126
Remarks. At first glance the theorem seems both surprising √
and per√
haps also not especially useful. [For most purposes,
√ the
√ notation Q( 2, 3)
says more about what it denotes than does Q( 2 + 3).] But later we’ll use
the theorem several times in crucial situations. It becomes less surprising
several sections hence when, as a byproduct of the fundamental theorem of
Galois theory, we see the (perhaps even more surprising) fact that any finite
extension K ⊃ F in characteristic zero admits only finitely many intermediate fields E; that is, fields with F ⊂ E ⊂ K. (When [K : F ] > 2, there
are certainly infinitely many F -subspaces V with F ⊂ V ⊂ K.) Thus with
infinitely many γ ∈ K \ F (when K %= F ), it seems reasonable that each of
the intermediate fields, including K itself, should be F (γ) for infinitely many
γ.
First, here is an example to show that the assumption of characteristic
zero is not made merely because we can’t think of a proof without it. The
example necessarily involves infinite fields of characteristic p since we already
know that any extension involving finite fields is simple. Let K = Fp (α, β)
where {α, β} is algebraically independent over Fp , and let F = Fp (αp , β p ).
Then [K : F ] = p2 , an F -basis for K being
{ αi β j : 0 ≤ i, j < p } .
(Check this!) All γ ∈ K have the form
γ =
,
uij αi β j /
,
vij αi β j
;
uij , vij ∈ Fp .
Since (s + t)p = sp + tp in characteristic p,
γp =
,
uij αpi β pj /
,
vij αpi β pj ∈ F .
(We used that up = u in Fp , but this is irrelevant; any field of characteristic
p could replace Fp in this example.) Thus γ is algebraic of degree at most p
over F (in fact, of degree p or 1), and so K %= F (γ) for any γ.
Exercise 34A. In this example, show that there are infinitely many E
with F ⊂ E ⊂ K . HINT: Consider F (α + θβ) for θ ∈ F.
Challenge. Is there a logical connection between simplicity and finiteness of the set of intermediate fields? See Appendix B.
127
Proof of 34.1. By 26.6, K = F (α1 , α2 , . . . , αn ) for some αi ∈ K.
It therefore suffices to prove the theorem for finite extensions of the form
F (α, β) ⊃ F. [Then taking (α, β) = (αn−1 , αn ) and changing F to F (α1 , · · · , αn−2 )
yields K = F (α1 , . . . , αn−2 , γ) for some γ; now take (α, β) = (αn−2 , γ) to decrease the number of generators to n − 2; etc.]
It suffices to find λ ∈ F such that, if γ = α + λβ, then β ∈ F (γ). [For
then α = γ − λβ ∈ F (γ), so F (α, β) ⊂ F (γ); and the reverse inclusion is
obvious.]
Let α and β have minimal polynomials a(x) and b(x), respectively, over
F. Choose any λ ∈ F which disagrees with (β − β̃)−1 (α̃ − α), for all roots α̃
of a(x), and all roots β̃ %= β of b(x) [in some splitting field for a(x)b(x) over
F ]. The choice is possible, since F is infinite. Let
h(x) = a(γ − λx) ∈ F (γ)[x] .
Then
h(β) = a(γ − λβ) = a(α) = 0 ,
and, for all β̃ as above,
h(β̃) = a(γ − λβ̃) %= 0 ,
since
γ − λβ̃ = α + λ(β − β̃) %= α̃
for any root α̃ of a(x). Thus h(x) and b(x) have β as a common root, but
no other common roots in any extension of F (γ). The minimal polynomial,
m(x), of β over F (γ) therefore divides both h(x) and b(x). Being irreducible,
m(x) has distinct roots, since the characteristic is zero. These roots will
all be common to h(x) and b(x), and so m(x) has only one root [in fact
m(x) = x − β], and β ∈ F (γ), as required.
IV. Galois Theory
The main theoretical content here is in sections 35 and 38. In the former,
we introduce the Galois group, and derive enough theory to give a group theoretic condition which is necessary for a polynomial equation in one variable
to be solvable by radicals and field operations: namely, the condition that
this Galois group of the splitting extension for the polynomial is soluble. This
is used in the following two sections: firstly to prove that, when dealing with
polynomials of degree greater than 4, no formula of that form can exist for
any base field; and then to exhibit a specific rational polynomial whose roots
can’t be expressed in terms of its coefficients using only + , − , × , ÷ and
n√
. Then in 38 we give the ‘complete story’, a 1-1 correspondence between
intermediate fields of a given extension and subgroups of the corresponding
Galois group. Mostly we stick to the case of characteristic zero, but in such a
way that tacking on the extra subtleties for general characteristic goes quite
smoothly and efficiently. (See Appendix B.) Other applications of the Galois
correspondence include completing the proof concerning the constructibility
of regular n-gons, and giving another proof of the fundamental theorem of
(19th century) algebra. In Appendix C, it’s shown that solubility is also
sufficient for solvability.
35. The Galois group.
Galois’ big idea was, in essence, to introduce groups into field theory, as
follows.
Definition. For each field extension K ⊃ F , let AutF (K) denote the set
of those field isomorphisms θ : K → K for which θ(a) = a for all a ∈ F .
Such a θ is called an automorphism of K fixing elements of F .
Note. When F is the prime subfield, the condition θ(a) = a is easily
deducible, since θ(1) = 1.
Exercise 35A. Prove that the condition θ(a) = a is equivalent to F linearity of θ.
Proposition 35.1. Using composition as operation, the set AutF K becomes a group.
128
The Galois group
129
Proof. This is a routine verification. Check that AutF K contains all
three of : the inverse of such a θ, the composition of two such θ, and the
identity map of K.
Proposition 35.2. If h(x) ∈ F [x] and θ ∈ AutF K, then θ maps to itself
the set of those roots of h(x) which happen to be in K.
Proof. If h(x) = Σai xi , then for any b ∈ K,
θ(h(b)) =
,
θ(ai bi ) =
,
θ(ai )θ(b)i = h(θ(b)) ,
since θ(ai ) = ai . Thus h(b) = 0 implies that h(θ(b)) = 0.
Proposition 35.3. Each element of AutF (F (b1 , . . . , bk )) is uniquely determined by its values on b1 , . . . , bk ; i.e. if θ(bi ) = θ̃(bi ) for all i, then the
elements θ and θ̃ are equal.
Proof. For any ‘scalars’ aI = ai1 ,···,ik ∈ F , almost all zero, let
c =
,
I
aI bi11 · · · bikk .
Then
θ(c) =
,
I
aI θ(b1 )i1 · · · θ(bk )ik =
,
I
aI θ̃(b1 )i1 · · · θ̃(bk )ik = θ̃(c) .
Now every element of F (b1 , . . . , bk ) can be written as c−1 d for elements c and
d as above. Then
θ(c−1 d) = θ(c)−1 θ(d) = θ̃(c)−1 θ̃(d) = θ̃(c−1 d) .
Thus θ = θ̃.
Remark. It is immediate from the last two results that AutF (K) is a
finite group when the extension K ⊃ F is finite—for K is then generated
over F by finitely many algebraic elements, and there are only finitely many
possible elements to be images of each of them, by 35.2, so 35.3 allows only
finitely many possibilities for automorphisms.
The Galois group
130
Definition. For each g(x) ∈ F [x], the Galois group of g(x) over F is
defined to be the group AutF (K) for the splitting field K of g(x) over F .
This is a finite group.
Exercise 35B. Show that the Galois group of g(x) over F is independent
of the choice of splitting field. More precisely, any choice of an isomorphism
between two splitting extensions induces in a natural way an isomorphism
between the corresponding two automorphism groups.
The following is now immediate from 35.2 and 35.3, since the splitting
field is generated over F by the roots of g(x).
Theorem 35.4. Each element of the Galois group of g(x) restricts to a
self-bijection of the set of roots of g(x), and is completely determined by this
permutation of the roots.
Thus the Galois group becomes identified with a subgroup of Sk , the symmetric group, once a list, b1 , . . . , bk , of all distinct roots of g(x) is given. (So
k ≤ deg g, possibly strictly when g(x) is reducible or when the characteristic
is not zero—recall that an irreducible polynomial in characteristic zero has
no repeated roots).
Example. Let F = Q , g(x) = x3 −√
2 and ω = e2πi/3 . Then the splitting
field of x3 − 2 over Q is K = Q(ω , 3 2). Furthermore, [K : Q] = 6, a
Q-basis for K being
√
√
√
√
{ 1 , ω , 3 2 , 3 2ω , 3 4 , 3 4ω } .
√
√
Complex conjugation fixes 1 , 3 2 and 3 4, and, in the other three basis
elements, replaces ω by ω 2 = −1 − ω. So it maps K to itself. This gives an
element of AutQ (K) which acts as a 2-cycle
√
√
√
{ 3 2ω ↔ 3 2ω 2 and 3 2 ←⊃ }
on the set of roots of x3 − 2. To show that AutQ (K) is as large as 35.4
allows
√ (i.e.3 √is isomorphic
√ 2to S3 ), it remains only to see that the 2-cycle
3
3
{ 2 ↔
2ω and
2ω ←⊃ } can be realized as the restriction of an automorphism θ of K (because S3 is generated, for example, by { (12), (23) }).
The Galois group
131
Such a θ would have to fix 1 (of course) and to interchange ω and ω 2 , since
; √
<
√
√
√
√
√
θ(ω) = θ (3 2)−1 (3 2ω) = θ(3 2 )−1 θ(3 2ω) = (3 2ω)−1 (3 2) = ω 2 .
Since ω 2 = −ω − 1 and
bijection
3
√
4 maps to
3
√
4ω 2 , it follows that θ is the Q-linear
√
√
a0 + a1 ω + (b0 + b1 ω)(3 2) + (c0 + c1 ω)(3 4) 9→
√
√
[(a0 − a1 ) − a1 ω] + [b1 + b0 ω](3 2) + [−c0 + (c1 − c0 )ω)](3 4) .
A slightly tedious calculation shows that this map is indeed ‘homomorphic’
with respect to multiplication. Alternatively S3 is also generated by any set
{ 2-cycle, 3-cycle }. So instead of worrying about θ, one may realize a 3cycle as the restriction of an automorphism φ as follows. Note that x3 − 2
is irreducible also in Q(ω)[x]. Now take φ to be the unique Q(ω)-linear
isomorphism
√
√
Q(ω) (3 2) −→ Q(ω)(3 2ω)
√
√
mapping 3 2 to 3 2ω, as given by the simple extension principle.
Exercise 35C. Carry out all of the details for the existence of θ and φ
immediately above.
This example shows that the Galois group is in general not abelian. A
famous problem, still unsolved in 1994, asks whether every finite group occurs
(up to isomorphism) as the Galois group over Q of at least one g(x) ∈ Q[x].
Every finite abelian group does occur (see 38DIII). Quite a few of them
occur as follows:
The Galois group of xn − 1 over Q is isomorphic to Z×
n , the group, of order
Φ(n), of invertibles in the ring Zn (see 35D below).
The structure of Z×
n as a product of finite cyclic groups was determined earlier
in Section 15.
To see how both (abelian) groups of order 4 occur, here are details for
the cases n = 5 and n = 12. We have Φ(5) = 4 = Φ(12).
Let n = 5 and ω = e2πi/5 , so that K = Q(ω) has basis { 1, ω, ω 2 , ω 3 }
over Q. By 35.3, each element of AutQ (Q(ω)) is determined by its value on
ω. This value lies in { ω, ω 2 , ω 3 , ω 4 }, which is the set of roots of 1 + x +
The Galois group
132
x2 + x3 + x4 = c5 (x), of which K is also the splitting field. Thus the Galois
group here has at most four elements. If there were a field automorphism θ
with θ(ω) = ω 2 , then
θ(ω 2 ) = (θ(ω))2 = (ω 2 )2 = ω 4 =
θ(ω 3 ) = (ω 2 )3
= ω6
= ω;
4
2 4
3
θ(ω ) = (ω )
= ω .
− 1 − ω − ω2 − ω3 ;
As in the last example, the values of θ(ω i ) for 0 ≤ i ≤ 3 determine a Q-linear
map which may be checked directly to be in AutQ (K). In this instance, a
much simpler argument for the existence of θ is that, by the simple extension
principle, it is the unique isomorphism from Q(ω) to Q(ω 2 ) [which happens
to equal Q(ω)], mapping ω to ω 2 , using the fact that, over Q, the numbers
ω and ω 2 have the same minimal polynomial, c5 (x). The above calculations
show that, on the set of roots of c5 (x), the restriction of θ is the 4-cycle
( ω 9→ ω 2 9→ ω 4 9→ ω 3 9→ ω ) .
And so the Galois group is { e, θ, θ2 , θ3 }, which is cyclic of order four. It is
3
therefore isomorphic to Z×
5 , as was claimed. It follows that θ acts as another
2
4-cycle, whereas θ acts as a product of two 2-cycles {ω ↔ ω 4 , ω 2 ↔ ω 3 }.
Now let n = 12 and ω = e2πi/12 , so that K = Q(ω) has Q-basis { 1, ω, ω 2 , ω 3 }.
This time the Galois group (of both c12 (x) = x4 − x2 + 1 and x12 − 1 ) should
be thought of as a permutation group on the set { ω, ω 5 , ω 7 , ω 11 } of all
the roots of c12 (x). It has order four again, with elements determined by
φ(ω) = ω 5
;
ψ(ω) = ω 7
;
(φ ◦ ψ)(ω) = φ(ω 7 ) = (ω 5 )7 = ω 11 .
This time each of φ, ψ and φ ◦ ψ acts as a product of two disjoint transpositions: for example, φ is {ω ↔ ω 5 ; ω 7 ↔ ω 11 }. Thus we have
φ2 = ψ 2 = e
and
;
φ◦ψ = ψ◦φ ,
AutQ (K) = { e, φ, ψ, φ ◦ ψ } ∼
= C2 × C2 ∼
= Z×
12 .
Exercise 35D. Let F be any field and let ω be a primitive nth root of
unity in some extension field. Define a map
Γ : AutF (F (ω)) −→ Z×
n
The Galois group
133
by Γ(θ) = [k]n when θ(ω) = ω k . Using the routine established in the above
examples, prove that Γ is well-defined; a morphism of groups; and injective.
When F = Q, use Gauss’ theorem 33.1 on the irreducibility of cn (x) in Q[x]
to prove that Γ is also surjective.
Doing this simple exercise establishes the previous claim, and also the
following, whose proof we include since the result is crucial later.
Proposition 35.5. For any F and any root, ω, of unity in an extension
field, the group AutF (F (ω)) is abelian; i.e. the Galois groups over all fields
of all the polynomials xn − 1 are abelian.
Proof. For two elements φ and ψ, let φ(ω) = ω k and ψ(ω) = ω # . Then
(φ ◦ ψ)(ω) = φ(ω # ) = φ(ω)# = (ω k )# = (ω # )k = (ψ ◦ φ)(ω) .
Since group elements here are determined by their effects on ω, we have
φ ◦ ψ = ψ ◦ φ, as required.
The reader may have noticed that, in all of the examples so far, we have
|AutF (K)| = [K : F ]. This is always true when K is a splitting field over
F . It can be helpful when trying to calculate a Galois group (usually not an
easy task), but we’ll delay its proof to the
√ section after next. It is not true
for general √
K. For example |AutQ (Q(3 2 ))| = 1, since any group element
√
3
must map 3 2 to itself [neither of the other roots of x3 − 2 being in Q(
2)
√
], and since group elements here are determined by their effects on 3 2.
Another family of examples where abelian Galois groups occur is for xn −λ
over E, where E contains both λ and a primitive nth root, ω, of unity. The
splitting field is E(µ) for any µ with µn = λ, since the list of roots of xn − λ
is then µ, ωµ, ω 2 µ, · · · , ω n−1 µ.
Proposition 35.6. With the above data, the group AutE (E(µ)) is
abelian.
Proof. Any element φ of the Galois group permutes the above roots,
and is determined by its effect on µ since φ(ω k µ) = ω k φ(µ) (or, more simply,
The Galois group
134
since µ generates E(µ) over E). For elements φ and ψ, let φ(µ) = ω i µ and
ψ(µ) = ω j µ. Then
(φ ◦ ψ)(µ) = φ(ω)j φ(µ) = ω j ω i µ = ω i ω j µ = (ψ ◦ φ)(µ) ,
so φ ◦ ψ = ψ ◦ φ, as required.
Exercise 35E. By considering the least i > 0 for which there exists φ
with φ(µ) = ω i µ, show that the Galois group in 35.6 is actually cyclic.
Note how the first example (the splitting field of x3 − 2 over Q) is built
up in two steps corresponding to the two previous propositions:
√
F = Q ; ω = e2πi/3 ; E = Q(ω) ; µ = 3 2 ;
Q ⊂ Q(ω) ⊂ Q(ω, µ) = K .
This tower yields a subnormal series with abelian quotients,
S3 " A3 " {e},
when L 9→ AutL (K) is applied.
Let us return to the theory to generalize this, giving a minimal route to
understanding the famous assertion of Ruffini, first proved in all detail by
Abel: equations of degree greater than four cannot in general be solved using
only ‘radicals’ (nth roots) and field operations.
Proposition 35.7. A tower
F0 ⊂ F1 ⊂ F2 ⊂ · · · ⊂ Fn
of field extensions yields a decreasing sequence
AutF0 (Fn ) ⊃ AutF1 (Fn ) ⊃ · · · ⊃ AutFn (Fn ) = {e}
of groups.
The Galois group
135
Proof. This has been stated in the form in which it usually arises, but
the content is just the case n = 2: if F0 ⊂ F1 ⊂ F2 , then AutF1 (F2 ) is a subgroup of AutF0 (F2 ), and AutF2 (F2 ) is the trivial group. These are obvious
from the definitions.
N.B. From here onwards, except where the contrary is mentioned explicitly, all fields will have CHARACTERISTIC ZERO and all EXTENSIONS ARE FINITE.
The proof of the next theorem shows that the existence of primitive elements can be useful. This theorem would definitely need to be modified if
we were allowing infinite extensions and/or arbitrary characteristic. It singles out a class of extensions named because of the connection with normal
subgroups given in 35.9ii)a) ahead.
Definition. A normal extension is one for which any one (and therefore
all) of the conditions in the following theorem hold.
Theorem 35.8. Given K ⊃ F , a finite extension in characteristic zero,
the following conditions are equivalent.
i) For all α ∈ K \ F , there exists θ ∈ AutF (K) with θ(α) %= α.
ii) For each irreducible g(x) ∈ F [x], either all the roots of g(x) are in K,
or none of them are.
iii) For all α such that K = F (α), the splitting field over F of the minimal
polynomial of α over F is K.
iv) The field K is the splitting field over F of at least one polynomial
g(x) ∈ F [x].
v) If F ⊂ K ⊂ L, then all F -linear field maps φ : K → L satisfy φ(K) =
K.
Remarks. a) It follows from v) that for all ψ ∈ AutF (L), we have
ψ(K) = K. Does this imply v)?
b) In v), it is not important whether L ⊃ K is restricted to finite extensions or not: the unrestricted v) will be deduced, and the restricted v) will
The Galois group
136
be assumed, in the following proof.
Proof. i) ⇒ ii): Assume that g(x) has a root in K and let α1 , · · · , αr be
all such (distinct) roots which are in K. Let
h(x) = (x − α1 )(x − α2 ) · · · (x − αr ),
so that h(x) divides g(x) in K[x]. It suffices to prove that h(x) ∈ F [x],
since irreducibility there of g(x) then implies that h(x) and g(x) are associates, as required. (Recall that g(x) = h(x)k(x) with g(x) and h(x) in F [x]
implies that k(x) ∈ F [x].) But the coefficients, ±ei (α1 , · · · , αr ), of h(x),
where ei is the ith elementary symmetric polynomial (see 22), are fixed by
all θ ∈ AutF (K), since θ permutes {α, · · · , αr }. By i), these coefficients
must be in F , as required.
ii) ⇒iii): If g(x) denotes the minimal polynomial of α over F , then by
ii), all of its roots are in K, which is generated over F by one root, and so
K is the splitting field of g(x).
iii) ⇒iv): This is immediate from the existence of primitive elements,
proved in the previous section.
iv) ⇒v): The set of roots in K of any h(x) ∈ F [x] is mapped by φ to
itself, by the argument proving 35.2. Now choose any h(x) such that K is
its splitting field. Then φ maps a set of generators over F for K to itself,
and therefore maps all of K to itself.
v) ⇒ ii) ⇒i): The proof of 35.8 is concluded by two arguments using the
isomorphism extension properties.
Assuming v), suppose given an irreducible g(x) in F [x] with a root, α, in
K. Let β ∈ M be any root of g(x) in any extension M of K. We must prove
that β ∈ K. Construct a commutative diagram where maps φ1 and φ send
α to β :
# F (α)
# K
F
$ φ
id -φ1
.
$
# F (β)
# M
#L
F
The simple extension principle, 28.5, gives φ1 uniquely. The general extension principle, 28.6, gives a finite extension L ⊃ M and a field map φ. Since
K ⊂ M , we have K ⊂ L, and thus φ(K) = K by v). Since α ∈ K and
φ(α) = β, we get β ∈ K, as required.
The Galois group
137
Assuming ii) and given α ∈ K \ F , let g(x) ∈ F [x] be the minimal
polynomial of α over F . Choose another root β %= α. This can be done since
α ∈
/ F , so the degree of g(x) is not 1 , and its roots are distinct by 29.4
because the characteristic is zero. By ii), the element β is also in K. Now
construct a commutative ladder of isomorphisms mapping α to β :
F
id F
# F (α)
θ
-1
# F (β)
# K
θ
-
# K
This clearly suffices to prove i), since θ(α) = β %= α. The existence of θ1
is clear from the simple extension principle, 28.5. As for θ, note that since ii)
implies iv), K is a splitting field for some h(x) ∈ F [x] over F . Therefore it is
a splitting field for the same polynomial h(x) over F (α) and also over F (β).
So the existence of θ is immediate from 28.8, the splitting field uniqueness
principle.
Exercise 35F. Prove that K ⊃ F is a normal extension if and only if
there exists an extension L ⊃ K for which v) holds with L ⊃ F normal.
Remark. This completes the proof of 35.8. But here is a proof that
K in the last paragraph is a splitting field [i.e. a proof that ii) implies
iv) ] which doesn’t use the existence of primitive elements. Write K as
F (α1 , · · · , α# ), and let hi (x) be the minimal polynomial of αi over F . Let
h(x) = h1 (x) · · · h# (x). By ii), K contains all roots of each hi (x), and so,
of h(x). It is generated over F by (a subset of) these roots, completing the
proof.
This, and examination of the proof of 35.8, shows that, without the
assumption of characteristic zero, the conditions in 35.8 satisfy
i) =⇒ ii) ⇔ iv) ⇔ v) =⇒ iii) .
In arbitrary characteristic, the words normal extension usually mean ii), iv),
v), whereas Galois extension refers to the stronger condition i). See the
remarks beginning Section 38. Now let’s return to the world of characteristic
zero.
Remark. It is an unfortunate fact that when E ⊃ F and K ⊃ E are
both normal, it may happen that K ⊃ F is not normal.
The Galois group
138
√
√
Exercise 35G. Show that Q ⊂ Q( 2) ⊂ Q(4 2) is an example illustrating the remark above, first showing that any extension of degree 2 is
normal.
Exercise 35H. The above example shows that there must be something
wrong with the following ‘proof’ that
{E ⊃ F and K ⊃ E both normal implies that K ⊃ F is normal}
Find the error :
Let α ∈ K \ F . By 35.8i), we need only find θ ∈ AutF (K) with θ(α) %= α. If
α ∈ K \ E, then 35.8i) yields θ ∈ AutE (K) with θ(α) %= α, as required, since
K ⊃ E is normal. If α ∈ E \ F , choose φ ∈ AutF (E) with φ(α) %= α. Since
K is a splitting field over E [by 35.8iv)], use the uniqueness of splitting fields
to extend φ to an automorphism θ from K to itself, as required !!?
Proposition 35.9. Given a ‘short’ tower F ⊂ E ⊂ K of finite extensions
in characteristic zero, we have the following.
i) If K ⊃ F is normal, then so is K ⊃ E .
ii) If K ⊃ F and E ⊃ F are normal, then
a) AutE (K) is a normal subgroup of AutF (K), and
b) restricting defines an isomorphism of groups
AutF (K)/AutE (K) −→ AutF (E) .
Proof. i) This is immediate from either iv) or v) of 35.8.
ii) If θ ∈ AutF (K), then θ maps E to E by 35.8v), and so restriction
defines a function
AutF (K) −→ AutF (E) ,
which is evidently a morphism of groups. Its kernel is by definition AutE (K),
proving a). The function is surjective, as required for b), since 28.8 allows
any φ ∈ AutF (E) to be extended to an automorphism of K [which is a
splitting field over E by part i) ].
Recall from Section 11 that a finite group G is soluble if and only if there
exists a subnormal series
G = G0 " G1 " G2 " · · · G#−1 " G# = {e}
such that each Gi /Gi+1 is cyclic. But it suffices to replace cyclic by abelian
as noted in 13F, using the structure theorem for finite abelian groups.
The Galois group
139
It is now easy to see that, for all F of characteristic zero, all λ ∈ F and
all n ≥ 1, the Galois group over F of xn − λ is soluble : its splitting field K
tops a tower
F ⊂ F (ω) ⊂ F (ω, µ) = K ,
where µn = λ and ω is a primitive nth root of unity. Now F (ω) is the splitting
field over F of xn − 1, so 35.9ii) applies, giving
AutF (K) " AutF (ω) (K) " {e}
and
AutF (K)/AutF (ω) (K) ∼
= AutF (F (ω)) .
The latter group and AutF (ω) (K) are both abelian, by 35.5 and 35.6 respectively, and so AutF (K) is soluble (of ‘length’ ≤ 2), as required.
We wish to generalize this to equations which can be solved using only
various nth roots and the field operations. Before proceeding, here is a small
technicality needed later.
Definition. Given an (algebraic, but we’re assuming finite, remember?)
extension K ⊃ F and elements α, β ∈ K, we say that α and β are conjugate
over F if and only if they have the same minimal polynomial over F .
Lemma 35.10. i) With notation as above, if α and β are conjugates,
then so are αn and β n for each n ≥ 1.
ii) Given F ⊂ E ⊂ K, with α and β in K being conjugate over F , suppose
that E ⊃ F is normal and αn ∈ E for some n ≥ 1. Then β n ∈ E.
Proof. i) Let f (x) be the minimal polynomial of α (or β) over F . Let
g(x) be the minimal polynomial of αn over F , and let h(x) = g(xn ). Then
h(α) = g(αn ) = 0, so f (x) divides h(x) in F [x]. Thus
g(β n ) = h(β) = 0 ,
and so g(x) is also the minimal polynomial of β n , as required.
ii) Since E ⊃ F is normal, and g(x) has root αn ∈ E, we see that β n is
also in E by 35.8ii).
The Galois group
140
Definition. Let F0 be a field of characteristic zero. A polynomial g(x) in
F0 [x] is said to be solvable by radicals over F0 if and only if there is a tower
of field extensions
F0 ⊂ F1 ⊂ F2 ⊂ · · · ⊂ Fm ,
for some m, such that
i) g(x) splits in Fm —equivalently, the splitting field of g(x) over F0 is a
subfield of Fm ;
ii) for each i > 0, we have Fi = Fi−1 (µi ) for some µi ∈ Fi such that
µni i ∈ Fi−1 for some ni > 0—equivalently, each Fi is generated over Fi−1 by
a root of xn − λ, where n > 0 and λ ∈ Fi−1 both depend on i.
Another formulation of this definition : F0 (µ1 , · · · , µm ) contains a splitting
field over F0 for g(x), for some elements µi , where µi has a positive power
in F0 (µ1 , · · · , µi−1 ) for each i.
This definition is the mathematically usable version of the statement that
all the roots of g(x) can be ‘expressed using only nth roots for various n, field
operations, and elements of F0 ’. Subsuming the example after the reminder
of the definition of soluble group, next comes the main theorem used to show
that some equations g(x) = 0 of degree greater than four are not susceptible to such ‘radical solutions’. The following two sections will go into detail
on this. The converse of the next theorem is also true, its proof using the
fundamental theorem three sections ahead. The converse has more ‘positive’ applications (neutralize nattering nabobery!). For example, Galois
theory can be used to give a systematic derivation of the ‘radical formulae’ for solving cubics and quartics. See Stewart, pp.161–164; Goldstein,
pp.316–322; Artin, pp.560–565; or Appendix C of this book.
Note that the phrase “over F0 ” in the definition above is very important.
If F0 is C or R, it follows from the fundamental theorem of (19th century)
algebra that every g(x) ∈ F0 [x] is solvable by radicals over F0 . But if g(x) ∈
Q[x], it may very well not be solvable by radicals over Q, as we shall see in
Section 37.
Theorem 35.11. If g(x) ∈ F0 [x] is solvable by radicals over F0 , then the
Galois group of g(x) over F0 is a soluble group.
The Galois group
141
Remark. Consistent, but not mutually consistent, usage of ‘soluble’ and
‘solvable’ will be found in British and U.S.ish (American?) texts, respectively.
Proof. Let K be the splitting field of g(x) over F0 . To prove, as required,
that AutF0 (K) is soluble, let µi , ni and
F0 ⊂ F1 ⊂ F2 ⊂ · · · ⊂ Fm
be as in the last definition. We’ll produce a new tower
F0 ⊂ L0 ⊂ L1 ⊂ · · · ⊂ Lr = L .
ASIDE. The first tower has two possible defects for the purposes of the
proof:
i) Possibly Fi−1 contains no primitive nth
i root of unity. If it does, then
by 35.6, the extension Fi ⊃ Fi−1 is normal and AutFi−1 (Fi ) is abelian.
ii) Possibly Fm ⊃ F0 is not normal. If it is, then
AutF0 (K) ∼
= AutF0 (Fm )/AutK (Fm )
by 35.9ii).
If neither defect is present, then the desired solubility of AutF0 (K) follows
from that of AutF0 (Fm ) by ii). The latter follows from i) since, by 35.9ii),
AutFi−1 (Fm )/AutFi (Fm ) ∼
= AutFi−1 (Fi ) .
These arguments are used at the end of the proof below.
There are a number of texts in use where the present material seems to
be done in a quite efficient manner, even relative to what is here, just above
and below. Since this book prides itself on being a “bare bones”, succinct
presentation, the following explanation must be given. There are at least two
such texts where one or both of points i) and ii) above are completely missed;
a third, better known, text in which both i) and ii) are explicitly built into
the definition of solvability by radicals; and a very famous text in which the
author has decided to assume that every field under consideration contains
all needed roots of unity—this gets around i), but leaves one with insufficient
theory to see that some quintics in Q[x] are not solvable by radicals. As for
ii), adjusting the tower to make the top field normal over the bottom is left
as an exercise.
The Galois group
142
Definition. Any extension Fm ⊃ F0 as in ii) of the last definition is
called a radical extension. (This bears some analogy to the iterated quadratic
extensions which occurred in the study of geometrical constructions. The
existence of numbers relevant to such constructions and not in any iterated
quadratic extension proves non-constructibility results. The existence of algebraic numbers not in any radical extension will prove the non-existence of
‘radical formulae’ for solving polynomial equations.)
Lemma 35.12. If Fm ⊃ F0 is any radical extension, then there is an
extension L ⊃ Fm such that L ⊃ F0 is both radical and normal. Furthermore, this last extension can be chosen so that it has a radical tower whose
successive extensions all have abelian Galois groups.
The Galois group
143
Proof. Let gi (x) be the minimal polynomial of µi over F0 . Let n be
n1 · · · nm (or any common multiple of { n1 , · · · , nm }). Let L be a splitting
field of (xn − 1)g1 (x) · · · gm (x) over F0 which contains Fm . In other words,
L is the field generated over F0 by ω, a primitive nth root of unity, together
all the µi and all their conjugates over F0 . List the roots ν1 , · · · , νr of the
product g1 (x) · · · gm (x), in such a way that µ1 and all its conjugates occur
first, then µ2 and all of its conjugates, etc.
Now define L0 := F0 (ω), and, inductively,
Lj := Lj−1 (νj ) = F0 (ω, ν1 , ν2 , · · · , νj ) .
This gives the promised tower, with L ⊃ F0 certainly normal.
Now L0 is the splitting field of xn − 1 over F0 , so L0 ⊃ F0 is normal and,
by 35.5, the group AutF0 (L0 ) is abelian.
Fix j > 0, and let i be such that νj is µi or one of its conjugates over F0 .
For some k < j, the field Lk is generated over F0 by ω plus all of the µt and
their conjugates over F0 for t < i, i.e. it is the splitting field over F0 for the
product (xn − 1)g1 (x) · · · gi−1 (x). Since µni i ∈ Fi−1 ⊂ Lk , we have by 35.10
that νjni ∈ Lk ⊂ Lj−1 . Thus, since Lj−1 contains a primitive nth
i root of unity
(namely, some power of ω), it follows that Lj is the splitting field over Lj−1
for xni − νjni . By 35.6, the group AutLj−1 (Lj ) is abelian.
Continuation of the proof of 35.11. The proof is now completed (as
in the aside) by using 35.9 several times, as follows. The g(x)-splitting field
K is a subfield of L, since Fm ⊂ L. The short tower F0 ⊂ L0 ⊂ L of normal
extensions gives an abelian quotient
AutF0 (L)/AutL0 (L) ∼
= AutF0 (L0 ) .
Each short tower Lj−1 ⊂ Lj ⊂ L of normal extensions gives an abelian
quotient
AutLj−1 (L)/AutLj (L) ∼
= AutLj−1 (Lj ) .
Since the subnormal series
AutF0 (L) " AutL0 (L) " AutL1 (L) " · · · " AutLm (L) = {e}
has abelian quotients, it is immediate that AutF0 (L) is soluble. The short
tower F0 ⊂ K ⊂ L of normal extensions gives
AutF0 (L)/AutK (L) ∼
= AutF0 (K) .
The general equation of degree n
144
By 11F, the group AutF0 (K) is soluble, since it is the image of a soluble
group under a morphism of groups. This is what we had to prove.
Are you convinced that we have captured what you thought it meant for
an equation to be solvable by radicals? The only objection that I can think
of is that we are requiring all the roots, rather than just one root, to lie in a
radical extension. Let’s say that the phrase ‘g(x) has a solution by radicals’
means the latter. Then 35.12 gives an easy solution to the following.
Exercise 35I. Show that :
i) an irreducible polynomial has a solution by radicals if and only if it is
solvable by radicals;
ii) a polynomial has a solution by radicals if and only if at least one of its
irreducible factors is solvable by radicals;
iii) a polynomial is solvable by radicals if and only if all of its irreducible
factors are solvable by radicals.
36. The general equation of degree n.
Given any field F of characteristic 0, and an integer n > 0, form the field
F0 = F (a0 , a1 , · · · , an−1 ), where { a0 , · · · , an−1 } is algebraically independent
over F , and let
g(x) = xn + an−1 xn−1 + · · · + a1 x + a0 ∈ F0 [x] .
One refers to g(x) = 0 as the general equation of degree n over F. We are
thinking of the ai as ‘formal symbols’, so to speak. Let K be a splitting field
for g(x) over F0 . Of course, F0 ⊃ F and K ⊃ F are not finite extensions.
Write
(∗)
g(x) = (x − r1 )(x − r2 ) · · · (x − rn )
in K[x], so that K = F0 (r1 , · · · , rn ).
Theorem 36.1. The Galois group of g(x) over F0 is isomorphic to the
symmetric group Sn .
The general equation of degree n
145
Remark. Since Sn is not soluble for n ≥ 5 by 12.1, it follows that
g(x) = 0 is not solvable by radicals for n ≥ 5. Informally, this means that
there can be no formula, involving only + , − , × , ÷ , k th roots
and the coefficients, for solving all polynomial equations of a given degree
(5 or greater), even for cleverly chosen F. The first accepted proof of this
famous result was found by Abel for degree 5. It put an end to 2,500 years of
attempts to find such formulae (except by a few cranks). Ruffini had earlier
produced a proof which turned out to also be substantially complete and
correct.
Proof. Multiplying out the right hand side of (∗) gives
ai = ± en−i (r1 , · · · , rn )
(∗∗)
by 22.4, so ai ∈ F (r1 , · · · , rn ), yielding
K = F (a0 , · · · , ai−1 , r1 , · · · , rn ) = F (r1 , · · · , rn ) .
The proof is completed by showing that { r1 , · · · , rn } is algebraically
independent over F, since then, permutations σ ∈ Sn give automorphisms of
K by
g(r1 , · · · , rn )/h(r1 , · · · , rn ) 9→ g(rσ(1) , · · · , rσ(n) )/h(rσ(1) , · · · , rσ(n) ) .
Since these permute the set {r1 , . . . , rn }, they fix each ai by (∗∗), and so fix
all elements of F0 , as required.
Suppose then that h(r1 , · · · , rn ) = 0, where the polynomial
h(x1 , · · · , xn ) ∈ F [x1 , · · · , xn ] must be shown to be zero. Let
s(x1 , · · · , xn ) =
$
σ∈Sn
h(xσ(1) , · · · , xσ(n) ) ∈ SymmF [x1 , · · · , xn ] .
Using 22.3, there is a unique s• (x1 , · · · , xn ) in F [x1 , · · · , xn ] with
s(x1 , · · · , xn ) = s• (e1 (x), e2 (x), · · · , en (x)) .
Then 0 = s(r1 , · · · , rn )
(since h(r) = 0)
= s• (e1 (r), · · · , en (r)) = s• (±an−1 , · · · , ±a0 ) .
Radically unsolvable over Q
146
Since the coefficients of s• are in F, and { ± a0 , · · · , ± an−1 } is algebraically independent over F, we get s• (x1 , · · · , xn ) = 0 in F [x1 , · · · , xn ].
Thus s(x1 , · · · , xn ) = 0, and so h(x1 , · · · , xn ) = 0, as required.
Exercise 36A. Show that if some of the elements in an algebraically
independent set are replaced by their negatives, the set remains algebraically
independent (as used just above).
37. Radically unsolvable over Q.
The non-existence of a general formula leaves open the possibility that,
over some fields, every equation might be solvable by radicals, but involving
such a plethora of special cases that they couldn’t be welded together into
a ‘single formula’. Over any algebraically closed field, such as C or the
algebraic numbers, this is exactly what happens, since there’s no need to
build any tower at all. And over R, one has C = R(i) with i2 = −1 ∈ R, so
again, any g(x) ∈ R[x] is solvable by radicals over R.
Exercise 37A. Let B ⊃ A be a radical extension. Show that a polynomial
in A[x] being solvable by radicals over A and over B are equivalent.
But over Q and many other fields, one can write down explicit polynomials g(x) which are not solvable by radicals. Below is one of degree 5 over
Q. First here is a result which we could have proved earlier but didn’t need
then.
Theorem 37.1. If K ⊃ F is normal, then
|AutF (K)| = [K : F ] .
Note. The proof applies to finite normal extensions in characteristic
zero, or more generally, to any simple extension which is the splitting field of
some polynomial in F [x] which has no repeated root.
Exercise 37B. For the field K in the previous section, find an F0 -basis
(which must have “n!” elements by 37.1).
Proof. Choose a primitive element α, so that K = F (α). Let the
minimal polynomial p(x) over F of α have degree [K : F ] = n, say. Let
α = α1 , α2 , · · · , αn be the roots of p(x), which are distinct by 29.4, and
in K by 35.8 ii). It is clear, by considering degrees, that K = F (αi ) for all
The Galois correspondence
147
i. Given θ ∈ AutF (K), we have θ(α) = αi for some i, and this determines θ,
by 35.2 and 35.3, so
|AutF (K)| ≤ n .
But for each i, the simple extension principle yields an isomorphism from
K = F (α) to K = F (αi ) sending α to αi , since α and αi have the same
minimal polynomial p(x), so we have equality above.
Theorem 37.2. Let g(x) ∈ Q[x] be an irreducible of degree 5 with exactly two roots in C \ R [and therefore three real roots]. Then the Galois
group of g(x) over Q is isomorphic to S5 .
Remark. Since S5 is not soluble, such a g(x) is therefore not solvable by
radicals over Q. For example, x5 − 4x + 2 will fill the bill; elementary calculus
shows that it has exactly three real roots, and Eisenstein’s criterion (20A)
with p = 2 gives irreducibility.
Proof. Let { α, ᾱ, α1 , α2 , α3 } be the roots of g(x), with the αi real.
Complex conjugation maps the splitting field, K, of g(x) to itself [by 35.8v),
taking L to be C, and K to be the splitting field within C of g(x)]. This
gives an element of AutQ (K) which acts as a 2-cycle, α ↔ ᾱ , on the set of
roots of g(x). Now |AutQ (K)| = [K : Q] by the previous theorem, and so it
is divisible by [Q(α) : Q] = 5, which happens to be a prime. Thus AutQ (K)
contains an element of order 5 by the Cauchy theorem 11.2. Only cycles can
have prime order in Sn [see 4.8], so AutQ (K) also has an element which acts
as a 5-cycle on the set of roots. But S5 is generated by any { 2-cycle, 5cycle } by 6C. Thus AutQ (K) is the full permutation group of the roots of
g(x), as required.
38. The Galois correspondence.
Passing from fields to their automorphism groups has an inverse, obtained
by forming the field of elements fixed by each member of a given group.
The precise version of this in the theorem below has many applications to
polynomial equations and elsewhere in field theory. For example, it follows
immediately that there are only finitely many intermediate fields for any
finite extension in characteristic zero, since a finite group certainly has only
finitely many subgroups.
148
In other books, the theorem below is often proved before the application
to solvability by radicals, since it is needed for the converse to 35.11. We
have already proved at least half of it.
The theorem has important analogues both beyond field theory, and
within field theory to non-finite extensions and to characteristic p. As for
the latter, the reader might wish to check that our proof below applies to
finite extensions with no restriction on characteristic, as long as the assumption of normality is taken to mean condition i) in 35.8. This should be fairly
credible in view of the remark after the proof of 35.8, indicating that i) implies all the other conditions. Note however that 37.1 then would need a new
proof. The phrase Galois extension is often used in connection with 35.8i),
rather than ‘normal extension’. That the equivalent 35.8 conditions ii), iv)
and v) do not imply i) without an additional assumption (namely separability
: each element in K is the root of a polynomial in F [x] which has no repeated
roots—see Appendix B) can be seen by means of the following example.
If K is a field of characteristic p, then the map α 9→ αp is a field map
K → K, and therefore injective, but not necessarily surjective if K is infinite.
The example in Section 34 shows this, but, more simply, take
K = Zp (β) ⊃ Zp (β p ) = F,
where β is transcendental over Zp . Then K ⊃ F satisfies iv) in 35.8 since
K is the splitting field of xp − β p over F. [Note that, in K[x], we have
xp − β p = (x − β)p .] But the extension does not satisfy 35.8i) since, for any
θ ∈ AutF (K), we have β p = θ(β p ) = θ(β)p , so θ(β) = β, since x 9→ xp is
injective. Thus θ is the identity map, and 35.8i) fails for every α ∈ K \ F.
Less relevant here, iii) does not imply { ii), iv), v) } in 35.8 for characteristic p : one can find a finite extension which is neither simple nor a splitting
field—for example,
Z3 (α, β) ⊃ Z3 (α3 , β 3 )
where { α, β } is algebraically independent over Z3 .
Exercise 38A. Complete the details of this example.
To avoid the trouble of going back through our arguments carefully, we’ll
just state the theorem in characteristic zero.
149
Fundamental Theorem 38.1. The Galois Correspondence. Let
K ⊃ F be a (finite) normal extension in characteristic zero. Denote by
IN T (F, K) the set of fields E with F ⊂ E ⊂ K. Denote by SG(F, K) the
set of subgroups of AutF (K). Define
G = GFK : IN T (F, K) −→ SG(F, K)
by G(E) := AutE (K). Define
F = FFK : SG(F, K) −→ IN T (F, K)
by
F(Γ) := { α ∈ K : θα = α for all θ ∈ Γ }
(called the ‘fixed field of Γ’). Then:
i) the maps G and F are mutually inverse bijections;
"
"
ii) they reverse inclusions : E ⊂ E
⇐⇒ G(E) ⊃ G(E ) ;
iii) we have |G(E)| = [K : E] ;
iv) the extension E ⊃ F is normal if and only if G(E) is a normal subgroup
of G(F ), in which case
AutF (E) ∼
= G(F )/G(E) .
Remark. It follows from iii) that
[E : F ] = |G(F )|/|G(E)| .
Superficially more generally, the ‘relative distances’ in the towers
F ⊂ E1 ⊂ E2 ⊂ K
and
G(F ) ⊃ G(E1 ) ⊃ G(E2 ) ⊃ G(K) = {e}
are related as one would expect:
[E2 : E1 ] = |G(E1 )|/|G(E2 )| .
To see this, just write [E2 : E1 ] as [K : E1 ]/[K : E2 ], and apply iii). Alternatively, apply the Galois correspondence of K ⊃ E1 . In general
[E2 : E1 ] ≥ |AutE1 (E2 )| ,
150
by the first half of the argument in 37.1.
Exercise 38B. Deduce that, for a tower of fields F ⊂ E1 ⊂ E2 ⊂ K,
with K ⊃ F as in the theorem, we have :
E2 ⊃ E1 is normal ⇔ AutE2 (K) is a normal subgroup of AutE1 (K) .
Proof of the Galois correspondence theorem. To begin, it is a
routine verification to check that F is well-defined, i.e. that F(Γ) is a subfield
of K containing F.
(i) That F(G(E)) = E has already been proved: F(G(E)) ⊃ E is trivial,
and the reverse inclusion is immediate from condition i) of 35.8 for normal
extensions, K ⊃ E being such an extension by 35.9i). The non-obvious
inclusion for showing that G(F(Γ)) = Γ is G(F(Γ)) ⊂ Γ. Let E = F(Γ) and
[K : E] = n. Then
|G(F(Γ))| = |AutE (K)| = n
by 37.1, so it suffices to prove that |Γ| ≥ n. Write K = E(α) and define
g(x) =
$
{x − θ(α)} ,
θ∈Γ
a polynomial in K[x] which has degree equal to |Γ|. Then, for any θ0 ∈ Γ,
g(x) =
$
{x − θ0 θ(α)} ,
θ∈Γ
so θ0 fixes the coefficients of g(x). Since E = F(Γ), this implies that g(x) ∈
E[x]. But g(α) = 0, so g(x) is divisible by the minimal polynomial of α over
E. The latter has degree n, so degree(g(x)) ≥ n, as required.
(ii) This is obvious.
(iii) This is 37.1, once again using that K ⊃ E is normal.
(iv) In view of 35.9ii), it remains only to prove that if E ⊃ F is not a
normal extension (where E is a subfield of K), then AutE (K) is not a normal
subgroup of AutF (K). The contrapositive form of the proof in 35.8, that ii)
implies i), proves the existence of θ ∈ AutF (K) with θ(E) %= E : { Choose
an irreducible g(x) ∈ F [x] with roots α ∈ E and β %∈ E [since 35.8ii) fails
for the extension E ⊃ F ]; then construct
F ⊃ F (α) ⊃ K
||
↓ψ
↓θ
F ⊃ F (β) ⊃ K
151
where ψ and θ are isomorphisms mapping α to β.} Then, by i) of the present
theorem, Autθ(E) (K) %= AutE (K), so the following lemma completes the
proof (as well as reproving the converse, 35.9ii)a), and describing how conjugacy of subgroups in AutF (K) behaves.)
Lemma 38.2. For any normal extension K ⊃ F, intermediate field E,
and θ ∈ AutF (K),
Autθ(E) (K) = θ ◦ AutE (K) ◦ θ−1 .
Proof. It is a routine verification that θ ◦ φ ◦ θ−1 ∈ Autθ(E) (K) for any
φ ∈ AutE (K). The reverse inclusion follows either from the fact that the
two groups have the same order (by earlier parts of the fundamental theorem), or else from the just proved inclusion, replacing θ by θ−1 and E by θ(E).
Each of the Galois groups which we calculated earlier gives an illustration
of the fundamental theorem. See also Section 40 for another example.
152
x2 + 1 over R:
{e}
↓
{e, complex conjugation}
C
↑
R
More generally, the Galois correspondence for any quadratic (i.e. degree
2) extension takes this form. In fact this holds for any normal extension of
prime degree p (except that the group is cyclic of order p).
x5 − 1 over Q: (Here ω = e2πi/5 )
Q(ω)
↑
Q(ω 2 + ω 3 )
↑
Q
{e}
↓
2
{θ , e}
↓
{θ, θ2 , θ3 , e}
This example and the next one show the two possibilities which can occur
for normal extensions of degree 4 (since there are only two groups of order
4, up to isomorphism).
x12 − 1 over Q: (Here ω = e2πi/12 .).
{e}
Q(ω)
'
&
&
&
&
Q(ω + ω 5 )
/ %
$$
$
$
Q(ω 2 )
Q(ω + ω 11 )
$
%
/
'
$
&&
$
&
$
&
Q
&
&
0
{e, φ}
$
&
$
$
&
-
$
.
$
{e, ψ}
$
$
.
$
&
- 0
&
{e, φ ◦ ψ}
&
&
{e, φ, ψ, φ ◦ ψ}
153
Here we have
ω + ω 11
ω + ω 5 = − (ω 7 + ω 11 ) = ω 3 = i ;
√
= ω + ω̄ = 2Re(ω) = 3 = − (ω 7 + ω 5 ) ;
Q(ω 2 ) = Q(ω 4 ) .
x3 − 2 over Q: ( Here ω = e2πi/3 .)
Q(ω,3
√
2)
4
3
21
1
3
!5 6"
3
1
!
"
3
1
3
!
"
1
Q(ω)
{e}
&! " $
& ! " $
$
8"
0& !7
&
.
$
√
√
√
Q(3 2) Q(3 2ω) Q(3 2ω 2 ) A3 {e, (12)} {e, (13)} {e, (23)}
1
2
4
3
"6
!5
1
3
"
!
1
3
1 " ! 3
1
3
$
"
!
&
$ "
!
&
$$
. "8 7! 0
&&
Q
S3
(x2 − 2)(x2 − 3) over Q: Note that the ‘lattices’ for x12 −1 are isomorphic
√
√
2
to
+
2
to the
ones
here.
We
use
±
to
denote
the
automorphism
sending
√
√
and 3 to − 3; similarly for ∓ and =.
√ √
Q( 2, 3)
'
&&
&
&
√
Q( 2)
/ %
$$
$
$
√
Q( 3)
√
Q( 6)
$
%
/
'
$
&&
$
&
$
&
Q
{e}
&
&
0
{e, ±}
$
&
$
$
&
-
$
.
$
{e, ∓}
$
$
.
$
&
- 0
&
{e, =}
&
&
{e, ±, ∓, =}
154
√
√
√
x4 − 2 over Q: Here τ maps 4 2 to itself and maps 4 2i to −4 2i; and
√
√
√
√
σ maps 4 2 to 4 2i and maps 4 2i to −4 2 .
√
√
Q(4 2, 4 2i)
'
&
&
&
@?
?
/
11
2
BA
A
1
1
√
√
√
√
Q(4 2) Q(4 2i) Q(4 2(1 + i)) Q(4 2(1 + i)3 )
9
:
√
;;<
9
Q(i, 2)
;
9
9 ; '
&
%
/ $
√
Q( 2)
>==
=
=
&
&
√
Q( 2i)
{e}
9 Q(i)
:
'
&
9
&
9 / &
9 &
Q
'
&
&
)
+
) & $ ++
)
) &
&
$
0
. ++
$
H
)
,
{e, τ } {e, σ 2 τ }
"
"
"8
=
=
-
{e, σ 2 }
=G
0
&
{e, στ } {e, σ 3 τ }
. F;
$
;
;
!
!
!7!
{e, σ 2 , τ, σ 2 τ } {e, σ 2 , στ, σ 3 τ }
-
9 {e, σ, σ 2 , σ 3 }
D
9
D
9
D
C
9
E
D
D4 ∼
= {e, σ, σ 2 , σ 3 , τ, στ, σ 2 τ, σ 3 τ }
In the first example, one verifies that Q(ω 2 + ω 3 ) is the unique strictly
intermediate field by calculating that a0 + a1 ω + a2 ω 2 + a3 ω 3 is fixed by θ2
if and only if a1 = 0 and a2 = a3 (where each ai ∈ Q). The intermediate
fields in the other examples are similarly straightforward calculations using
the definition of the Galois group.
Large Exercise 38C. Do these calculations.
155
2πi
2πi
Suppose one wished to compute [K : Q], where K = Q(e 3 , e 5 ) , the
splitting field of (x3 − 1)(x5 − 1) over Q. The method of Section 26 narrows
the answer down to 4 or 8, using the following diagram, where the arrows are
labeled with information about the degree. The two inequalities each follow
2πi
2πi
from the label on the parallel arrow by 26A. Let λ = e 3 and ω = e 5 .
K
'
&
≤4 &
$
%
$ ≤2
$
$
&
&
Q(λ)
Q(ω)
$$
%
2$
$
&
'
&&
&4
Q
To verify that 8 is the answer amounts to proving that λ %∈ Q(ω). If it
were in Q(ω), the first example above shows that Q(λ) = Q(ω 2 + ω 3 ), so
λ = a0 + a2 (ω 2 + ω 3 ) = a0 + a2 µ (say) ,
with ai ∈ Q. Now
µ2 = ω 4 + 2ω 5 + ω 6 = (−1 − ω − ω 2 − ω 3 ) + 2 + ω = 1 − µ .
Thus
0 = λ2 + λ + 1 = (a20 + a22 + a0 + 1) + (2a0 a2 − a22 + a2 )µ .
It is now easy to see that no rationals a0 , a2 give zero for both bracketed
terms in the last expression.
Exercises 38D. I) Complete the last argument.
(II) Show that, over Q, the polynomial (x5 − 1)(x7 − 1) has splitting field
of degree 24.
(III) See Goldstein pp. 250-251 for a proof [using Gauss’ theorem 33.1
on the irreducibility of cm (x)] of the special case of Dirichlet’s theorem, which
156
says that there are infinitely many primes in every sequence 1+k, 1+2k, 1+
3k, · · · (for k > 0) . Using this, the fundamental theorem, and the fact that,
for a primitive nth root, ω, of unity,
AutQ (Q(ω)) ∼
= Z×
n ,
show that, for any finite abelian group Γ, there exists a ‘short’ tower of
normal extensions
Q ⊂ E ⊂ Q(ω)
[for some such ω] such that AutQ (E) ∼
= Γ.
Thus every finite abelian group is isomorphic to the Galois group over Q
of some polynomial. Some deeper related theorems are as follows.
Šafarevič (1954). Every finite soluble group is (up to isomorphism) the
Galois group over Q of some polynomial.
Hilbert (1892). For all n, the symmetric group Sn is the Galois group
over Q of some polynomial (in fact, · · · of some irreducible of degree n in
Q[x]).
The argument in 37.2 depends on 5 being a prime, but general n is harder.
The question of whether every finite group is the Galois group over Q of
some polynomial is a notorious unsolved problem (as of 1994).
Kronecker-Weber (1853, 1877). If E ⊃ Q is a (finite) normal extension with AutQ (E) abelian, then for some root, ω, of unity, the field E is a
subfield of Q(ω).
(Fermat prime)-gons are constructible
157
39. (Fermat prime)-gons are constructible.
Recall that Gauss’ theorem, 33.2, on the constructibility of regular ngons was not completely proved. It remained to show that if p = 2r + 1 is a
Fermat prime, then the regular p-gon is constructible; equivalently, · · · then
e2πi/p ∈ P. This is equivalent to finding a tower
Q = F0 ⊂ F1 ⊂ F2 ⊂ · · · ⊂ Fr = Q(e2πi/p ),
with each successive extension being of degree 2. Now Q(e2πi/p ) ⊃ Q is
normal, being the splitting extension for xp − 1. So, by the fundamental
theorem, this is equivalent to finding a descending series of subgroups,
AutQ (Q(e2πi/p )) = G0 ⊃ G1 ⊃ G2 ⊃ · · · ⊃ Gr = {e},
such that each group has index 2 in the previous one. But the group
AutQ (Q(e2πi/p )) is abelian (by 35.5) of order 2r (since the cyclotomic polynomial cp (x) = xp−1 + xp−2 + · · · + 1 has degree 2r ), so the existence of this
series of subgroups is an easy consequence of 13.2, which gives the structure
of finite abelian groups. In fact, the group at issue is cyclic, making the
descending series unique and its existence even more obvious.
Exercise 39A. Give the detailed proof that any abelian group whose
order is a power of 2 has such a tower of subgroups.
Large Exercise 39B. For p = 17 give explicit details of constructing
the tower of fields to show that the regular 17-gon can be produced with a
straight-edge and compass. Do it for p = 5 first, to warm up.
The other straight-edge & compass leftover is to complete the proof of
27.4. We should show that any root of x4 − 2x − 2 generates an extension
of Q which has no strictly intermediate fields. Now the polynomial has
Galois group S4 , by 39C below. That group has exactly four subgroups of
index four, each being the symmetric group on a three element subset of
{ 1, 2, 3, 4 }. None of these is contained in any subgroup of index two, since
A4 is the only subgroup of order 12 in S4 . This completes the proof, using
the Galois correspondence.
Exercise 39C. i) Show that the Galois group over F of an irreducible in
F [x] acts transitively on the set of roots of the polynomial; i.e. for any two
roots it has an element mapping the first to the second.
Automorphisms of finite fields
158
ii) Show that the only proper transitive subgroups of S4 which contain a
transposition are isomorphic to D4 . (To define such groups, place the integers
1 to 4 on the vertices of the square in the definition of D4 .)
iii) Artin (14.6.14, p.564) gives a cubic polynomial, the ‘resolvent’, whose
irreducibility would eliminate the D4 above as a possibility for the Galois
group of a given irreducible quartic over Q. Show that this cubic is x3 − 4
when the quartic is x4 − 2x − 2.
Since, as in the proof of 37.2, complex conjugation gives the needed
transposition, this exercise shows that x4 − 2x − 2 has Galois group S4 over
Q, as required. Hilbert’s result, getting Sn as a Galois group over Q, is done
differently, since the theory of quartics is special.
40. Automorphisms of finite fields.
n
Since Fpn is the splitting field of xp − x over Fp , the extension Fpn ⊃
Fp satisfies ii), iv) and v) of 35.8. In fact, it even satisfies i), i.e., it is a
Galois extension, as discussed before the fundamental theorem. Thus the
characteristic p version of that theorem would apply, in fact to any extension
Fpab ⊃ Fpa of finite fields.
Without appealing to the fundamental theorem, one can, as follows, directly calculate the group AutFpa (Fpab ), and verify the truth of the fundamental theorem in this instance by inspection. We have
|AutFp (Fpn )| ≤ n ,
since the extension can be written as a simple extension by an element of
degree n; and an automorphism is determined by its effect on that element,
for which there are at most “n” choices, the other roots of the relevant
minimal polynomial. (As it happens, the proof of 37.1, that
|AutF (K)| = [K : F ] ,
depended only on K ⊃ F being simple and being a splitting extension.
Therefore it applies to all extensions involving finite fields. So we have equality above, but this will be deduced more concretely below.) The Frobenius
self-map, θK : x 9→ xp of any field K of characteristic p, is evidently bijective
r
of order n when K = Fpn ; in fact θr maps each x to xp , and so has fixed
field Fps with s = GCD{n , r}. Thus
AutFp (Fpn ) ∼
= Cn
with generator θFpn ,
41. Galoisienne proof of the fundamental theorem of algebra
159
since the left hand side is a group of order at most n, and we have found an
element of that order. More generally,
AutFpa (Fpab ) ∼
= Cb
with generator (θF
pab
)a .
One sees this by noting that the claimed generator generates a cyclic group
of order b, which is the same order as that of the automorphism group, since
[Fpab : Fpa ] = b.
For example, here is the Galois correspondence for Fp12 :
Fp12
&
&'
'
&
&
&
Fp6
{e}
$
%
$
$
%
$
$
Fp3
/
Fp2
$
%
$
$
&
'
&
&
Fp
& $
&
0
Fp4
&
0
&
3
6
&
{e, θ6 }
9
{e, θ , θ , θ }
$
$
.
$
$
.
$
{e, θ4 , θ8 }
$
.
$
2
{e, θ , θ4 , θ6 , θ8 , θ10 }
&
0
&
&
{e, θ, θ2 , · · · , θ11 }
41. Galois theoretic proof of the
fundamental theorem of (19th century) algebra.
Using the fact that a real polynomial of odd degree has a real root, it
follows that
(I) R ⊃ R is the only extension of R with odd degree.
For, any element α in such an extension would give R(α) ⊃ R of odd
degree. But the minimal polynomial of α over R is irreducible in R[x], and
so it has degree 1. Thus α ∈ R, as required.
The fact, that any extension K ⊃ F of degree 2 has the form F (α) ⊃ F
where α2 ∈ F but α ∈
/ F , implies that
(II) C has no extension of degree 2.
160
2
iθ
For,
√ xiθ/2− z splits as (x − w)(x + w) in C[x], where, if z = re , then
w = re .
Finally, note that
(III) if K1 ⊃ R has degree 2, then K1 ∼
= C.
For, with K1 = R(α), where α2 ∈ R but√α ∈
/ R, then necessarily
α2 < 0.
√
Define φ : C → K1 by φ(a + ib) = a + (αb/ −α2 ) , where −α2 of course
means the positive real whose square is −α2 . It is easily checked that φ is an
isomorphism.
Combining Galois theory with some non-trivial group theory (but not
using anything dependent on the ‘Fundamental Theorem of Algebra’) yields:
Lemma 41.1. If K ⊃ R is a normal (finite) extension, then it has degree
at most 2.
It follows from (III) that K ∼
= R or C.
Proof. Let [K : R] = 2s q where q is odd.
First we’ll show that q = 1. By the Galois correspondence and the existence of Sylow subgroups (11.1), AutR (K) has a subgroup of order 2s which
has the form AutE (K) where R ⊂ E ⊂ K. But then the degree [E : R] is q
above, so q = 1 by (I).
Now since |AutR (K)| = 2s , there is a series of groups
AutR (K) = G0 " G1 " · · · " Gs = {e},
where each |Gi /Gi+1 | = 2, by Exercise 11D (solubility of p-groups). Applying the inverse Galois correspondence yields a tower
R = K 0 ⊂ K1 ⊂ · · · ⊂ K s = K ,
where each [Ki+1 : Ki ] = 2. Assume that s > 0. Then K1 ∼
= C by (III). But
then K2 cannot exist by (II), so s = 1, as required.
161
Corollary 41.2. C has no finite extensions other than itself; i.e., C is
algebraically closed.
Proof. Suppose that L ⊃ C is a finite extension. Let α ∈ L, let g(x) be
the minimal polynomial of α over R, and let K ⊃ R be a splitting extension
for (x2 + 1)g(x). Then K ∼
= C by 41.1, so g(x) splits in C[x] and α ∈ C,
as required.
Remarks. (1) It is possible to give a ‘rabbit out of hat’ proof using less
field theory than above and no group theory. One avoids the Sylow theorem
by a tricky induction and a magic formula.
(2) A real closed field R is one which has a linear order—[a transitive
relation < with exactly one of a < b, a = b, b < a holding for each pair
(a, b)] satisfying the axioms for an ordered field—[for all a, b and c, we have
a < b ⇒ a + c < b + c ; (0 < c & a < b) ⇒ ac < bc],
and such that:
i) each positive element has a square root in R—this implies that the
order is unique, once the field operations have been given, since then, a <
b ⇐⇒ b − a is a non-zero square; and
ii) each polynomial of odd degree in R[x] has a root in R.
Then x2 + 1 has no root in R. The proof in this section quickly generalizes
to show that R(i) is algebraically closed, when i2 = −1. An example of such
an R is R ∩ A, the field of real algebraic numbers (as is R itself, of course).
(3) A surprising fact is that C has infinitely many automorphisms outside
of the group
AutR (C) = { e , complex conjugation } .
These cannot map R to itself, since AutQ (R) = {e}.
Exercise 41A. Prove this last statement. First show that an automorphism would preserve positivity and therefore order.
It follows that C has subfields isomorphic to R other than R itself. These
extra automorphisms of C are not continuous. They may be proved to exist
by finding an intermediate field E between Q and C such that C ⊃ E is
algebraic and no element of E is algebraic over Q. Take E = Q(S) (the field
generated by Q ∪ S), where S is a maximal algebraically independent (over
Q) subset of C. (The existence of such an S follows from Zorn’s Lemma.)
Then E has many automorphisms, for example ones agreeing on S with any
Separability
162
given self-bijection of S. Now extend such automorphisms to automorphisms
of C, using the transfinite extension principle alluded to after the proof of
A10 in the appendix following Section 28.
APPENDIX B. Separability and the
Galois correspondence in arbitrary characteristic.
Definition. A polynomial is separable if and only if it has distinct roots;
that is, g(x) is separable if and only if GCD{g(x), Dg(x)} = 1.
This is independent of both the field in which we regard g(x) as having
its coefficients, and also of the extension in which it splits. If g(x) ∈ F [x],
then clearly g(x) is separable if and only if (I) its irreducible factors in F [x]
are separable, and (II) each occurs with exponent 1 in the factorization of
g(x). Essentially we’re only interested in separability of irreducibles, once the
base field F has been fixed. Note also that p1 (x)α1 · · · ps (x)αs has the same
splitting field over F as p1 (x) · · · ps (x), for distinct irreducibles pi (x) in F [x]
and αi ≥ 1.
Caution. As an element of Q[x], the polynomial x2 + 1 is separable; but
the element of Z2 [x] denoted similarly is not separable. Note also that some
authors define p1 (x)α1 · · · ps (x)αs as above to be separable as long as each irreducible factor pi (x) has no repeated roots. This seems to complicate things
a bit, since then separability would be a function of both the polynomial and
the field you are using.
Definition. Given K ⊃ F, an element α ∈ K is separable over F if and
only if its minimal polynomial over F is separable. (This depends on F, but
not on K).
We say that K ⊃ F is a separable extension if and only if all α in K are
separable over F.
For example, in characteristic zero, all K ⊃ F are separable. If β is
transcendental over Zp , then Zp (β) ⊃ Zp (β p ) is not separable: β is not
separable over Zp (β p ) since xp − β p = (x − β)p ; although β is certainly
separable over Zp (β). The minimal polynomial of β over Zp (β p ) is in fact
Separability
163
xp − β p , although all that we need to know here is that the former is a nonlinear factor of the latter.
Theorem B1. i) For any finite K ⊃ F, we have
|AutF (K)| ≤ [K : F ] .
More precisely, if K is written as F (α1 , · · · , α# ), with gi (x) being the minimal
polynomial of αi over F (α1 , · · · , αi−1 ), and if mi is the number of roots in K
of gi (x), then |AutF (K)| = m1 m2 · · · m# .
ii) If K is the splitting field over F of a separable polynomial in F [x],
then |AutF (K)| = [K : F ].
Remark. B1ii) is proved below, after the statements of the other three
theorems, and is all that we use for B4 below. The proof of B1i) is similar.
Theorem B2. For any finite K ⊃ F, the following are equivalent:
i) for all α ∈ K\F , ∃θ ∈ AutF (K) with θ(α) %= α
(i.e. 35.8i) holds; i.e. FG(F ) = F in 38.1, the fundamental theorem);
ii) the extension K ⊃ F is both normal and separable;
iii) the field K is the splitting field over F of some separable polynomial in
F [x].
Note The implications i) ⇔ ii) ⇒ iii) are immediate from the statement
and proof of 35.8.
Remark. B1i) easily implies that |AutF (K)| = [K : F ] when K ⊃ F is
normal separable. This would give B1ii) if we could use B2[iii) ⇒ ii)], but
the proof below of B2[iii) ⇒ ii)] uses B1ii).
Definition. The extension K ⊃ F is called a Galois extension if and
only if the conditions in B2 hold.
Theorem B3. i) A finite extension K ⊃ F is simple if and only if
|IN T (F, K)| < ∞; that is, · · · iff there are finitely many intermediate
fields.
ii) If K ⊃ F is separable and finite, then K ⊃ F is simple.
Separability
164
Remark. B3i)⇐ is very easy, and is all we use in proving:
Theorem B4. The fundamental theorem, 38.1, holds for any finite
Galois extension in any characteristic.
The trick is to first prove that
G
F
IN T (F, K) −→ SG(F, K) −→ IN T (F, K)
is the identity (almost by definition). But then IN T (F, K) must be finite,
since SG(F, K) is. By B3i)⇐, the extension K ⊃ F is simple, and now the
same proof as in 38.1, that
SG(F, K) −→ IN T (F, K) −→ SG(F, K)
is the identity, works. Parts ii) and iv) of 38.1 are proved as before. Also
38.1iii) is just B1ii) above, in view of B2, and since K ⊃ F being normal
and separable trivially implies that K ⊃ E is also normal and separable, for
any E ∈ IN T (F, K).
Proof of B1ii). We prove by induction on [K1 : F1 ] that for all
F1 -→ K1
φ ↓∼
=
F2 -→ K2
where K1 ⊃ F1 and K2 ⊃ F2 are splitting
field extensions for separable polynomials
which correspond under φ,
the number of isomorphisms ψ : K1 → K2 , completing the diagram to
make it commute, is [K1 : F1 ] (which equals [K2 : F2 ], of course). The
initial step is trivial. For the inductive step, given two φ-corresponding irreducibles, gi (x) ∈ Fi [x], which divide the φ-corresponding polynomials referred to above, choose a root α ∈ K1 for g1 (x). Let the polynomials g1 (x)
and g2 (x) have degree m. Then there are “m” roots, γ, of g2 (x) in K2 by
separability, so there are “m” choices of commutative diagram as follows:
F1 -→ F1 (α) -→ K1
∼
φ ↓∼
=
=↓ θ
F2 -→ F2 (γ) -→ K2
with θ(α) = γ .
For each such θ, by induction there are “[K1 : F1 (α)]” isomorphisms ψ :
K1 → K2 completing the diagram. Since any ψ at the beginning of the proof
arises from a unique such θ, and since
[K1 : F1 ] = [K1 : F1 (α)][F1 (α) : F1 ] = m[K1 : F1 (α)] ,
Separability
165
this completes the induction. Statement B1ii) is the special case in which
F1 = F2 = F ; φ = the identity ; and K1 = K2 = K .
Proof of B2[iii) ⇒ ii)]. Let E = FG(F ), and let K be the splitting
field over F of the separable polynomial g(x) ∈ F [x]. Then F ⊂ E ⊂ K, so
K is the splitting field over E of g(x) as well. By B1ii), we have
|AutE (K)| = [K : E] and |AutF (K)| = [K : F ] .
By the definition of E, we have AutF (K) = AutE (K) . This yields [K : F ] =
[K : E] . Thus [E : F ] = 1, and so E = F, as required.
Additional Galois theory problems—many books have lots of good
problems, especially Hungerford .
i
m
i
1. If Σm
i=0 ai x is irreducible in F [x], then so is Σi=0 am−i x .
2. The following are equivalent for a field F of characteristic p :
i) every irreducible in F [x] is separable;
ii) the Frobenius map, F → F ; a 9→ ap , is surjective.
(Such a field is called perfect, as are all fields of characteristic 0, where
i) holds necessarily.)
3. Every finite field is perfect.
4. Assume that K ⊃ F and L ⊃ F are both finite, that K is isomorphic
to a subfield of L, and that L is isomorphic to a subfield K, both
isomorphisms fixing elements of F . Show that L ∼
= K. Does this hold
without the finiteness assumption?
5. For every finite K ⊃ F , there exists a field L such that : i) L ⊃ K; ii)
L ⊃ F is normal; iii) if M ⊃ K is an extension with M ⊃ F normal,
then L is isomorphic to a subfield of M , fixing elements of F. Prove
that L exists and is unique up to isomorphisms fixing elements of F .
(The extension L ⊃ F is called the normal closure normal closure of
K ⊃ F .)
Separability
166
6. Given K ⊃ F and B, C both in IN T (F, K), let B ∨ C (the ‘compositum’) be the subfield of K generated by B ∪ C. Prove that the normal
closure of K ⊃ F in problem 5. can be written
(K1 ∨ K2 ∨ · · · ∨ Kr ) ⊃ F ,
for some r ≥ 1 and Ki ∈ IN T (L, F ) such that each Ki is isomorphic
to K over F.
[This ∨ together with ∩ are the binary operations in the lattice lattice
IN T (F, K).]
7. If K ⊃ F is Galois and E ∈ IN T (F, K), then K ⊃ E is also Galois.
8. For every m ≥ 2, give an example of K ⊃ F with [K : F ] = m and
K %= F (γ) for every γ with γ m ∈ F. To what extent can this be done
in characteristic 0?
9. (See problem 6 above.) If K ⊃ F is Galois and g(x) is irreducible in
F [x], then all irreducibles in the factorization of g(x) in K[x] have the
same degree. Show also that this can fail for non-Galois extensions.
10. Prove that every finite group is isomorphic to AutF (K) for at least one
finite K ⊃ F in characteristic 0, (even with F ⊃ Q finite—the solution
that will make you famous is doing it for F = Q).
Solubility implies solvability
167
11. Let g(x) = 0 be the ‘general equation of degree m’ over F [as in Section
36] giving the field F0 .
i) Is g(x) irreducible in F0 [x]?
ii) Does this hold in arbitrary characteristic?
iii) Is g(x) separable in arbitrary characteristic?
12. For any positive integer n, find a finite extension K ⊃ F such that
K %= F (γ1 , . . . , γn ) for any γ1 , . . . γn .
13. Illustrations of subgroup (non-)existence implying subfield
(non-)existence:
i) If K ⊃ F is Galois (finite) with AutF (K) soluble, then there
exists E ∈ IN T (F, K) with [E : F ] prime.
ii) Does i) hold without the assumption on AutF (K)?
[Hint: A6 has no subgroups of prime index.]
iii) If a prime p divides [K : F ] where K ⊃ F is any finite extension
in characteristic 0, does there exist E ∈ IN T (F, K) with [K :
E] = p?
iv) Does i) hold for any prime dividing [K : F ]?
APPENDIX C. Solubility implies solvability.
Let us again return to the world of characteristic 0 (and, as usual, consider
only finite extensions). We shall prove that if g(x) ∈ F [x] has soluble Galois
group over F , then g(x) is solvable by radicals over F (the converse of 35.11).
This excellent result is due to Galois. It is false in this simple form when the
characteristic is nonzero.
What must be done is to show that a normal extension E ⊃ F , with
AutF (E) soluble, can be embedded in a radical extension; i.e. there is a
short tower F ⊂ E ⊂ K for some radical extension K ⊃ F . This is Theorem
168
C7 below. The obvious first try is (using solubility) to write down a tower
of groups,
AutF (E) = G0 " G1 " G2 " · · · " Gr = {1} ,
with successive quotients abelian, or stronger, cyclic of prime order. Then,
by the Galois correspondence, we have Gi = AutFi (E) for a tower of fields
F = F0 ⊂ F1 ⊂ F2 ⊂ · · · ⊂ Fr = E .
Letting Gi /Gi+1 be cyclic of prime order pi , we have the forlorn hope that
Fi+1 = Fi (βi ) with βipi ∈ Fi . This at least motivates the following question:
When can an extension K ⊃ L with [K : L] = p, a prime, be written as
K = L(β) with β p ∈ L? There is a nice clean sufficient condition:
Theorem C1. If K ⊃ L is a normal extension of prime degree p such
that L contains a primitive pth root of unity, then K = L(β) for some β with
β p ∈ L.
The need for normality and roots of unity (give examples to show this !)
complicates the “first try” above; but below C7 is proved by an induction
on degree which is an amplification of that “first try”.
Note that, when p = 2, the hypotheses are automatically satisfied; but
we already know the conclusion as well.
There is an easy exercise providing a weak partial converse to C1:
If
p is a prime and L(β) ⊃ L is a normal extension of degree p with β p ∈ L,
then L(β) (not necessarily L) contains a primitive pth root of unity.
The proof of C1 below will use the norm
N : K × −→ L× ,
k 9→
$
θ(k) ,
θ∈G
defined for any normal extension K ⊃ L, where G := AutL (K).
Lemma C2. The function N is a well defined morphism of (multiplicative) groups. Furthermore, N (&) = & [K:L] for all & ∈ L× ; and N (θ(k)) = N (k)
for all k ∈ K × and θ ∈ AutL (K).
Exercise CA. Prove this.
169
Lemma C3. (Hilbert’s Theorem 90.) If K ⊃ L is a normal extension
with AutL (K) being a cyclic group generated by θ, then for all y in K × , we
have N (y) = 1 ⇔ there exists an x ∈ K × with y = x−1 θ(x).
Proof. The ⇐ half is trivial, since N is a group morphism, and N (θ(x)) =
N (x). For the other half, we need another technical but fundamental result,
due to Dedekind:
Theorem C4. Let G be any group, and let F be any field. Then the
set of all group morphisms from G to F × , regarded as a subset of the vector
space, F G , of all functions from G to F , is linearly independent.
Remark. Group morphisms from G to F × were called characters, particularly when F = C. This is still the case; but, since the year 1899, certain
other functions from G to F are also called characters, as explained in Section 48. In 48.7, we prove a stronger result (orthonormality rather than just
linear independence), applying more generally (to these post-1899 characters
as well), but also less generally (G is assumed to be finite). Since we need the
result for the infinite group G = F × , an elementary proof of C4 is provided
below.
Corollary C5. Any set of automorphisms of a field is linearly independent. (Just take G to be F × itself.)
Proof of C4. Suppose, for a contradiction, that we have a non-trivial
linear relation
f1 θ1 + f2 θ2 + · · · + fn θn = 0
(I)
where each fi is in F , and the θi are distinct morphisms from G to F × .
In fact, choose such a relation with n minimal, so that all fi are non-zero.
Multiplying by f1−1 , we may assume that f1 = 1. Clearly n > 1. Choose
h ∈ G with θ1 (h) %= θ2 (h). For any g ∈ G, we have
θ1 (g) + f2 θ2 (g) + · · · + fn θn (g) = 0
(II)
θ1 (h)θ1 (g) + f2 θ2 (h)θ2 (g) + · · · + fn θn (h)θn (g) = 0
(III)
θ1 (g) + f2 θ1 (h)−1 θ2 (h)θ2 (g) + · · · + fn θ1 (h)−1 θn (h)θn (g) = 0
(IV)
by applying (I) to g and to hg. Multiply (III) by θ1 (h)−1 , giving
170
Then (II) minus (IV) yields
f2 [1 − θ1 (h)−1 θ2 (h)]θ2 (g) + · · · + fn [1 − θ1 (h)−1 θn (h)]θn (g) = 0 .
This holds for all g, so by the minimality of n, we have
fi [1 − θ1 (h)−1 θi (h)] = 0
for all i ≥ 2. But fi %= 0, so θ1 (h) = θi (h), which for i = 2 gives a contradiction.
Continuation of the proof of C3. To prove ⇒, suppose that θ has
order n. Define y0 , y1 , · · · , yn−1 inductively, by y0 := y and
yi := yθ(yi−1 ) = yθ(y)θ2 (y) · · · θi (y) .
Note that yn−1 = N (y) = 1. Using C5, the set { id, θ, θ2 , · · · , θn−1 } is
linearly independent, so we may choose some z with
y0 z + y1 θ(z) + · · · + yn−2 θn−2 (z) + yn−1 θn−1 (z) %= 0 .
Let x−1 be the left hand side. A straightforward calculation shows that
yθ(x−1 ) = x−1 , as required.
Proof of C1. It suffices to find a β ∈ K \ L such that β p is fixed by all
elements of AutL (K): This is because K = L(β) for any β ∈ K \ L (since
there are no intermediate fields), and β p ∈ L (since no element of K \ L is
fixed by the Galois group). If ω ∈ L is a primitive pth root of unity, then
N (ω) = ω [K:L] = ω p = 1 .
Now |AutL (K)| = [K : L] = p, a prime, so AutL (K) is a cyclic group. Let θ
be a generator. Then, by C3⇒, there exists β ∈ K such that ω = β −1 θ(β).
Rewrite this as θ(β) = βω. Since βω %= β, we get β ∈ K \ L. Since
(βω)p = β p ω p = β p , we get θ(β p ) = β p , and so β p is fixed by the whole
Galois group, as required.
Lemma C6. Suppose given a diagram of field extensions,
F
⊂
E
∩
F̂
171
∩
⊂
Ê ,
where E ⊃ F and Ê ⊃ F̂ are both splitting extensions for the same polynomial in F [x]. Then the map
AutF̂ (Ê) −→ AutF (E) ,
θ 9→ θ|E
(defined by restricting), is a well defined morphism of groups, and is injective.
Exercise CB. Prove this, using the most basic ideas near the start of
Section 35.
Theorem C7. Let E ⊃ F be a normal extension in characteristic 0 for
which AutF (E) is soluble. Then there exists a radical extension K ⊃ F such
that E ⊂ K.
Proof. Proceed by induction on [E : F ]. The initial step is obvious. For
the inductive step, by solubility, there is a prime p such that AutF (E) has an
index p normal subgroup N —‘the top of the tower’. Let F̂ = F (ω), where
ω is a primitive pth root of unity, and let Ê = E(ω). If E ⊃ F is a splitting
extension for g(x) ∈ F [x], then Ê ⊃ F̂ is also one for g(x), and Ê ⊃ F is
one for (xp − 1)g(x). So the latter two extensions are normal. By C6, the
group AutF̂ (Ê) is isomorphic to a subgroup of AutF (E); and so, by 11H, it
is soluble.
Case i). If [Ê : F̂ ] < [E : F ], then, by the inductive hypothesis, there
is a radical extension K ⊃ F̂ such that Ê ⊂ K. But F̂ = F (ω), where
ω p =1 ∈ F , so K ⊃ F is also radical. Since E ⊂ Ê ⊂ K , we are done.
Case ii). If [Ê : F̂ ] = [E : F ](= b, say), then AutF̂ (Ê) is isomorphic to
AutF (E) by the restriction map, since they are finite groups of the same order
b. Let N̂ ! AutF̂ (Ê) be the normal subgroup which corresponds to N under
this isomorphism. Let D ∈ IN T (Ê, F̂ ) be the intermediate field for which
AutD (Ê) = N̂ , using the Galois correspondence. Then the extension Ê ⊃ D
is normal with soluble Galois group N̂ . Its degree is “p” times smaller than
[E : F ], so by the inductive hypothesis, there is a radical extension K ⊃ D
such that Ê ⊂ K. Now D ⊃ F̂ is normal, because N̂ is a normal subgroup
of AutF̂ (Ê). This extension has degree p. Thus, by C1, we can write D as
172
F̂ (β) with β p ∈ F̂ . As we did in Case i), write F̂ as F (ω) with ω p ∈ F .
Combining the last two sentences with the radicality of K ⊃ D, it follows
that K ⊃ F is also a radical extension. Since E ⊂ Ê ⊂ K, that’s it.
Exercise CC. (CubiCs !)
i) Show how, by ‘completing the cube’, finding solutions to the general
equation, x3 + ax2 + bx + c = 0 , of degree 3 may be reduced to solving
x3 + px + q = 0 .
Let
1
u = [ (−q +
2
and
=
q2 +
4p3 1/3
)]
27
=
1
4p3 1/3
v = [ (−q − q 2 +
)]
2
27
ii) Show that u3 + v 3 + q = 0 .
iii) Show that 3uv + p = 0 .
iv) Deduce that u + v is a root of x3 + px + q = 0 .
This gives a solution to the general cubic, one which is ‘radical’, as claimed
earlier (and as follows also from what we just proved in this Appendix, since
S3 and all its subgroups are soluble).
What about the general solution? Well, there are lots of combinations of
choices for the square and cube roots in our formulae. OOPS! It looks as
though we have produced far more than three roots for a cubic polynomial !
The square roots are no problem : Clearly ii) only works if we choose
the same square root in the two formulae; and changing them both to their
negatives just interchanges u and v. So the choice of square roots is illusory;
it doesn’t result in any extra solutions to the cubic.
Let’s be a bit more formal about where we’re working. Assume that F is
a field whose characteristic is not 2 nor 3 (although the previous bracketed
comment only applies for characteristic 0). Show that:
i)* If a, b and c in i) are in F , then so are p and q.
ii)* The identity in ii) holds when the square root and cube roots in the
formulae are interpreted to be any three fixed elements in some extension of
F whose square or cubes are as indicated.
173
iii)* In an extension of F , prove that, for each choice of cube root giving
u, there is at most one choice of cube root giving v so that iii) holds; and
exactly one choice if the extension is ‘big enough’.
This gets us down from nine to (at most) three solutions of the form u+v,
so we have the ‘general solution’, as required.
v) Show that if u + v is one solution, then the others are ωu + ω 2 v and
ω 2 u + ωv, where ω is a primitive cube root of unity.
vi) To reinforce all of this, multiply out
(x − u − v)(x − ωu − ω 2 v)(x − ω 2 u − ωv) ,
where u and v are ‘compatible’, as explained in iii)*.
As readers of this book (but not all students seem to) realize, the command: ‘Show that object O is a solution to equation E’ is usually much easier
to obey than the commands: ‘Find a solution to equation E’ or ‘Find all solutions to equation E’. The renaissance Italians who first solved cubics and
quartics were of course faced with the harder questions. Due to their success,
we needed only to consider the easier one. But to see how these formulae for
roots of cubics might arise, you can find the analysis of the harder questions
in many texts; for example, Rotman, pp.25–28.
Exercise CD. (DisCriminants !) Assuming characteristic 0, let g(x) ∈
F [x] have splitting field E over F . By indexing its roots—that is, by writing
g(x) = (x − r1 )(x − r2 ) · · · (x − rn ) ∈ E[x] ,
we get the Galois group as a subgroup of Sn , and we can define
∆ =
$
i<j
(ri − rj ) ∈ E .
i) Show that ∆ depends, up to sign, only on g(x) and E, and not on the
indexing. More precisely, show that, if the roots are re-indexed as s1 , · · · , sn ,
where si = rσ(i) for some permutation σ, then ∆ changes by being multiplied
by the sign of σ.
The discriminant of g(x) is defined to be
D = ∆2 ,
which is clearly independent of the indexing. Obviously D = 0 if and only if
g(x) has a repeated root. (This is a minor bit of discrimination practised by
174
the discriminant.)
ii) Show that D ∈ F .
iii) Let G be the Galois group of g(x) over F , and let H = G ∩ An ,
regarding G ⊂ Sn as above. Show that the intermediate field F(H) (notation
from the Galois correspondence) is F (∆); in particular,
√
G ⊂ An if and only if
D∈F .
(So that’s another thing which the discriminant discriminates !)
Exercise CE. Refer to CC, and assume the characteristic to be 0.
i) Show that the two corresponding cubics in CC i) have the same discriminant, and it is −4p3 − 27q 2 . (More precisely, show that
=
4p3
[ and (1 + 2ω)2 = − 3 ] ,
27
where the square root is the one chosen in the formula for the roots of cubics,
assuming the roots to be indexed in the following order: u+v, ωu+ω 2 v, ω 2 u+
ωv.) Find another formula for the discriminant of the general (unreduced)
cubic, in terms of its coefficients a, b and c. Generalize the first phrase to
arbitrary degree : Show how any monic of degree n can be replaced by a
‘reduced’ one, whose coefficient of xn−1 is zero, where the two polynomials
have roots differing merely by a ‘translation’, and have the same discriminant.
ii) Check directly from the formula for the discriminant of a cubic, g(x),
that D = 0 if and only if GCD{ g(x), Dg(x) } =
% 1, where the last D denotes
the formal derivative. Now use the explicit formula for the roots to show
directly that these conditions are equivalent to the existence of a repeated
root.
iii) Prove the following easy facts about Galois groups of cubics:
A product of three linears in F [x] has trivial Galois group.
A linear times an irreducible quadratic in F [x] has Galois group cyclic of
order 2.
An irreducible cubic in F [x] has Galois group A3 or S3 , and we get A3 if and
only if ∆ ∈ F .
iv) Show that when F = Q and g(x) is irreducible in Q[x], and if we
take the splitting field which is inside C, then g(x) has three real roots when
D > 0; and only one real root when D < 0 [in which case the Galois group
is necessarily S3 , by iii)]. (Further discrimination !)
∆ = 3(1 + 2ω) q 2 +
V. Modules over PIDs
and Similarity of Matrices.
The next five sections give a straightforward (but very important) generalization of 13.2, and a major application to matrix theory. Basic facts
concerning linear algebra and abelian groups are assumed, as well as unique
factorization in the ring (see sections 13, 17 and 25). The first section,
42, consists of the minimum needed theory for modules over a commutative ring. The student will likely find that later algebra courses emphasize
modules very strongly. Section 43 contains the structure theorem for finitely
generated modules over a Euclideanizable domain. Thus the above title misleads, motivating:
Project. Reprove everything in sections 43 and 44, assuming only that
R is a PID (or else, consult Hartley & Hawkes, Jacobson or Hungerford).
In Section 44, part of the more common approach for proving the structure theorem via Smith normal form is given. All of this specializes, with
R = Z, to abelian groups in Section 45, and to the application, with
R = F [x], to similarity of matrices in Section 46.
We shall label by ‘RV’ any statement whose proof is a routine verification,
left to the reader. These should provide a vehicle for plenty of mathematical
recreation. Only a few other exercises are included, except at the end of the
final section, 46.
42. Basics on modules.
In this section, R is any commutative ring (with 1, of course).
Definitions. An R-module (if R is a field, also called an R-vector space—
see 25) is an abelian group M together with a scalar multiplication
R × M −→ M
;
(r, m) 9→ r · m ,
such that, for all r, r& in R and m, m& in M :
(r + r& ) · m = r · m + r& · m
;
r · (m + m& ) = r · m + r · m&
175
r · (r& · m) = (rr& ) · m ;
;
1·m = m .
Basics on modules
176
Such an M is finitely generated (if R is a field, also said to be spanned by
a finite set, which implies that M is finite dimensional) if and only if there
is a subset { m1 , · · · , mn } of M such that each m can be written in the
form r1 · m1 + · · · + rn · mn for some ri in R. The set { m1 , · · · , mn } is
then a set of generators for M . (If R is a field, it is also called a spanning
set, and, when n is minimal, a basis.) A module which can be generated by
one element (to be fussy, by a singleton set) is called cyclic—when R = Z,
this agrees with our previous notation for abelian groups.
A group morphism θ : M1 → M2 between R-modules is a module morphism (if R is a field, also called a linear transformation ) if and only if θ
also satisfies θ(r · m) = r · θ(m) for all r ∈ R and m ∈ M1 . Such a θ is a
module isomorphism if and only if it is bijective.
Having written these definitions down, we’ll normally cease using the “·”,
since the ring multiplication and the module scalar multiplication can always
be distinguished by the context.
RV i) If θ is an isomorphism, then θ−1 is a module morphism, and so an
isomorphism. Then we write M1 ∼
= M2 , so that ∼
= is an equivalence relation,
using also that identity maps and composites of morphisms are morphisms.
A subgroup N of an R-module M is a submodule (if R is a field, also
called a subspace ) if and only if N also satisfies rn ∈ N for all n ∈ N and
r ∈ R. Then the quotient group M/N is given the scalar multiplication
r(m + N ) := (rm) + N .
RV ii) This scalar multiplication is well-defined, and makes M/N into a
module, the ‘quotient module’.
Let θ : M1 → M2 be a module morphism.
RV iii) Kerθ and Imθ are submodules of M1 and M2 , respectively.
RV iv) For submodules N1 ⊂ Kerθ ⊂ M1 and Imθ ⊂ N2 ⊂ M2 , the
factorization of θ as a composite,
M1 −→ M1 /N1 −→ N2 −→ M2 ,
consists of three module morphisms: the canonical surjection; followed by
Basics on modules
177
the map sending m1 + N1 to θ(m1 ); followed by the inclusion.
RV v) Taking N1 = Kerθ and N2 = Imθ, the group isomorphism
M1 /Kerθ → Imθ in the middle above is in fact a module isomorphism.
If M1 , · · · , Mt are R-modules, their external direct sum is
M1
*
···
*
Mt := { (m1 , · · · , mt ) : mi ∈ Mi } ,
with its usual abelian group structure using coordinatewise addition, and
given the coordinatewise scalar multiplication:
r(m1 , · · · , mt ) := (rm1 , · · · , rmt ) .
RV vi) This makes M1
*
···
*
Mt into an R-module.
*
*
*
*
*
RV vii) M1 (M2 · · · Mt ) ∼
= M1 · · · Mt .
This is virtually tautological rather than merely routine. The other asso*
ciativity isomorphisms for
hold; vii) happens to be the one needed later.
RV viii) Using its ring operations, R is itself an R-module.
RV ix) The submodules of the module R are precisely the ideals I of the
ring R.
Thus, for example, R/I is a quadruply abused notation: it denotes a set,
an abelian group, a ring, an R-module, and an R/I-module.
RV x) Let M be an R-module, and S a subring of R. Then restricting
S×M
inclusion
−→
R×M
scalar mult.
−→
M
makes the abelian group M into an S-module.
For example, any C-vector space may be thought of in addition as an Rvector space, of double the dimension. Any F [x]-module ‘is’ also an F -vector
space.
In x) we could have had R and S being any two commutative rings, with
a fixed ring morphism φ : S → R. But only the case when φ is inclusion of
a subring will be used below.
Structure of finitely generated modules
178
43. Structure of finitely generated modules
over euclidean domains.
Assume now that R is a Euclideanizable domain, although we only need
it to be a PID for the proof given of 43.2 to work.
Theorem 43.1. If M is a finitely generated R-module, then there are
integers k ≥ 0 and & ≥ 0, and non-zero non-invertible elements di of R, with
d1 | d2 | · · · | d# , such that
M
*
*
*
* k
∼
R/(d2 ) · · · R/(d# )
R .
= R/(d1 )
[Here the ideal in R generated by d is denoted (d), and is a submodule of R
by RV ix).]
Theorem 43.2. With notation as in 43.1, if also we have non-negative
integers j and m, and elements e1 | · · · | em , with no ei zero or invertible,
such that
R/(e1 )
*
···
*
R/(em )
*
Rj
*
*
* k
∼
R ,
= R/(d1 ) · · · R/(d# )
then j = k, & = m, and, for all i, the elements di and ei are associates in R.
We may paraphrase these theorems as saying that every finitely generated
module is ‘uniquely’ a direct sum of cyclic modules. Combining these two
theorems, we have a classification for finitely generated modules M over a
Euclideanizable domain (which actually holds more generally over any PID):
each such M corresponds to an invariant
(k ; [d1 ] , · · · , [d# ]) ,
where [d] denotes the ‘associate class’ of d in R. The set of isomorphism
classes of such modules is thus in 1-1 correspondence with the set of sequences
which can serve as an invariant. Let us re-emphasize that possibly k or & is
zero, that no di is zero or invertible, and that di divides di+1 for each i < &.
This section will give efficient proofs of these theorems, proofs which don’t
reveal in any very direct way how to calculate the invariant. An algorithm
for doing that is discussed in the next section.
179
Proof of Theorem 43.1.
If M = {0}, the theorem holds with & = k = 0, using the usual conventions. Proceed by induction on n := the minimum number of generators for
M , to prove the slightly stronger statement that
the theorem holds with, in addition, k + & = n .
Start the induction with n = 0 for which M = {0} as above. [If this makes
you nervous—it shouldn’t—, specialize and simplify the following proof to
n = 1, and it gives a proof for an initial case with n = 1.]
For the inductive step, take n > 0 and fix M such that M can be generated
by some n-element set, but not by any (n − 1)-element set. Define G to be
the following subset of R :
G := { r ∈ R : ∃{m1 , · · · , mn } generating M ,
and {r2 , · · · , rn } ⊂ R, with rm1 +
,
i>1
ri m i = 0 } .
Note that no element of G is invertible, since the minimality of n would
be contradicted if we could express m1 in terms of the other generators:
m1 = − r−1 i>1 ri mi . Clearly 0 ∈ G.
Case 1. Assume that G = {0}:
Pick a set { m1 , · · · , mn } generating M , and define θ : Rn → M by
θ(r1 , · · · , rn ) = r1 m1 + · · · + rn mn .
It is straightforward to check that θ is a module morphism. It is surjective
by the definition of the statement: ‘{ m1 , · · · , mn } generates M ’. It is
injective because G = {0}: [If (r1 , · · · , rn ) ∈ Kerθ, then each ri ∈ G, since
we can re-order the generating set as
{ mi , m1 , · · · , mi−1 , mi+1 , · · · , mn } .
But ri ∈ G implies ri = 0, so Kerθ = {(0, 0, · · · , 0)}, as required.] Thus θ is
an isomorphism, so that the theorem holds for M with & = 0 and k = n.
Case 2. Assume that G %= {0}:
Let δ : R \ {0} → N be a fixed function such that (R, δ) is a Euclidean
domain. Now choose d1 so that d1 %= 0 and d1 ∈ G, with δ(d1 ) ≤ δ(r) for
180
all non-zero r ∈ G. [The crucial thing is that ≤ is a discrete linear order on
N. We couldn’t do this if we only knew that δ took values in the non-negative
rationals or reals, for example.]
Claim i): For all { m1 , · · · , mn } generating M , if
d1 m1 +
,
ri m i = 0 ,
i>1
then d1 divides rj for all j > 1.
Proof. Let rj = qd1 + s, where either s = 0 or δ(s) < δ(d1 ) (using the
division algorithm). Then
,
0 = d1 m1 + (qd1 + s)mj +
ri m i
i*=1 or j
= smj + d1 (m1 + qmj ) +
,
ri m i .
i*=1 or j
But
B := { mj , m1 + qmj , m2 , · · · , mj−1 , mj+1 , · · · , mn }
is an n-element set which generates M : [It suffices to express each mi as a
linear combination from B—this is trivial for i > 1, and for i = 1 we have
m1 = (1)(m1 + qmj ) + (−q)(mj ).] Thus s ∈ G. But s %= 0 and δ(s) < δ(d1 )
would contradict the definition of d1 , so s = 0, as required.
Claim ii): For all { m1 , · · · , mn } generating M , if
d1 m1 +
,
ri m i = 0 =
i>1
,
ri& mi
,
i≥1
then d1 divides r1& .
Proof. Write r1& = q & d1 + s& where either s& = 0 or δ(s& ) < δ(d1 ). Then
&
0=0−q0=
,
i≥1
ri& mi
−q
&
"
d1 m1 +
,
i>1
#
r i m i = s& m 1 +
,
i>1
(ri& − q & ri )mi .
Thus s& ∈ G, so δ(s& ) < δ(d1 ) is not possible, as required.
Now, using i), fix some n-element set { m1 , · · · , mn } generating M ,
and a relation
,
d1 (m1 +
ti mi ) = 0
i>1
for some t2 , · · · , tn in R. Define
m̄1 = m1 +
,
181
ti mi .
i>1
Claim iii): The set A := { m̄1 , m2 , m3 , · · · , mn } generates M .
Proof. It suffices to express each mi as a linear combination from A.
This is trivial for i > 1, since then mi ∈ A. For i = 1, just note that
m1 = m̄1 − t2 m2 − t3 m3 − · · · − tn mn .
Let < m̄1 > be the (cyclic) submodule of M generated by m̄1 ; that is,
< m̄1 > := { rm̄1 : r ∈ R } .
Claim iv): The set
{ m2 + < m̄1 >, m3 + < m̄1 >, · · · , mn + < m̄1 > }
generates the quotient module M/ < m̄1 >.
Proof. Given m ∈ M , use iii) to choose ri ∈ R such that
m = r1 m̄1 +
,
ri m i .
i>1
Then
m+ < m̄1 > =
"
,
i>1
=
"
,
i>1
=
,
#
ri mi + r1 m̄1 + < m̄1 >
#
ri mi + < m̄1 >
ri (mi + < m̄1 >) ,
as required.
i>1
*
*
* k
Claim v): We have M/ < m̄1 > ∼
· · · R/(d# )
R
= R/(d2 )
for some k ≥ 0 , & − 1 ≥ 0 , non-zero non-invertible d2 | d3 · · · | d# such
that k + & − 1 ≤ n − 1 .
We’ll soon see that = holds, not <, in the last inequality.
182
Proof. By iv), M/ < m̄1 > can be generated by “n − 1” elements, so
this is immediate from the inductive hypothesis.
Claim vi): < m̄1 > ∼
= R/(d1 ) .
Probably (d1 ) should be written < d1 > for consistency, but earlier in field theory
we always used (d1 ) when thinking of it as an ideal rather than as a submodule.
Proof. Define φ : R → < m̄1 > by φ(r) = rm̄1 . Then φ is easily seen
to be a module morphism. It is surjective by the definition of
< m̄1 >. Now d1 m̄1 = 0 by definition of m̄1 , so d1 ∈ Kerφ, and thus
(d1 ) ⊂ Kerφ. But
r ∈ Kerφ =⇒ 0 = rm̄1 = rm̄1 + 0m2 + 0m3 + · · · + 0mn
=⇒ d1 | r
by ii)
=⇒ r ∈ (d1 ) .
Thus Kerφ ⊂ (d1 ), giving Kerφ = (d1 ); and so
R/(d1 ) = R/Kerφ ∼
= Imφ = < m̄1 > ,
as required.
*
Claim vii): M ∼
(M/ < m̄1 >) .
= < m̄1 >
Proof. Given m ∈ M , write
m = r1 m̄1 +
,
ri m i ,
i>1
*
using iii). Define ψ : M −→ < m̄1 >
(M/ < m̄1 >)
by
ψ(m) := (r1 m̄1 , m + < m̄1 >) .
To show that this is well defined, suppose that
r1 m̄1 +
,
i>1
-
ri mi = r1& m̄1 +
,
ri& mi .
i>1
Then (r1 − r1& )m̄1 + i>1 (ri − ri& )mi = 0 . By ii) and iii), d1 | (r1 − r1& ). Thus
(r1 − r1& )m̄1 = 0 since d1 m̄1 = 0. So r1 m̄1 = r1& m̄1 , showing that the first
component in the definition of ψ is well-defined, as required. Also ψ is easily
seen to be a module morphism. It is surjective, by the definition of < m̄1 >.
183
Finally, to prove that ψ is injective, suppose that m = r1 m̄1 +
in Kerψ. Then
-
i>1 ri mi
is
(0, 0) = ψ(m) = (r1 m̄1 , m + < m̄1 >) ,
-
-
so r1 m̄1 = 0 and i>1 ri mi = m ∈ < m̄1 >. Writing i>1 ri mi as r0 m̄1 ,
we see that d1 | (−r0 ) by ii). Therefore r0 m̄1 = 0, that is, i>1 ri mi = 0.
Thus m = 0.
*
*
* k
Claim viii): M ∼
···
R/(d# )
R ,
= R/(d1 )
for d1 as defined at the beginning of Case 2, and for k and d2 , · · · , d# as in
v).
Proof. This is immediate, combining vii), vi) and v).
Claim ix): d1 is non-zero and non-invertible, k + & = n, and d1 | d2 if it
happens that & > 1.
Proof. The first statement is clear since G contains no invertible elements. Let θ : M → D be any isomorphism as in viii), where D is the
direct sum in viii). Define elements as follows, where the right-hand sides
are non-zero in the ith slot:
u1 , · · · , u# in M ,
by θ(ui ) := (0, · · · , 0, 1 + (di ), 0, · · · , 0) ,
and
v#+1 , · · · , v#+k in M ,
by θ(vi ) := (0, · · · , 0, 1, 0, · · · , 0) .
Then { u1 , · · · , u# , v#+1 , · · · , v#+k } generates M since it maps under the
isomorphism θ into a set of generators for D. Thus k +& ≥ n, since M cannot
be generated by fewer than “n” elements. By v), k + & ≤ n, so k + & = n, as
required. Also if & ≥ 1, then
d1 (1 + (d1 ), 0, · · · , 0) = 0D = d2 (0, 1 + (d2 ), 0, · · · , 0) .
Thus, applying θ−1 , we see that d1 u1 = 0M = d2 u2 . Thus
d1 u1 + d2 u2 + 0 + · · · + 0v#+k = 0 . Hence d1 | d2 by i), as required.
Since d2 | d3 · · · by v), the last two claims complete the induction, and
so, the proof of Theorem 43.1.
184
Proof of Theorem 43.2.
This will be presented as a sequence of definitions and routine verifications.
Definition. The torsion submodule, TM , of a module M is
TM := { m ∈ M : rm = 0 for some r %= 0 } .
RV i): TM is a submodule of M .
RV ii): If θ : M → N is a module isomorphism, then θ(TM ) = TN , so
that TM ∼
= TN . Furthermore,
M/TM ∼
= N/TN
via
m + TM 9→ θ(m) + TN .
RV iii): If D is the direct sum in 43.1, then
*
*
TD ∼
= R/(d1 ) · · · R/(d# ) and D/TD ∼
= Rk .
Combining iii) [and its analogue for the other direct sum in 43.2] with
ii), the hypothesis of 43.2 yields
*
*
*
*
RV iv): R/(d1 )
···
R/(d# ) ∼
···
R/(em )
= R/(e1 )
and
RV v): Rk ∼
= Rj .
This has split the proof into two parts: a ‘torsion’ part, iv); and a ‘free’
part, v), which we deal with first.
RV vi): Let QR be the field of fractions of R, and let θ : Rj → Rk
be an isomorphism of R-modules from v). Then φ : QjR → QkR , given by
φ(a1 /b, · · · , aj /b) := (1/b)θ(a1 , · · · , aj ), is an isomorphism of vector spaces
over QR .
It now follows from v), vi), and elementary linear algebra that j = k, as
required, since QjR and QkR are isomorphic vector spaces, and so they have
the same dimension.
185
Now let p be any irreducible in R. Thus R/(p) [as a ring] is in fact a field.
For any non-negative integer α and R-module M , let
pα M := { pα m : m ∈ M } .
RV vii): pα+1 M is a submodule of pα M , itself a submodule of M .
RV viii): The abelian group pα M / pα+1 M is an R/(p)-vector space,
using the scalar multiplication
;
(r + (p)) m + pα+1 M
<
:= rm + pα+1 M .
The main point is to check that this is well-defined.
RV ix): If M ∼
= N as R-modules, then pα M/pα+1 M ∼
= pα N/pα+1 N as
R/(p)-vector spaces.
Lemma x): For M = R/(d), the dimension over R/(p) of the vector
space pα M/pα+1 M is
>
1 if pα+1 | d in R ;
0 if not
.
(For example, when R = Z , we have 2Z12 /4Z12 ∼
= Z2 whereas 4Z12 /8Z12 ∼
= {0}
.)
Proof. Let λ0 := pα + (d) = pα (1 + (d)) ∈ pα M . For any other element
λ = pα b + (d) of pα M , clearly λ = bλ0 , and so
;
λ + pα+1 M = (b + (p)) λ0 + pα+1 M
<
.
Thus λ0 + pα+1 M spans pα M/pα+1 M over R/(p). It remains to prove that
λ0 + pα+1 M is non-zero if and only if pα+1 | d in R. Now
[λ0 + pα+1 M is zero] ⇐⇒
9
pα + (d) = pα+1 c + (d) for some c ∈ R
:
⇐⇒ [pα (1 − pc) ∈ (d) for some c ∈ R] .
The last statement implies that pα+1 does not divide d, since
pα (1 − pc) = de would otherwise contradict unique factorization, the lefthand side not being divisible by pα+1 .
186
Conversely, if pα+1 does not divide d, we must prove the existence of c
and e in R with pα (1 − pc) = de. Write this as d& e + pα+1−µ c = pα−µ , where
d = pµ d& with GCD { p, d& } = 1. It is clear that such e and c can be found,
since the equation d& x + pα+1−µ y = 1 has a solution (x, y) in R × R.
RV xi): As R/(p)-vector spaces,
*
*
* α
pα (M1 M2 )/pα+1 (M1 M2 ) ∼
(p M2 /pα+1 M2 ) .
= (pα M1 /pα+1 M1 )
The proof of Theorem 43.2 is now completed as follows:
Consider pα L / pα+1 L, where L is the left-hand side in iv). Using x)
and the iteration of xi) to any finite number of summands, its dimension
over R / (p) is the number of i for which pα+1 divides di . By iv) and ix) this
must equal the number of i for which pα+1 divides ei . Since this is true for all
irreducibles p and non-negative integers α, the divisibility conditions di | di+1
and ei | ei+1 and unique factorization in R show that & = m and di ∼ ei
for all i, since these “numbers of i” referred to above clearly determine the
prime power factorizations of all di and ei up to an invertible factor. (See
also the definition of elementary divisors at the end of Section 44).
Remark. The proof of 43.2 evidently applies when R is any PID, (any
UFD??), although 43.1 fails in general for UFD’s.
Exercise 43A. Justify the last remark by an example. Hint: Try taking
the ring R to be Z[x].
*
(a) Take M as the quotient (R R)/D, where D is the cyclic submodule
generated by (2, x). Show that:
i) M has no torsion (i.e. rm = 0 implies r = 0 or m = 0 for r ∈ R and
m ∈ M );
ii) M is not cyclic;
iii) the module defined in exactly the same way, except that R = Q[x], is
cyclic.
Assuming that M were a direct sum of quotients of R, deduce from i) that all
the summands are isomorphic to R, from ii) that the number of summands
is greater than 1, and from iii) that the number of summands is less than 2.
(b) Perhaps simpler is just to take M to be the ideal generated by {2, x},
as a submodule of Z[x].
187
(c) Now show that the modules in parts (a) and (b) are actually isomorphic (so your hard work in one of the previous two parts was unnecessary).
Exercise 43B. Can you generalize 43A(b) to find an example (at least)
for any UFD which has a non-principal ideal generated by two elements?
ADDENDUM. An alternative proof that iv) and the divisibility
conditions imply that & = m and di ∼ ei for all i.
Definition. ‘The’ annihilator, ann(M ), of an R-module M , is any generator for the ideal
{ r ∈ R : rm = 0 for all m ∈ M }
( using that R is a PID).
RV xii): This is an ideal, and so, since R is a PID, the element ann(M )
is well-defined up to associates.
RV xiii): If M ∼
= N as R-modules, then ann(M ) and ann(N ) are associates.
RV xiv): If L is the left-hand side in iv), then ann(L) ∼ d# .
(Use the fact that di | di+1 ).
It follows from iv), xiii) and xiv) that d# ∼ em , as required. Now let E
be the right-hand side in iv), and let θ : L → E be a module isomorphism.
Let m0 ∈ L be any element for which rm0 = 0 implies that d# | r [for
example, m0 may be any element whose last component is 1 + (d# )]. Let
< m0 > := { rm0 : r ∈ R }, the cyclic submodule generated by m0 .
RV xv): We have L/ < m0 >
m + < m0 >
Lemma xvi): L/ < m0 >
∼
= E/ < θ(m0 ) > as modules, via
9→ θ(m) + < θ(m0 ) > .
*
*
∼
= R/(d1 ) · · · R/(d#−1 ) .
188
Given this lemma, the proof is completed by induction on &: combining
xvi) [and its analogue for E/ < θ(m0 ) > ] with xv) yields
*
R/(d1 )
···
*
*
*
R/(d#−1 ) ∼
= R/(e1 ) · · · R/(em−1 ) .
Thus, by the inductive hypothesis, & − 1 = m − 1 and di ∼ ei for 1 ≤ i < &,
as required.
To prove the lemma, we use the method of the next section. Letting
mi = (0, · · · , 0, 1 + (di ), 0, · · · , 0), the module L has generators {m1 , · · · , m# }
and relations di mi = 0 for 1 ≤ i ≤ &. Thus L/ < m0 > also has “&” generators
with the ‘same’ relations plus an extra one given by m0 + < m0 >= 0. Thus
the matrix C giving the relations is















d1
0
d2
·
0
·
b1 b2 · ·
·
·








 [of shape (& + 1) × & ],




d# 

b#
where m0 = b1 m1 + b2 m2 + · · · + b# m# . But the possible choices of m0 are
exactly those for which
GCD{d1 d2 . . . d# , b1 d2 . . . d# , d1 b2 . . . d# , · · · , d1 d2 . . . d#−1 b# } ∼ d1 d2 . . . d#−1 .
(For this proof it is not enough to simply take m0 = m# , that is, b# = 1 and
all other bi = 0, because we need the analogue for E.) But then the above
matrix is easily reduced by row/column operations to















1
0
d1
·
0
0
0

·
· ·
·
·
d#−1
0







 ,






Generators, relations and elementary operations
189
as required.
Note that this argument can be re-written so that it does not use anything
from Section 44. On the other hand, nothing from Section 44 which was just
used depends at all on the theorem, 43.2, which was being partially reproved.
44. Generators, relations and elementary operations.
We shall continue to assume that R is a Euclideanizable domain, although
the first few paragraphs apply to any commutative ring. Let C = (cij ) be
a matrix in Rp×n , the set of p × n matrices with entries in R. Define an
R-module:
n
:= Rn / <
MC
,
j=1
cij ej : 1 ≤ i ≤ p > ,
where ej = (0, · · · , 0, 1, 0, · · · , 0) ∈ Rn , and the denominator is the submodule
generated by the given (“p”) elements. Thus MC = Rn /ImθC , where θC :
Rp → Rn is the module morphism uniquely determined by requiring that
θC (fi ) =
,
j
cij ej ,
where fi := (0, · · · , 1, · · · , 0) ∈ Rp .
Remark. Any module M which is isomorphic to MC , via some isomorphism which maps ej + ImθC to (say) mj ∈ M for each j, will be said
to be ‘given by generators m1 , · · · , mn and relations j cij mj = 0’. This is
equivalent to the existence of a module surjection Rn → M , mapping ej to
mj , with kernel equal to ImθC . In M , the mi generate, and the relations
hold; but this also captures the more subtle idea that all other relations in
M follow from the given ones. For a brief discussion of the analogous matters
relating to (non-commutative) groups, see the last paragraphs of Section 11.
Proposition 44.1. If also D ∈ Rp×n , and if there exist matrices P ,
invertible in Rp×p , and Q, invertible in Rn×n , such that P CQ = D, then
MC ∼
= MD .
Proof. First one checks that θC " C = θC ◦θC " whenever the matrix product
C C is defined, by evaluating both sides on the ‘standard basis’ generators,
and using that both maps are module morphisms.
&
190
Thus the diagram
θ
C
Rp −→
Rn
θP ↑
θ
↓ θQ
D
Rp −→
Rn
is commutative. Furthermore, the vertical arrows denote isomorphisms since,
for example, θP ◦ θP −1 = θP −1 P = θI , which is the identity map of Rp . It
follows that θQ maps ImθC isomorphically onto ImθD , and so determines an
isomorphism
MC = Rn /ImθC −→ Rn /ImθD = MD
by
v + ImθC
9→
θQ (v) + ImθD ,
as required.
Exercise 44A. Formulate and prove a converse to 44.1. (This won’t be
used below.)
191
Proposition 44.2. If, for integers m, &, p and n, the p × n diagonal
matrix D, beginning with “m” 1’s, is

1














for di ∈ R, then
MD
..
0
.
1
d1
0
∼
=
R/(d1 )

..
.
d#
*
···
*
0
R/(d# )







 ,






*
Rn−m−# .
Corollary 44.3. If (as in 44.1) P CQ = D, for D as in 44.2 with nonzero non-invertible di such that di | di+1 for all i < &, then the invariant of
MC is (n − m − & ; [d1 ], · · · , [d# ]).
Proof of 44.2. Again we use the ‘RV-style’; most of the RVs below are
nearly tautological, particularly iv) and v).
Definition. Given module morphisms θi : Mi& → Mi for 1 ≤ i ≤ n,
define
*
···
*
···
θ1
*
θn : M1&
*
···
*
Mn& −→ M1
*
···
*
Mn
as usual (coordinatewise) by (m&1 , · · · , m&n ) 9→ (θ1 m&1 , · · · , θn m&n ) .
RV i): Im(θ1
(M1
*
···
*
*
θn ) = (Imθ1 )
Mn )/Im(θ1
*
···
*
*
···
*
(Imθn ) , so
*
*
θn ) ∼
= (M1 /Imθ1 ) · · · (Mn /Imθn )
via the map: coset of (m1 , · · · , mn ) 9→ (m1 + Imθ1 , · · · , mn + Imθn ).
RV ii): If D& is the same as D, except that it is a square matrix of shape
n × n, then θD and θD" have the same image.
RV iii): For D& as in ii), we have θD" = θ1


1
θi : R → R is multiplication by
di

0
*
192
···
*
θn , where
if 1 ≤ i ≤ m ;
if m + 1 ≤ i ≤ m + & ;
if i > m + & .
RV iv): If θ : R → R is multiplication by d, then Imθ = (d).
RV v): R/(1) ∼
= {0} ;
R/(0) ∼
=R ;
{0}
*
M∼
= M.
Now the proof is finished as follows:
MD := Rn /ImθD = Rn /ImθD"
*
by ii)
*
= Rn /Im(θ1 · · · θn )
*
*
∼
= (R/Imθ1 ) · · · (R/Imθn )
m*
*
*
by iii)
by i)
*
= (R/(1))
R/(d1 ) · · · R/(d# ) (R/(0))n−m−# by iii), iv)
*
*
*
∼
by v) .
= R/(d1 ) · · · R/(d# ) Rn−m−#
Assertion 44.4. For each C, there exist P, Q and D as in Corollary 44.3.
Rather stronger: there is an ‘effective method’ (or algorithm) to actually
calculate P, Q and D.
Remarks. The matrix D in 44.3 is sometimes called the Smith normal
form; and applying an algorithm as below is called reduction to Smith normal
form.
Before giving an informal description of the algorithm, we relate this to
the approach used in several texts on this subject. Given M with generators
{ m1 , · · · , mn }, there is obviously a module surjection Rn → M determined by ej 9→ mj . What is not obvious (but is true) is that the kernel of
this surjection is finitely generated, say by “p” elements of Rn (stronger:
even with p ≤ n). In fact any submodule of a finitely generated module over
a PID is itself finitely generated.
Exercise 44B. Prove this, using only the fact that ideals are finitely generated (i.e. R is Noetherian)—the case of Rn is the crucial one, since the
general case will follow by mapping the free module onto it. (There is a
193
proof in Artin, p.469.)
It follows that M ∼
= MC for some matrix C (in fact, many different C); one
says that every finitely generated module over a PID is ‘finitely presentable’.
From this, 44.1, 44.2 and 44.4 evidently give a second proof of Theorem
43.1. With a little more work, one can also give a ‘matrix proof’ of Theorem
43.2; that is, given C, the matrix D in 44.3 is (up to associates) unique (but
P and Q are not). Actually, the logic can be reversed to deduce the above
from 43.1 and 43.2.
In many texts, the above approach is used. We chose to separate out
the ‘effective methods’, proofs of whose existence are not necessarily the
best proofs of 43.1 and 43.2. Often the proofs given of the existence and
correctness of the algorithm are only part way towards a proof acceptable in
computation theory.
Exercise 44C. Show that, when R is a field, the existence of Smith
normal form is equivalent to the fact that any linear transformation V → W
between finite
R-vector spaces can be represented by a matrix
" dimensional
#
I 0
of the form
. (When V = W , one is of course allowed different
0 0
bases for the domain and codomain.)
Informal description of an algorithm for 44.4. As in linear algebra,
there are three types of row operation on matrices in Rp×n :
Ir . Interchange two rows.
IIr . Add a multiple of one row to a different row.
IIIr . Multiply a row by an invertible in R.
These are reversible, as are the analogous column operations, denoted
Ic , IIc , IIIc . The following assertion is then proved as in linear algebra over
a field: for all C and D, there exist invertible P, Q with P CQ = D if and
only if some sequence of row/column operations leads from C to D.
The procedure, for converting a given C by r/c operations to a D as in
44.2, goes by induction on size (as in linear algebra when R is a field). It
suffices to show how to get a top left entry which divides all other entries;
then one ‘kills’ the rest of the first row and column, and proceeds (southeast!) inductively.
To do this, pick a Euclidean function δ on R. It suffices show that if
194
a non-zero matrix entry of minimum δ does not divide some other entry,
then certain operations will produce a matrix with smaller such minimum δ.
Iteration of this must terminate eventually, at which point a non-zero entry
of minimum δ dividing all other entries can be moved the top left, using the
operations Ir and Ic . Assume then that T has an entry tij with δ(tij ) ≤ δ(tk# )
for all tk# %= 0, but tij fails to divide some entry of T .
i) If it doesn’t divide an entry in its own row, say ti# , write this entry
ti# = qtij + r where r %= 0 and δ(r) < δ(tij ). Then subtracting q times the j th
column from the &th has the desired result of lowering the minimum value of
δ.
ii) If tij doesn’t divide an entry in its own column, just interchange “row”
and “column” in i).
iii) Finally suppose that tij divides all entries in its own row and column,
but does not divide tk# . Let ti# = rtij . Add (1 − r) times the j th column to
the &th column. Perhaps this decreases the minimum δ, as required. If not,
note that the new (i, &)th entry is tij and the new (k, &)th entry is congruent
to tk# modulo tij . Thus we are now in case ii) above, with (i, j) replaced by
(i, &).
This completes the sketch of the algorithm. Programming and increasing
efficiency may be safely left to an underling.
Exercise 44D. As an example with some applicability, the reader can
easily show how to convert
B
e 0
0f
C
to
B
GCD{e, f }
0
0
LCM{e, f }
C
for non-zero e and f in R. (Recall that GCD {e, f } can be written as se + tf ,
and that these matrices have the same determinant.)
It follows that
R/(e)
*
*
R/(f ) ∼
R/(LCM{e, f }) .
= R/(GCD{e, f })
This may be iterated to convert any direct sum of modules R / (d) to
one in the invariant factor form of Theorem 43.1. This also shows that if
d = pα1 1 pα2 2 · · · pαs s is the unique factorization of d in R (with pi distinct), then
*
*
R/(d) ∼
···
R/(pαs s ) .
= R/(pα1 1 )
Applying this to each dj occurring in the invariant of M , we obtain a second
canonical decomposition of M , as a direct sum of modules R/(pα ) plus a ‘free’
195
part Rk . This is called the elementary divisor form of M . It is unique except
that there is no natural way to choose an ordering for the multiset of powers
pα of irreducibles. There is a straightforward way to recover the invariant
factors dj from the elementary divisors pα : the last one, d# , is the product,
over all p which occur, of the highest power of p occurring; deleting these, d#−1
is obtained in the same way from the remaining elementary divisors, and so
on. This demonstrates the uniqueness of the multiset of elementary divisors
as a consequence of the uniqueness of the sequence of invariant factors. It is
also perhaps the simplest way to see how to convert an arbitrary direct sum
of modules of the form R/(d) to invariant factor form: one first produces
the elementary divisor form by factoring each such d, and then passes to the
invariant factor form as above.
45. Finitely generated abelian groups revisited.
If ( A , + ) is an abelian group, then a Z-module, ( A , + , · ), can
be defined by specifying
0Z · a := 0A
and, for n > 0 ,
n · a := a + (n − 1) · a
[inductively]
and
(−n) · a := − (n · a) .
It is elementary to check that this gives a well defined Z-module, and that a
group morphism is also a Z-module morphism. Thus the notions of ‘abelian
group’ and ‘Z-module’ are interchangeable. Furthermore, being finitely generated as a group and as a Z-module are equivalent. The previous sections
therefore give the classification of finitely generated abelian groups from Section 13, including the effective method to reduce a finitely presented abelian
group to invariant factor form
Zd1
*
···
*
Zd!
*
Zk
for 1 < d1 | d2 | · · · | d# ,
and to elementary divisor form
Zpα1 1
*
···
*
Zpαs s
*
Zk .
Similarity of matrices
196
FROM HERE ONWARDS, THE READER WILL NEED A MORE
THOROUGH BACKGROUND IN LINEAR ALGEBRA—Chapters
1, 3 and 4 of Artin would suffice.
46. Similarity of matrices.
Let F be a field and let A ∈ F n×n . Recall from linear algebra that another
n × n matrix B is similar to A if and only if there is an invertible S in F n×n
such that B = S −1 AS. It is easy to see that similarity is an equivalence
relation. By regarding A as a linear transformation, and then thinking of
that transformation as the scalar multiplication by x in a module over F [x],
and finally applying the basic theorems 43.1 and 43.2, we shall get a 1–1
correspondence between similarity classes of n × n matrices and sequences
of non-constant monic polynomials, d1 (x) | d2 (x) | d3 (x) | · · · | d# (x), whose
degrees sum to n. By knowing some of the details of how this works, one
can then find a set of representatives (for similarity classes) which are quite
simple matrices—the rational canonical form and (when F is algebraically
closed) the Jordan form, both given near the end of this section. One needs
only to write down exactly one matrix which produces each given sequence
of polynomials. In contrast with what appears to be the common approach
in texts for undergraduates, we shall go almost immediately to the heart
of the matter in 46.2 below to show how, given the matrix A, to actually
calculate the di (x), which are called the invariant factors of A (since they are
the invariant factors of the module associated to A). More precisely, we find
a matrix, with polynomial entries, which gives a presentation by generators
and relations of the module associated to the scalar-entried matrix A—the
former matrix turns out to be simply xI − A. Then r/c operations complete
the job of getting the invariant factors. From this, all the theory related to:
the minimal polynomial, d# (x); the characteristic polynomial, which is the
product of all the di (x); etc... drops out without any fuss.
First we show how to get a 1–1 correspondence between similarity classes
(of matrices) and isomorphism classes (of F [x]-modules which are finite dimensional as F -vector spaces). The ‘information’ contained in the matrix A
above may be ‘coded’ as a pair consisting of F n×1 together with the linear
operator on F n×1 which is defined to be left multiplication by A. Up to
similarity, this information is coded by:
197
i) any n-dimensional F -vector space M , together with
ii) any F -linear operator on M which happens to be representable by the
matrix A in some F -basis for M .
Now suppose instead that M denotes an F [x]-module which, when regarded as an F -vector space, is n-dimensional. The map m 9→ x · m gives
an F -linear operator on M . In a sense, there is no additional information
to be had from M , besides its F -vector space structure together with this
operator: for each g(x) ∈ F [x], the map m 9→ g(x) · m is determined by this
information using the module axioms:
(a0 + a1 x + a2 x2 + · · ·) · m = a0 · m + a1 · (x · m) + a2 · (x · (x · m)) + · · · .
Below we shall make precise this 1-1 correspondence, between the set of
similarity classes of square F -matrices and the set of isomorphism classes of
F [x]-modules M with dimF M < ∞, and then use the results from sections
43 and 44 to completely ‘solve’ the ‘problem’ of similarity.
Definition. For A ∈ F n×n , let N (A) be the F [x]-module whose underlying abelian group is F n×1 , with scalar multiplication
g(x) · v := g(A)v

(matrix multiplication)

y1
 .. 
v =  .  ∈ F n×1 .
yn
for all
Exercise 46A. Check that N (A) is a module.
Clearly dimF N (A) = n.
Theorem 46.1. i) The matrix A is similar to B if and only if we have
∼
N
= N (B) as F [x]-modules.
ii) Furthermore, every F [x]-module M with dimF (M ) < ∞ is isomorphic to
some module N (A) .
(A)
Proof. Assuming that A is similar to B, choose an invertible S with
S AS = B. Then, by induction on j , Aj S = SB j for all j ≥ 0. Thus
−1
,
(
j
cj Aj )S = S
,
j
cj B j ;
198
that is, g(A)S = Sg(B) for any g(x) ∈ F [x]. Using · and 4 for scalar
multiplications in N (A) and N (B) respectively, this says that
g(x) · (Sv) = S(g(x) 4 v)
for all v ∈ F n×1 . Thus left multiplication by S defines a module morphism
from N (B) to N (A) , which is an isomorphism since S −1 exists.
Conversely, suppose that θ : N (B) → N (A) is an F [x]-module isomorphism. It is, in particular, an invertible F -linear operator on F n×1 , and
therefore(∗) it is given as left multiplication by some invertible S ∈ F n×n ; i.e.
θ(v) = Sv. But now, since θ(x 4 v) = x · (θ(v)), we have SBv = ASv for all
v. Thus(∗) SB = AS or S −1 AS = B, as required. (At (∗) , some very basic
linear algebra has been used.)
As for the last claim in the proposition, take n =dimF (M ), choose any
basis { b1 , · · · , bn } for M as a vector space over F , and determine an
F -linear map
φ : F n×1 −→ M
by mapping the ith standard basis vector to bi . It is mechanical to check that
φ is a module isomorphism from N (A) to M , where A is the matrix which
represents (x·) with respect to the given basis for M .
We now have the desired 1–1 correspondence between similarity classes
of matrices and isomorphism classes of modules. It takes the similarity class
of A to the isomorphism class of N (A) . This map is well defined by half of
46.1i); injective by the other half; and surjective by 46.1ii).
Theorem 46.2. As F [x]-modules,
N (A) ∼
= MxI−A .
That is, taking R = F [x] and C = xI − A ∈ (F [x])n×n in Section 44 (with
p = n), the module MC is isomorphic to N (A) . Equivalently, a presentation of
N (A) by “n” generators and “n” relations is defined using the matrix xI − A.
Proof. Let ψ : (F [x])n → N (A) be the unique F [x]-module morphism
which maps ei = (0, · · · , 1, · · · , 0) ∈ (F [x])n to
199
 
0
 .. 
.
 
(A)

. (Recall that the latter module equals F n×1 as a set. It is
gi = 
1 ∈ N
.
.
.
0
helpful here to distinguish ei (whose components are constant polynomials) from
gi (which is almost the same and would often be identified with ei ). There are
lots of other surjective module morphisms besides ψ which will produce various
other choices for generators and relations; but ψ seems to be particularly natural,
and the resulting presentation matrix, xI − A, is as simple as one could hope
for.)
Continuing with the proof, we shall be looking at the following morphisms
of modules:
θxI−A
ψ
(F [x])n −→ (F [x])n −→ N (A) .
Clearly ψ is surjective. We must show that its kernel agrees with Im(θxI−A )
(in the notation of Section 44). This image is the submodule generated by
{ c1 , · · · , cn }, where
ci = (−a1i , − a2i , · · · , x − aii , · · · , − ani ) ,
since ci = θxI−A (ei ) [and the ei ’s generate the domain of θxI−A ]. Each ci is
in the kernel of ψ, since

ψ(ci ) = ψ xei −
,
j
the last equality being given by

aji ej  = xgi −


a1i
 .. 
xgi = Agi =  . 
ani
=
,
,
aji gj = 0 ,
j
aji gj ,
j
from the definition of scalar multiplication in N (A) . Now let J be an abbreviated name for Im(θxI−A ), the F [x]-submodule generated by { c1 , · · · , cn }.
We have just proved that ψ factors through a surjective morphism (F [x])n /J →
N (A) . It remains to show that the latter is an isomorphism. For this it suffices
to check that
dimF ((F [x])n /J) ≤ n ,
200
which follows by showing that
X := { e1 + J, · · · , en + J }
spans (F [x])n /J over F . Since the elements hN,i := (0, · · · , xN , · · · , 0) [with
the power of x in the ith position] span (F [x])n over F , we shall prove by
induction on N that
hN,i + J ∈ SpanF (X) .
This is clear for N = 0 since h0,i = ei . For the inductive step, note that a
straightforward calculation with the definitions of hN,i and ci yields
hN,i = xN −1 ci + a1i hN −1,1 + a2i hN −1,2 + · · · + ani hN −1,n .
But xN −1 ci ∈ J by the definition of J, and aki hN −1,k + J ∈ SpanF (X) for
each k by the inductive hypothesis, completing the proof.
Proposition 46.3. If, for invertible F [x]-matrices P and Q of size n × n,
we have








P (xI − A)Q = D := 






1
0
...
1
0

d1 (x)
..
.
d# (x)
0







 ,






with “m” 1’s on the diagonal together with monics dj (x), then
i) m + & = n = #1 degdj (x), so that there are, in fact, no zeros on the
diagonal of D, and
det(xI − A) = d1 (x)d2 (x) · · · d# (x) .
ii) Assume also that 1 %= d1 (x) | d2 (x) | · · · | d# (x) . Then D is uniquely
determined by A, and the invariant of the module N (A) (or of MxI−A by
46.2) is (0 ; [d1 (x)], · · · , [d# (x)]).
Proof. i) Since P is invertible, we have
1 = det(P −1 P ) = det(P −1 )det(P ) ,
201
and so detP is invertible in F [x], and therefore lies in F . (In fact, for any
R and any P ∈ Rn×n , the inverse, P −1 , exists in Rn×n if and only if detP is
invertible in R.
Exercise 45B. Use the ‘adjoint formula’ for the inverse to prove the “if”
half.)
Similarly, detQ is invertible in F [x], and so is also in F . Thus, applying
det to the display in the proposition, detD is a non-zero constant multiple
of det(xI − A). But the latter is clearly monic of degree n. Thus D has no
zeros on the diagonal, so its determinant is d1 (x) · · · d# (x), which is monic,
and therefore agrees with det(xI − A), and has degree n.
ii) By 44.1, MxI−A has the same invariant as MD . By 44.2 and i) above,
this invariant is (0 ; [d1 (x)], · · · [d# (x)]). By 43.2, there can be no other such
diagonal matrix D1 satisfying P1 (xI − A)Q1 = D1 for invertible P1 and Q1 .
Assertion 46.4. For each A in F n×n , there exist P, Q and D as in
46.3ii). They may be obtained by applying r/c operations over F [x]:
I xI − A
I
#
#
···
#
P D
Q
(BACH?)
This is the specialization of 44.4, taking R = F [x].
Definition. The invariant factors, d1 (x), · · · , d# (x), of N (A) will also be
referred to as the invariant factors of A—similarly for the elementary divisors
of A, the multiset consisting of the powers of monic irreducibles occurring in
the unique factorizations of all the dj (x).
Corollary 46.5. Two square F -matrices are similar iff they have the
same sequence of invariant factors (or equivalently, · · · the same multiset of
elementary divisors).
202
This is immediate from 43.1, 43.2, 46.1, 46.2 and 46.3, since we have:
A is similar to B
⇐⇒
N (A) ∼
= N (B)
MxI−A ∼
= MxI−B
⇐⇒
⇐⇒
xI − A and xI − B reduce by r/c ops. to the same Smith normal form .
The latter refers to the matrix D in 46.3ii), requiring all the diagonal entries
to be monic.
Much of the above is summarized by four bijections:
Set of similarity classes of square F -matrices of all sizes .
C
basic linear algebra
Set of ∼
= classes of linear operators V → V,
where V varies over finite dimensional F -vector spaces,
T
S
and (V → V ) ∼
= (W → W ) ⇐⇒
defn.
isomorphism P : W → V such that
C
there is a linear
P −1 T P = S .
start of this section
Set of isomorphism classes of F [x] -modules M for which dimF M < ∞ .
C
major theorem( 43.1, 43.2 )
Set of all sequences 1 %= d1 (x) | d2 (x) | · · · | d# (x) of monics in F [x] .
C
rinky − dink process
Set of finite multisets of positive powers of monic irreducibles in F [x] .
A canonical form for similarity is a set of square F -matrices (of hopefully rather simple description) such that every square F -matrix is similar to
exactly one in the set. In other words, the set is formed by picking out just
one (hopefully rather handsome) matrix from each similarity class.
With the theory above, it suffices to find one matrix, for each multiset
consisting of powers of monic irreducibles in F [x], whose elementary divisors
203
form that multiset. Note that r/c operations over F [x] exist to convert xI −A
into


1


..


.










0
1
d1 (x)
0
...
d# (x)


 ,





where d1 (x), · · · , d# (x) are the invariant factors of A, and others to convert
the latter to


 q1 (x)



..

 ,
.


0

0
qn (x)

where the qi (x) are the elementary divisors of A, supplemented by 1’s to make
their number up to n, and planted in any desired order on the diagonal. Now
if
"
#
A& 0
A =
0 A&& ,
with A& and A&& square, then
xI − A =
"
xI & − A&
0
0
xI && − A&&
#
,
for suitable identity matrices I & and I && . It follows that the multiset of elementary divisors for such an A is the union of those for A& and A&& . (The
same is not true of the invariant factors—one must first ‘fracture’ the monics
in the union into elementary divisors and then re-assemble. These last points
are probably more easily seen in terms of modules than of matrices—after
shuffling the free part, it is obvious that a direct sum of modules in elementary divisor form is still in that form; but a direct sum of two in invariant
factor form can seldom be put into that form just by permuting summands.)
Thus it suffices to produce one canonical form matrix for each power, p(x)α ,
of a monic irreducible p(x). Then, for a given elementary divisor multiset,
to produce the corresponding canonical form matrix, one assembles these as
204
blocks down the diagonal, the number of blocks being the size of the multiset,
with the blocks in some arbitrarily chosen order.
Rational Canonical Form. For each monic in F [x],
q(x) = xs + as−1 xs−1 + · · · + a0 ,
it is easy to find r/c operations over F [x] which convert
xI − Bq(x)
to







1
..
.
1
0
0
q(x)




 ,


where the companion matrix

Bq(x)









:= 









0
1
0
..
0
0
1
.
..
0
.
1
−a0 −a1 · · · · · · −as−1










 .









(Do the cases s = 2 and 3 to convince yourself of this—the formal proof of
an assertion like this deserves to be left in the closet!)
Thus, given A ∈ F n×n , taking qi (x) to be p(x)α once for each occurrence
of p(x)α among the elementary divisors of A, ‘the’ rational canonical form of
A will be a matrix
BLOCK DIAGONAL ( Bq1 (x) , Bq2 (x) , · · · )
Jordan Canonical Form. Here assume that F is algebraically closed
(which is equivalent to saying that all monic irreducibles in F [x] have the
form x − λ, as λ varies over F ; for example, F = C. ) It is easy to find r/c
205
operations which convert
xI − Jα,λ
where the Jordan block
Jα,λ







:= 













to
λ
1
..
λ 1
λ
0
1
0
1
...
0
.
(x − λ)α
0
... ...
λ




 ,









 ∈ F α×α .




1 

λ
(Do a couple of examples to convince yourself of this.)
‘The’ Jordan canonical form of A will then be block diagonal, with one
Jordan block as above for each occurrence of (x − λ)α among the elementary
divisors of A.
By using 46.2, we have quickly derived both of these canonical forms,
as well as algorithms to produce them. One can derive, by a lengthier argument, just the existence of these canonical forms, without first proving
46.2. One need only analyze the general cyclic F [x]-module. This may
make the rational and Jordan blocks appear to be more natural. Here is a
quick summary—see Artin, pp.478–482, for a more detailed treatment. Let
v1 be a module generator for the cyclic F [x]-module F [x]/ < d(x) >, where
d(x) is monic of degree s. Then g(x)v1 = 0 only for multiples g(x) of d(x).
Thus {v1 , v2 , · · · , vs } is linearly independent over F , where vi := xi−1 v1 .
With respect to that basis, it is easily calculated that the matrix of the linear
operator (x·) is the rational block whose bottom row consists of the negatives
of the coefficients of d(x) [i.e. the companion matrix of d(x)]. To get the
Jordan block from before, assume that d(x) = (x − λ)α , so that α = s. This
time use instead the basis whose ith member is (x − λ)i−1 v1 .
A number of otherwise rather difficult results in matrix theory are easy
consequences of this theory. In linear algebra, det(xI − A) is called the
206
characteristic polynomial of A. By 46.3, this is the product, d1 (x) · · · d# (x),
of the invariant factors of A. The minimal polynomial of A is the monic which
generates the ideal { g(x) ∈ F [x] : g(A) = 0 }. But this, by definition, is
the annihilator of the module N (A) . Now, for any PID, say R, when di | di+1
for all i, we have
ann ( R/(d1 )
*
···
*
R/(d# ) )
= d# .
(See the alternative proof of Theorem 43.2). Thus the minimal polynomial of
A is d# (x), the ‘largest’ invariant factor. We immediately deduce two results:
i) The minimal polynomial divides the characteristic polynomial; or equivalently, f (A) = 0 , where f (x) =det(xI − A). This is known as the
Cayley-Hamilton theorem.
ii) Every irreducible factor of the characteristic polynomial occurs as a
factor of the minimal polynomial.
In particular, every root of the characteristic polynomial occurs as a root
of the minimal polynomial (which is equivalent to ii) when F is algebraically
closed). These roots are known also as eigenvalues. When F is algebraically
closed, they are the diagonal entries in the Jordan form; and the dimension
of the eigenspace corresponding to λ is the number of Jordan blocks in which
λ is the diagonal element [i.e. the number of elementary divisors of the
form (x − λ)α ]. The diagonalizability theorems from linear algebra all follow
immediately. Note that for any F , a matrix has a Jordan form as long as
its minimal polynomial is a product of linear factors (and conversely). It is
diagonalizable as long as these factors are distinct.
Many readers will have first encountered xI − A in dealing with eigenvalues. It is perhaps ironic that deciding whether two matrices are similar
(that is, carrying out reduction of xI − A to Smith normal form) is in general
easier than finding exact eigenvalues (which requires one to solve polynomial
equations).
We have used in this section the ubiquitous passage back and forth between matrices and linear transformations which is a central topic in elementary linear algebra. But all the main results have been stated in terms of
square matrices, since this is simpler and more direct. The student should
translate each such statement to the equivalent statement concerning linear
operators. For example, ‘block diagonal matrix’ will translate to ‘direct sum
decomposition into invariant subspaces’.
207
We have emphasized finding the invariant factors of a matrix, and thereby
finding its ‘canonical form(s)’. The question of actually finding an invertible
matrix S which conjugates the given matrix to its canonical form is now a
straightforward application of basic linear algebra—the methods here will
produce F -bases for the relevant vector spaces, and S is then a ‘change-ofbasis’ matrix. Multiplying will produce a matrix which conjugates between
any two given similar matrices, once both conjugating matrices have been
found which convert the givens to their common canonical form.
Exercise 46C. Try to analyze your example from 43A for the non-PID,
Z[x], by the methods of this chapter (U.S.–style?), and analyse what goes
‘wrong’ (British–style?).
Exercise 46D. Look for matrices in the similarity section of your linear
algebra text(s) and calculate their rational and (when existing) Jordan forms.
Exercise 46E. i) For n = 1, 2, and 3, find formulae for the numbers of
similarity classes in (Fp )n×n .
ii) Also find formulae for the numbers of conjugacy classes in their groups,
GL(n, Fp ), of invertibles.
Can you generalize these to arbitrary n? (See 31C.)
Exercise 46F. Find all matrices in F 2×2 which are similar only to themselves. Now generalize this to F n×n . Explain this in terms of linear operators.
Exercise 46G. Is the following ‘proof’ of the Cayley-Hamilton theorem
convincing? If not, why not?
“Since f (x) = det(xI − A), we have
f (A) = det(AI − A) = det(0) = 0 !!”
Exercise 46H. Show that A and B are similar (over F ) if and only if
xI − A and xI − B are ‘row/column equivalent’ (over F [x]).
Exercise 46I. i) Write down the short proof, based on the results of this
section, that, over an algebraically closed field, A is similar to a diagonal
matrix if and only if its minimal polynomial has no repeated roots.
ii) Give an example to show that the “if” above can fail over ‘general’ fields.
iii) Show that, if A2 = A (i.e. A is idempotent), then A is diagonalizable
(over any field).
208
iv) Prove that two idempotent matrices are similar if and only if they are
‘row/column equivalent’.
Explain these in terms of linear operators.
Exercise 46J. Prove that, if A is a matrix which is a Jordan or rational
block, then A is not similar to any block diagonal matrix with more than
one block down the diagonal. Explain this in terms of linear operators.
Exercise 46K. i) Prove that every square matrix over an algebraically
closed field is similar to a matrix of the form D + N , where D is a diagonal
matrix, N i = 0 for some i (i.e. N is nilpotent), and DN = N D.
ii) Prove that a matrix over such a field is nilpotent if and only if it has only
0 as an eigenvalue.
iii) Prove that a matrix over any field is invertible if and only if it doesn’t
have 0 as an eigenvalue.
Explain these in terms of linear operators.
Exercise 46L. Prove that an element of finite order in GL(n, C) is
similar to a diagonal matrix whose diagonal entries are roots of unity; and
conversely. Explain this in terms of linear operators.
Exercise 46M. Is every square matrix similar to its transpose?
Exercise 46N. ‘Classify’ finitely generated modules over
i) Z[i], where i2 = −1;
ii) C[µ], where µ2 = 0.
Exercise 46O. Find all the similarity classes in C5×5 whose characteristic
polynomial is (x + 5)2 (x − 7)3 . In each such class, write down both a sample
matrix and the dimensions of the eigenspaces for all eigenvalues.
Exercise 46P. Show that, if A and B commute and both are diagonalizable, then there is an invertible S such that both S −1 AS and S −1 BS are
diagonal matrices (over any field). Explain this in terms of linear operators.
Show that the commuting condition is needed.
Exercise 46Q. Prove that the minimal polynomial of a block diagonal
matrix is the LCM of the minimal polynomials of its blocks.
Exercise 46R. Let J and R be the Jordan and rational blocks, respectively, corresponding to (x − λ)n . For small (all?) n, find S such that
S −1 JS = R.
209
Exercise 46S. Prove that, if A and B are in F n×n , and K ⊃ F is a field
extension, then A and B are similar over F if and only if they are similar
over K.
VI. Group Representations
Sections 47 to 50 study the ways in which a finite group can be related
to subgroups of the group, GL(n, C), of all invertible n × n matrices, by
morphisms from the former to the latter. These are called representations of
the group, and called faithful representations when the morphism is injective.
When faithful, the representation gives an isomorphism, ‘realizing’ the group
as a group of matrices. In general, it realizes a quotient of the group as a
matrix group. We shall only make a small start on this subject, which is
probably larger than the rest of group theory put together. There are some
surprisingly nice results, even at the very beginning of the subject. We’ll
finish with the famous Burnside (p, q)-theorem which says that a group whose
order is divisible by fewer than three primes is necessarily soluble. This is an
excellent example of an application of representations to the structure theory
of groups, one which had no proof outside representation theory for many
decades. The applications of group representations in geometry, topology,
harmonic analysis, differential equations, physics, chemistry, number theory
(for example, Wiles’ proof, in 1994, of Fermat’s last theorem), and probability
theory are undoubtedly more important even than its applications purely
within algebra. Interesting applications to combinatorics also occur.
47. G-modules & representations.
There are two equivalent ways of thinking about representations of a finite
group—as G-modules and as G-matreps, both defined below. We’ll pass back
and forth freely between the two concepts, once we see this equivalence in
47.1.
Definition. Let G be a group and let S be a set. An action of G on S
is a function
G × S −→ S ; (g, s) 9→ g · s ,
which satisfies 1 · s = s and a · (b · s) = (ab) · s for all a and b in G and all
s ∈ S.
We shall say that two actions of G, on S by ·, and on T by 4, are isomorphic if
and only if there is a bijective function Γ : S → T such that Γ(g ·s) = g 4Γ(s)
for all g ∈ G and s ∈ S.
210
G-modules & representations
211
See the remarks after 11.1 giving a list of examples of G-actions. We used
them to prove theorems 11.1 (Sylow) and 11.4, and they occurred on every
page in the sections on Galois theory. A set together with a given G-action
on the set is sometimes called a G-set.
Definition. Let G be a finite group. A G-module is a finite dimensional
C-vector space V together with an action of G on the set V such that the
action is linear; i.e. for each g, the map v 9→ g · v is a linear operator on V ;
i.e.
g · (αv + βw) = α(g · v) + β(g · w)
for all complex numbers α and β, all g ∈ G, and all v and w in V .
Two G-modules are said to be isomorphic if and only if they are isomorphic
(as in the previous definition) by some isomorphism, Γ, which is also C-linear.
The dimension of V (some call it degree) is just its vector space dimension.
Remark. There is a ring called the complex group algebra of G (see the
beginning of Section 50), which is non-commutative if G is, and such that
a G-module is virtually the same thing as a module over the group algebra
(in the sense of the previous chapter except for the non-commutativity). A
more sophisticated treatment of the subject dealt with here sees it as a special
case of module theory over non-commutative rings with a certain property,
semisimplicity, which the group algebra has.
Definition. Let G be a finite group. A G-matrep is a group morphism
ρ : G → GL(n, C) for some n ≥ 0. (By convention, when n = 0, the general
linear group is the trivial group—after all, the group of linear operators on
the zero vector space is a trivial group.) The jargon ‘matrep’ is not used
elsewhere— it saves us from saying ‘matrix representation’ quite a few times
(and permits the joke before 47N)—I request the reader’s forgiveness for
resorting to bad literary taste in the interests of economy.
The above G-matrep is said to be equivalent to some other given G-matrep
λ : G → GL(m, C) if and only if both m = n and there exists an invertible
matrix P such that λ(g) = P −1 ρ(g)P for all g ∈ G. (This last definition may
seem a bit unnatural—it is justified by the next proposition.)
Exercise 47A. Show that isomorphism, on the class of G-modules, and
equivalence, on the class of G-matreps, are both equivalence relations.
The ideas of G-module and G-matrep are almost identical, in the following
212
sense. Given a G-matrep ρ : G → GL(n, C), let V = Cn×1 and let the action
of every g on V be left multiplication by the matrix ρ(g). Given a G-module
V , choose any basis for V and map g to ρ(g) := the matrix representing the
operator (g ·) with respect to the chosen basis.
Proposition 47.1. i) These are well defined maps back and forth, from
the class of G-matreps to the class of G-modules, and from the class of pairs,
(G-module, basis), to the class of G-matreps.
ii) Equivalent G-matreps go to isomorphic G-modules.
iii) Starting with a G-module, the equivalence class of the G-matrep produced
is independent of which basis is used.
iv) Isomorphic G-modules produce equivalent G-matreps.
v) Starting with a G-module and doing the two maps consecutively gives a
G-module isomorphic to the one you started with.
vi) The same holds for G-matreps and equivalence.
Thus we obtain a bijection between the set of isomorphism classes of
G-modules and the set of equivalence classes of G-matreps.
Proof. It is a routine verification to check that starting from a G-matrep
produces a G-module, and also the other way round, giving i). As for ii), if the
matrix P demonstrates that ρ is equivalent to λ, then left multiplication by
P gives a linear operator on Cn×1 which may easily be seen to be a G-module
isomorphism mapping the G-module obtained from ρ to that obtained from
λ. For iii) and iv), if Γ is an isomorphism between two G-modules, and they
have been given bases in order to produce corresponding G-matreps, then
the matrix P which represents Γ using these two bases is easily checked to
provide an equivalence between the G-matreps corresponding to the given
G-modules. When the two G-modules are equal, this proves iii), and when
they’re not necessarily equal it also does iv). Starting from a G-matrep, the
corresponding G-module is Cn×1 , which has the standard basis. Using this
basis, the G-matrep you get is actually the one you started with (not just
equivalent to it). This proves vi). Starting with a G-module and a basis
for it, let Γ be the unique linear isomorphism from it to Cn×1 which takes
the given basis to the standard basis. Then Γ is readily checked to be an
isomorphism of G-modules. Its codomain is the G-module associated to that
G-matrep which is associated to the starter with its given basis. This proves
v).
213
Exercise 47B. Do all the routine verifications in detail to convert this
into a complete proof.
Definition. Define the direct sum, of two G-modules V and W , to be
*
the vector space V W of ordered pairs, with the G-action
g · (v, w) := (g · v, g · w) .
Define the direct sum ρ ⊕ λ of two G-matreps to be the map which sends g
to the block diagonal matrix
"
ρ(g)
0
0
λ(g)
#
.
Exercise 47C. i) Show that both of these external direct sum operations
are well defined.
*
*
ii) Prove that if V ∼
= V & and W ∼
= W & , then V W ∼
= V & W &.
iii) Prove the analogue of ii) for G-matreps and equivalence.
iv) Show that if V and ρ correspond under the bijection in 47.1, and W and
*
λ do as well, then so do V W and ρ ⊕ λ .
Definition. A G-invariant subspace (also called a G-submodule), of a
G-module V , is any vector subspace W such that g · w ∈ W for all w ∈ W
and g ∈ G.
A G-module V is called irreducible if and only if it has precisely two invariant
subspaces (which are therefore {0} and V itself—in particular, V %= {0}).
Exercise 47D. Determine what the corresponding concepts are in the
language of G-matreps.
Definition. A G-module V is the internal direct sum of a finite sequence
V1 , V2 , · · · of G-submodules if and only if each element of V can be written
uniquely as a sum of vectors, one from each Vi . This is equivalent to stating
that the map
* *
V1 V2 · · · −→ V ,
which sends (v1 , v2 , · · ·) to v1 + v2 + · · ·, is an isomorphism of G-modules.
Examples. The representations of the trivial group are obvious, so let’s
try to determine all the representations of a cyclic group, {1, g}, of order 2.
214
For such a matrep ρ, any linear operator T with matrix ρ(g) satisfies T 2 = I.
Therefore the space on which T acts splits uniquely as the internal direct
sum of the (+1)-eigenspace of T and the (−1)-eigenspace of T . Picking any
bases in these two eigenspaces, each of them splits, but usually non-uniquely,
into internal direct sums of one dimensional invariant subspaces. Putting
this together we see that G = {1, g} has only two irreducible modules, each
of dimension 1 over C, on which (g ·) acts as the identity in one case, and as
−I on the other ‘irrep’. Furthermore an arbitrary G-module splits uniquely
as the internal direct sum of two invariant subspaces, each of which splits
non-uniquely as an internal direct sum of irreducible invariant submodules.
All of the irreducibles within either one of the two earlier invariant direct
summands are isomorphic to each other.
We’ll see that the general features just noted carry over to an arbitrary finite
group, the number of irreducibles being finite but usually larger than 2, and the
dimensions of some of these irreducibles being larger than 1 when the group is
non-abelian.
Definition. A G-module is trivial when (g ·) is the identity map for all
group elements g. A G-matrep is trivial when it maps all group elements to
the identity matrix. These are corresponding concepts. The one dimensional
trivial representation is the only one of these which is irreducible. The zero
representation (i.e. n = 0 for matreps and dimension is zero for G-modules)
is trivial, not irreducible, and even more uninteresting than the other trivial
representations; but it is needed to avoid extra phrases in definitions and
results.
Examples continued. Let’s look at the cyclic group G = {1, g, g 2 }
of order three, the cyclic group H = {1, h, h2 , h3 } of order four, and the
Klein group K = {1, a, b, ab} of order four. A G-matrep, ρ, is determined
by knowing ρ(g) since ρ(g 2 ) = ρ(g)2 and ρ(1) = I. Also ρ(g)3 = I. Similarly
, an H-matrep, λ is determined by knowing λ(h); and λ(h)4 = I. Finally,
a K-matrep µ is determined by knowing both µ(a) = A and µ(b) = B; and
they are matrices satisfying A2 = I = B 2 and AB = BA.
By elementary considerations of eigenspaces, it follows that a G-module is
uniquely the internal direct sum of three invariant subspaces, corresponding
to the three cube roots of unity, occurring as eigenvalues for ρ(g). Each of
these three is itself a direct sum of 1-dimensional irreducibles, all isomorphic.
215
The same is true of an H-module, except that there are four distinct irreducibles. The fact that A and B commute yields a similar result for K, with
four irreducibles possible, all of dimension one and corresponding to the four
combinations of ±1, occurring as eigenvalues for A and B. (In each of these
cases, the number of copies of a given irreducible within a given G-module
might be zero!)
Exercise 47E. Write out the complete details for these claims above.
Examples cont’d. Let’s now look at the smallest non-commutative
group, S3 . It has at least two 1-dimensional (and therefore irreducible) representations: the trivial one, and the action on C given by (σ ·) := multiplication by sign(σ). It also has a linear action on C3 given by permuting the
coordinates, i.e.
σ · (z1 , z2 , z3 ) := (zσ−1 (1) , zσ−1 (2) , zσ−1 (3) ) .
Exercise 47F. Check that this is a representation, and that we must use
the inverse of σ on the right-hand side—we’ve been working with left actions;
with right actions the inverse wouldn’t occur.
This last representation is not irreducible; the subspace {(z, z, z)} is a
1-dimensional trivial invariant submodule. A complementary submodule is
{ (z1 , z2 , z3 ) : z1 + z2 + z3 = 0 } .
Exercise 47G. i) Prove that this last set is an invariant S3 -submodule, and
is irreducible.
ii) Try to prove that S3 has only these three irreducible representations, up
to isomorphism of course.
Here is a table summarizing these facts and a few others concerning the
five examples.
-
group #conj.classes #irreps. order
dim2 (irreps.)
C2
2
2
2
2 = 12 + 12
C3
3
3
3
3 = 12 + 12 + 12
C4
4
4
4
4 = 12 + 12 + 12 + 12
D2
4
4
4
4 = 12 + 12 + 12 + 12
S3
3
3
6
6 = 12 + 12 + 22
216
The numerical coincidences in the table are not coincidental: in order of
relative difficulty, here are three facts, which we’ll prove in sections 47, 48
and 50 respectively.
The squared dimensions of the irreps add up to the order of the group!
The number of irreps is the number of conjugacy classes in the group!
The dimension of each irrep divides the order of the group!
Our ‘empirical’ evidence for these is admittedly not especially strong just
yet. In fact it doesn’t seem obvious that a finite group cannot have infinitely
many non-isomorphic irreducibles, possibly even ones whose dimensions get
arbitrarily large.
First let’s establish, for any finite group G, the decomposition of every
G-module into irreducibles.
Maschke’s Theorem 47.2. If W is an invariant subspace of a Gmodule, then there is another invariant subspace U such that V is the internal
direct sum of W and U .
Corollary 47.3. Any G-module is the direct sum of (finitely many)
irreducible submodules.
The corollary is immediate by induction on the dimension of the G-module.
By ‘convention’ (actually by logic), the zero module is the direct sum of the
empty set of irreducibles.
Definition. A G-map, θ : V → V & , between two G-modules is a linear
map such that θ(g · v) = g · θ(v) for all g ∈ G and v ∈ V .
Thus an isomorphism is a bijective G-map.
Lemma 47.4. The kernel and image of a G-map both are invariant
subspaces.
Exercise 47H. Prove this.
Proof of 47.2. It suffices to find a G-map θ : V → W such that θ(w) = w
for all w ∈ W —for linear algebra shows that V then splits as the direct sum
of W and the kernel of θ, so the result follows from half of 47.4. By linear
algebra, we know that there is a linear map φ : V → W which has the
splitting property φ(w) = w for all w ∈ W . Use the following ‘averaging
217
trick’ to define θ so that it will be a G-map with the splitting property. Let
θ(v) := |G|−1
,
g∈G
g −1 · φ(g · v) .
It is a routine verification that θ has the required properties. Let’s just check
that it is a G-map:
θ(g0 · v) = |G|−1
,
g∈G
g −1 · φ(gg0 · v) = |G|−1
,
h∈G
g0 h−1 · φ(h · v) = g0 · θ(v) ,
letting h = gg0 .
Exercise 47I. Do the other verifications to finish this proof.
Definition. Given two G-modules V and W , define HomG (V, W ) to be
the set of all G-maps from V to W .
Proposition 47.5. The set HomG (V, W ) is a subspace of the vector
space, HomC (V, W ), of all linear maps from V to W .
Exercise 47J. Prove this.
Exercise 47K. Prove that if V ∼
=G V & and W ∼
=G W & , then
HomG (V, W ) ∼
=C HomG (V & , W & ). Here we are using ∼
=G to denote isomorphisms of G-modules, and ∼
=C to denote isomorphisms of complex vector
spaces.
Schur’s Lemma 47.6. Let V and W be irreducible G-modules.
i) If V ∼
%= W , then HomG (V, W ) is the zero subspace.
ii) If V ∼
= W , then HomG (V, W ) is a 1-dimensional space, all of whose nonzero elements are isomorphisms.
iii) Taking V = W , the space HomG (V, V ) consists of all the scalar multiples
of the identity.
Proof. i) Suppose that θ ∈ HomG (V, W ) is non-zero. By 47.4, its image
is a non-zero invariant subspace of W , and so is all of W , by irreducibility.
By 47.4, its kernel is a proper invariant subspace of V , and so is zero, by
irreducibility. Thus θ is an isomorphism, contradicting hypothesis.
ii) follows immediately from iii), by taking V & = W & = V in 47K. Its
second half also follows from the argument given for i).
218
iii) Let θ ∈ HomG (V, V ). Let α be an eigenvalue of θ. Then θ − αI is a
G-map with a non-zero kernel, so its kernel is V , by irreducibility. Thus it
is the zero map, as required.
Lemma 47.7. The ‘HomG functor’ distributes over direct sums; that is,
for G-modules Vα , Wβ , V and W ,
*
HomG (
α Vα ,
and
HomG (V,
*
W) ∼
=C
+
HomG (Vα , W ) ,
∼
=C
+
HomG (V, Wβ ) .
β Wβ )
α
β
Proof. Consider the canonical maps from linear algebra, which prove
the same isomorphisms, except that HomG is replaced by HomC . A routine calculation shows that these maps restrict to isomorphisms between the
subspaces in the lemma.
Now we can prove the most important aspect of the uniqueness of the
decomposition of a G-module into a direct sum of irreducibles.
Theorem 47.8. Let V be a G-module. Suppose that
+
α
Vα ∼
=G V ∼
=G
+
Vγ ,
γ
where all the Vα and Vγ are irreducible G-modules. Let W be any irreducible
G-module. Then the number of α for which Vα is isomorphic to W equals
the number of γ for which Vγ is isomorphic to W .
This says that the number of times that a given irreducible appears in
a decomposition of a G-module depends only on the G-module and not on
the particular decomposition. We have seen that there is no uniqueness of
the actual irreducible direct summands, when a given isomorphism class of
irreducible occurs more than once in a decomposition of some G-module into
an internal direct sum.
Proof. We show below that the number at issue in the case of subscripts
α is the dimension of HomG (V, W ). By symmetry this will also be true
219
for subscripts γ, as required. (The main point is to find a formula for the
number which is manifestly independent of the details of any decomposition!)
dimC HomG (V, W ) =by
=by
=
47.7
dimC
,
Vα ∼
=W
=by
+
47K
HomG (Vα , W ) =
α
dimC HomG (Vα , W ) +
47.6
,
Vα ∼
=W
1 +
,
Vα ∼
*=W
0
*
dimC HomG (
,
,
α Vα ,
W)
dimC HomG (Vα , W )
α
Vα *∼
=W
dimC HomG (Vα , W )
= #{α : Vα ∼
= W} .
This completes the proof.
Corollary to the proof 47.9. The number of times that an irreducible
G-module W occurs as a direct summand in a G-module V is dimC HomG (V, W ).
Exercise 47L. Prove that it’s also dimC HomG (W, V ).
We shall have quite a few references to Schur’s lemma. Here’s another
one.
Theorem 47.10. If G is abelian, then every irreducible G-module is
1-dimensional.
Proof. For every h ∈ G, the map (h ·) is a G-map on any G-module.
For,
h · (g · v) = (hg) · v = (gh) · v = g · (h · v) .
If the G-module is irreducible, it is immediate from iii) of Schur’s lemma
that (h ·) is multiplication by a scalar. This being true for all h, every vector
subspace is invariant. But any vector space of dimension greater than 1 has
plenty of non-trivial proper vector subspaces.
So far we haven’t produced even one construction for a non-trivial Gmodule for an arbitrary finite group G. Here is one which will have plenty
of use.
220
Definition. For each G, define a G-module RG , called the regular representation of G, as follows. As a vector space, let RG be the space with basis
{rg : g ∈ G}, so that the dimension is |G|. Define the action by specifying
h · rg := rhg and extending linearly. This gives the formula
h·
,
,
αg rg =
g∈G
αh−1 g rg ,
g∈G
for complex numbers αg .
Exercise 47M. Verify that RG is a G-module.
Lemma 47.11. For any G-module W , we have
HomG (RG , W ) ∼
=C W .
On the left, W is regarded as a G-module; but on the right it is regarded as
merely a vector space.
Using 47.9, and applying 47.11 with W irreducible, we obtain
Corollary 47.12. Every irreducible G-module occurs as a direct summand in RG a number of times equal to its dimension.
This could be written
RG ∼
=G
+
W ⊕(dimW ) ,
W
where the direct sum is over irreducible G-modules, one for each isomorphism
class, and W ⊕n denotes a direct sum of “n” copies of W .
Taking dimensions, we get
|G| =
,
(dimW )2 ,
W
proving the first claim after the table following 47G. In particular, the
number of isomorphism classes of irreducibles is finite.
Corollary to the Corollary 47.13. An abelian group G has a total of
“|G|” irreducible representations (all 1-dimensional).
Proof of 47.11. Define a map
Γ : HomG (RG , W ) −→ W ,
221
by
θ 9→ θ(r1 ) .
It is evident that Γ is a linear map. Let θ ∈ Ker(Γ). Then
θ(rg ) = θ(g · r1 ) = g · θ(r1 ) = g · Γ(θ) = g · 0 = 0 .
Since θ is linear, and {rg } is a basis, we get θ = 0. Thus Γ is injective. It
remains to show that Γ is surjective. Given w ∈ W , define a linear map θw
from RG to W by specifying it on the basis we’re using:
θw (rg ) := g · w .
As long as we can show that θw is a G-map, we’re finished, since
Γ(θw ) = θw (r1 ) = 1 · w = w ,
as required. Let v =
-
g∈G
αg rg . The needed calculation is:
-
θw (h · v) = θw [
αg (h · rg )]
g∈G
-
= θw (
g∈G
-
= θw (
=
=
-
k∈G
k∈G
-
g∈G
= h·
= h·
αh−1 k rk )
αh−1 k (k · w)
αg (hg · w)
-
αg (g · w)
g∈G
-
αg rhg )
αg θw (rg )
g∈G
-
= h · θw (
g∈G
αg rg )
= h · θw (v) .
Remarks. i) The representation RG may appear to you to have been
‘pulled out of a hat’. Here is a general construction which converts a finite
G-set, X, into a G-module RX . When the G-set is G acting on itself by left
multiplication, this construction produces a G-module isomorphic to RG .
222
Define RX to be the complex vector space with basis X, and let the linear
action on this vector space be the unique linear extension of the given action
on the basis X. For example, the 3-dimensional representation of S3 earlier
was of this form, with X being the S3 -set which defines S3 . It is unusual for
RX to be irreducible. It never is for decomposable G-sets—those which can
be written as the disjoint union of two non-empty G-subsets—and seldom is
even when X is indecomposable.
ii) One can think of the proof of 47.11 as saying that RG is the ‘free
G-module on one generator r1 ’ : for any G-module W and any w ∈ W , there
is a unique G-map θw : RG → W mapping r1 to w. This is analogous to Z[x]
being the free commutative ring on one generator x: it maps uniquely by a
ring morphism into any given ring S, sending x to any preassigned s ∈ S.
There are free groups, free abelian groups, free modules, · · ·, in each case on
one generator or, more generally, on any set of generators. See the mapping
properties in Section 21, called ‘extension principles’ there. It is often best
to emphasize the mapping property rather than any particular construction
of the free object. The mapping property guarantees that the free object is
unique up to a unique isomorphism. Of course a particular construction is
needed to be sure that the free object exists.
Exercise 47N. Prove in detail the statements above about mapping
properties.
The results we have just proved are more often proved using the character
theory of the next section. The methods just used are among Schur’s many
basic contributions to representation theory. About ten years earlier in 1899,
Frobenius had invented the subject of character theory of finite groups (see
Section 48), and had proved versions of most of what we’re doing in this
chapter. There had been a long history of studying characters for abelian
groups before that, with applications to physics, number theory, and probability; see Mackey.
Representations of finite abelian groups. A major difficulty often
occurs in trying to give all the irreducible representations of some group
explicitly (—inventing a better matrep?). Abelian groups are not typical
of the subject in most ways, including this. We know that such a G has
exactly “|G|” irreducible representations, all 1-dimensional. Let’s write them
down explicitly. We can let G be the product Ca × Cb × · · ·, by the structure
223
theorem for finite abelian groups in multiplicative notation. We don’t need
the divisibility conditions on the orders of the factors for what follows. Let
α, β, · · · be fixed complex numbers which are primitive roots of unity of
orders a, b, · · ·. Let g, h, · · · be generators for the factors in the previous
direct product. For each sequence of integers i, j, · · · with 0 ≤ i < a, 0 ≤
j < b, · · ·, let ρi,j,··· be the representation in which g 9→ αi , h 9→ β j , · · ·.
It is easy to see that this determines a unique 1-dimensional representation.
To see that this is the complete list as we vary the sequence i, j, · · ·, use
the following easy exercise to deduce that no two of these representations are
equivalent.
Exercise 47O. For any group G, suppose that ρ and µ are equivalent
G-matreps. Show that if ρ(g) = αI, then µ(g) = αI.
Exercise 47P. Find four distinct 1-dimensional representations of the
dihedral group D4 ; also find a 2-dimensional irreducible—recall the definition
of that group! (But give complex, not real, representations.) Deduce that
you’ve found all of its representations. Do the same for the quaternion group
of order 8.
Addendum. We haven’t made precise the claim after the first example,
concerning the canonical decomposition of a G-module V into a direct sum
of invariant subspaces, indexed by the irreducibles W , and each containing
all copies of W which occur as invariant subspaces of V . This summand is
called the W -isotypical component. A character theory version accessible to
the masses is given in 48M. For the elite who have learned about tensor
!
products (see Appendix ), here is a more elegant version.
!
Given also a finite dimensional vector space U , we can make U C W
into a G-module with action 4, by requiring that (g4) should be the unique
!
linear operator on U C W which maps u ⊗ w to u ⊗ (g · w). This doesn’t
depend at all on the G-module W being irreducible. We’ll revert now to “·”
rather than “4”.
!
Exercise 47Q. a) Prove that U C W is a G-module.
b) Prove that this G-module is the direct sum of “dimU ” copies of W .
Now revert to the assumption that W is irreducible, so that b) gives the
!
decomposition of U C W into irreducibles.
!
c) Show that there is a 1-1 correspondence between decompositions of U C W
224
into an internal direct sum of irreducibles and decompositions of U into an
internal direct sum of 1-dimensional subspaces.
For V above, take U = HomG (W, V ), and define a map
Γ :
+
W
!
HomG (W, V )
C
W −→ V ,
(where the direct sum is over all isomorphism classes of irreducible G-modules)
by requiring that, on the W th component of the domain of Γ, we set Γ(θ ⊗
w) = θ(w). So the map Γ is basically evaluation of G-maps. Denote the
image of that W th component under Γ as VW , and call it the W th isotypical
component of V .
Characters of representations
225
Theorem 47.14. i) The map Γ is an isomorphism of G-modules.
ii) Thus V is the internal direct sum of G-invariant subspaces VW .
iii) Every invariant subspace of V is uniquely of the form
,
W
Γ(UW
!
W)
for a collection of vector subspaces UW ⊂ HomG (W, V ), and all such internal
direct sums are invariant.
iv) An invariant subspace is a direct sum of copies of an irreducible W0 if
and only if all the corresponding UW for W %= W0 are zero. It is isomorphic
to W0 itself if, in addition, UW0 is 1-dimensional. Thus there is a 1-1 correspondence between invariant subspaces of V which are isomorphic to an
irreducible W and 1-dimensional vector subspaces of HomG (W, V ).
Notice how this last statement includes Schur’s lemma. The proof is
straightforward, and will be left as Exercise 47R.
Exercise 47S. If X is a G-set and x ∈ X, the orbit of x is the set
{ gx : g ∈ G }. Show that there is a natural 1-1 correspondence between it
and the cosets of G modulo the stabilizer subgroup of x, which is defined to
be { g ∈ G : gx = x }.
This principle is used in the proofs of 11.1, 11.4, 50.10, and 51.2.
48. Characters of representations.
For any finite group G, let CG/∼ be the vector space of all those functions
Θ : G → C such that Θ(a−1 ba) = Θ(b) holds for all a and b. This we’ll call
the space of class functions on G. The dimension of CG/∼ is clearly equal
to the number of conjugacy classes in G. In this section, we shall associate,
to each representation of G, an element of CG/∼ , called the character of the
representation. The irreducible characters will be shown to be a basis for
CG/∼ (but not the obvious basis). Thus we’ll have proved the second claim
after the table following 47G—the number of irreps equals the number of
conjugacy classes.
Aside. (Aside to the aside—as Shakespeare could have told you
(but for having better taste in humour), an aside is a digression which
226
gets twice the gas kilometreage.) The genre of proof above—establishing
an enumerative result with an algebraic proof—occurs often. (Ex. By considering the homogeneous components of the ring F [s1 , s2 , · · ·]/Ideal{s2i − s2i :
i > 0}, where si has degree i, show that the number of partitions of an integer into odd parts is equal to its number of partitions into distinct parts.)
Combinatorialists spend much worthwhile effort on making such proofs more
direct. (Ex. Do it for this last example.) However, for the theorem above,
there is no known explicit bijection, defined ‘simultaneously’ for all groups,
which maps the set of irreps onto the set of conjugacy classes.
There are many other applications of characters besides the above theorem, which also has a ‘(group algebra)-style’ proof. Indeed, as indicated
earlier, Frobenius’ original formulation of the subject was expressed entirely
in terms of characters; representations didn’t occur at all. We begin by
defining the character of a representation.
Definition. For A = (Aij ) ∈ Cn×n , define the trace of A to be the sum
of its diagonal entries:
tr(A) :=
n
,
Aii .
1
Proposition 48.1. The trace of a matrix is also the sum of its eigenvalues, counted with their multiplicities as roots of its characteristic polynomial.
For all square matrices A and invertible P , we have
tr(P −1 AP ) = tr(A) .
Proof. The first statement follows by examining the coefficient of the
second highest power of x in
det(xI − A) =
$
j
(x − λj ) .
The second follows from the first, since P −1 AP and A have the same characteristic polynomial. Here’s a direct proof. Let Q = (Qij ) = P −1 . Then
tr(P −1 AP ) =
=
=
=
=
=
-
227
i (QAP )ii
- i
-
j,k
-
j,k (
-
Qij Ajk Pki
i
Pki Qij )Ajk
j,k (P Q)kj Ajk
-
j,k δkj Ajk
-
j
Ajj
=
(Kronecker delta)
tr(A) .
Corollary 48.2. For any linear operator T : V → V , we have a well
defined scalar tr(T ) := tr(A), where A represents T with respect to any basis
of the finite dimensional vector space V .
Remark. For readers who have learned about tensor products, here is
the direct definition of the trace function on operators:
trace : HomC (V, V ) −→ V ∗
!
C
V −→ C ,
where V ∗ is the dual of V , the right-hand map is evaluation, the linear map
which sends f ⊗ v to f (v), and the left-hand map is the inverse of the linear
isomorphism which sends f ⊗v to [w 9→ f (w)v] . This will not be used below.
Definition. Let V be a G-module. The character of V is the function
χV : G → C defined by χV (g) := tr[(g ·) : V → V ] .
The corresponding object χρ for a G-matrep ρ is evidently given by defining
χρ (g) := tr[ρ(g)] .
Corollary 48.3. The function χρ (resp. χV ) is a class function, and
depends only on the equivalence class of ρ (resp., on the isomorphism class
of V ).
In the case of a matrep ρ, this is clear, since ρ(a−1 ba) = ρ(a)−1 ρ(b)ρ(a),
and since an equivalent matrep takes values of the form P −1 ρ(a)P . For Gmodules, the proof is similar, or may be done by passing to the corresponding
matrep.
228
Remarks. i) Since the eigenvalues of a finite order matrix are roots
of unity, character values are sums of roots of unity, and therefore algebraic numbers. They are not arbitrary algebraic numbers, by the results of
Abel/Galois/Ruffini. As well, in the next section, 49, we’ll see that they are
so-called algebraic integers.
ii) For 1-dimensional matreps, there is no distinction between ρ and its
character χρ . In that case (but only in that case), we have
χρ (ab) = χρ (a)χρ (b) ,
since ρ is a morphism of groups.
Exercise 48A.
iii) Prove that tr(AB) =tr(BA) for all matrices A and B.
(For invertible matrices, this is immediate from 48.1). It follows that χρ (ab) =
χρ (ba) (or, use that ab and ba are conjugate).
iv) The dimension of ρ is χρ (1).
Proposition 48.4. The character of the regular representation is given
by
χRG (g) =
&
|G|
0
if g = 1 ;
if g =
% 1.
Proof. Recall that g · rh = rgh . Thus, for fixed g, the coefficient of rh in
g · rh is 0 if g %= 1, and is 1 if g = 1. Summing over h gives the result.
Exercise 48B. More generally, given a finite G-set X, find a formula for
the character of the associated linear representation, RX , in terms of fixed
points; and show that it reduces to 48.4 when X is G under left multiplication.
The following easy fact shows that the characters of all representations
are known once we have the irreducible characters.
Proposition 48.5. We have χV ⊕W (g) = χV (g) + χW (g) .
Exercise 48C. Prove this.
Exercise 48D. Calculate the characters of all the irreducible representations in the previous section, without peeking below until you’ve finished.
Each finite group has a character table. In it, one lists the names of the
irreducible representations down the left-hand side and the conjugacy classes
across the top, and enters the character values into the body of the table;
229
and then decorates it to taste with various bells and whistles. A handy one
to include with each conjugacy class is the size of the class. Here are the
examples from the previous section, where I’ve added an extra bottom line
(not officially part of the character table) giving the characters of the regular
representation (which is not irreducible for non-trivial groups). The names
ρ2 , µν, · · · will be explained later in 48H.
C2
χtriv
χρ
χreg
D2
χtriv
χµ
χν
χµν
χreg
[1] [1]
1
g
1
1
1 −1
2
0
[1] [1] [1] [1]
1
a
b ab
1
1
1
1
1 −1
1 −1
1
1 −1 −1
1 −1 −1
1
4
0
0
0
S3
χtriv
χsign
χother
χreg
χstandard
C3
χtriv
χρ
χ ρ2
[1] [1] [1]
1 g g2
1 1 1
1 ω ω2
1 ω2 ω
χreg
3
C4
χtriv
χρ
χ ρ2
χ ρ3
χreg
0
0
[1] [1] [1]
1 h h2
1 1
1
1 ω ω2
1 ω2 ω4
1 −i −1
4
0
(ω = e2πi/3 )
0
[1]
h3
1
ω3
ω6
i
(ω = e2πi/4 )
0
[1]
[3]
[2]
1 − cycle 2 − cycles 3 − cycles
1
1
1
1
−1
1
2
0
−1
6
0
0
3
1
0
For S3 we’ve added two illegitimate lines at the bottom. The standard
representation is the 3-dimensional linearization, after 47E and from Remark
i) before 47N, of the defining action for S3 on { 1, 2, 3 }.
230
Looking at these we see that the extra row(s) on the bottom are in agreement with 48.5. If you add up each column in the character table itself,
weighting each entry by multiplying by the entry to its left in the first column, you get the entry for the regular representation, i.e.
χreg =
,
dimρ χρ .
irreps ρ
This is equivalent to 47.12, and is also a special case of orthogonality below.
Similarly, the equation
χstandard = χother + χtriv
checks out. Note also that it is normal practice to put the identity element
first, so that the leftmost column lists the dimensions of the irreducibles. The
overworked word ‘degree’ is often used to mean dimension.
Exercise 48E. Expanding on 47P, calculate the character tables for
both non-abelian groups of order 8.
Now look closely at the seven character tables you’ve got [not at the
illegitimate lower line(s)].
Firstly they are square, with (say) “m” rows and “m” columns; that’s the
theorem (yet to be proved) from the introductory paragraph of this section.
Next, the columns, regarded as vectors in Cm , form an orthogonal basis
with respect
to the standard (hermitian) inner product, with each vector of
D
length |G|/(class size) .
m
Finally, the rows, regarded
D as vectors in C , form an orthogonal basis,
with each vector of length |G|, with respect to a weighting of the standard
(Hermitian) inner product, where we “weight” by multiplying each term in the
calculation of the standard inner product by the number of elements in that
conjugacy class.
These last three statements hold for any finite group. (If they don’t hold
for your answer to 48E, then start again. If they do, then you’re almost
certainly correct.) The last two observations are called the character orthogonality relations. They are far easier to remember in the above form, as
handwaving facts with a schematic picture of a table in your head, than to
remember as the formulae given in 48.12 and in the proof of 48.7 below.
231
Another way to remember the row orthogonality is that the irreducible characters form an orthonormal set with respect to a standard Hermitian inner
product on the space of all complex valued functions on G. We discuss this
further after the proof of 48.6.
Perhaps one can prove orthogonality without using the equality between
the numbers of conjugacy classes and of irreps. Suppose that we’ve done so,
and that these two numbers differ for some group. By taking the entries of
its character table and dividing each by the square root of the number of
elements in the conjugacy class label for that entry’s column, form a nonsquare matrix, M . Orthogonality tells us that we have a matrix for which
both of the square matrices M tr M and M M tr are diagonal matrices with
non-zero diagonal entries. But this is ridiculous, since there’s certainly no
pair of non-square matrices M and N of complementary shape such that the
square matrices M N and N M both have maximal rank—the rank cannot
exceed the smaller of the two matrix dimensions involved. We shall redo
this proof of the squareness of the character table directly in 48.11 below,
without having to quote these rank facts from linear algebra.
The proofs of orthogonality below look a bit formidable. However, they
could be reproduced by a well trained drone who was provided with the following information—so remember this and the rest is essentially calculation.
In both cases you apply Schur’s Lemma.
For row orthogonality, apply it to the following map, depending on an
arbitrary linear map α : V1 → V2 between G-modules with actions · and 4
(this is the ‘Maschke averaging trick’) :
φα : V1 −→ V2 ,
v 9−→
,
g∈G
g −1 4 α(g · v) .
232
For column orthogonality, apply it to the following map, defined for any
G-module U and any conjugacy class C in G:
ψC : U −→ U ,
u 9→
,
h∈C
h·u .
Lemma 48.6 These two maps are in fact G-maps.
Proof. The proof for φα is exactly the same as that within Maschke’s
theorem, 47.2. As for the other, using the fact that C is a conjugacy class
to justify the fourth equality,
ψC (g · u) =
,
h∈C
hg · u =
= g·
as required.
,
h∈C
,
h∈C
gg −1 hg · u
g −1 hg · u = g ·
,
k∈C
k · u = g · ψC (u) .
To discuss row orthogonality, here’s some useful notation. Let CG denote
the vector space of all functions from G to C. Give it the following Hermitian
inner product:
,
α(g)β(g) .
< α | β > := |G|−1
g∈G
The standard basis for CG consists of the functions which take one value
equal to 1, and the rest equal to 0. Then the above is the standard inner
product which would make this into an orthonormal basis, except that we’ve
divided by |G|. Now row orthogonality may be stated as follows.
Theorem 48.7. The set of irreducible characters is an orthonormal set
in CG , with respect to the inner product < | >. In particular, that set is
linearly independent.
Remark. Note that characters have the property
χµ (g −1 ) = χµ (g) ,
since the eigenvalues for A−1 are the reciprocals of those for A, and since
z −1 = z for a root of unity z. Thus we have
< χρ | χµ > = |G|−1
,
g∈G
χµ (g −1 )χρ (g) .
233
Exercise 48F. Assume that { z(g) : g ∈ G } is a collection of numbers
for which z(a−1 ba) = z(b) for all a and b; i.e. a class function. Let the righthand sum below be over all the conjugacy classes C in G, with gC being some
chosen element in C. Prove that
,
z(g) =
g∈G
,
C
|C|z(gC ) .
Deduce that 48.7 is really the same as row orthogonality of the character
table.
Proof of 48.7. As we said earlier, this will follow fairly mechanically
by applying Schur’s lemma to the G-map, φα , between two irreducible Gmodules V1 and V2 . For any choice of C-linear map α, that G-map must be
zero if V1 and V2 are not isomorphic, and is a scalar multiple of the identity
map when V1 = V2 .
First suppose that ρ and µ are inequivalent irreducible matreps, obtained
by choosing bases in the non-isomorphic irreducible G-modules V1 and V2
respectively. Let X be the matrix representing α with respect to these bases.
We therefore have, from the definition of φα ,
,
µ(g −1 )Xρ(g) = 0
g∈G
for all matrices X of a suitable size, where the right-hand side is a zero
matrix. Looking at the (i, &)th entry in this identity, we get
,
µ(g −1 )ij Xjk ρ(g)k# = 0
j,k; g∈G
for all (i, &) and all choices of complex numbers Xjk . Making one of these
choices to be 1, and the rest to be 0, it follows that for all i, j, k, and &, we
have
,
µ(g −1 )ij ρ(g)k# = 0 .
g∈G
Specialize this by taking i = j and k = &, sum over all (i, k), and divide by
|G|:
, ,
,
|G|−1
( µ(g −1 )ii ) ( ρ(g)kk ) = 0 .
g∈G
i
k
This is exactly
|G|−1
,
234
χµ (g −1 )χρ (g) = 0 ,
g∈G
that is, < χρ | χµ > = 0, as required.
Now instead take µ = ρ, an irreducible matrep coming from the irreducible G-module V1 (= V2 ) with some basis. Again let X represent α using
that basis. By the first paragraph of the proof, we get
,
ρ(g −1 )Xρ(g) = zI ,
g∈G
for some complex number z. Letting n = dimV1 be the size of the matrices,
and taking the trace of both sides, we obtain z = |G| tr(X)/n. The (i, &)th
entry gives
,
j,k; g∈G

−1

 n |G| k Xkk
ρ(g −1 )ij Xjk ρ(g)k# = 

0
if i = & ;
if i %= & .
This holds for all (i, &) and all choices of complex numbers Xjk . As in the
paragraph above, ‘equating coefficients’ of the X’s yields
,
g∈G
ρ(g −1 )ij ρ(g)k# = δjk δi# |G|/n ,
(Kronecker deltas).
This leads quickly to
,
i,k; g∈G
ρ(g −1 )ii ρ(g)kk = |G| ,
which is the same as < χρ | χρ > = 1, as required.
235
Corollary 48.8. i) Two G-modules are isomorphic if (and only if) their
characters are equal.
* ⊕n
* ⊕m
ii) If V ∼
U U and W ∼
U U are the decompositions of two
=
=
G-modules into direct sums of irreducibles U , then
< χ V | χW > =
,
nU mU = dimHomG (V, W ) .
U
iii) In particular, the number of copies of an irreducible U in the direct
sum decomposition of V is < χV | χU >.
iv) A G-module V is irreducible if and only if < χV | χV > = 1 .
Proof. If V and W are two G-modules, decomposed into direct sums
of irreducibles as in ii), then χV = U nU χU and similarly for W , using the
mU . Then i) is immediate, since the linear independence of the irreducible
characters yields nU = mU for all U . The first equality in ii) is clear from the
usual calculation of an inner product of two vectors expressed in terms of an
orthonormal set. The second equality in ii) is immediate from 47.9 and 47L.
Part iii) is a special case of ii). Part iv) follows because a sum, U (nU )2 , of
squares of non-negative integers can only equal 1 when one of them is 1 and
the rest are 0.
Remark. Note how the two significant, but initially obscure, technical
objects, < | > and dimHomG , from this and the previous sections, have
turned out to be the same object in different clothing !
Now let’s return to the other two observations concerning character tables,
namely squareness and column orthogonality.
Lemma 48.9. Let ρ be an irreducible G-matrep of dimension n, and let
gC ∈ C, a conjugacy class in G. Then
,
g∈C
ρ(g) = n−1 |C| χρ (gC )I .
Proof. If ρ is the matrep associated to an irreducible G-module U with
some basis, then the left-hand side of the identity to be proved is the matrix
of the G-map ψC (defined before 48.6). By Schur’s lemma, it must be zI for
some z. Taking traces immediately yields the fact that z = n−1 |C| χρ (gC ),
as required.
236
Corollary 48.10. For any class function f ∈ CG/∼ , and any irreducible
G-matrep ρ of dimension n, we have
|G|−1
,
g∈G
f (g)ρ(g) = n−1 < χρ | f > I .
Proof. With the outside sum over all conjugacy classes C, and with
notation as in 48.9, the left-hand side above is
|G|−1
= n−1 |G|−1
,,
C g∈C
,
C
f (g)ρ(g) = |G|−1
,
f (gC )
C
,
ρ(g)
,
f (g) χρ (g) I ,
g∈C
f (gC ) |C| χρ (gC ) I = n−1 |G|−1
g∈G
which is the right-hand side. (Alternatively, there is a G-module with a
self-G-map for which the left-hand side is the matrix.)
Theorem 48.11. The set of irreducible characters of a finite group is an
orthonormal basis for the subspace of class functions on that group (with respect to the restriction of our standard Hermitian inner product on the space
of all functions). In particular, the number of irreducible representations of
a group equals its number of conjugacy classes.
Proof. Since the irreducible characters form an orthonormal set, they
will be a basis for CG/∼ , as required, as long as no non-zero class function,
f , is orthogonal to all of them. But if < χρ | f >= 0 for all irreducible
characters ρ, then from 48.10 we obtain
,
f (g)ρ(g) = 0 ,
g∈G
for all ρ which are irreducible, and therefore also for all ρ whatsoever. Taking
the latter to be the regular representation, and applying this to the element
r1 , we get
,
f (g)rg = 0 .
g∈G
Thus f (g) = 0 for all g, as required, since { rg
independent.
: g ∈ G } is linearly
237
Exercise 48G. Show that if all the irreducible representations of G are
1-dimensional, then G is abelian (the converse of 47.13).
Here are the column orthogonality relations for the character table.
Theorem 48.12. Let g and h be elements of G. Summing over all
equivalence classes of G-matreps ρ, we have
,
ρ


 |G|/|C| if g, h are in the same conjugacy class C;
χρ (g)χρ (h) = 

0
if g and h are not conjugate .
Proof. Let ∆C : G → C be the characteristic function of the conjugacy
class C ; i.e. ∆C (g) is 1 or 0 according as g ∈ C or g ∈
/ C. Directly from the
definition of the inner product we see that
< χρ | ∆C >
= |G|−1 |C| χρ (gC ) ,
where, as usual, gC is a chosen element in C. Therefore, using the ancient
formula for writing a vector in terms of an orthonormal basis,
∆C
=
,
ρ
< ∆C | χρ > χρ
= |G|−1 |C|
,
χρ (gC ) χρ .
ρ
Let D also denote a conjugacy class. Using a daring form of the Kronecker
delta, we get
δC,D = ∆C (gD ) = |G|−1 |C|
,
χρ (gC ) χρ (gD ) .
ρ
This is just what the theorem is asserting.
Orthogonality can be used to help complete a partly cooked character
table. For example, if you know all but the last row, then, by row orthogonality, the last row is determined up to multiplying its elements by a complex
number of modulus 1. But its first entry is a positive integer, so this determines the row completely. This may easily result in a situation where you
know all the characters of a group, but don’t have explicit formulae for all
of its representations. You’ll be in good company—Frobenius determined
238
the characters of all the symmetric groups (these characters actually happen to be integers), but it was several decades later when Young and Specht
wrote down the actual representations. More dramatically: in 1911, Schur
determined the characters of a sequence of groups, of orders n!2, which map
surjectively onto the symmetric groups (‘the projective characters of the symmetric group’); but it was 1988 before the representations themselves were
finally found by Maxim Nazarov.
The vector space CG is a ring, and CG/∼ a subring, using multiplication of values (so-called ‘pointwise multiplication’). The subset consisting
of the characters of all (not necessarily irreducible) representations is obviously closed under addition. But it is also closed under multiplication—it
is a ‘semi-ring’,—as we see from 48K and 48L below, which depend on the
tensor product.
Exercise 48H. (This explains the names of some of the representations
in the character tables exhibited earlier.) Show that if ρ and µ are G-matreps,
with ρ of dimension 1, then g 9→ ρ(g)µ(g) is also a matrep, which is irreducible
if and only if µ is. Furthermore, g 9→ ρ(g)−1 is a matrep.
Exercise 48I. Find two 1-dimensional irreps of Sn for any n > 1. Show
that the standard representation splits as the direct sum of irreps of dimensions n − 1 and 1. Find a fourth irrep, also of dimension n − 1, when n > 3.
Determine the character table of S4 —note how only integers somehow magically occur.
Exercise 48J. Show that if each element of a group is conjugate to its
inverse, then all of the character values of the group are real.
Exercise 48K. Let V be a G-module and W an H-module, for any two
!
finite groups, G and H. Determine an action of G × H on V C W by
requiring
(g, h) · (v ⊗ w) := (g · v) ⊗ (h · w) .
Explain how this, for fixed (g, h), determines a well defined linear map.
Prove that you get an action of G × H. Show that
χV ⊗W (g, h) = χV (g)χW (h) .
239
How does this relate to 48H? Prove that
< χV ⊗W | χV " ⊗W " >
=
< χ V | χ V " > < χW | χ W " > .
!
Deduce that if V and W are irreducible, then so is V W . Deduce further
!
!
that, if also V & and W & are irreducible, then V W and V & W & are iso∼
morphic as G × H-modules if and only if both V ∼
=G W and V & !
=H W & .
Conclude that the irreducible (G × H)-modules are exactly the V W , as
V ranges over all irreducible G-modules, and W ranges over all irreducible
H-modules.
Exercise 48L. Given a group morphism β : G → K, and a K-module
V , define β ∗ V , the restriction of V along β, to be V with the G-action
g · v := β(g) · v . Prove that this is a G-module. (Sometimes this is
given less generally in two cases: when β is inclusion of a subgroup, it is
called restriction; and when β is projection onto a quotient group, it’s called
‘lifting’.) When K = G × G and β is the diagonal map g 9→ (g, g), we can
!
apply 48K and form V • W := β ∗ (V W ) for a pair of G-modules. This
is also called their tensor product, and is seldom irreducible. It generalizes
48H : prove that
χV •W = χV χW .
Deduce that the set of characters is closed under multiplication. By multiplying known characters, one can often produce new representations of
interest—starting with the trivial, sign, and standard representations, generate the complete character table of S5 using various tensor products and
restrictions.
Exercise 48M. Show that the projection of V to its U -isotypical component is given by
v 9→ |G|−1 dimU
,
g∈G
χU (g) g · v .
That is, this map is the identity on the U -isotypical component, and is zero
on all the other isotypical components. (See 47.14.)
49. Algebraic integers.
Algebraic integers
240
The following concept has several applications in mathematics besides
the applications in the next section to representation theory. Readers who
have not yet studied the definitions beginning Section 43 can specialize R
to Z below, and change the term R-module to abelian group. All of our
applications will be in that case.
Definition. Let S ⊃ R be an extension of commutative rings. An
element s ∈ S is said to be integral over R if and only if the conditions in
the following theorem hold. The usual definition is condition i). The term
‘algebraic integer’ is synonymous with ‘integral over Z’.
Theorem 49.1. With notation as above, the following conditions are
equivalent.
i) There is a monic polynomial in R[x] having s as a root.
ii) The ring R[s] is finitely generated as an R-module.
iii) There exists an intermediate ring T (i.e. S ⊃ T ⊃ R) with s ∈ T and
such that T is finitely generated as an R-module.
Proof. To prove that i) implies ii), note that if some monic polynomial
as in i) has degree n, then all higher powers of s can be expressed as R-linear
combinations of the set { 1, s, · · · , sn−1 }, using that polynomial. Therefore
that finite set of powers generates R[s] as an R-module. That ii) implies iii)
is trivial, taking T to be R[s]. To see that iii) implies i), see Hungerford
VIII 5.3 for the general case. Here is a shorter argument when R = Z, the
only case which we’ll use. By 13F, any subset of a finitely generated abelian
group generates a subgroup which can be generated by a finite subset of the
given set. Taking the group to be T and the set to consist of the powers of
s, this provides a guarantee that some power of s is a Z-linear combination
of lower powers of s, as required.
Corollary 49.2. The subset of elements in S which are integral over R
is a subring of S and an extension of R.
Proof. The set of products of pairs of elements, the first from a set of
R-module generators for R[s] and the second from such a set for R[t], is a
set of R-module generators for R[s, t]. Taking T in 49.1iii) to be R[s, t],
we see that if s and t are both integral over R, then so are s ± t and st, as
required. (This proves generally that the ring generated by the union of any
two ‘integral extensions’ is also integral.)
Algebraic integers
241
Corollary 49.3. If S is finitely generated as an R-module, then all of
its elements are integral over R.
Proof. Take T = S in iii) of 49.1.
Proposition 49.4. The intersection of Q with the ring of algebraic
integers in C is Z. That is, a rational which is an algebraic integer is actually
an integer.
Proof. If the rational b/c in lowest form is a root of
ai with an = 1, then
-n
0
ai xi for integers
bn + an−1 bn−1 c + an−2 bn−2 c2 + · · · + a0 cn = 0 .
It follows easily that any prime dividing c would divide b also. Therefore
c = ±1, as required.
Any root of unity is an algebraic integer, and therefore any sum of roots
of unity is also. Thus, since character values for finite groups are always sums
of roots of unity, we immediately get the following result, taking us back to
the central topic of this chapter.
Proposition 49.5. If G is a finite group, then its character values, χρ (g),
are algebraic integers for all g ∈ G and all G-matreps ρ.
Proposition 49.6. Let S and S & both be extensions of R. A morphism
of rings, φ : S → S & , which fixes all elements of R, necessarily maps elements
integral over R in S to elements integral over R in S & . In particular, any ring
morphism between commutative rings maps algebraic integers to algebraic
integers.
Proof. If, for a polynomial f , we have f (s) = 0, then
f (φ(s)) = φ(f (s)) = φ(0) = 0 .
We finish this section with a few unmotivated special facts, which will be
crucial in the next section.
Proposition 49.7. Suppose that α ∈ C is an algebraic integer with
|α| < 1 and such that |β| ≤ 1 for all (Galois) conjugates β of α. Then α = 0.
Divisibility & the Burnside (p, q)-theorem
242
Proof. Let γ be the product of all the conjugates of α. Then γ is fixed
by the Galois group over Q of the minimal polynomial of α. Therefore it is
in Q. But γ is an algebraic integer, so it’s in Z by 49.4. Being less than 1 in
absolute value, it must be 0. Hence α = 0 as well.
Lemma 49.8. If each αi is a complex root of unity, then
| α1 + · · · + αk | ≤ k .
Equality holds if and only if all the αi are equal.
Proof. This is a standard application of the triangle inequality in R2 ,
since each αi has modulus 1.
Proposition 49.9. If α = (α1 + · · · + αk )/k is an algebraic integer in C,
and if each αi is a root of unity, then either α = 0 or else αi = α for all i.
Proof. If the αi are not all equal, then 49.8 implies that the hypotheses
of 49.7 hold, and then its conclusion gives us exactly what we want. ( This
uses that ‘conjugation’ commutes with addition, and ‘maps’ roots of unity
to roots of unity.)
Proposition 49.10. Suppose that a and b are relatively prime integers,
and z is a complex number such that both z and az/b are algebraic integers.
Then z/b is also an algebraic integer.
Proof. We have z/b = (u)(az/b) + (v)(z) for integers u and v such that
ua + vb = 1. Now apply 49.2.
50. Dimension divides order &
the Burnside (p, q)-theorem.
First let’s define the group algebra of a finite group. This will not be used
in any serious way; only the fact that the set in 50.3 below is a commutative
ring satisfying 50.3 is needed later.
Definition. The group algebra of G is denoted C[G]. As a set, it is
the set of all formal linear combinations, g∈G zg g, where the coefficients are
243
complex numbers. These are added and multiplied in the obvious way, where
the multiplication uses the operation in the group:
(
,
g∈G
zg g) (
,
wg g) :=
g∈G
,
g∈G
(
,
za wb ) g .
ab=g
Proposition 50.1. i) This produces a ring—in fact, a C-algebra in the
sense of Section 52—which is in general non-commutative.
ii) It is commutative if and only if G is an abelian group.
Exercise 50A. Prove this.
We’ll only be using the following commutative subring.
Proposition 50.2. The centre of C[G] (i.e. the set of elements which
commute with every element) is the vector subspace with basis over C equal
to { eC : C is a conjugacy class in G }, where
eC :=
,
g.
g∈C
Proof. The linear independence of { eC : · · · } is obvious. That eC
commutes with everything is a routine verification, so we get one inclusion.
For the opposite one, let s = g∈G zg g be any element in the centre of the
group algebra. To show that zh−1 gh = zg , just calculate with the equality
sh = hs.
Exercise 50B. Give all the details of this proof.
Proposition 50.3. The set of integer linear combinations of the basis
above, SpanZ { eC : C is a conjugacy class in G }, consists entirely of algebraic integers.
Proof. This is immediate from 49.3, observing that the set is a subring.
Recall 48.9, which said that if ρ is an irreducible G-matrep and C is a
conjugacy class in G with gC ∈ C, then
,
g∈C
ρ(g) = m(C, ρ)I ,
244
where
m(C, ρ) := |C| χρ (gC )/dimρ .
-
-
Proposition 50.4. The map C zC eC 9→ C zC m(C, ρ) is a ring morphism from the centre of the group algebra to C.
Exercise 50C. Prove this.
Combining 50.4, 50.3 and 49.6, we get
Corollary 50.5. For all conjugacy classes C in G and all irreducible
G-matreps ρ, the complex number m(C, ρ) is an algebraic integer.
Now we can prove the last of the three claims made after the table in
Section 47.
Theorem 50.6. The dimension of any irreducible G-module divides the
order of the finite group G.
Proof. Denoting the irreducible matrep as ρ, we have
1 =
= |G|−1
< χ ρ | χρ >
= |G|−1 dimρ
,
C
,
χρ (g)χρ (g)
g∈G
{|C| χρ (gC )/dimρ} {χρ (gC )} .
Both factors {between the brace-brackets} in the last sum are algebraic integers, by 50.5 and 49.5, respectively. Since the set of algebraic integers is
closed under multiplication and addition, we see that |G|/dimρ is an algebraic integer. But it’s a rational number. Hence it is an integer, as required,
by 49.4.
Combining this divisibility condition with the squareness of the character
table and the ‘sum of squares’ condition puts strong constraints on what
the dimensions of the irreducibles can be. Every group has at least one
1-dimensional representation, the trivial one.
Thus, for example, a group of order p2 can only have 1-dimensional representations, since all other divisors are already too big. Therefore the group
is necessarily abelian by 48G, an immediate consequence of the squareness
245
of the character table. This is a new proof of 11.5, giving a low-level illustration of the use of representation theory to get results about the structure
of finite groups.
Similarly, a group of order p3 will have, say, “a” irreducibles of dimension
p, and “p3 − ap2 ” of dimension 1. Thus its number of conjugacy classes has
the form p3 − ap2 + a for some integer a between 0 and p − 1. The number of
conjugacy classes determines the dimensions of the irreducibles completely.
In fact, only the values a = 0 and 1 occur.
Lemma 50.7. Suppose that χρ (g)/dim(ρ) is an algebraic integer, where
g ∈ G and ρ is an irreducible G-matrep. Then either χρ (g) = 0 or ρ(g) is a
scalar multiple of the identity matrix.
Proof. In 49.9, take k =dimρ, and let α1 , · · · , αk be the list of eigenvalues of ρ(g), listed once for each time they occur as roots of the characteristic
polynomial.
Lemma 50.8. If C is a conjugacy class in G with gC ∈ C, and ρ is
an irreducible G-matrep such that |C| and dimρ are relatively prime, then
χρ (gC )/dimρ is an algebraic integer.
Proof. This is immediate from 49.10 with a = |C|, b = dimρ and
z = χρ (gC ), using 49.5 and 50.5.
Theorem 50.9. If some conjugacy class in a finite group has prime
power order larger than 1, then the group is not simple.
Proof. Let C be that conjugacy class in G, with |C| = ps and gC ∈ C.
By column orthogonality, 48.12, we have
,
χρ (1)χρ (gC ) = 0 ,
ρ
so that
1 +
,
dimρ χρ (gC ) = 0 .
ρ non−trivial
Now χρ (gC ) is an algebraic integer by 49.5, and for no algebraic integer z is
1 + pz = 0, by 49.4. Thus we can find an irreducible ρ such that
i) the dimension of ρ is not divisible by p;
ii) χρ (gC ) %= 0 ;
246
iii) χρ (gC )/dimρ is an algebraic integer (by 50.8).
From 50.7, we conclude that ρ(gC ) is a scalar multiple of the identity. But
the scalar matrices form a normal subgroup, N , of GL(n, C), so ρ−1 N is
now a non-trivial normal subgroup of G. If it’s proper, we’re finished. If not,
then use instead the normal subgroup Kerρ, which is
a) proper, because ρ is not the trivial representation, and
b) non-trivial, because ρ maps the non-abelian group G into the abelian
group of scalar matrices.
Theorem 50.10. (Burnside) If a finite group has order divisible by
fewer than three primes, then it is soluble.
Proof. Suppose, for a contradiction, that G is a counterexample of smallest order. A non-trivial proper normal subgroup, N , of G would then produce
soluble groups N and G/N . But then G would be soluble, by 11G. Therefore N can’t exist, and so G is a simple non-abelian group. Let its order be
pa q b for distinct primes p and q, with b > 0. Let H be a Sylow subgroup
of order q b . By 11.4, the group H has a non-trivial centre, since it has
prime power order. Let h be a non-identity element in the centre of H. Let
K = { g ∈ G : gh = hg }, the centralizer of h in G. Since K is a subgroup
containing H, we have |K| = pc q b , where 0 ≤ c ≤ a. If c = a, then h is in the
centre of G, and the latter would be a proper non-trivial normal subgroup of
G. Therefore c < a. But the conjugacy class of h has order |G|/|K| = pa−c
(see 47S). By 50.9, we have our contradiction.
See the remarks after 12.2 about finite simple groups—a re-phrasing of
50.10 is that a finite non-abelian simple group must have order divisible by
at least three primes. The simplicity of A5 shows that this is a best possible
result in the simplest sense. The renowned Feit-Thompson theorem says that
one of the prime divisors of the order of a finite non-abelian simple group
is necessarily 2. Examples show that there aren’t any odd primes with that
privileged status.
We have pushed ahead into deeper waters in this section, and omitted, for example, some quite important methods for the construction of
representations—induced representations, exterior and symmetric powers,
further development of the products in the exercises at the end of 48, · · · .
Another topic of importance (which is less difficult than some of what we’ve
247
done) is the analysis of representations over general fields of characteristic
zero by means of comparison with representations over the algebraic closure—
see 50E below.
The student should now read Serre, a beautiful, economical exposition.
After this, much of the research literature on the representation theory of
finite groups, and its applications to number theory, will be accessible. To
learn about infinite groups, read Adams and Fulton-Harris. Also Artin
has a chapter on group representations, with some more complicated examples, some material on infinite groups, and a good set of exercises.
Exercise 50D. i) Invent an averaging trick analogous to Maschke’s to do
the following: Let V be a G-module with an inner product < | >. Construct
another inner product ( | ) on V which is G-invariant, i.e. for all g ∈ G and
all v, w ∈ V , we have
(g · v | g · w) = (v | w) .
ii) Deduce that any G-module has a G-invariant inner product.
iii) Deduce that any G-matrep is equivalent to a G-matrep which takes
all of its values in the unitary group,
U (n) := { A ∈ GL(n) : A is unitary } .
In particular, any finite subgroup of GL(n) is conjugate to a subgroup of
U (n).
iv) Use the existence of an invariant inner product and the idea of ‘orthogonal complement’ to give a new proof of the fact that every G-module
is a direct sum of irreducibles.
Exercise 50E. Here is something about G-modules, where we modify
the definition in assuming that the ground field has characteristic zero, but
might not be algebraically closed.
i) Show that i) of Schur’s Lemma, 47.6, continues to hold.
ii) Find an example where the group is abelian but has an irreducible representation of dimension larger than 1—think rotation!
iii) Show that parts ii) and iii) of Schur’s Lemma can fail when the field is
not algebraically closed.
iv) Prove the following more general version of Schur’s Lemma: If F is any
Tensor Products
248
field and V is an irreducible G-module, then HomG (V, V ) is a finite dimensional division algebra over F —see the definition after 52D ahead—using
composition as multiplication operation. (We don’t need characteristic zero
here, nor that G is finite.)
This leads nicely into the last two sections of the book, which study
division algebras. A generalization of the ideas in Section 52 can be applied
(among other places) to representation theory, because of iv) above.
APPENDIX
!
. Tensor Products.
Let R be a commutative ring (with 1, of course), which will be fixed until
the last few paragraphs. Let M and N be R-modules. We’ll define another
!
R-module, M N , called the tensor product of M and N .
If you have not studied the basics on modules yet (given here in Section 42), you can take R to be a field, and replace the word “module” by
“vector space” (and below, also replace “module morphism” by “linear transformation”). The few references in earlier sections to tensor products have
involved only that case, indeed with only finite dimensional vector spaces
!
being needed. For vector spaces, the dimension of M N will turn out to
be the product of the dimensions of M and N .
!
In the general case, the module M N will contain elements denoted
m ⊗ n, where m ∈ M and n ∈ N . (Note that this last symbol ⊗ is smaller
!
!
than the earlier .) But not all elements of M N will take the form m⊗n
; a general element will be writeable as a finite sum of such elements; and this
is a bit messy, since there is no uniqueness in writing elements that way. It
!
turns out that, for most purposes, it is better to think of M N in terms of
a mapping property (also called a universal property) which implicitly defines
it, as we do just below, rather than in terms of notations for its elements.
See Remark ii) after 47.11 and paragraphs after 21B for other information
concerning mapping properties.
For fixed M and N , consider all R-modules P , and all maps
β : M × N −→ P
which are bilinear over R; i.e. for fixed n ∈ N , the map sending m to β(m, n)
is a module morphism from M to P ; and, for fixed m ∈ M , the map N → P
defined by n 9→ β(m, n) is also a morphism of R-modules.
Tensor Products
249
*
Suggestion. The set M × N is the underlying set of the module M N .
But don’t think of it that way in the present context. This is as you avoid
doing in linear algebra, when studying bilinear maps defined on V × V (for
example, an inner product). In that context, you don’t usually picture V ×V
as the vector space which is the direct sum of V with itself.
Definition. The tensor product of M and N is defined to be any Rmodule T , together with any bilinear map
ι : M × N −→ T ,
which has the following ‘universal property’ : for every pair (P, β) as above,
there is a unique module morphism β̂ : T → P such that β = β̂ ◦ ι.
Any such T is denoted M
!
N.
This definition probably bugs you for at least three reasons : first of all,
we haven’t defined the module T in a unique way; secondly, it is not at all
obvious that any such module T should exist; and finally, the tensor product
has just been defined not only to be a module, but also to include the function
ι.
The last objection can only be smoothed over by saying that we preferred
to not complicate things at the beginning; certainly the module is the more
important thing, and often the term ‘tensor product’ in a discussion means
just the module, not ι. Furthermore, the function ι serves to fix the notation
for the elements referred to earlier: the element m⊗n is defined to be ι(m, n).
Thus the set of elements which can be written in the form m ⊗ n is exactly
the image of ι.
The first two objections are dealt with by proving a couple of propositions.
!
Proposition 1. Suppose that (T & , ι& ) is another choice for a tensor
product of M and N . Then there is a unique module isomorphism, γ : T →
T & , such that ι& = γ ◦ ι.
Thus (if it exists) the tensor product module is unique up to isomorphism;
and that isomorphism is itself unique, subject to the last equality. This is
!
what justifies the use of the notation M N for T .
Proof. Treating (T, ι) as the tensor product, and taking (P, β) in the
definition to be (T & , ι& ), we get a module map γ such that ι& = γ ◦ ι. Now
just reverse the roles : taking (T & , ι& ) as the tensor product, and (T, ι) as
Tensor Products
250
any old pair ‘(module, bilinear map)’, we get a module map α : T & → T such
that ι = α ◦ ι& .
To show that γ is an isomorphism, we’ll show that α is its inverse. Now
the module morphism α ◦ γ : T → T has the following property : ι = (α ◦
γ) ◦ ι . This says that α ◦ γ is the unique module morphism which arises
in the definition when we take (T, ι) to play both the roles: as the tensor
product and as the general pair. But clearly the identity map from T to
itself is the unique module morphism in question. Thus α ◦ γ is the identity.
Symmetrically, γ ◦ α is the identity map from T & to itself.
It remains only to prove that γ is unique. But, by the definition of the
tensor product, the uniqueness of γ is immediate from the equality ι& = γ ◦ ι
and the fact that γ is a module map; bijectivity isn’t needed to prove its
uniqueness.
!
Proposition 2. For any commutative ring R, and any pair M and N
of R-modules, there is a tensor product of M and N .
Proof. For the moment, think of M × N as merely a set. Let F be
the free R-module with this set as basis. Thus F consists of all finite linear
combinations of pairs (m, n) with coefficients in R. Such a pair itself could
be identified with that element of F , which we’ll denote as [m, n], for which
the coefficient of (m, n) is 1, and all other coefficients are 0.
Let S be the submodule of F generated by the union of the following four
sets :
{ [m + m& , n] − [m, n] − [m& , n] : m, m& ∈ M, n ∈ N } ,
{ [m, n + n& ] − [m, n] − [m, n& ] : m ∈ M, n, n& ∈ N } ,
{ [rm, n] − r[m, n] : r ∈ R, m ∈ M, n ∈ N } ,
and
{ [m, rn] − r[m, n] : r ∈ R, m ∈ M, n ∈ N } .
Define T to be the quotient module F/S. Define
ι : M × N −→ T
by
(m, n) 9−→ [m, n] + S .
Tensor Products
251
To prove that ι is bilinear is a straightforward calculation, which will be
left as an exercise. (Note however that bilinearity would fail for the map
M × N → F sending (m, n) to [m, n] ; it is essential that we factor out
the ‘relations’ given by the four sets.)
To prove the universal property, suppose given (P, β) as in the definition.
Let µ : F → P be the unique module morphism which sends each [m, n] to
β(m, n); that is,
,
µ(
j
rj [mj , nj ]) :=
,
rj β(mj , nj ) .
j
It is now a straightforward calculation to see that each of the elements in the
four sets defining S is in the kernel of µ. Thus there is a module morphism
β̂ : T = F/S −→ P for which β̂(x + S) = µ(x) for all x ∈ F . [This
comes from the basic material on the first isomorphism theorem—Section
42; RVii), iii), iv), v).] Now the equality β = β̂ ◦ ι is immediate by
calculating both mappings on (m, n).
To prove the uniqueness of β̂, note that the set of elements [m, n] + S
generates T , since the elements [m, n] generate F . A module morphism with
domain T is completely determined once we know its values on some set of
generators. But any module morphism ν : T → P for which
Tensor Products
252
β = ν ◦ ι satisfies
ν([m, n] + S) = ν(ι(m, n)) = β(m, n) = β̂([m, n] + S) .
Thus ν(y) = β̂(y) for all y, as required.
!
Proposition 3. Suppose that R is a field, and that the vector spaces
!
M and N have bases {mi } and {nj }, respectively. Then M N has basis
{mi ⊗ nj }. In particular, the dimension of the tensor product of two vector
spaces is the product of the dimensions of its two ingredients.
Sketch Proof. This will illustrate the following point : In dealing with
the tensor product, it is almost always a bad idea to use a particular construction,
such as the one in the last proof. It’s usually better to go to the universal property.
Let T be the vector space with basis consisting of ‘symbols’ mi ⊗ nj . Let ι :
M × N → T be the function sending ( i ai mi , j bj nj ) to (i,j) ai bj mi ⊗ nj ,
where the elements ai and bj are in the field R (and are almost all 0). We
leave it as an exercise for the reader to show that (T, ι) satisfies the definition
of the tensor product.
Remark. In the case of general R, the following partial generalization
can be proved: If M and N have generating sets {mi } and {nj } respectively,
!
then M N has generating set {mi ⊗ nj }. In particular, a tensor product
of a pair of finitely generated modules is also finitely generated. The first
!
assertion is equivalent to the earlier statement that every element in M N
can be written as a sum of elements of the form m ⊗ n.
The tensor product has properties resembling the associative, commutative and distributive laws, and the existence of the identity element :
(M
!
!
!
!
N) P ∼
= M (N P )
;
!
N ∼
= N M ;
*
!
!
*
!
(M N ) P ∼
(N P ) ;
= (M P )
!
M R ∼
= M .
M
!
It now follows that we could write down the tensor product of any two
finitely generated modules over a PID in elementary divisor form (see the
Tensor Products
253
remarks after 44D), once the following formulae are known, where p and q
are non-associated irreducibles in R :
(R/ < pi >)
(R/ < pi >)
!
!
(R/ < pj >) ∼
= R/ < pmin{i,j} > ;
(R/ < q j >) ∼
= {0} .
We shall leave all the above as exercises. Remember that the mapping
property is the easy way to carry out most of the proofs. For example, if ι :
!
M × N → M N gives the tensor product of M and N (in that order), then
!
it’s easy to prove that λ : N × M → M N , given by λ(n, m) = ι(m, n),
gives the tensor product of N and M (in the reverse order to before). This
will prove the ‘commutativity’ above.
The information in our three propositions suffices for completing the
proofs in previous sections where the tensor product occurs. These are :
the alternative proof of the uniqueness of splitting fields in the remarks after
28.4, including 28A; the remark after 48.2; and Exercises 47Q, 48K, 48L,
as well as Theorem 47.14.
Aside. In differential geometry and physics, the word “tensor” is used in
a way which relates to the material here, with one important extra-algebraic
wrinkle. The concept of a manifold, M, of dimension n is a generalization
of the idea of a surface (the case of dimension 2), such as a sphere, cylinder
or torus (surface of a doughnut). At each point x ∈ M , there is the tangent
space—a vector space, Vx , of dimension n. A vector field on M is a choice
of one vector, vx ∈ Vx , for each x in M . This is always assumed to vary
continuously with x, and usually to vary smoothly (i.e. as a function, it
should have derivatives of all orders). A tensor field on M is a choice of one
!
!
!
vector in Vx Vx · · · Vx for each x ∈ M , where the number of factors
in the tensor product is fixed. This is what the word “tensor” is often used
to mean; the word “field” is omitted, and smoothness tends to go without
saying. Often, some or all of the factors in the last tensor product are replaced
by (Vx )∗ , the dual space of Vx . If you see a quite ugly proliferation of suband superscripts when reading material on this, what you are seeing are the
component scalars of a tensor with respect to a basis for the iterated tensor
product. This basis will be the one obtained, starting from some fixed basis
for the tangent space, by iterating the basis recipe in the last proposition from
Tensor Products
254
!
!
!
two factors to the number of factors in the tensor product Vx Vx · · · Vx
above. In the case where some factors are the cotangent space (Vx )∗ , one uses
the basis dual to that for Vx .
You will want to get used to ‘diagram chasing’, also called ‘abstract nonsense’, before considering yourself to be a fully fledged mathematician. The
definition and many of the proofs concerning tensor products can be carried
out efficiently using this cultural pursuit. For example, the definition of the
tensor product may be expressed as follows :
M × N +∀β (bilinear)
++
,
P
ι
)
*
-
T
)
)
∃!β̂ (morphism)
There are occasions when several ‘ground rings’ enter a discussion. To
!
be clear about which one is being ‘tensored over’, the notation M N is
!
expanded to M R N . For example, if R is any ring whose underlying abelian
!
group is isomorphic to Z2 , then R Z R ∼
= Z4 , where R is just thought of as
!
an abelian group. However, R R R ∼
=R∼
= Z2 , where the first isomorphism
is of R-modules, but the second only of abelian groups. (R = Z[i] would
do here.) This shows that the result can depend on which ring is ‘tensored
over’.
!
An example using vector spaces is as follows : C C C ∼
=C∼
= R2 ,
!
!
whereas C R C ∼
= R2 R R2 ∼
= R4 . All of these are (at least) isomorphisms of real vector spaces.
Here is a question which gives some play with individual elements in a
!
!
tensor product : Is Q Z Q ∼
= Q Q Q (as abelian groups)? Working on
this will probably produce instances where a ⊗ b = c ⊗ d with a %= c and
!
b %= d. For example, in Q Z Q, is
(1/2) ⊗ (1/2) − 1 ⊗ (1/4)
an element of order 2?—or is it zero? The element denoted in the same way
!
in Q Q Q is certainly 0 , since it is
(1/2)[1 ⊗ (1/2)] − (1/2)[1 ⊗ (1/2)] .
Tensor Products
!
255
!
Note also that a ⊗ b could be zero in A B, and yet non-zero in A& B & , for
submodules A& of A, and B & of B. Give an example ! So the notation a ⊗ b
on its own is inherently ambiguous.
VI. A Dearth of Division Rings
Sections 51 and 52 each discuss a classic fact about division rings: 1)
there are no finite non-commutative ones (Wedderburn) ; and 2) a finite dimensional non-commutative division algebra over R is necessarily isomorphic
to the quaternions (Frobenius). We assume that 1 %= 0 in a division ring,
although that wasn’t done in the original definition.
51. Finite division rings are fields.
To begin, here is a number theoretic lemma whose proof uses the simplest
facts about the cyclotomic polynomials cn from Section 33.
Lemma 51.1. Given integers k > 1 and n ≥ 1, suppose that M is a
multiset of positive integers m < n with m | n, such that
kn = k +
,
M
(k n − 1)/(k m − 1) .
Then n = 1 (and so M is empty).
Proof. When m | n and m < n, the cyclotomic polynomial cn (x) divides
(x − 1)/(xm − 1) in Z[x]. Therefore the integer cn (k) divides
n
kn − 1 −
,
M
(k n − 1)/(k m − 1) ,
which equals k − 1. So write k − 1 = rcn (k) for some integer r. Thus
1 = (k − 1)/(k − 1) = rcn (k)/(k − 1) = (k − 1)−1 r
$
ξ
(k − ξ) ,
with the product being over all the complex primitive nth roots, ξ, of unity.
But for n > 1, distances in R2 give |k − ξ| > (k − 1), so
|(k − 1)−1 r
$
ξ
(k − ξ)| > (k − 1)−1 |r|
$
ξ
(k − 1) = |r|(k − 1)Φ(n)−1 ≥ 1 ,
contradicting the equality in the previous display. Therefore n = 1.
Exercise 51A. Prove that for positive integers k > 1, m, and n,
(k m − 1) | (k n − 1) ⇐⇒ m | n .
256
Finite division rings are fields
257
Recall that the group of invertibles in a (not necessarily commutative)
ring R is denoted R× . The centre of the ring is
Z(R) := { r ∈ R : rs = sr for all s ∈ R } .
The centralizer of an element r is
CR (r) := { s ∈ R : sr = rs } .
These are subrings. We use the same notation for the centralizer subgroup
of an element in a group.
Exercise 51B. Suppose that R is a division ring. Prove that CR (r) also
is. Deduce that, for non-zero r,
CR (r)× = CR× (r) .
Exercise 51C. Recall, from 47S, that if h ∈ C, a conjugacy class in a
group G, then the following map is bijective :
G/CG (h) −→ C
gCG (h) 9−→ ghg −1 .
Theorem 51.2. (Wedderburn) Any finite division ring is a field.
Proof. Let D be a finite division ring. Then its centre, F , is a field, and
D is a vector space over F , of dimension n, say. Let |F | = k. Thus |D| = k n .
Then, summing over non-singleton conjugacy classes, C, of the group D× ,
with dC ∈ C and with CD (dC ) of dimension mC over F , we get
k n − 1 = |D× | = |Z(D× )| +
,
C
= k−1+
|C| = |F × | +
,
C
,
C
|D× |/|CD× (dC )|
(k n − 1)/(k mC − 1) .
Each term in the summations is an integer greater than 1, so all the mC are
proper divisors of n, by 51A. But then 51.1 gives n = 1. Therefore D = F ,
as required.
Uniqueness of the quaternions
258
Remarks. i) The finite fields were classified in Section 31, so the finite
division rings are completely known.
ii) Desarguean projective geometries are coordinatized by division rings
(see Samuel). It was Hilbert who first observed that Pappus’ Theorem
holds if and only if the coordinate ring is a field. In particular, there is no
‘deduction’ of Pappus’ from Desargues’. However, by Wedderburn’s theorem,
every finite Desarguean projective geometry is Pappian.
iii) There is a very substantial generalization of Wedderburn’s theorem
due to Jacobson: If R is a ring such that, for all its elements r there exists
an integer n > 1 (possibly depending on r) such that rn = r, then R is
commutative. Wedderburn’s theorem follows by taking n to be the order of
the division ring.
Jacobson’s theorem does not require the assumption that rings have a
1. Could this also be deduced ‘after the fact’ from the following elementary
exercise or an analogue?
Exercise 51D. Show that every (associative) Z-algebra A can be embedded in a ring R, as follows. As an abelian group, take R = Z × A. Define
multiplication by
(n, x)(m, y) := (nm, ny + mx + xy) .
Show that R is commutative if and only if A is commutative.
Exercise 51E. Assume that R is a finite ring with no divisors of zero,
i.e. the integral domain ‘property’ holds, except that commutativity is not
assumed—(ab = 0 ⇒ a = 0 or b = 0). Show that R is a finite field. (See
4A.)
52. Uniqueness of the quaternions.
We begin with more definitions than are really needed here, but it is
helpful to see the larger context. We’ll talk about associative algebras, and
drop the word “associative”. But you should be aware that there are nonassociative algebras, such as Lie algebras (see Fulton-Harris), of tremendous importance in mathematics and physics.
Let R be a commutative ring (with 1, as usual). Recall that a Z-algebra
satisfies the same axioms as a (not necessarily commutative) ring, except
that no identity element is assumed to be in it.
259
Definition. An (associative) R-algebra is an object A which is simultaneously a Z-algebra and an R-module, such that the two multiplications are
related by the ‘law’, (!!), which occurred in the discussion of the quaternions
H in Section 14 :
for all r ∈ R, and a, b ∈ A,
r · (ab) = (r · a)(b) = (a)(r · b) .
A morphism between R-algebras is a map which preserves all three operations (so it is a morphism of modules, and would be a morphism of rings,
except that no reference to an element 1A occurs).
A canonical example of a non-commutative R-algebra is the algebra,
n×n
R , of all n × n matrices over R, for n > 1. Another example is the
group algebra from Section 50.
To justify the a priori notation when R = Z, recall from Section 45 that
an abelian group is automatically a Z-module.
Exercise 52A. Show that the ‘law’ in the last definition holds, so that
there is a Z-algebra structure in the new sense on any Z-algebra in the old
sense.
Exercise 52B. Assume that F is a field, and A is an F -algebra which is
finite dimensional as an F -vector space. Assume that A has no zero divisors.
Show that A, under its two binary operations, is a division ring. Relate this
to Section 19, especially 19.3, and to 51E.
Exercise 52C. Define the centre, Z(A), of an R-algebra A, as done in
the last section. Show that Z(A) is a commutative R-algebra. Given an
R-algebra A with 1A , define a map φ : R → A by sending r to r · 1A . Prove
that φ is a morphism of rings whose image is contained in the centre of A.
Conversely, given a ring A, a commutative ring R, and a ring morphism from
R into the centre of A, show how A may be made into an R-algebra with 1.
(In particular, commutative ring extensions S ⊃ R give R-algebras S.)
Exercise 52D. Show how any ring may be ‘canonically’ converted into
an R-algebra, where R is its centre.
Definitions. Let F be a field. A division algebra over F is an F -algebra,
D, whose underlying Z-algebra is actually a division ring. In particular D
260
has a 1, so the scalar multiplication by F may alternatively be regarded as
an embedding of F into D, by 52C. Then we may regard F as a subfield
of D. Thus the idea of an element a ∈ D being algebraic over F may be
defined just as in field theory : there exist ci ∈ F , not all zero, such that
c0 + c1 a + c2 a2 + · · · = 0. Because F ⊂ Z(D), it is immaterial whether the
ci are positioned to the left of the ai or not. As one would expect, D is said
to be algebraic over F if all of its elements are. Note also that a morphism
of division algebras over a field F can now be defined to be a ring morphism
which fixes all elements of F . Such a map is necessarily injective, since a
division ring has no interesting ideals, two-sided or otherwise.
Exercise 52E. Since an F -module is just a vector space, the meaning of
D being a finite dimensional algebra is clear. Show that the same argument
as in field theory demonstrates that such an algebra is algebraic over F .
Note that any division ring is a division algebra over its centre, by 52D.
This is really the way we were viewing finite division rings in the last section,
but we didn’t need all these definitions to do so.
We shall deal with Frobenius’ theorem (stated below) by chopping it into
pieces labeled (I) to (IX). In Exercise 52H ahead, a different, better (and
more ‘exercisable’ !) proof is suggested. You may want to go directly there
and do it.
(I). First note that there are no proper (not necessarily commutative)
ring extensions R ⊃ C such that
i) C is inside the centre of R;
ii) R has no zero divisors; and
iii) R is algebraic over C.
%
%
For if r ∈ R is a root of j (x − zj ) ∈ C[x], then j (r − zj ) = 0, so, for
some j, we have r = zj ∈ C.
In particular, there are no finite dimensional division algebras over C,
other than C itself.
Exercise 52F. Where is i) used above?
(II). Another way of saying that R has only two algebraic field extensions
is that there are no algebraic commutative division algebras over R, up to
isomorphism, other than C and R itself. In particular, this is true as well
261
with “algebraic” replaced by “finite dimensional”.
Theorem 52.1. (Frobenius) If D is an algebraic division algebra over
R which is not commutative, then D is isomorphic to the quaternions, H, as
a real division algebra.
This applies in particular to finite dimensional non-commutative division algebras over R. Combined with (II), we see that there are only three
algebraic division algebras over R, namely H, C and R. In particular, an
algebraic division algebra over R is necessarily finite dimensional.
Exercise 52G. Produce from your memory of previous sections several
examples of fields F and algebraic F -division algebras which are not finite
dimensional over F . Now produce one which isn’t commutative.
Proof of 52.1. Assume below in (III) to (VIII) that D denotes a central
algebraic R-division algebra; that is Z(D) = R, where we regard R as a
subfield of D via the embedding of R which gives D its module structure.
(III). If & ∈ D \ R, then
R(&) := { r0 + r1 & : ri ∈ R }
is a subfield of D isomorphic to C as an R-algebra.
For & is the root of a real irreducible of degree 2, by algebraicity, so the set
is easily seen to be an algebraic field extension of R. Now apply paragraph
(II) above.
(IV). If m ∈ D \ R and m2 ∈ R, then m2 < 0.
2
2
For m2 %= 0 since a division
√ ring has no zero divisors; and x − m cannot
have three roots, m and ± m2 , in the field R(m) ∼
= C. Better: D can be
replaced by R(m), which, by (III), is replaceable by C, for which the result
is obvious.
(V). If R(&) ⊂ D with &2 = −1, then there exists m ∈ D \ R(&) with
m& = −&m.
Because D is central over R, we can choose an element s ∈ D with
m := s& − &s %= 0. A trivial calculation yields the identity (i.e. that & and
m anti-commute). Since R(&) ∼
= C by (III), and m doesn’t commute with &
(WHY?), we see that m ∈
/ R(&).
(VI). If R(&) ⊂ D with &2 = −1, and m ∈ D \ R(&) with m& = −&m, then
m ∈ R.
2
262
By (III), we have m2 = c0 + c1 m for ci ∈ R. The fact that & and m
anti-commute gives that & and m2 commute. Thus
c0 & + c1 &m = &m2 = m2 & = c0 & − c1 &m .
Thus 2c1 &m = 0 and so c1 = 0 as required, since we can cancel in a division
ring.
(VII). Given & and v in D with v ∈
/ R(&) [and hence & ∈
/ R(v)], no element
of D outside of R commutes with both v and &.
For, by (I) and (III), such an element would need to lie in both R(&)
and R(v). But we cannot have c0 + c1 & = c&0 + c&1 v with the c’s in R unless
c1 = 0 = c&1 .
(VIII). Given & and v in D with v 2 = &2 = −1, the element &v + v&
commutes with both v and &.
This is a trivial calculation.
(IX). Now assume that D is any R-division algebra which contains H as
a subring. Then no non-zero element of D anti-commutes with all three of
i, j and k.
For such an element y, we have
0 = yk + ky = yij + ijy = (yi + iy)j − i(yj − jy) = − i(yj − jy) .
By cancellation, yj = jy. But y and j also anti-commute, yielding 2yj = 0.
Thus y = 0, as required.
Now let’s polish off the proof. Let D be a non-commutative algebraic
division algebra over R. By (II), its centre is isomorphic to R or C. But
the latter is impossible by (I). Thus D is a central R-algebra. By (III), we
may choose & ∈ D with &2 = −1. By (V), we may choose m ∈ D which
anti-commutes with &. By (IV) and (VI), the element m2 is a negative real.
Multiplying by a suitable real, we may rechoose m so that m2 = −1, and m
still anti-commutes with &. Let n = &m. It is now routine to calculate that
&, m and n satisfy all the relations given for i, j, and k in the definition of
H, the quaternions. From this, it is routine to demonstrate that the map
φ : H −→ D
,
263
c0 + c1 i + c2 j + c3 k 9−→ c0 + c1 & + c2 m + c3 n ,
is a morphism of rings fixing each real. Its kernel is zero since H is a division
ring. It remains only to show that φ is surjective. Let u ∈ D \ R. By
(III), there is an element v of the form c0 + c1 u with v 2 = −1, where the
c’s are reals (and so c1 %= 0). It suffices to exhibit an element w ∈ φ(H)
with v = −w, since then v ∈ φ(H), and so u ∈ φ(H). We may assume that
v∈
/ R(&) ∪ R(m) ∪ R(n), since that union is contained in φ(H). Let
w :=
1
1
1
(v& + &v)& + (vm + mv)m + (vn + nv)n .
2
2
2
Now v& + &v ∈ R by (VIII) and (VII). The same applies to the other two
‘coefficients’ in the definition of w. Thus w is certainly in the image of φ.
It is routine to check that v + w anti-commutes with each of &, m and n.
Now (IX) evidently holds with H replaced by any isomorphic R-algebra, and
i, j, and k replaced by their images under the isomorphism. Therefore, from
(IX), we get v + w = 0, completing the proof.
264
This proof is something of a bag of tricks. There is a more systematic
theory of central division algebras, with applications to number theory and
quadratic forms. Frobenius’ theorem above drops out as a byproduct of this
theory. See Hungerford and Lam(1973).
Exercise 52H. This gives another, and a very slick, proof of Frobenius’
theorem. See Lam (1990), pp. 219-220, for the details, if necessary. Let D
be a non-commutative algebraic division algebra over R.
(i) Argue as in (III) above that D contains a subalgebra isomorphic to
C. So, without loss of generality, we have C ⊂ D.
(ii) Prove that, as a C-vector space, D decomposes as the direct sum of
+
D and D− , where
D± := { d ∈ D : di = ±id } .
(iii) Show that D+ = C.
(iv) Show that µ : D− → D+ , sending x to xy, for any fixed non-zero
y ∈ D− , is well defined and an isomorphism of complex vector spaces.
(v) Prove that any such y satisfies y 2 ∈ R and y 2 < 0 .
(vi) Deduce that D ∼
= H as real algebras.
You may be wondering about the following. We gave a definition of H as
C × C, similar to the construction of C from R. This used complex conjugation; but there is also a quaternionic conjugation. Why can’t we produce
an algebra structure on H2 using analogous formulae?—and wouldn’t this
contradict Theorem 52.1? The answer is that the proof of associativity for
H uses commutativity of C in an essential way. There is a construction of
a non-associative R-algebra along the lines above. It is called the Cayley
numbers and, of course, has dimension over R equal to 8 = 2dimR H. With a
suitable definition of not-necessarily-associative division algebra, it is a very
difficult theorem, proved first in 1957 by Bott, Kervaire, and Milnor using
algebraic topology, that only in dimensions 1, 2, 4 and 8 can there exist such
algebras (when the field is R, and we restrict to finite dimensions). Perhaps
the neatest proof so far would use the ‘postcard-size’ argument in the first
section of Adams-Atiyah. This argument appears to be just an elementary
algebraic manipulation; but it is based on Bott periodicity, which is a topological result of great depth. The argument shows that certain topological
265
spaces (which not-necessarily-associative division algebra structures on R2n
would allow you to construct) can only exist when n is 1, 2 or 4. The spaces
which do exist are projective planes based on the algebras. The ‘postcard’
argument ends when the condition 2n | 3n − 1 has been deduced.
Exercise 52I. Show that, indeed,
2n | 3n − 1 ⇐⇒ n = 1, 2 or 4 .
References
266
References
Adams, J. F. Lectures on Lie Groups. W. A. Benjamin, New York,
1969.
Adams, J. F. and Atiyah, M. F. K-theory and the Hopf invariant.
Quart. J. Math. (Oxford)(2) 17 (1966) 31–38.
Artin, M. Algebra. Prentice Hall, Englewood Cliffs, N.J., 1991.
Atiyah, M. F. and Macdonald, I. G. Introduction to Commutative
Algebra. Addison-Wesley, Reading, Mass., 1969.
Fulton, W. and Harris, J. Representation Theory—A first course.
Springer-Verlag, New York, 1991.
Greub, W. Multilinear Algebra. Springer-Verlag, New York, 1978.
Goldstein, L. J. Abstract Algebra. Addison-Wesley, Reading, Mass.,
1973.
Hartley, B. and Hawkes, T. O. Rings, Modules and Linear Algebra.
Chapman & Hall, London, 1970.
Herstein, I. N. Topics in Algebra. 2nd ed., Wiley, New York, 1975.
Hungerford, T. W. Algebra. Springer-Verlag, New York, 1974.
Jacobson, N. Basic Algebra I, II. Freeman, San Francisco, 1974, 1980.
Lam, T. Y. The Algebraic Theory of Quadratic Forms. W. A. Benjamin,
Reading, Mass., 1973.
Lam, T. Y. A first course in noncommutative ring theory. SpringerVerlag, New York, 1991.
Lang, S. Undergraduate Algebra. 2nd ed., Springer-Verlag, New York,
1990.
Macdonald, I. G. Symmetric Functions and Hall Polynomials. Oxford
Univ. Press, Oxford, 1979 (enlarged 2nd ed. to appear).
Mackey, G. W. Harmonic analysis as the exploitation of symmetry—a
historical survey. Bull. A.M.S. 3 (1980) 543–699.
Rotman, J. Galois Theory. Springer-Verlag, New York, 1990.
Samuel, P. Projective Geometry. Springer-Verlag, New York, 1988.
Serre, J. P. Linear Representations of Finite Groups. Springer-Verlag,
New York, 1977.
Stewart, I. Galois Theory. Chapman & Hall, London, 1989.
Index
of Notation
GL(n, R), 7, 13
GL(Rn ), 7, 13
∃, iii
∃!, iii
∀, iii
=⇒, iii
HomC (V, W ), 219
HomG (V, W ), 219–221, 237
HomG (V, V ), 220
Jα,λ , 207
A4 , 50
A5 , 40, 248
An , 17, 18, 30, 40, 41, 90
Bq(x) , 206
Ck , 7, 12, 13
CR (r), 259
*
*
M1 · · · Mt , 179
MC , 191
MxI−A , 200
Mn (R), 54
M ap(R, R), 66
N (A) , 199, 208
D (formal derivative), 118
D (discriminant), 175
D2 , 15
D3 , 8, 12, 14
D4 , 7, 12, 14, 38
Dn , 34
O(n), 7, 13
O(Rn ), 7, 13
F × , 120, 170
F (a),
√ 92
F ( β), 103
F (x), 78
F (a1 , · · · , an ), 99
R ⊂ S, 61
R × S, 58
R× , 58, 259
(R × S)× , 59
Rn×n , 54, 261
RG , 222–224
RX , 224, 230
R[s], 61
R[s1 , · · · , sk ], 83
R[x], 54, 63
R[x1 , · · · , xk ], 84
G × H, 24, 43
G/H,
* 31
G H, 43
G∼
= H, 13
G/CG (h), 259
GL(n, C), 210, 213, 248
QR , 76
Qrn, 27, 38, 54
Quad, 101, 125
267
Notation Index
S ⊃ R, 61
S4 , 50, 158
S5 , 148
Sn , 1, 7, 12, 18, 19, 40, 145, 146, 157
SymmR[x1 , · · · , xn ], 86
U (n), 249
∼ W , 95
V*
=
V
W , 215
V • W , 241
V ∗ , 229
Vx , 255
(Vx )∗ , 255
Z(A), 261
Z(D), 263
Z(R), 259
[F (a) : F ], 97
[G : H], 21, 31
[K : F ], 96
(k ; [d1 ] , · · · , [d# ]), 180
[g0 ], v, 21
c̄, 54
∩α Hα , 17
µ(r), 122
φ–linear map, 109
φ−1 , iv, 16
φγ (γ ∈ Sk ), 86
φα (linear maps α), 233, 235
ψC , 234, 237
ρ ⊕ λ, 215
π, *
iv, 112*
θ1 · · · θn , 193
∆ (alternant),
90
√
∆ ( discriminant), 174
Φ(k), 9, 23, 58, 59, 120, 123–125
χreg , 231
χstandard , 232
a ∼ b, v, 70
a ≡ b(mod H), 19
a l b, 70
b ∈ B, iii
ck (x), 124
cp (x), 126
e, iv, 112
ei , 87
eC , 245
g n , 10
g0 H, 21
g0 + H, 21
i, j, k, 27, 54
m(C, ρ), 246
r/c operations, 195, 203
A, 112
C, iv, 54
C(X), 80
C[G], 245
C× , 7
CG , 234, 240
CG/∼ , 227, 240
Fpn , 121
F×
pn , 121
268
Notation Index
H, 263, 266
Q, iv, 46, 54
Q/Z, 47
R, iv, 8, 14, 54
R4 , 54
R× , 7, 12
Rn , 8
R+ , 14
Z, iv, 8, 12, 14, 18, 46, 54
Zk , v, 8, 12, 13, 29, 34, 57
Z×
k , 8, 23
Z×
pn , 60, 120
Ck , 101
Lk , 101
Pk , 101
P, 101, 125, 158
ann(M ), 189
AutF(K), 129
i
deg( a!
i x ), 67
N ), 254
dim(M
dimHomG , 221, 237
Imφ, 30
Kerφ, 30
tr(A), 228
tr(T ), 229
!, 38
lGl, 9
llgll (order), 11
llhll (length), 56
ll(g, h)ll, 26
< χV l χW >, 237
269
< α l β >, 234
< I >, 237
( l ), 249
!
,!
106, 115, 229, 240, 250
M
N , 250–252
⊗, 229, 240, 250
m ⊗ n, 250, 251
inherent ambiguity, 257
REFERENCE MATERIAL
Adams, 249
Adams-Atiyah, 266
Artin, 38, 80, 86, 116, 142, 159, 195, 198,
207, 249
Atiyah-Macdonald, 106
Fulton-Harris, 249, 261
Goldstein, 142, 143, 156
Greub, 106
Hartley-Hawkes, 177
Herstein, 76, 143
Hungerford, 166, 177, 242, 266
Jacobson, 177
Lam(1973), 265
Lam(1991), 61
Lang, 143
Macdonald, 90
Mackey, 225
Rotman, 174
Samuel, 260
Serre, 249
Stewart, 142
Index
Abel, Niels, 146
Abel/Galois/Ruffini, Niels/Evariste/Paolo,
i, 230
abelian, 9
group, 43, 177
by generators and relations, 48
simple, 50
abstract nonsense, 256
abstract symbol, 64
abused notation, v, 179
action, 212
of a group, 36
of groups on graphs, 39
Adams, Frank(J.F.), 266
additive group structure of a field, 96
additive notation, 9
adjoint formula, 203
algebra
algebraic, 262
finite dimensional, 262
over C, 245
structure on H2 , 266
algebraic
closure, 110, 115
of Fpn , 115
of Q, 115
division algebra over R, 263
non-commutative, 263–265
element, 61, 84, 93, 97, 262
extension, cardinality of, 112
geometry, 81, 86
integer, 230, 242–247
number, 112
approximation by rationals , 113
topology, 39, 266
algebraically
closed, 110, 198, 206, 208
dependent, 83
independent, 83, 113, 147
over Fp , 127
algebraizing topology, 81
algorithm, 16, 194, 195
alternating
function, 90
group, 18
polynomial, 90
angle trisection, 100, 103
ann(M ), 189
annihilator, 189, 208
arrow-theorist, 65
associates in a ring, 70
associative, 4, 6
algebra, 260, 261
over Z, 53
associativity, 53
for H, 55
isomorphisms, 179
Atiyah, Michael, 266
AutF (K), 129
AutR (C), 162
AutFpa (Fpab ), 159
AutQ (R), 162
automorphism, 110, 129
of finite field, 159
discontinuous, of C, 162
averaging trick, 219
axiom of choice, 115
axioms for groups, 6, 10, 24
basis, 46, 95, 178
for F (x), 97
bijective, iv, 1, 8, 13, 29
bilinear map, 251, 253
binary operation, 4, 51
on set of cosets, 32
block matrix, 206
Bott periodicity, 266
Bott, Raoul, 266
Burnside (p, q)-theorem, 244, 248
cancellation, 10
canonical form, 204
270
Index
matrix, 206
canonical surjection, 179
Cantor, Georg, 112, 113
cardinal number, iv
of sets of Zp [x]-irreducibles, 122
careful plodder, 65
Cartesian product, 23
Cauchy, Augustin, 36
Cauchy theorem, 148
Cayley’s Theorem, 19
Cayley-Hamilton theorem, 208, 209
Cayley numbers, 266
central division algebra, 263, 265
centralizer of element, 248, 259
centralizer subgroup, 259
centre of
algebra, 261
C[G], 245
group, 37, 248
ring, 259
change-of-basis matrix, 209
character, 170, 227, 229
of abelian groups, 225
of direct sum, 230
of RG , 230
of RX , 230
of symmetric groups, 240
orthogonality relations, 233
table, 231
theory, 225
value, 243
characteristic
of field, 91, 96
polynomial, 198, 208, 228
chemistry, 6, 212
class function, 227, 229, 235
classical algebra, 90
classification, 26, 27, 36, 39
of finite simple groups, 41
of finitely generated abelian groups, 43,
197
of finitely generated modules, 180
codomain, iii
column operation, 195
271
column orthogonality, 234, 237, 239
combinatorialist, 228
combinatorics, 90, 212
commutative, 4, 8, 9, 26, 34, 53
diagram, 192
ring, 53, 250
companion matrix, 206
complex analysis, 111
composition
factor, 50
quotients, 50
series, 50
computation theory, 195
congruence (mod k), v, 19
congruence class, v, 8
congruent, 19
conjugacy class, 37, 218, 227, 228, 238, 246,
247
conjugate, 141, 244
conjugate to a subgroup, 249
conjugation, 36, 244
conspicuous consumer, 65
constructibility
of n-gons, 125, 158
of 17-gon, 158
of pentagon, 158
constructible point, 101
coordinatewise, 179
correspondence, one-to-one, 13
cotangent space, 256
countable, iv, 112
countable dimensional vector space, 96
cubics, 142, 173–175
discriminant, 175
cycle, 1
cyclic
F [x]-module, 207
group, 18
quotient group, 37
subgroup, 19
cyclotomic polynomial, 124, 158, 258
cylinder, 255
decomposable G-set, 224
Index
decomposition of G-module into irreducibles,
218
Dedekind, Richard, 170
degree, 67, 232
of a over F , 93
of K over F , 96
of an extension, 96
of iterated extension, 98, 156
of representation, 213
over the prime field, 96
Desargues’ theorem, 260
det(xI − A), 208
determinant, 29, 34, 196
diagram chasing, 256
diagonal matrix, 29, 48
diagonalizable, 210
differential equations, 6, 212
differential geometry, 255
dim(V ), 95
dimension,!95
of M
N , 250
of representation, 213
direct product, 24
of cyclic groups, 43
direct sum, 43
decomposition, 209
of G-matreps, 215
of G-modules, 215
Dirichlet’s theorem, 156
discrete linear order, 182
discriminant, 174, 175
disjoint cycles, 1, 12
distributive law, 54
distributivity, 53
divisibility, 244
of ring elements, 70
division algebra, 250, 262
over C, 262
over R, 260, 263
division algorithm, 67, 182
non-uniqueness, 76
division ring, 53, 258, 261, 262
finite, 258–260
divisor, 69
272
dog on leash, 118
domain, iii
doubling the cube, 103
dual basis, 256
dual space, 255
Edmonton, 9
effective method, 194, 195
eigenspace, 208, 216, 217
eigenvalue, 12, 208, 220, 228
Eisenstein’s criterion, 83, 100, 122, 126
elementary
divisor, 188
form, 197, 205
from invariant factors, 197
of a matrix, 203
operations, 191
symmetric polynomial, 87, 146
epimorphism, 29
equivalence class, v, 20
equivalence relation, v, 16, 20, 21, 31, 71,
76, 178, 198
equivalent matreps, 214
Euclidean algorithm, 72
Euclidean domain, 75
Euclideanizable domain, 74, 180, 191
Euler, Leonhard, iii, 23
Euler’s function, 9, 120
evaluation, 229
extension
field as vector space, 96
of a ring, 61
Galois, 149
iterated quadratic, 143
normal, 136, 149
normal but not Galois, 149
not simple nor splitting field, 149
radical, 143
radical and normal, 143
separable, 149, 163
simple, 164
extension principles, 85, 86, 225
external direct sum, 179
exterior power, 249
Index
F -algebra, 106
F × , finite subgroups, 120
factorization, 72
n
of xp − x, 122
faithful representation, 212
families of simple groups, 40
Feit-Thompson theorem, 248
Fermat, Pierre de, 23, 76
Fermat prime, 125, 158
Fermat’s last theorem, 212
field, 36, 53
additive group structure, 96
algebraic closure, 110, 115
automorphism, 129
characteristic, 91, 96
extension, 61
existence
of field with all roots, 105, 114
of roots in general, 104
finite, structure of, 120
fixed, 148, 150
full of eµ ’s, 98
generated by a, 92
generated by {a1 , · · · , an }, 99
intermediate, 129
map, 92
of algebraic numbers, 112
of constructible numbers, 125
of constructible points, 125, 158
of fractions, 76
of rational functions, 78
of real algebraic numbers, 162
prime, 91
real closed, 162
splitting, 104
theory, 91
tower of, 99, 103
finite
cyclic, 18
dimensional, 178
dimensional algebra, 262
division ring, 262
extension, 96
fields, structure, 120
273
order, 46
simple group, 248
soluble group, 50
finitely generated
abelian group, 43, 197, 242
group, 43
R-module, 242
finitely presentable, 195
first isomorphism theorem
groups, 33
rings, 57
modules, 253
fixed field, 148, 150
formal derivative, 118
free
abelian group, 46, 224
commutative ring, 224
G-module, 224
group, 39, 47, 224
module, 224, 252
Frobenius, Georg, 225, 228, 240, 258
Frobenius automorphism, 159
Frobenius’ theorem, 262, 263, 266
fundamental group, 39
fundamental theorem of (19th century) algebra, 110, 142, 161
fundamental theorem of algebra, 76, 110
Galoisienne proof, 161, 162
homotopy theoretic proof, 117
G-invariant subspace, 215
G-map, 219
G-matrep, 212, 213
G-module, 212, 213, 249
G-set, 213, 227
G-submodule, 215
Galois conjugate, 244
Galois correspondence, 148, 150, 161, 169
for Fp12 , 160
in any characteristic, 163
examples, 153–155
Galois, Evariste, 129
Galois extension, 139, 149, 164
Galois group, 110, 129
Index
as a subgroup of Sk , 131
conjugacy of subgroups, 152
finiteness, 130
of cubic, 176
of general equation of degree n, 145
of specific polynomials
x12 − 1, 133
x5 − 1, 132
xn − 1, 132, 134
x3 − 2, 131, 135
x4 − 2, 155
xn − λ, 135, 140
x4 − 2x − 2, 158, 159
(x2 − 2)(x2 − 3), 154
order, 134, 147, 150, 159, 164
problem of realizing abstract group, 132,
157
realizing Sn , 157
realizing any abelian group, 132, 157
realizing any soluble group, 157
soluble, 168
transitivity of, 158
Galois theory, 36, 91, 106, 111, 129, 213
fundamental theorem, 127, 150, 165
Gauss, Carl, 111, 113, 124, 125
Gauss’ Lemma, 81, 126
Gauss’ theorem, 81, 158
Gaussian domain, 73
Gaussian integers, 75
GCD, 30, 59, 61, 71, 119, 196
general commutative law, 5, 62
general distributive law, 62
general equation of degree n, 145
non-solvability by radicals, 146
general linear group, 14, 30
generalized associative law, 5, 51, 62
generating set for a vector space, 95
generators and relations, 39, 190–192, 198,
200
for abelian groups, 48
for Sn , 40
geometrical constructions, 143
geometry, 6, 212
greatest common divisor, 71
274
greatest lower informative bound, 17
ground ring, 256
group, 6
abelian, 43
action, 36
alternating, 18
centre of, 37
cyclic, 18
finitely generated, 43
free abelian, 46
free, 39, 47
fundamental, 39
Galois, 110, 129
general linear, 14, 30
isomorphism, 13
morphism, 29
multiplication table, 15
of order p2 q, 38
of order 2p, 36
of order p2 , 38, 247
of order p3 , 38, 247
of order pq, 36
of prime order, 25
of small order, 25
order, 9, 21
order of element in, 11, 21
quotient, 32
representations, 212
simple, 41
solubility of 2-groups, 161
solubility of p-groups, 37
soluble, 37, 129, 140
special linear, 30
sporadic simple, 41
structure on G/N , 32
Sylow, 161
torsion abelian, 47
words, 39
group algebra, 228, 244, 245, 261
centre of, 245
complex, 213
groupiness, 24
harmonic analysis, 212
Index
hermitian inner product, 232
Hilbert, David, 157, 260
Hilbert’s Nullstellensatz, 81
Hilbert’s Theorem#90, 170
HomG functor, 220
homomorphism, 29
homotopy, 117
ideal, 56, 179
finitely generated, 194
left, 56
principal, 69
right, 56
two-sided, 56
vs. submodule, 184
idempotent, 210
identification, 65
identity
element, 6, 9, 53
of R[x], 64
map, 16, 34
permutation, 1
Imφ, 30
image of a soluble group, 38
indecomposable, 49
G-set, 224
index, 276
of book, 268-282
of subgroup, 21
indexed basis, 98
induced representation, 249
induction, ii, 1
infinite cyclic, 18, 39
infinite extension, 96
injective, iii, 29
inner product, 251
G-invariant, 249
integral
combination, 46
domain, 57
extension, 243
over R, 242
over Z, 242
intermediate field, 129
275
finitely many, 164
number of, 128, 149
internal direct product, 27, 43
internal direct sum, 43
of G-modules, 216
intersection
of ideals, 72
of subgroups, 17
of subrings, 62
invariant factor, 43, 60, 180, 193
decomposition, 48
form, 196, 197, 205
from elementary divisors, 197
of a matrix, 198, 203
of N (A) , 203
invariant subspace, G-, 215
inverse, 6, 9, 53
of gN , 32
of permutation, 1
of product, 10
invertible, 7, 53, 58
in R[x], 67
in R[x1 , . . . , xk ], 85
matrix, 210, 212
irreducible
character, 227
G-module, 215
abelian G, 222
irreducibles in commutative ring, 72
existence in Q[x], 122
existence in Zp [x], 122
in C[x], 76
in R[x], 76
irreps, 216
dimension divides order, 246
number of, 218, 227, 228, 238
of
- abelian group, 223, 239
squared dimensions of, 218, 223
isomorphic, 13, 15
G-actions, 213
G-matreps (equivalent), 214
G-modules, 213
isomorphism, 13, 14, 29
extension properties, 137, 138
Index
of modules, 178
of vector spaces, 95
theorem, 1st, 33
theorems, 2nd etc., 35
isotypical component, 226, 227, 241
projection to, 241
iterated field extension, degree of, 98, 156
iterated quadratic extension, 143
Jacobson, Nathan, 260
Jordan block, 207
Jordan canonical form, 198, 206
Jordan form, 208
Jordan-Holder Theorem, 50
juxtaposition, 9, 53
Kerφ, 33
kernel, 30
Kervaire, Michel, 266
kindergarten, 53, 63, 101
Klein 4-group, 15
Kronecker/Weber, Leopold/Heinrich, 157
Krull-Schmidt Theorem, 49
Lagrange’s theorem, 21, 35
lattice, 167
LCM, 27, 71, 196
least common multiple, 71
left coset, 21
Lie algebra, 261
line integral, 111
linear, 95, 213
φ–linear map, 109
action, 36
algebra, ii, 177, 198
combination, 94, 245, 252
independence, 95, 245
of automorphisms, 170
map, 94
operator, 199, 209, 213
pigeon-hole principle, 109
transformation, 7, 178, 198
linearity, 129
linearly independent, 95
Liouville, Joseph, 113
276
manifold, 255
map, G-, 219
mapping property, 225, 250, 255
matrep, 212, 213
equivalent, 214
matrix, 7, 48, 54
multiplication, 7
representation (matrep), 212
theory, 177
Maschke averaging trick, 219, 233, 249
Maschke’s theorem, 218, 234
matrix group, 212
maximal algebraically independent set, 162
maximal ideal, 78
existence, 80
in C(X), 81
in a PID, 79
Milnor, John, 266
minimal polynomial, 97, 198, 208
of root of unity, 124
module, 177, 250
cyclic, 178
direct sum, 179
elementary divisor form, 197
finitely generated, 178, 194
G-, 212
generators, 178
invariant factor form, 196
isomorphism, 178
morphism, 178
over group algebra, 213
over PID, 177
quotient, 178
theory, 213
Moebius inversion, 122, 124
monomorphism, 29, 30
morphism
composition of, 16
extension principles, 85
image of, 30, 56
kernel of, 56
of algebras, 261
of division algebras, 262
of fields (field map), 92
Index
of groups, 29
of modules, 178, 251
of rings, 56
multiplication
in quotient ring, 57
on G/N , 32
table, 15
multiset, iv, 205
nabobery, 142
nattering nabobery, ii
Nazarov, Maxim, 240
negative, 9
nilpotent, 210
Noetherian ring, 194
non-associative algebra, 261
over R, 266
non-commutative algebraic R-division algebra, 265
non-constructible number of degree four, 104,
158
non-separable extension, example, 164
non-singular matrix, 7, 54
non-solubility of Sn , 40, 42
non-soluble group, 50
non-solvability of polynomial equations, 91,
146, 148
norm, 169
normal
extension, 136, 149
in any characteristic, 139
of prime degree, 169
non-transitivity, 139
vs. subgroup, 150
closure, 167
subgroup, 32, 248
not-necessarily-associative division algebra,
266
number theory, 212
operation-preserving actions, 36
operations on R[x], 64
orbit, 227
order of
finite field, 120
277
element, 11, 21
elements in Sn , 148
finite non-abelian simple group, 41
group, 9, 21, 218, 223
ordered field, 162
orthogonal
basis, 232
complement, 249
matrix, 7
transformation, 7
orthogonality relations, 233
orthonormality, 170
p-component, 47
p-group, 37
p-primary decomposition, 48
partially symmetric polynomial, 90
partitioning, 21
Pappus’ theorem, 260
partial fractions, 97
partitions, 228
permutation, 1, 12
even/odd, 2, 4
physics, 6, 212, 225, 255
PID, 70, 180, 188, 194
pointwise multiplication, 240
polynomial, 54, 63
g ∗ (x), corresponding to g(x), 107
function, 63, 122
ring, 63
ring in k variables, 84
with Galois group S4 , 158
separable, 163
polynomials in several variables, 83
powers, 10
presentation matrix, 190
prime ideal, 78, 92, 94
in a PID, 79
prime subfield, 91
primitive, 81
element, 126, 147
roots of unity, 120, 124, 169, 258
principal ideal, 69, 70
domain, 70
Index
probability theory, 212, 225
product map, 51
product of representations, 249
projective
characters, 240
geometry, 260
plane, 266
quadratic form, 266
quartics, 142
quaternion, 54
quaternion group, 27, 226
quaternionic
conjugate, 55
conjugation, 266
inverse, 56
quaternions, uniqueness, 260
quotient
and remainder, 68
group, 32
module, 178
of a soluble group, 38
ring, 57
R-algebra, 53
R-module, 177, 242
r/c operations, 198
radical extension, 143, 168, 172
radicals
one solution by, 145
solvable by, 141
rank, 47
rational canonical form, 198, 206
rational function, 78
real algebraic numbers, 162
real closed field, 162
reflection, 34
regular n-gon, 34, 103
regular representation, 222, 230, 232
relative consistency, 64
remainder theorem, 68
repeated roots, 118
representations, 212
applications to number theory, 249
faithful, 212
278
groups of order p2 , 247
groups of order p3 , 247
induced, 249
lifting, 241
of direct product, 241
of finite abelian groups, 225
of infinite groups, 249
of particular groups
C2 , 216
C3 , 217
C4 , 217
D2 , 217
D4 , 226
Qrn, 226
S3 , 217, 218
S4 , 240
S5 , 241
over general fields of characteristic zero,
249
regular, 222
restriction, 241
sign, of S3 , 217
of Sn , 241
standard, of S3 , 217 , 224, 232
of Sn , 241
tensor product, 241
trivial, 216
unitary, 249
representation theory, 41
resolvent, 159
ring, 53
associates in, 70
centre of, 259
commutative, 56, 250
free, 224
divisibility in, 70
division, 53, 258, 262
extension, 61
first isomorphism theorem, 57
Gaussian integers, 75
generated by s, 61
generated by {s1 , . . . , sk }, 83
ideal in, 56
integral domain, 57
Index
morphism, 56
Noetherian, 194
of class functions, 227, 238
of polynomial functions, 66
of polynomials, in one variable, 54, 63
of polynomials, in several variables, 84
semi-ring, 240
of symmetric polynomials, 86
PID, 70
quotient, 57
subring, 56
UFD, 73
vs. Z-algebra, 53
rinky-dink process, 204
root, 68
existence
of field with all roots, 105, 114
of roots in general, 104
of polynomial, 36
of unity, 12, 243, 244
roots of unity, sum of, 230, 243
rotation, 34, 250
routine verification, 186
row/column
equivalent, 209
operations, 48, 195
row orthogonality, 233, 234
Ruffini, Paolo, 146
rules of algebra, 64
RV, 177
Šafarevič, Igor, 157
scalar matrix, 248
scalar multiplication, 94, 177, 198
Schur’s lemma, 220, 222, 227, 235, 237, 249
Schur, Issai, 225, 240
semi-ring, 240
separability, 149, 163
separable element, 163
Shakespeare, William, 228
sign, 3, 4, 29, 90
representation of S3 , 217
of Sn , 241
similar, 198
279
similarity
class of matrix, 198
of matrices, 177, 198
‘problem’ of, 199
simple
algebraic extension, 93
extension, 92, 126
group, 41, 247
transcendental extension, 93
simplicity of An , 40
skewfield, 53
smallest simple group, 40
Smith normal form, 177, 194, 195, 208
reduction to, 194
smoothness, 255
solubility, 37, 50
implies solvability, 168, 172
of S4 , 40
of 2-groups, 161
of p-groups, 37
soluble, 37, 129, 140, 248
solvability, 38
solvable by radicals, 129, 135, 141, 168
over C, 142, 147
over R, 142, 147
solvable implies soluble, 142
space of class functions, 238
spanning set, 178
Specht, Wilhelm, 240
special linear group, 30
sphere, 255
splitting, 28
into direct sum, 219
splitting field, 104, 136
existence, 105
for (F, S), 114
for infinitely many polynomials, 114
uniqueness, 106, 109
sporadic simple groups, 40
squaring the circle, 103, 112
squareness of character table, 233, 237, 246
stabilizer subgroup, 227
standard representation of S3 , 217, 224, 232
of Sn , 241
Index
straight-edge & compass constructions, 100,
125
structure of
finite abelian groups, 43-51, 120, 158
finite extensions, 99
finitely generated abelian groups, 43-51
finitely generated modules, 180–190
simple algebraic extensions, 93, 99
structure theory of
groups, 25-27, 38, 91, 212
abelian groups, 38, 43-51
subfields
of C isomorphic to R, 162
of Fpn , 122
subgroup, 16
fixing a subset, 36
generated by a subset, 17
of a finitely generated abelian group, 51
of index two, 34
submodule, 178
G-, 215
vs. ideal, 184
subnormal series, 38, 39
subring, 56
generated by s, 62
subspace, 178
substitution : R[x] → M ap(R, R), 66
sum of squares condition, 246
surjective, iii, 29
surface, 255
Sylow, Ludvig, 35, 213
subgroup, 161, 248
theorems, 35, 38
symmetric
function, 90
polynomial, 86
power, 249
symmetry, 7
tangent space, 255
tensor, 255
field, 255
product, 106, 226, 240, 250, 251
associative law, 254
280
basis recipe, 254, 256
commutative law, 254, 255
of cyclic PID-modules, 255
of fields, 106
of representations, 241
distributive law, 254
existence, 252
generating set, 254
uniqueness, 252
theory of abelian groups, 38
topological, 111
topology, 6, 81, 212
torsion
abelian group, 47
subgroup, 46
submodule, 186
-free, 46
torus, 255
total degree, 88
tower of groups, 37
tower of fields, 99, 103
trace, 228
transcendental extension, 94, 97
transcendental element, 61, 84
transcendentality
by Cantor’s argument, 112
of specific numbers, 113
transfinite extension principle, 163
transpose, 210
transposition, 1, 3, 6, 12
transpositions, product of, 4
triangle inequality, 244
trisection of angles, 101, 103
trivial
morphism, 30, 34
G-matrep, 216
G-module, 216
UFD, 73, 188
UFDness
failure, 75
of F [x], 73
of F [x1 , . . . , xk ], 85
of any Euclideanizable domain, 74
Index
of any PID, 74
of Z, 74
uncountable dimension, 96
unidentified flying domain, 73
union of subgroups, 17
unique factorization, 70, 177
domain, 73
uniqueness of
field generated by a root, 104
identity element, 7
inverses, 7
quaternions, 56, 260
splitting fields, 121, 255
unit, 53
unitary group, 249
unitary representation, 249
universal property, 250, 251, 253
unsolvable by radicals
over Q, 147
general equation of degree n, 145
specific example, 148
vector field, 255
vector space, 36, 94, 177, 250
Wedderburn, Joseph Henry Maclagan, 258–
260
well-defined, 31
Wiles, Andrew, 212
word in generators, 39
Young, Alfred, 240
Z-algebra, 260, 261
vs. ring, 53
Z-module, 197
Z×
k as a direct product of cyclic groups, 60
zero, 9, 68
divisor, 58
representation, 216
Zorn’s lemma, 80, 116, 163
281

Bare Bones Algebra, Eh!?

Transcription

Similar documents

Fellowes 79Ci PDF - DubaiMachines.com

simulform - KnowTech3D

SIROCCO 2016 - Daniel Graf

congruence properties of zariski-dense subgroups i

24 SAVE £11 SAVE £13 SAVE £11 SAVE

A family of newforms

On functions with a finite or locally bounded Dirichlet integral

Pan 49A Form last Correction 16.08.2014 New.cdr

SmartCare Damp Proof

Descriptive Set Theory and Harmonic Analysis