Math 3070 Course Notes Rob Noble October 16, 2014
Transcription
Math 3070 Course Notes Rob Noble October 16, 2014
Math 3070 Course Notes Rob Noble October 16, 2014 2 Contents 1 Mathematical Induction and the Least Integer Principle 5 2 Integers 11 3 Unique Factorization 19 4 Linear Diophantine Equations 25 5 Congruences 31 6 Linear Congruences 37 7 Fermat’s and Wilson’s Theorems 43 8 The Divisors of an Integer 47 9 Perfect Numbers 51 10 Euler’s Theorem and Function 55 11 Primitive Roots 61 12 Quadratic Congruences 69 13 Quadratic Reciprocity 77 14 Pythagorean Triangles 87 15 Infinite Descent and Fermat’s Conjecture 97 16 Sums of Squares 103 17 x2 − N y 2 = 1 117 3 4 CONTENTS Chapter 1 Mathematical Induction and the Least Integer Principle Many times in what follows, we will invoke the principle of mathematical induction or the least integer principle to prove results. These principles are equivalent, and are used extensively, sometimes implicitly, in mathematics. There are two equivalent forms of the principle of mathematical induction. These two forms are given below. Lemma 1 (Principle of Mathematical Induction (First Form)). Let S be a set of integers. If S contains some integer m and is such that For all integers n ≥ m, if n ∈ S then n + 1 ∈ S, then S contains all integers greater than or equal to m. Lemma 2 (Principle of Mathematical Induction (Second Form)). Let S be a set of integers. If S contains some integer m and is such that For all integers n ≥ m, if all of m, m + 1, . . . , n ∈ S then n + 1 ∈ S, then S contains all integers greater than or equal to m. Although the principle of mathematical induction is equivalent to the least integer principle, sometimes it is more natural to apply the latter to prove a particular result. Below is stated the least integer principle along with the dual greatest integer principle. Lemma 3 (Greatest and Least Integer Principles). Let S be a set of integers. If S is nonempty and bounded above (resp. below) then S has a greatest (resp. least) element. As stated above, the two forms of the principle of mathematical induction and the least integer principle are equivalent. This forms the content of the following proposition. 5 6 CHAPTER 1. MATHEMATICAL INDUCTION AND THE LEAST INTEGER PRINCIPLE Proposition 1. The following statements are equivalent. 1. If S is a set of integers containing some integer m such that For all integers n ≥ m, if all of m, m + 1, . . . , n ∈ S then n + 1 ∈ S, then S contains all integers greater than or equal to m. 2. If S is a nonempty set of integers that is bounded below, then S contains a least element. We close this section with an example that illustrates the use of these equivalent fundamental concepts. We will then turn to the development of the core material of the course. Example 1. Use one of the two forms of mathematical induction or the least integer principle to prove the following statements. (a) For all integers n ≥ 1 we have 1 3 + 2 3 + · · · + n3 = n2 (n + 1)2 . 4 (b) The Fibonacci numbers {f0 , f1 , f2 , . . . } are defined recursively by f0 = 0, f1 = 1, fn+2 = fn+1 + fn (n ≥ 0). For all n ≥ 0 we have 1 fn = √ 5 √ !n 1+ 5 1 −√ 2 5 √ !n 1− 5 . 2 (c) For every positive integer n there exist integers q and r such that 0 ≤ r ≤ 3 and n = 4q + r. Here, we will see later that part (c) above is a special case of what is called the division algorithm which makes rigorous the well known process of division with remainder. Solution. The first form of the principle of mathematical induction will be the most natural to use for part (a), whereas the second form of the principle of mathematical induction, and the least integer principle will prove to be the most natural to apply to parts (b) and (c) respectively. We start with part (a). Let S be the set of all integers greater than or equal to 1 for which the result holds. That is n2 (n + 1)2 S = n ≥ 1 13 + 23 + · · · + n3 = . 4 We need to prove that S contains all integers greater than or equal to one, and, by the first form of the principle of mathematical induction, it will be sufficient to show that 1 ∈ S and that for any integer n ≥ 1, if n ∈ S then n + 1 ∈ S. Well, 1 is certainly in S since 13 = 1 = 4 12 (1 + 1)2 = . 4 4 7 Suppose then that for some integer n ≥ 1 we have n ∈ S. We need to prove that under this assumption n + 1 ∈ S as well. To this end, we compute 13 + 23 + · · · + n3 + (n + 1)3 = (13 + 23 + · · · + n3 ) + (n + 1)3 n2 (n + 1)2 + (n + 1)3 4 n2 +n+1 = (n + 1)2 4 2 n + 4n + 4 = (n + 1)2 4 (n + 2)2 2 = (n + 1) 4 2 (n + 1) ((n + 1) + 1)2 = . 4 (Since we are assuming n ∈ S) = We have shown that n + 1 ∈ S so that we can conclude by the first form of the principle of mathematical induction that S contains all integers greater than or equal to one. Thus, for all integers n ≥ 1, we have n2 (n + 1)2 13 + · · · + n3 = 4 as claimed. We turn now to the proof of part (b). We will use the second form of the principle of mathematical induction. Suppose then that S is the set of all integers greater than or equal to 0 for which the result holds. That is ( √ !n √ !n ) 1 1+ 5 1 1− 5 S = n ≥ 0 fn = √ −√ . 2 2 5 5 We need to show that S contains all integers greater than or equal to zero. By the second form of the principle of mathematical induction it is sufficient to show that 0, 1 ∈ S and that for all n ≥ 1 if 1, 2, . . . , n ∈ S, then n + 1 ∈ S as well. First of all, 0 ∈ S since 1 1 1 f0 = 0 = √ − √ = √ 5 5 5 √ !0 1+ 5 1 −√ 2 5 √ !0 1− 5 , 2 and 1 ∈ S since 1 √ 5 √ !1 1+ 5 1 −√ 2 5 √ !1 1− 5 1 =√ 2 5 (1 + √ 1 2 5 =√ 5 2 =1 = f1 . √ 5) − (1 − 2 √ 5) ! 8 CHAPTER 1. MATHEMATICAL INDUCTION AND THE LEAST INTEGER PRINCIPLE Suppose then that we have an integer n ≥ 1 for which 1, . . . , n ∈ S. We need to show that this assumption implies that n + 1 ∈ S as well. Well, we are assuming that n and n − 1 are both in S so that √ !n √ !n 1+ 5 1− 5 1 1 −√ , fn = √ 2 2 5 5 and fn−1 1 =√ 5 √ !n−1 1 1+ 5 −√ 2 5 √ !n−1 1− 5 . 2 Therefore fn+1 = fn + fn−1 " √ !n √ !n # √ !n−1 1 1 1 1+ 5 1− 5 1 + 5 1 −√ + √ −√ = √ 2 2 2 5 5 5 5 ! √ !n √ n 1 1+ 5 2 1− 5 2 1 √ √ =√ 1+ 1+ −√ . 2 2 5 1+ 5 5 1− 5 √ !n−1 1− 5 2 But √ 2(1 − 5) 2 √ =1+ √ √ 1+ 1+ 5 (1 + 5)(1 − 5) √ 2(1 − 5) =1− 4 √ 1− 5 =1− √ 2 1+ 5 = . 2 Similarly, we have 1+ √ 1− 5 2 √ = . 2 1− 5 We conclude that fn+1 1 =√ 5 1 =√ 5 1 =√ 5 √ !n √ !n 1+ 5 1− 5 2 1 2 √ √ 1+ −√ 1+ 2 2 1+ 5 5 1− 5 √ !n √ ! √ !n √ ! 1+ 5 1+ 5 1 1− 5 1− 5 −√ 2 2 2 2 5 ! ! √ n+1 √ n+1 1 1+ 5 1− 5 √ − . 2 2 5 We have therefore shown that n + 1 ∈ S so that S contains all integers greater than or equal to 0, as required. 9 Finally, we turn to the proof of part (c). We will apply the least integer principle. Let n ≥ 1 be an integer. Define S = {n − 4q ≥ 0 | q is an integer.}. Then S is nonempty since it contains n = n − 4(0). Also, since every member of S is greater than or equal to zero, we see that S is bounded below. By the least integer principle we conclude that S has a least element r. But then r = n − 4q for some q so that n = 4q + r. In order to complete the proof, we need to show that 0 ≤ r ≤ 3. But this must be the case since r ≥ 0 (since it is a member of S), and if r ≥ 4, then we could write n = 4(q + 1) + (r − 4), where 0 ≤ r − 4 < r. This would imply that r − 4 = n − 4(q + 1) ∈ S which would contradict the minimality of r. By contradiction, we conclude that 0 ≤ r ≤ 3, as required. We will get plenty of additional practise applying the fundamental principles introduced in this section in what follows. 10 CHAPTER 1. MATHEMATICAL INDUCTION AND THE LEAST INTEGER PRINCIPLE Chapter 2 Integers This chapter is based on [Dud08, §1]. We denote the set of integers by Z, the set of natural numbers, or positive integers, by N, and the set of nonnegative integers by N0 . That is, Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . . }; (2.1) N = {1, 2, 3, . . . }; (2.2) N0 = {0, 1, 2, 3, . . . }. (2.3) Definition 1. Let a, b ∈ Z. We say that a divides b, written a | b, if and only if there exists d ∈ Z for which b = ad. If a does not divide b, we write a - b. Example 2. We have 2 | 6, 12 | 60, 17 | 17, −5 | 50, and 8 | −24, but 4 - 2 and 3 - 4. Proposition 2. The relation | satisfies the following properties: (i) It is transitive. That is, for all a, b, c ∈ Z, if a | b and b | c, then a | c. (ii) For integers d, a1 , . . . , an , c1 , . . . , cn , if d | a1 , d | a2 , . . . , d | an , then d | (c1 a1 + · · · + cn an ). Proof. Let a, b, c, d, a1 , . . . , an , c1 , . . . , cn ∈ Z. (i) Suppose that a | b and b | c. Then, there exist integers e and f such that b = ae and c = bf . But then, we have c = bf = (ae)f = a(ef ). Setting g = ef , we see that c = ag for some integer g so that a | c, as required. (ii) Suppose that d | a1 , d | a2 , . . . , d | an . Then, there exist integers b1 , . . . , bn such that a1 = db1 , a2 = db2 , . . . , an = dbn . But then, we see that c1 a1 + · · · + cn an = c1 (db1 ) + · · · + cn (dbn ) = d(c1 b1 ) + · · · + d(cn bn ) = d(c1 b1 + · · · + cn bn ) = dh 11 12 CHAPTER 2. INTEGERS where we have set h = c1 b1 + . . . cn bn . Since we have found an integer h such that c1 a1 + · · · + cn an = dh, we conclude that d | (c1 a1 + · · · + cn an ), as required. Definition 2. Let a, b ∈ Z where at least one of a, b is nonzero. The unique integer d satisfying the pair of conditions: (i) d | a and d | b; (ii) For any integer c such that c | a and c | b, we have c ≤ d, is called the greatest common divisor of a and b and is denoted by (a, b) or gcd(a, b). Implicit in this definition is that such an integer d exists. In order to prove that this is the case, we need to invoke the greatest integer principle stated along with its dual the least integer principle in Lemma 3. Proposition 3. Let a, b ∈ Z not both be zero. Then, the greatest common divisor of a and b is well defined and at least 1. That is, there exists a unique integer d satisfying the conditions (i) and (ii) of Definition 2 and this d is greater than or equal to one. Proof. Let S be the set of all common divisors of a and b. That is, S = {c ∈ Z | c | a and c | b} ⊆ Z. Since 1 | a and 1 | b, we see that 1 ∈ S so that S = 6 ∅. Further, since any common divisor of a and b is bounded above by |a| and |b|, we see that S is bounded above. By the greatest integer principle, S has a largest element d. But then, since d ∈ S, we see that d | a and d | b, and since it is the largest element of S, any integer c for which c | a and c | b must satisfy c ≤ d. Further, since S cannot have two largest elements, we see that d is unique. Finally, since 1 ∈ S, we conclude that d ≥ 1. Theorem 1. Let a, b ∈ Z not both be zero, and let d = (a, b). Then (a/d, b/d) = 1. Proof. Assume the hypotheses. From Proposition 3 we know that (a/d, b/d) ≥ 1. We complete the proof by showing that (a/d, b/d) ≤ 1. To this end, let g = (a/d, b/d). Since g | a/d and g | b/d, there exist integers u and v such that a = gu, d b = gv. d Therefore, a = (gd)u, b = (gd)v. We conclude that gd is a common divisor of a and b. Since d is the greatest common divisor of a and b, we conclude that gd ≤ d. Finally, since d > 0, we can divide by d to conclude that g ≤ 1, as required. Definition 3 (Relatively Prime). Let a, b ∈ Z not both be zero. We call a and b relatively prime provided (a, b) = 1. 13 If a and b are large integers, it is impractical to find their greatest common divisor by trial division. The Euclidean Algorithm provides us with an efficient, systematic way of determining greatest common divisors. First, we need to make use of the division algorithm which relies on the archimedean principle. Theorem 2 (The Archimedean Principle For Integers). Given any integers a and b, there exist integers u and v such that a ≤ bu, a ≥ bv. The division algorithm referred to above is given by the following theorem. Theorem 3. Let a, b ∈ Z, with b 6= 0. There exist unique integers q and r, with 0 ≤ r < |b| such that a = bq + r. Proof. Assume the hypotheses and let S = {a − bq ∈ N0 | q ∈ Z}. We will see that S is a nonempty set of integers that is bounded below. The least integer principle will then provide us with a least element r, and we’ll see that this r together with the q for which r = a − bq satisfy the requirements of the theorem. By the archimedean principle, there exists an integer q such that a ≥ bq. But then a − bq ≥ 0 so that a − bq ∈ S. S is therefore nonempty. Since every element of S is nonnegative, we see that S is bounded below by 0. By the least integer principle, S has a least element r = a − bq. Then a = bq + r. Further, we see that r = a − bq ≥ 0 since r ∈ S. We conclude the proof of the existence of integers q and r satisfying the conditions of the theorem by proving that r < |b|. But this must be the case since if r ≥ |b|, and |b| = bε for ε = ±1, we could write a = b(q + ε) + (r − bε), where r > r −bε = r −|b| ≥ 0. This would imply that r −bε = a−b(q +ε) ∈ S thereby contradicting the minimality of r. Therefore, we have 0 ≤ r < |b|. We have proved that there is at least one pair of integers q, r satisfying the conditions of the theorem. We now complete the proof by showing that this pair of integers is unique. To this end, suppose that the pairs q1 , r1 and q2 , r2 both satisfy the conditions of the theorem. Then, we have a = bq1 + r1 , 0 ≤ r1 < |b|; (2.4) a = bq2 + r2 , 0 ≤ r2 < |b|. (2.5) Subtracting (2.4) from (2.5) yields b(q2 − q1 ) = r1 − r2 . (2.6) Now, (2.6) implies that r1 − r2 is a multiple of b. However, we have −|b| < r1 − r2 < |b|. Since the only multiple of b in this range is 0, we conclude that r1 − r2 = 0. But then, (2.6) reads b(q2 − q1 ) = 0, and since b 6= 0, we can conclude that q2 − q1 = 0. We have therefore shown that r1 = r2 and q1 = q2 so that the pair of integers referred to in the statement of the theorem is indeed unique. 14 CHAPTER 2. INTEGERS Remark 1. We proved the division algorithm in this fashion since it allowed us to do everything while speaking only of integers. We didn’t need to discuss real numbers or rational numbers. However, if we allow ourselves the use of the reals and rationals, one can show that the unique integers q and r referred to in the statement of Theorem 3 are intimately related to the floor and fractional part of a/b, respectively: q= jak b , r nao . = b b Here, the floor bxc of a real number x is the largest integer less than or equal to x while the fractional part {x} of a real number x is the difference x − bxc. Therefore, the division algorithm is simply the familiar process of division with remainder. Theorem 3 together with the following lemma will yield the Euclidean algorithm for computing greatest common divisors. Lemma 4. Let a, b ∈ Z not both be zero. If a = bq + r, for integers q and r, then (a, b) = (b, r). Proof. Firstly, we note that the greatest common divisors in question are well-defined since there is definitely no problem if b 6= 0, while if b = 0, then r = a 6= 0. We will use part (ii) of Proposition 2 that states that a common divisor of integers must divide any linear combination of the integers. Let gab and gbr denote the greatest common divisor of a, b and b, r, respectively. From a = bq + r together with the fact that gab divides both a and b, we conclude that gab | r. Consequently, gab divides both b and r so that gab ≤ gbr . Similarly, Since gbr divides both b and r, and a = bq + r, we see that gbr also divides a. But then gbr divides both a and b so that gbr ≤ gab . Putting these two inequalities together yields gab ≤ gbr ≤ gab so that gab = gbr , as required. We have arrived at last at the Euclidean algorithm. Theorem 4 (The Euclidean Algorithm). Let a, b ∈ Z with b 6= 0. If we define sequences {q0 , q1 , q2 , . . . } and {r−1 , r0 , r1 , r2 , . . . } by letting r−1 = |b| and then applying the division algorithm successively to obtain a = bq0 + r0 , 0 ≤ r0 < |b| b = r0 q1 + r1 , 0 ≤ r1 < r0 r0 = r1 q2 + r2 , 0 ≤ r2 < r1 r1 = r2 q3 + r3 , 0 ≤ r3 < r2 r2 = r3 q4 + r4 , .. . 0 ≤ r4 < r3 .. . there is a first index t ≥ 0 such that rt = 0. The greatest common divisor of a and b is then given by (a, b) = rt−1 . Proof. We have the decreasing sequence |b| > r0 > r1 > r2 > · · · ≥ 0. Therefore, eventually we 15 obtain a first zero remainder rt . We then have a = bq0 + r0 , 0 ≤ r0 < |b| b = r0 q1 + r1 , 0 ≤ r1 < r0 r0 = r1 q2 + r2 , 0 ≤ r2 < r1 r1 = r2 q3 + r3 , 0 ≤ r3 < r2 r2 = r3 q4 + r4 , .. . 0 ≤ r4 < r3 .. . rt−3 = rt−2 qt−1 + rt−1 , 0 ≤ rt−1 < rt−2 rt−2 = rt−1 qt . Successively applying Lemma 4 yields (a, b) = (b, r0 ) = (r0 , r1 ) = · · · = (rt−2 , rt−1 ) = rt−1 , as required. Corollary 1. Let a, b ∈ Z not both be zero and d = (a, b). Then (i) There exist integers x and y such that d = ax + by. (ii) Every common divisor of a and b divides d. Proof. Part (i) follows from running the Euclidean algorithm backwards starting with the second to last equation. The details are left to the reader. For part (ii), we use part (i) to find integers x and y such that d = ax + by and then note that any common divisor of a and b must also divide ax + by = d. Remark 2. We defined the greatest common divisor as the common divisor that is larger than any other common divisor. Using part (ii) of Corollary 1, we could have, instead, defined the greatest common divisor to be the common divisor that is divisible by every common divisor. Example 3. Use the Euclidean algorithm to calculate (343, −280) and (1578, 442). In each case, express the greatest common divisor as a linear combination of the given integers. Solution. We compute 343 = (−280)(−1) + 63; (2.7) −280 = 63(−5) + 35; (2.8) 63 = 35(1) + 28; (2.9) 35 = 28(1) + 7; (2.10) 28 = 7(4). We conclude that (343, −280) = 7. Running equations (2.7)–(2.10) backwards yields 16 CHAPTER 2. INTEGERS 7 = 35 − 28 from (2.10) = 35 − (63 − 35) from (2.9) = −63 + 2(35) = −63 + 2(−280 + 5(63)) from (2.8) = 2(−280) + 9(63) = 2(−280) + 9(343 − 280) from (2.7) = 11(−280) + 9(343). Similarly, we compute 1578 = (442)(3) + 252; (2.11) 442 = 252(1) + 190; (2.12) 252 = 190(1) + 62; (2.13) 190 = 62(3) + 4; (2.14) 62 = 4(15) + 2; (2.15) 4 = 2(2). We conclude that (1578, 442) = 2. Running equations (2.11)–(2.15) backwards yields 2 = 62 − 4(15) = 62 − 15(190 − 62(3)) from (2.15) from (2.14) = −15(190) + 46(62) = −15(190) + 46(252 − 190) = 46(252) − 61(190) from (2.13) = 46(252) − 61(442 − 252) from (2.12) = −61(442) + 107(252) = −61(442) + 107(1578 − 442(3)) from (2.11) = 107(1578) − 382(442). We close this section with a couple of properties of divisibility in the presence of relative primality. 17 Proposition 4. The following two statements hold. (i) If a, b, d ∈ Z are such that d | ab and (d, a) = 1, then d | b. (ii) If a, b, m ∈ Z are such that a | m, b | m and (a, b) = 1, then ab | m. Proof. (i) Assume the hypotheses. From part (i) of Corollary 1, we can find integers x and y such that 1 = dx + ay. Multiplying by b yields b = bxd + yab. But then, since d divides itself as well as ab, we see that it also divides bxd + yab = b. (ii) Assume the hypotheses. Since b | m, there exists an integer q such that m = bq. But then, a | m reads a | bq. Since (a, b) = 1, we can invoke part (i) to conclude that a | q. But then, there is an integer r such that q = ar. We conclude that m = bq = bar = (ab)r, so that ab | m, as required. 18 CHAPTER 2. INTEGERS Chapter 3 Unique Factorization This chapter is based on [Dud08, §2]. In this section, we prove that the set Z of integers has unique factorization. That is, we show that every nonzero integer not equal to 1 or −1 can be factored into a product of prime numbers in an essentially unique way. This does not hold in general for other sets of numbers as the following example illustrates. For the purposes of this example, we need to distinguish between primes that never divide a product without dividing one of the individual factors and irreducibles that cannot be factored nontrivially. The reason for this distinction is precisely because in the case given in the example, we do not have unique factorization, as we will see. √ √ √ Example 4. Let Z[ −6] = {a + b −6 | a, b ∈ Z}. Call elements of Z[ √ −6] irreducible if they cannot be factored nontrivially√in to the product of two elements of Z[ −6]. Here,√by nontrivial factors, we mean elements of Z[ −6] that do not have absolute value 1. Show that √ Z[ −6] does not possess unique factorization into irreducibles. Further, defining primes of Z[ −6] to be elements √ √ of Z[ −6] that cannot divide a product over Z[ −6] without dividing one of the individual factors √ √ over Z[ −6], we have irreducibles in Z[ −6] that are not prime. Solution. Consider the following equations √ −6 × −6. (3.1) √ √ We will finish the solution by √ showing that√−2, 3, and −6 are all irreducible in Z[ −6]. If we have a factorization of e + f −6 ∈ {−2, 3, −6} of the form √ √ √ e + f −6 = (a + b −6)(c + d −6), (3.2) − 6 = −2 × 3 = √ Then, multiplying by conjugates, we obtain e2 + 6f 2 = (a2 + 6b2 )(c2 + 6d2 ). In our cases of interest, we obtain the equations (a2 + 6b2 )(c2 + 6d2 ) = 4; (3.3) 2 2 2 2 (3.4) 2 2 2 2 (3.5) (a + 6b )(c + 6d ) = 9; (a + 6b )(c + 6d ) = 6. 19 20 CHAPTER 3. UNIQUE FACTORIZATION Therefore, a2 +6b2 must be a positive divisor of 4, 9, or 6. This forces a2 +6b2 ∈ {1, 2, 3, 4, 6, 9}. If |b| ≥ 2, then a2 + 6b2 ≥ 6(2)2 = 24 > 9 is too big to lie in this set. Therefore, we have b ∈ {0, 1, −1}. Similarly, we must have a ∈ {0, ±1, ±2, ±3}. Further, if a and b are both nonzero, then a2 + 6b2 = a2 + 6 √ ∈ {7, 10, . .√ . }. We conclude that one of a, b is√zero (and√the other is not). We conclude that a + b −6 ∈ {± −6, ±1, ±2, ±3}. Similarly, c + d −6 ∈ {± −6, ±1, ±2, ±3}. The equations one obtains from (3.2) by substituting the relevant values for e and f become −2 = ±1 × ∓2; √ 3 = ±1 × ±3; √ −6 = ± −6 × ±1. √ In any case, we obtain a factor with absolute value 1 so that −2, 3, and −6 are irreducible, as required. Finally, we note that this lack of unique factorization is the reason for the√need to distinguish between primes and irreducibles. Indeed, we have just shown that −2, 3, and −6 are all irreducible.√However, none of these are prime since (3.1) shows that each divides a product of elements of Z[ −6] without dividing any of the individual factors: √ √ √ −2 | ( −6 × −6) but − 2 - −6 √ √ √ 3 | ( −6 × −6) but 3 - −6 √ √ √ −6 | (−2 × 3) but −6 - −2 and −6 - 3. Here, we can say that −2, 3 - √ √ −6 and −6 - −2, 3 since none of √ √ 3 −2 −6 −6 =√ , =√ −2 3 −6 −6 √ lie in Z[ −6]. The lack of unique factorization seen in Example 4 does not hold for the set Z of integers. This is reflected in the fact that for integers, primes and irreducibles coincide. Before proving that the set Z of integers possesses unique factorization, we need some preliminaries to which we now turn. Definition 4 (Primes and Irreducibles). A positive integer p > 1 is called prime if whenever it divides a product of integers, it divides one of its factors. That is, p > 1 is prime provided For all integers a and b, p | ab =⇒ p | a or p | b. A positive integer p > 1 is called irreducible if its only positive factors are 1 and p. Proposition 5. An integer is prime if and only if it is irreducible. Proof. Let p > 1 be an integer. p prime =⇒ p irreducible: Suppose first that p is prime. Then, whenever p divides a product of integers, it must divide one of the factors. We wish to prove that p is irreducible. We therefore assume that p can be 21 factored as p = ab for positive integers a and b and then prove, assuming this, that one of a, b is equal to 1 (and the other is equal to p). This will show that the only positive factors of p are 1 and p. Suppose then that p = ab. Then p | ab and so p | a or p | b since p is prime. If p | a, we can write a = pq for some positive integer q. But then, we see that p = ab = pqb so that qb=1. It follows that q = b = 1, and a = p. Similarly, if p | b, we conclude that b = p and a = 1. Therefore p is irreducible, as required. p irreducible =⇒ p prime: Suppose now that p > 1 is irreducible. Then, the only positive divisors of p are 1 and p. It follows that for any integer a, (p, a) ∈ {1, p}. We need to prove that if p divides a product of integers, then it must divide one of the factors. Suppose then that for integers a and b, we have p | ab. We complete the proof by establishing that p | a or p | b. We know that (p, a) is either 1 or p. If (p, a) = 1, then we can invoke part (i) of Proposition 4 to conclude that p | b. On the other hand, if (p, a) = p, then p | a. We conclude that p is prime, as required. The definition of primality implies that whenever a prime divides the product of two integers, it must divide one of the individual factors. We can extend this to any finite number of factors using mathematical induction or the least integer principle. This forms the content of the following proposition. Proposition 6. Let p be a prime, and suppose that we have integers a1 , . . . , an such that p | a1 . . . an . Then, p | ai for some 1 ≤ i ≤ n. In particular, if a prime p divides a product q1 . . . qn of primes q1 , . . . , qn , then p = qi for some 1 ≤ i ≤ n. Proof. Let S = {1 ≤ j ≤ n | p | (a1 . . . aj ) but p - ai for any 1 ≤ i ≤ j}. We wish to show that S is empty. This would imply, in particular, that n 6∈ S so that p | (a1 . . . an ) forces p | ai for some 1 ≤ i ≤ n, as required. Towards a contradiction, suppose that S = 6 ∅. Then S is a nonempty set of integers that is bounded below. By the least integer principle, S has a least element m. Since m ∈ S, we have p | (a1 . . . am ) but p - ai for any 1 ≤ i ≤ m. From p | (a1 . . . am−1 )am and the definition of primality, we conclude that p | (a1 . . . am−1 ) or p | am . Since the latter is impossible, we conclude that p | (a1 . . . am−1 ). But m is the least element of S and so m − 1 6∈ S. Therefore p | (a1 . . . am−1 ) forces p | ai for some 1 ≤ i ≤ m − 1. This is a contradiction and proves that S is indeed empty. As remarked above, this completes the proof that whenever a prime divides a finite product of integers, it must divide one of the individual factors. The second part follows from the first together with the fact that the only positive divisors of a prime are 1 and the prime itself. Indeed, assuming that p divides the product q1 . . . qn of primes q1 , . . . , qn , we can conclude from the first part that p | qi for some 1 ≤ i ≤ n. Since the only positive divisors of qi are 1 and qi and p 6= 1, we conclude that p = qi , as required. Having established that primes and irreducibles coincide for the integers, we now invariably refer to these fundamental integers as primes. We now proceed to the proof of the Fundamental Theorem of Arithmetic that expresses the fact that integers possess unique factorization. 22 CHAPTER 3. UNIQUE FACTORIZATION Lemma 5. Every integer n > 1 is divisible by a prime. Proof. Let S = {n > 1 | n is not divisible by a prime}. We wish to show that S = ∅. We will do this by contradiction. Suppose then that S 6= ∅. Then S is a nonempty set of integers that is bounded below (by 1 for example). By the least integer principle, S has a least element m > 1. Since m ∈ S, we know that m is not divisible by a prime. In particular, it is composite. Therefore, there exist positive integers a and b, such that 1 < a, b < m and m = ab. But then, since a < m, the minimality of m implies that a is not in S. We conclude that a is divisible by some prime p. But this is a contradiction since the transitivity of divisibility implies that p | a | m so that m is divisible by a prime. This contradiction implies that S is indeed the empty set so that every integer greater than one is divisible by a prime, as required. Lemma 6. Every integer n > 1 can be written as a finite product of primes. Proof. We will prove this by invoking the least integer principle. Define S to be the set of all integers greater than one that cannot be expressed as a finite product of primes. We wish to show that S = ∅, and we will do this by contradiction. Suppose then that S = 6 ∅. Then, S is a nonempty set of integers bounded below (by 1 for example). By the least integer principle, we conclude that S has a least element m. Since m is in S, it cannot be expressed as a finite product of primes and so, in particular, cannot itself be prime. Therefore, there exist integers a and b such that 1 < a, b < m and m = ab. But then, a and b are both less than m and so cannot lie in S since m is the least element of S. We conclude that each of a and b is a finite product of primes. Say a = p1 . . . pn , b = q1 . . . qk , for primes p1 , . . . , pn , q1 , . . . , qk . We then obtain m = ab = p1 . . . pn q1 . . . qk is a finite product of primes, thereby contradicting m ∈ S. This contradiction proves that S is indeed empty so that every integer greater than one can be written as a finite product of primes, as required. Theorem 5 (Euclid). There are infinitely many primes. Proof. Towards a contradiction, suppose this is false. Then, there are only finitely many primes, say p1 , . . . , pn . Consider the positive integer N given by N = p1 p2 . . . pn + 1. Since N > 1, we know from Lemma 5 that N is divisible by a prime q. Since we are assuming that the only primes that exist are p1 , . . . , pn , we see that q must be one of the pj . But this is impossible since then q would divide both N and p1 p2 . . . pn and consequently would also divide N − p1 p2 . . . pn = 1. This contradiction completes the proof. 23 Lemma 7. Every composite integer n has a prime divisor less than or equal to √ n. Proof. We will once again prove this using the least integer principle. Let S be the set of all composite integers that do not have a prime divisor less than or equal to their square root. We wish to show that S = ∅, and will do this by contradiction. Suppose then that S is nonempty. Then S is a nonempty set of integers, and since each of its members is greater than one, S is bounded below. By the least integer principle S has a least element m. Since m is composite, we can find integers a and b such that 1 < a, b <√m and m = ab. If a and b √ were both prime then, m, we would have a, b > m which would force since m only has prime divisors greater than √ √ m = ab > m m = m. Therefore, at least one of a and b is composite. If a is composite, then it is smaller than m and so does not lie in S. It√therefore √ has a prime divisor p less than or equal to its square root. But then p | a | m and p ≤ a ≤ m thereby contradicting m ∈ S. Similarly, if b were composite, it would have a prime divisor√less than or equal to its square root, and this prime would be a divisor of m less than or equal to m. This contradiction implies that S is indeed empty so that every composite integer has a prime divisor less than or equal to its square root, as required. We have arrived at the Fundamental Theorem of Arithmetic that expresses the fact that the set of integers possesses unique factorization. Theorem 6 (The Fundamental Theorem of Arithmetic). Every positive integer can be written uniquely as a product of primes. Proof. Here, we consider two products of primes to be the same provided they differ only in the ordering of the primes involved in the product. Let n be a positive integer. If n = 1, then we agree that n is the empty product of primes. On the other hand, if n > 1, we know from Lemma 6 that n can be written in at least one way as a product of primes. What we need to prove is that this can be done in only one way. Suppose then that n = p1 . . . pk = q1 . . . qm (3.6) for primes p1 , . . . , pk , q1 , . . . , qm . We complete the proof by showing that {p1 , . . . , pk } = {q1 , . . . , qm }. This can be done by induction or by the least integer principle, but let’s see why this holds using a more natural heuristic argument. From (3.6) we conclude that p1 | (q1 . . . qm ). From Proposition 6 we conclude that p1 = qi for some 1 ≤ i ≤ m. We can then divide both sides of (3.6) by p1 = qi to obtain p2 . . . pk = q1 . . . qi−1 qi+1 . . . qm . Continuing in this fashion, we can pair off each of the p` ’s with one of the qt ’s until no more primes appear on the left hand side. We must then have k = m for otherwise we would end up with a product of primes equal to 1 which is impossible. We conclude that {p1 , . . . , pk } = {q1 , . . . , qm }, as required. From the Fundamental Theorem of Arithmetic, we know that every positive integer can be factored uniquely into a product of primes. Collecting together like primes in this factorization leads to the prime-power factorization of the integer in question. This is the content of the following definition. 24 CHAPTER 3. UNIQUE FACTORIZATION Definition 5 (Prime-Power Factorization). Let n be a positive integer. Then n can be written uniquely in the form n = pe11 . . . pekk (3.7) for distinct primes p1 , . . . , pk and positive integers e1 , . . . , ek . The factorization given by (3.7) is called the prime-power factorization of n. To conclude this section, we note that in the presence of unique factorization, we have another way of determining the greatest common divisor of two integers. We first state a lemma that characterizes the positive divisors of an integer in terms of its prime divisors. Lemma 8. Let n be a positive integer with prime-power factorization given by n = pe11 . . . pekk , where the pi are distinct primes and the ei are positive integers. Then, the positive divisors of n are those d of the form d = pg11 . . . pgkk , where, for all i, 0 ≤ gi ≤ ei . Proof. It is clear that any integer d of the form stated in the lemma is a divisor of n. On the other hand, if d is any divisor of n, then its prime power factorization cannot have any primes distinct from the pi , and cannot have corresponding exponents greater than the ei . This completes the proof. We have arrived at the method of calculating greatest common divisors from prime-power factorizations. Theorem 7. Let m and n be positive integers having prime-power factorizations given by m = pe11 . . . pekk ; n = pf11 . . . pfkk , for distinct primes p1 , . . . , pk and nonnegative integers e1 , . . . , ek , f1 , . . . , fk . Then, the greatest common divisor (m, n) of m and n is given by min{e1 ,f1 } (m, n) = p1 min{ek ,fk } . . . pk . Proof. From Lemma 8 we know that the positive divisors of m are those integers of the form pg11 . . . pgkk where 0 ≤ gi ≤ ei for each i, and the positive divisors of n are those integers of the form pg11 . . . pgkk where 0 ≤ gi ≤ fi for each i. The common divisors of m and n are therefore the integers of the form pg11 . . . pgkk where 0 ≤ gi ≤ min{ei , fi } for each i, and thus the greatest common divisor is given by min{e1 ,f1 } min{ek ,fk } . . . pk (m, n) = p1 as claimed. Chapter 4 Linear Diophantine Equations This chapter is based on [Dud08, §3]. Theorem 8. Let a, b, c ∈ Z, and consider the linear diophantine equation ax + by = c. (4.1) If (a, b) - c, then (4.1) has no solutions in integers. On the other hand, if (a, b) | c, (4.1) has infinitely many solutions in integers parametrized as x=r+ b t, (a, b) y =s− a t, (a, b) (t ∈ Z), (4.2) where r, s is any particular solution to (4.1). Proof. Now, if ax + by = c has a solution, then (a, b) must divide c = ax + by since it divides both a and b and consequently any linear combination of a and b. We are therefore reduced to proving that when (a, b) | c the solutions to (4.1) are precisely those pairs x, y of the form given in (4.2). We split the proof of this into two parts. We first show that any pair x, y of the form given in (4.2) is a solution to the linear diophantine equation (4.1), and then show that every solution to (4.1) has the form given by (4.2). The first part is a simple calculation. Indeed, if r, s is some particular solution to (4.1), and x, y are given by (4.2), then b a ax + by = a r + t +b s− t (a, b) (a, b) = ar + bs = 0. We conclude that any pair x, y of the form given by (4.2) is a solution to the linear diophantine equation (4.1). Conversely, by the Euclidean Algorithm, we know that there exist integers r0 , s0 such that ar0 + bs0 = (a, b). (4.3) 25 26 CHAPTER 4. LINEAR DIOPHANTINE EQUATIONS Further, from the assumption that (a, b) | c, we have an integer d such that c = (a, b)d. Multiplying both sides of (4.3) by d yields a(r0 d) + b(s0 d) = (a, b)d = c. This proves that we have at least one solution to the linear diophantine equation in question. Suppose then that r, s is any particular solution to (4.1). We complete the proof by showing that for every solution x, y to (4.1) there exists an integer t such that x=r+ b t, (a, b) y =s− a t. (a, b) Since ar + bs = c, and ax + by = c, we see that a(x − r) + b(y − s) = 0. Dividing by (a, b) yields But then, a (a,b) b a (x − r) = (s − y). (a, b) (a, b) b a b | (a,b) (s − y) while (a,b) , (a,b) = 1. We conclude that a | (s − y). (a, b) We therefore have an integer t such that s−y = a t. (a, b) y =s− a t. (a, b) x=r+ b t (a, b) This yields Substituting this into (4.4), we obtain as required. Example 5. Find all positive integer solutions to 343x − 280y = 49. (4.4) 27 Solution. In Example 3 we found that (343, −280) = 7. Since 7 | 49, Theorem 8 tells us that 343x − 280y = 49 has infinitely many solutions parametrized as x=r+ −280 t, 7 y =s− 343 t 7 (t ∈ Z), where r, s is any particular solution. In Example 3, we found that 343(9) − 280(11) = 7. Multiplying by 7 yields 343(63) − 280(77) = 49, so that r = 63, s = 77 is a particular solution. We conclude from Theorem 8 that all integer solutions to 343x − 280y = 49 are given by x = 63 − 40t, y = 77 − 49t, (t ∈ Z). Finally, the requirement that x, y be a positive solution is the requirement that 63 − 40t > 0, 77 − 49t > 0. Equivalently, we require 77 63 = 1.575, t< ≈ 1.5714. 40 49 We conclude that the totality of positive solutions to 343x − 280y = 49 is given by t< x = 63 − 40t, y = 77 − 49t, (t ∈ Z, t ≤ 1). Linear diophantine equations can also be disguised in the form of word problems. The following example illustrates this. Example 6. Suppose that you have 5 pennies, 5 nickles, 6 dimes and 10 quarters. Find all the possible ways of making $2.99 in change. Solution. The equation that needs to be solved is x + 5y + 10z + 25w = 299. We break this up into three binary linear diophantine equations as follows: A + 25w = 299; (4.5) B + 10z = A; (4.6) x + 5y = B. (4.7) We will solve these equations in succession. Dividing 299 by 25 yields 299 = 25(11) + 24. Therefore, a particular solution to (4.5) is given by A0 = 24 and w0 = 11. We conclude that all solutions are given by A = 24 + 25t, w = 11 − t (t ∈ Z). 28 CHAPTER 4. LINEAR DIOPHANTINE EQUATIONS Since the number of quarters used is nonnegative and at most 10, we obtain 0 ≤ w ≤ 10 =⇒ 0 ≤ 11 − t ≤ 10. Therefore, we must have 1 ≤ t ≤ 11. We now turn to (4.6). This is given by B + 10z = 24 + 25t, where 1 ≤ t ≤ 11. We take the particular solution B0 = 24 + 25t, z0 = 0. We then obtain from Theorem 8 that all solutions are given by B = 24 + 25t + 10u, z = −u (u ∈ Z). Since z, being the number of dimes used, is nonnegative and at most 6, we see that we require 0 ≤ −u ≤ 6 =⇒ −6 ≤ u ≤ 0. We finally turn to equation (4.7). This equation is given by x + 5y = 24 + 25t + 10u, where we require 1 ≤ t ≤ 11 and −6 ≤ u ≤ 0. We take the particular solution x0 = 24 + 25t + 10u, y0 = 0, then obtain the totality of solutions given by y = −v x = 24 + 25t + 10u + 5v, (v ∈ Z). Since 0 ≤ x, y ≤ 5, we conclude that −5 ≤ v ≤ 0 and that 0 ≤ 24 + 25t + 10u + 5v ≤ 5. All solutions to our problem are then given by y = −v, x = 24 + 25t + 10u + 5v, z = −u, w = 11 − t, for integers t, u, v for which 1 ≤ t ≤ 11, −6 ≤ u ≤ 0, −5 ≤ v ≤ 0 and 0 ≤ 24 + 25t + 10u + 5v ≤ 5. Note that if t ≥ 3, then x = 24 + 25t + 10u + 5v ≥ 99 + 10u + 5v ≥ 99 − 60 − 25 = 14 > 5. We must therefore have 1 ≤ t ≤ 2. Substituting these two values in for t and then finding the corresponding compatible values for u and v yields the solutions given in the following table. t 1 1 1 2 2 u −4 −3 −2 −6 −5 v −1 −3 −5 −2 −4 ⇐⇒ x 4 4 4 4 4 y 1 3 5 2 4 z 4 3 2 6 5 w 10 10 10 9 9 29 We conclude that, in order to make change for $2.99 using the coins we have on hand, we have to use x pennies, y nickels, z dimes, and w quarters, where the quadruple x, y, z, w is one of the five possibilities given in the above table. 30 CHAPTER 4. LINEAR DIOPHANTINE EQUATIONS Chapter 5 Congruences This chapter is based on [Dud08, §4]. Many times in mathematics it is useful to consider different objects as being equivalent. In order for this notion of equivalence to be reasonable, we usually force the relation to be an equivalence relation. That is, a reasonable notion of equivalence on a set X should satisfy the following three properties: (i) For all x ∈ X, x is equivalent to itself. (ii) For all x, y ∈ X, if x is equivalent to y then y is equivalent to x. (iii) For all x, y, z ∈ X, if x is equivalent to y and y is equivalent to z, then x is equivalent to z. These properties are referred to as reflexivity, symmetry and transitivity respectively. Examples of equivalence relations that we are already familiar with include equality = on any set X, the relation given by similarity of matrices on the space Rn×n of n × n matrices, as well as the relation given by isomorphism on the set of vector spaces over R. For the purposes of number theory, a very important equivalence relation on the set Z of integers is obtained by identifying integers that have the same remainder upon division by a particular positive integer. This is the notion of congruence. Definition 6 (Congruence). Let a, b ∈ Z and m ∈ N. We say that a is congruent to b modulo m, written a ≡ b (mod m), provided m | (b − a). Proposition 7. Let m ∈ N. Then, congruence modulo m is an equivalence relation on the set Z of integers. That is, for all a, b, c ∈ Z, (i) (Reflexivity) a ≡ a (mod m) (ii) (Symmetry) If a ≡ b (mod m) then b ≡ a (mod m). (iii) (Transitivity) If a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m). Proof. If we translate the statements given in (i), (ii) and (iii), they become immediately clear. Indeed, using the divisibility notation, these statements read, for all a, b, c ∈ Z, 31 32 CHAPTER 5. CONGRUENCES (i) m | (a − a) (ii) m | (b − a) =⇒ m | (a − b) (iii) m | (b − a), (c − b) =⇒ m | (c − a). Statement (i) is clear since every integer divides 0, statement (ii) is clear since any divisor of b − a is also a divisor of (−1)(b − a) = a − b, and statement (iii) is clear since any divisor of b − a and c − b must divide the sum (b − a) + (c − b) = c − a. Notation 3. For integers a and b and positive integer m, we sometimes denote a ≡ b (mod m) using the shorthand notation a ≡m b. Since ≡m is an equivalence relation on Z, we know that the corresponding equivalence classes form a partition of Z. This is the content of the following theorem. Theorem 9. Let m be a positive integer. Then every integer is congruent to precisely one of 0, 1, . . . , m − 1 modulo m. Proof. This follows from the division algorithm. Indeed, if a is an integer, we have unique integers q and r such that 0 ≤ r < m and a = mq + r. But then r − a = −mq so that m | (r − a). We conclude that a ≡m r, so that every integer is congruent modulo m to its remainder upon division by m. Since this remainder r lies in {0, 1, . . . , m − 1}, we conclude that every integer is congruent modulo m to at least one of 0, 1, . . . , m − 1. On the other hand, if a were congruent to two elements of {0, 1, . . . , m − 1}, say r1 and r2 , then we’d have r1 ≡m a ≡m r2 , by symmetry, so that by transitivity we could conclude that r1 ≡m r2 . But this implies that m | (r2 − r1 ). Since −m < r2 − r1 < m, we obtain r2 − r1 = 0 so that r1 = r2 . Therefore, every integer is congruent modulo m to precisely one of 0, 1, . . . , m − 1 as claimed. Given an integer a and positive integer m, we refer to the set of all integers to which a is congruent modulo m as the residue class of a modulo m. The least nonnegative element in this residue class is the remainder a leaves when divided by m. We call this remainder the least residue of a modulo m. Recall that, by the division algorithm, we can express the least residue of a modulo m in the form r = a − mq for some integer q. In fact, as the following theorem shows, the residue class of a modulo m consists precisely of the integers of this form. Theorem 10. Let a, b ∈ Z and m ∈ N. Then a ≡ b (mod m) if and only if a = b + km for some integer k. Proof. We have a ≡ b (mod m) ⇐⇒ b ≡ a (mod m) ⇐⇒ m | (a − b) ⇐⇒ a − b = km for some integer k ⇐⇒ a = b + km for some integer k. 33 Theorem 11. Let a, b ∈ Z and m ∈ N. Then a ≡ b (mod m) if and only if a and b leave the same remainder when divided by m. Proof. We know that every integer is congruent modulo m to the remainder it leaves when divided by m, and so if ra is the remainder left when a is divided by m and rb is the remainder left when b is divided by m, we have a ≡m ra , b ≡m rb . We conclude that a ≡m b if and only if ra ≡m rb . However, 0 ≤ ra , rb < m, and so since an integer can be congruent to only one of 0, 1, . . . , m − 1 modulo m, we see that ra ≡m rb if and only if ra = rb . All in all, we have shown that a ≡m b if and only if ra = rb , as required. Summarizing what has been done so far, we have three equivalent ways of expressing that a ≡ b (mod m). We could say that m divides b − a, or that a = b + km for some integer k, or that a and b leave the same remainder when divided by m. We now gather together some properties of congruence modulo m: Proposition 8. Let a, b, c, d be integers, and m be a positive integer. The following statements hold: 1. If a ≡m b and c ≡m d then a + c ≡m b + d 2. If a ≡m b and c ≡m d then ac ≡m bd 3. If a ≡m b and d is a positive divisor of m then a ≡d b 4. If a ≡m b and c > 0 then ac ≡mc bc m c 5. ab ≡m ac if and only if b ≡ (a,m) 6. If ab ≡m ac and (a, m) = 1 then b ≡m c 7. If a ≡m b then (a, m) = (b, m) Proof. For (1) and (2), assume that a ≡m b and c ≡m d. Then a = b + mk, c = d + m` for some integers k and `. Therefore a + c = b + d + m(k + `), ac = (b + mk)(d + m`) = bd + m(b` + kd + mk`). In particular, (a + c) = (b + d) + mu and ac = bd + mv for some integers u and v. We conclude that a + c ≡m b + d and ac ≡m bd, as required. For (3), we go back to the original definition of congruence modulo m. If a ≡m b then m | (b − a). But since d | m, we see that d | (b − a). Consequently a ≡d b. We now turn to (4). Suppose that a ≡m b. Then a = b + km for some integer k. Multiplying by c yields ac = bc + k(mc) and we conclude accordingly that ac ≡mc bc. For (5), suppose first that ab ≡m ac. Then m | (ac − ab). Dividing by (a, m) yields m a | (c − b). (a, m) (a, m) 34 But CHAPTER 5. CONGRUENCES m a (a,m) , (a,m) = 1 and so we obtain m | (c − b) (a, m) m m so that b ≡ (a,m) c. Conversely, suppose that b ≡ (a,m) c. Then m | (c − b). (a, m) Multiplying by a yields a m | (ac − ab). (a, m) a m and so we can conclude from the transitivity of divisibility that m | (ac − ab). But m | (a,m) Therefore, ab ≡m ac, as required. Part (6) is an immediate consequence of part (5). Finally, part (7) is simply a restatement of Lemma 4 using different notation. Indeed, if a ≡m b, then a = mk + b for some integer k. We can therefore conclude by Lemma 4 that (a, m) = (m, b) = (b, m), as required. Remark 4. A special case of part (1) of Proposition 8 provides us with a useful way to switch between representatives for a particular congruence class. Indeed, if a ≡m b, and k is any integer, then since km ≡m 0, we see that a+km ≡m b+0 ≡m b. Therefore, if it is convenient, we can always add or subtract any multiple of m from a without changing its value modulo m. In particular, if we want to find the least nonnegative integer in the same congruence class as a (which will be the remainder a leaves when divided by m), we need only continue adding or subtracting m from a until we obtain an integer between 0 and m − 1. Proposition 8 tells us that we can treat congruences the same way as equalities, except we need to be careful with cancellation. We can add, multiply or scale congruences by integers at will, but need to change the modulus when we cancel. For example, we have 3·8≡3·4 (mod 12), but 8 6≡ 4 (mod 12). The correct cancellation is given by part (5) of Proposition 8: 3·8≡3·4 (mod 12) =⇒ 8 ≡ 4 (mod 12/(12, 3)) =⇒ 8 ≡ 4 (mod 4). Since polynomials with integer coefficients can be built up by successively applying multiplication and addition, we see that Proposition 8 implies that we can substitute into polynomial congruences. This is the content of the following result. Proposition 9. Let f (x) be a polynomial with integer coefficients, a, b be integers and m be a positive integer. If a ≡m b then f (a) ≡m f (b). Using this fact together with the fact that the only possible values for integers modulo m are 0, 1, . . . , m − 1 allows for quickly verifying results. Indeed, if we wish to determine when a particular polynomial expression can take on a particular value modulo m, we need only check each of 0, . . . , m − 1 in order to discover the answer. We illustrate this with a couple of examples. 35 Example 7. Show that an integer of the form 4n + 3 cannot be the sum of two squares of integers. Solution. Consider a sum of squares x2 + y 2 . Since x and y can only take on the values 0, 1, 2, 3 modulo 4, we see that x2 and y 2 must be congruent to one of 02 ≡4 0, 12 ≡4 1, 22 = 4 ≡4 0, 32 = 9 ≡4 1. Therefore, x2 + y 2 ≡4 0 + 0, 0 + 1, 1 + 0, 1 + 1. That is x2 + y 2 ≡4 0, 1, 2. We conclude that x2 + y 2 6≡ 3 (mod 4), as required. Example 8. Solve the congruences 3x ≡ 1 (mod 8) and x2 ≡ 1 (mod 8) for x (mod 8). Solution. We could always just plug in each of 0, . . . , 7 into the congruences to see which ones work and which ones do not, but in order to get some practice using properties of congruences, we will solve the congruences similarly to how one would solve the analogous equations. We compute 3x ≡ 1 (mod 8) =⇒ 3x ≡ 9 =⇒ x ≡ 3 (Since 1 ≡8 9) (mod 8) (mod 8) (By part (6) of Prop. 8) For the second congruence, we proceed as follows: x2 ≡ 1 (mod 8) =⇒ x2 ≡ 1 (mod 2) 2 =⇒ 2 | (x − 1) =⇒ 2 | (x − 1)(x + 1) =⇒ 2 | (x − 1) or 2 | (x + 1) =⇒ x ≡2 1, −1 =⇒ x ≡2 1. We conclude that any solution must be congruent to 1 modulo 2. That is, any solution must be odd. Conversely, suppose that x ≡ 1 (mod 2). Then x = 2k + 1 for some integer k so that x2 = (2k + 1)2 = 4k 2 + 4k + 1 = 4k(k + 1) + 1. Now, one of k, k + 1 is even while the other is odd. In any case, we have k(k + 1) ≡ 0 · 1 = 0 (mod 2). Consequently 4k(k + 1) ≡ 4 · 0 = 0 (mod 8). Finally, we note that this implies that x2 = 4k(k + 1) + 1 ≡8 0 + 1 = 1 so that x is a solution to the congruence in question. Therefore x2 ≡ 1 (mod 8) ⇐⇒ x ≡ 1 (mod 2). 36 CHAPTER 5. CONGRUENCES Chapter 6 Linear Congruences This chapter is based on [Dud08, §5]. Recall that in section 3 we saw how to solve linear diophantine equations. If we express these equations in congruence notation, we can simplify the process of solving these equations. In particular, instead of invoking the Euclidean algorithm to find a particular solution to our equation, by switching to congruence notation we can sometimes find a particular solution by inspection. We first restate Theorem 8 in terms of congruences. Now, given the linear diophantine equation ax + by = c, we know that the solutions coincide with the solutions of −ax − by = −c. We can therefore assume that b ≥ 0. Also, when b = 0, the equation becomes ax = c which fails to be of much interest. We therefore arrive at the equation ax + by = c for b > 0. We can then rewrite this equation as ax ≡b c. Further, we know that when there is a solution r, s to the linear diophantine equation ax + by = c, that there are infinitely many solutions given by x=r+ b t, (a, b) y =s− a t (a, b) (t ∈ Z). (6.1) Denoting (a, b) by g, we can express (6.1) as x ≡ b r. g Therefore, when there exists a solution, there is precisely one congruence class of solutions for b x modulo (a,b) . But, with g = (a, b), we have x ≡ r (mod b/g) if and only if x is congruent to b one of r, r + g , . . . , r + (g − 1) gb modulo b. Therefore, when ax ≡ c (mod b) has a solution, it has precisely g = (a, b) solutions modulo b (which correspond to the unique solution modulo b/g). Switching to the more familiar notation obtained by using m in place of b and b in place of c, we obtain the following theorem. Theorem 12. Consider the linear congruence ax ≡ b (mod m) (6.2) for integers a and b and positive integer m. If (a, m) - b, (6.2) has no solutions, while if (a, m) | b, (6.2) has precisely (a, m) solutions. 37 38 CHAPTER 6. LINEAR CONGRUENCES Example 9. Solve the linear diophantine equation 343x − 280y = 49 by converting the equation to a linear congruence. Solution. We came across this linear diophantine equation in Example 5. There, we found that the integer solutions were given by x = 63 − 40t, y = 77 − 49t (t ∈ Z). We now show how to obtain this via congruences. We start by rewriting our linear diophantine equation as the congruence −280y ≡ 49 (mod 343). Dividing through by 7 (remembering to divide the modulus by 7 as well) yields −40y ≡ 7 (mod 49). Replacing −40 by 9 to which it is congruent modulo 49 yields 9y ≡ 7 (mod 49). One can obtain via a quick application of the Euclidean algorithm that 9(11) − 2(49) = 1. Thus 9(11) ≡ 1 (mod 49). We then multiply our congruence 9y ≡ 7 (mod 49) by 11 to obtain 9(11)y ≡ 7(11) (mod 49) which reduces to y ≡ 77 (mod 49) We can therefore write y = 77 + 49s for some integer s. Defining t = −s, we obtain y = 77 + 49s = 77 − 49t. Substituting this into our original equation and solving for x yields x = 63 − 40t as expected. We close this section with a very important theorem that allows us to solve systems of simultaneous linear congruences. It is the celebrated Chinese Remainder Theorem. First we need a lemma. Lemma 9. Let m and n be relatively prime positive integers and a and b be arbitrary integers. If a ≡m b and a ≡n b then a ≡mn b. 39 Proof. This is simply a restatement of part (ii) of Proposition 4. Indeed, if a ≡m b and a ≡n b, then m | (b − a) and n | (b − a), then, since (m, n) = 1, we can conclude that mn | (b−a). That is, a ≡ b (mod mn), as required. We are now ready to state and prove the Chinese Remainder Theorem. Theorem 13 (The Chinese Remainder Theorem). Let a1 , . . . , ak be integers and m1 , . . . , mk be positive integers that are relatively prime in pairs: (mi , mj ) = 1 for i 6= j. The system of congruences x ≡ a1 (mod m1 ) x ≡ a2 .. . (mod m2 ) x ≡ ak (mod mk ) has a unique solution modulo the product m1 m2 . . . mk . Proof. Let m = m1 . . . mk . For each j, we have (m1 . . . mj−1 mj+1 . . . mk , mj ) = 1 and so we can express 1 as a linear combination of m1 . . . mj−1 mj+1 . . . mk and mj . If this linear combination is given by (m1 . . . mj−1 mj+1 . . . mk )bj + mj cj = 1, then we have (m1 . . . mj−1 mj+1 . . . mk )bj ≡ 1 For ease of notation, we will write m mj (mod mj ). instead of m1 . . . mj−1 mj+1 . . . mk . Set x0 = k X m bj aj . m j j=1 (6.3) We claim that the residue class of x0 modulo m is the unique solution modulo m we are after. First of all, for any 1 ≤ i ≤ k, we have x0 = k X m m bj aj ≡ bi ai ≡ (1)ai = ai mj mi j=1 (mod mi ) since every term in the sum except the i-th term is divisible by mi . We conclude that x0 is indeed a solution to our system of congruences. On the other hand, if x is any solution to our system of congruences, then, for any 1 ≤ i ≤ k, we have x ≡ ai ≡ x0 (mod mi ). Since (m1 , m2 ) = 1, we can invoke Lemma 9 to conclude that x ≡ x0 (mod m1 m2 ). 40 CHAPTER 6. LINEAR CONGRUENCES Then, since (m1 m2 , m3 ) = 1, we can invoke Lemma 9 once again to obtain x ≡ x0 (mod m1 m2 m3 ). Continuing in this fashion, we eventually obtain x ≡ x0 (mod m) as required. The Chinese Remainder Theorem guarantees that we will always be able to find a solution to a system of linear congruences modulo relatively prime moduli, and we could use (6.3) to write down this solution. In practise, however, it is usually easier just to solve the congruences in succession. We illustrate this with an example. Example 10. Find the unique solution modulo 60 to the following system of linear congruences: 3x ≡ 2 (mod 4) (6.4) 2x ≡ 1 (mod 3) (6.5) 3x ≡ 4 (mod 5). (6.6) Solution. We start by rewriting these congruences in the form x ≡ a (mod m) by multiplying by a suitable integer to eliminate the coefficient of x. Since 3 · 3 = 9 ≡ 1 (mod 4), 2 · 2 = 4 ≡ 1 (mod 3) and 3 · 2 = 6 ≡ 1 (mod 5), we multiply (6.4) by 3, (6.5) by 2, and (6.6) by 2. We obtain x≡6 (mod 4) (6.7) x≡2 (mod 3) (6.8) x≡3 (mod 5). (6.9) We now solve the congruences (6.7), (6.8), (6.9) in succession. From (6.7), we find that x = 6 + 4k for some integer k. We then substitute this into (6.8) to obtain 6 + 4k ≡ 2 (mod 3). This simplifies to k≡2 (mod 3), since 6 ≡ 0 (mod 3) and 4 ≡ 1 (mod 3). We conclude that k = 2 + 3` for some integer ` so that x = 6 + 4k = 6 + 4(2 + 3`) = 14 + 12`. We then substitute this into (6.9) to obtain 14 + 12` ≡ 3 (mod 5). This simplifies to 4 + 2` ≡ 3 (mod 5) 41 since 14 ≡ 4 (mod 5) and 12 ≡ 2 (mod 5). Thus 2` ≡ −1 ≡ 4 (mod 5). Dividing by 2 (which is valid since (2, 5) = 1), or equivalently, multiplying by 3, we obtain `≡2 (mod 5). We conclude that ` = 2 + 5m for some integer m. Finally, this yields x = 14 + 12` = 14 + 12(2 + 5m) = 38 + 60m. The unique solution modulo 60 is then given by x ≡ 38 (mod 60). 42 CHAPTER 6. LINEAR CONGRUENCES Chapter 7 Fermat’s and Wilson’s Theorems This chapter is based on [Dud08, §6]. In this section, we prove the following two theorems. Theorem 14 (Fermat’s Little Theorem). Let a, p ∈ Z with p prime. Then ap ≡ a (mod p). In particular, if (a, p) = 1, then ap−1 ≡ 1 (mod p). Theorem 15 (Wilson’s Theorem). A positive integer p is a prime if and only if (p − 1)! ≡ −1 (mod p). We start the proof of Fermat’s Little Theorem with the following lemma. Lemma 10. Let a ∈ Z and m ∈ N be such that (a, m) = 1. Then the least residues of a, 2a, 3a, . . . , (m − 1)a (mod m) are 1, 2, 3, . . . , m − 1 in some order. That is, modulo m, multiplication by an integer a relatively prime to m simply permutes 1, 2, . . . , m − 1. Proof. If we can show that none of a, 2a, . . . , (m − 1)a is congruent to 0 modulo m and that no two of these multiples of a are congruent modulo m, then we will be done. Indeed, this will imply that a, 2a, . . . , (m − 1)a are m − 1 distinct nonzero residue classes modulo m. Since there are only m − 1 such residue classes, namely 1, 2, . . . , m − 1, we will be able to conclude that {a, 2a, . . . , (m − 1)a} = {1, 2, . . . , m − 1} (mod m). To this end, suppose that ja ≡ 0 (mod m) for some 1 ≤ j ≤ m − 1. Then, since (a, m) = 1, we would have to conclude that j ≡ 0 (mod m) thereby contradicting 1 ≤ j ≤ m − 1. We have therefore shown that none of the multiples of a in question is congruent to 0 modulo m. Finally, if, for some 1 ≤ i, j ≤ m − 1, we had ia ≡ ja (mod m), then using (a, m) = 1, we could cancel a from both sides to obtain i ≡ j (mod m). Finally, since i and j both lie between 1 and m − 1, we conclude that i = j. Therefore no two of the multiples of a in question are congruent modulo m. This completes the proof. 43 44 CHAPTER 7. FERMAT’S AND WILSON’S THEOREMS We are now prepared to prove Fermat’s Little Theorem: Proof of Fermat’s Little Theorem. For (a, p) > 1, ap ≡ a (mod p) reads 0 ≡ 0 (mod p) which clearly holds. We can therefore assume that (a, p) = 1. We then need to prove that ap−1 ≡ 1 (mod p). To this end, we first invoke Lemma 10 to conclude that a, 2a, . . . , (p − 1)a is simply a reordering of 1, 2, . . . , p − 1 modulo p. We can then multiply together these residues to obtain a(2a)(3a) . . . [(p − 1)a] ≡ 1(2)(3) . . . (p − 1) (mod p). Simplifying yields ap−1 (p − 1)! ≡ (p − 1)! (mod p). Finally, since (p − 1)! = (p − 1)(p − 2) . . . 2(1) is a product of positive integers less than p, we see that ((p − 1)!, p) = 1. We can therefore divide each side by (p − 1)! to obtain ap−1 ≡ 1 (mod p) as required. We turn now to the proof of Wilson’s Theorem. We need a preliminary lemma. Lemma 11. Let p be a prime. Then, the congruence x2 ≡ 1 (mod p) has precisely two solutions: 1 and −1 ≡ p − 1 (mod p). Proof. Indeed, x2 ≡ 1 (mod p) is equivalent to p | (x2 − 1) = (x − 1)(x + 1). Since p is prime, this is equivalent to p | (x − 1) or p | (x + 1). That is, x ≡ 1 (mod p) or x ≡ −1 ≡ p − 1 (mod p). From the Euclidean Algorithm, we know that given any two relatively prime integers a and m, there exists integers x and y such that ax + my = 1. In fact, since this equation implies that the greatest common divisor of a and m is a positive divisor of 1, we see that a and m are relatively prime if an only if ax + by = 1 for some integers x and y. In turn, for m > 0, this is equivalent to the existence of an integer x such that ax ≡ 1 (mod m). That is, for m > 0, (a, m) = 1 is equivalent to a having an inverse modulo m. Further, if x and y are both inverses of a modulo m, then we’d have ax ≡ 1 ≡ ay (mod m) which would imply that x ≡ y (mod m) since (a, m) = 1 allowing us to divide congruences modulo m by a. We conclude that the integers relatively prime to m are precisely the ones that have an inverse modulo m, and that when an inverse exists, it is unique modulo m. We can therefore refer to the inverse of a modulo m when it exists, and denote it using the familiar notation a−1 . When m is equal to a prime p, every integer that is not a multiple of p is relatively prime to p and so has an inverse modulo p. What Lemma 11 says is that the only residue classes that are their own inverses modulo a prime p are 1 and p − 1. So, out of the residue classes 0, 1, . . . , p − 1, only 0 fails to have an inverse modulo p, and the only two that are their own inverses are 1 and p − 1. We summarize this in the following lemma. 45 Lemma 12. Let m be a positive integer and a be an arbitrary integer. Then, a has an inverse modulo m if and only if (a, m) = 1. When this is the case, the inverse is uniquely determined modulo m and denoted by a−1 . In the special case m = p is a prime, the residue classes possessing an inverse modulo p are 1, 2, . . . , p − 1, and among these, only 1 and p − 1 are their own inverse. We now have all that is required to prove Wilson’s Theorem: Proof of Wilson’s Theorem. Suppose first that p is prime. If p = 2, then (p − 1)! = 1! = 1 ≡ −1 (mod 2). We can therefore assume that p is odd. Consider the product 1(2) . . . (p − 2)(p − 1) of all the nonzero residue classes modulo p. By Lemma 12, we know that each of these residue classes has a unique inverse, and the only two that are equal to their inverse are 1 and p − 1. Each of 2, 3, . . . , p − 2 therefore gets multiplied by its inverse to yield 1 modulo p reducing the product to p − 1 which is −1 modulo p. That is, denoting the inverse of a modulo p by a−1 , we obtain (p − 1)! = (2) . . . (p − 2)(p − 1) −1 ≡ 1(2 · 2 −1 )(3 · 3 " )... p−1 2 p−1 2 −1 # (p − 1) ≡ 1(1)(1) . . . (1)(p − 1) =p−1 ≡ −1 (mod p). This completes the proof of the “only if” direction of Wilson’s Theorem. Conversely, suppose that m is composite. We need to prove that (m − 1)! 6≡ −1 (mod m). But this is easily proved with the help of Lemma 12. Indeed, the fact that m is composite implies that m has a nontrivial proper positive divisor d with 1 < d < m. But then d appears in the product that defines (m − 1)! so that d is a common divisor of (m − 1)! and m. We conclude that (m − 1)! and m fail to be relatively prime so that (m − 1)! cannot have an inverse modulo m. In particular, (m − 1)! cannot be congruent to −1 modulo m (or to any other invertible residue class modulo m). This completes the proof of Wilson’s Theorem. Fermat’s Little Theorem provides us with an efficient method of finding the least residue of large powers of integers modulo primes. We illustrate this with the following example. Example 11. Find the least residue of 55754 modulo 17. 46 CHAPTER 7. FERMAT’S AND WILSON’S THEOREMS Solution. We compute 55754 = 516 359 359 ≡ (1) = 5 · 510 10 ·5 (Since 5754 = (16)(359) + 10) (mod 17) 2 5 = (25)5 ≡ 85 (mod 17) 2 = 82 · 8 = (64)2 · 8 ≡ (−4)2 · 8 = 128 ≡9 (mod 17). (By Fermat’s Little theorem) Chapter 8 The Divisors of an Integer This chapter is based on [Dud08, §7]. In this section, two important members of the class of multiplicative functions are introduced. One of these is the number of divisors function d that assigns to a positive integer the number of its positive divisors. The other is the sum of the positive divisors function σ that assigns to a positive integer the sum of its positive divisors. We start by defining multiplicative functions and then proceed to the introduction to these two particular examples. Definition 7 (Multiplicative Function). A function f defined on the set of positive integers N is called multiplicative provided f (mn) = f (m)f (n) for all positive integers m and n with (m, n) = 1. A multiplicative function f is called totally multiplicative provided f (mn) = f (m)f (n) for all positive integers m and n. Note that the values of a multiplicative function f are completely determined by its values on prime powers. Indeed, if f (pk ) is known for all prime powers pk , then, for any n ∈ N, we have a prime-power factorization n = pe11 . . . perr for distinct primes p1 , . . . , pr and positive integers e1 , . . . , er . Since the prime powers pei i are relatively prime, we must have f (n) = f (pe11 ) . . . f (perr ). Similarly, if f is completely multiplicative, its values are completely determined by the values it takes on at primes. Indeed, with n as above, if we know the values of the f (pi ), we must have f (n) = f (p1 )e1 . . . f (pr )er . This is similar to the fact that a linear transformation of vector spaces is completely determined by its values on a basis. Indeed, with respect to multiplication, the set of primes can be considered a basis for the set of positive integers, and then completely multiplicative functions can be considered 47 48 CHAPTER 8. THE DIVISORS OF AN INTEGER as the “linear transformations” in this context. Indeed, this situation is made rigorous if we consider scalar multiplication to be given by exponentiation and vector addition to be given by product. We illustrate the determination of multiplicative (resp. completely multiplicative) functions by their values on prime powers (resp. primes) in the following example. Example 12. Let f and g be functions defined on the set N of positive integers. Suppose further that f (22 ) = 3, f (7) = −2; g(2) = −4, g(5) = 7. (i) Assuming that f is multiplicative, find f (28). (ii) Assuming that g is completely multiplicative, find g(500). Solution. For part (i), we note that 28 = 22 · 7. Therefore, since f is multiplicative, we have f (28) = f (22 · 7) = f (22 )f (7) = 3(−2) = −6. For part (ii), we note that 500 = 22 · 53 . Therefore, since g is completely multiplicative, we have g(500) = g(22 · 53 ) = g(2)2 g(5)3 = (−4)2 (7)3 = 5488. We turn now to the two Pexamples of multiplicative functions we will investigate in this section. We will use the notation d|n to denote the sum over the set of all positive divisors of n. With this notation, we make the following definition. Definition 8. Let r ∈ N0 . We define the function σr on N by X σr (n) = dr . d|n Two particular cases of interest are obtained by taking r = 0 and r = 1. For r = 0 we obtain the number of positive divisors function d defined on N by X d(n) = 1 d|n while for r = 1 we obtain the sum of the positive divisors function σ defined on N by X σ(n) = d. d|n The main result to be proved in this section is that for all r ≥ 0, the function σr is multiplicative. Taking r = 0 and r = 1 will prove the multiplicativity of the functions d and σ in particular. We prove the multiplicativity of the σr by combining prime-power factorizations with induction. The details are given below. 49 Theorem 16. Let r ∈ N0 . The function σr defined on N by X σr (n) = dr (n ∈ N) d|n is multiplicative. Proof. Let r ∈ N0 and m, n ∈ N be such that (m, n) = 1. We need to prove that σr (mn) = σr (m)σr (n). Now, we have prime-power factorizations m = pe11 . . . pekk ; n= q1f1 (8.1) . . . q`f` , (8.2) where the pi 6= pj for i 6= j, qi 6= qj for i 6= j and the ei and fj are positive integers. Since (m, n) = 1, we also have that the pi are distinct from the qj . Therefore, if we can show that for any product P1g1 . . . Ptgt of distinct prime powers P1g1 , . . . , Ptgt , we have σr (P1g1 . . . Ptgt ) = σr (P1g1 ) . . . σr (Ptgt ), we’d be able to conclude that σr (mn) = σr pe11 . . . pekk q1f1 . . . q`f` (8.3) (From (8.1) and (8.2)) = σr (pe11 ) . . . σr (pekk )σr (q1f1 ) . . . σr (q`f` ) (From (8.3)) = σr (pe11 . . . pekk )σr (q1f1 . . . q`f` ) (From (8.3)) = σr (m)σr (n) (From (8.1) and (8.2)) as required. We have therefore reduced the proof to establishing that for distinct prime powers P1g1 , . . . , Ptgt , we have σr (P1g1 . . . Ptgt ) = σr (P1g1 ) . . . σr (Ptgt ). We will establish this by induction on the number t of prime powers appearing in the product. To this end, let S be the set of all t ≥ 1 such that σr is multiplicative for the product of t distinct prime powers. We show that S contains all of N by induction. For t = 1, there is nothing to show since we only have one prime power in question. Both sides of (8.3) are therefore equal to σr (P1g1 ). We conclude that 1 ∈ S. Fix a positive integer t and suppose that t ∈ S. We complete the proof gt+1 by showing that t + 1 ∈ S. Consider then a product P1g1 . . . Ptgt Pt+1 of distinct prime powers P1g1 , gt+1 . . . , Pt+1 . Define N = P1g1 . . . Ptgt , g t+1 so that (N, Pt+1 ) = 1, our product is given by N Pt+1 , and we are assuming as inductive hypothesis that σr (N ) = σr (P1g1 ) . . . σr (Ptgt ). (8.4) Let d1 , . . . , ds be the positive divisors of N . Since (N, Pt+1 ) = 1, all the positive divisors of the gt+1 product N Pt+1 are given by the following array: 1 Pt+1 2 Pt+1 .. . d1 d1 Pt+1 2 d1 Pt+1 .. . d2 d2 Pt+1 2 d2 Pt+1 .. . ... ... ... ds ds Pt+1 2 ds Pt+1 .. . t+1 Pt+1 t+1 d1 Pt+1 t+1 d2 Pt+1 ... t+1 ds Pt+1 g g g g 50 CHAPTER 8. THE DIVISORS OF AN INTEGER g t+1 In order to compute σr (N Pt+1 ), we raise each of the positive divisors in the above array to the r-th power and then sum the resulting numbers. If we set d0 = 1 and sum by rows we obtain g t+1 )= σr (N Pt+1 s X j=0 = s X j=0 drj + s X r (dj Pt+1 ) + j=0 r drj + Pt+1 s X 2 dj Pt+1 r + ··· + j=0 s X 2 drj + Pt+1 j=0 r 2 = 1 + Pt+1 + Pt+1 s X g t+1 dj Pt+1 j=0 s r X g t+1 drj + · · · + Pt+1 s r X j=0 r j=0 s X gt+1 r + · · · + Pt+1 drj j=0 = = r gt+1 )σr (N ) σr (Pt+1 gt+1 ). σr (N )σr (Pt+1 We conclude from (8.4) that gt+1 gt+1 gt+1 ) = σr (N Pt+1 ) = σr (P1g1 ) . . . σr (Ptgt )σr (Pt+1 σr P1g1 . . . Pt+1 as required. We conclude that t + 1 ∈ S. By induction the proof is complete. drj Chapter 9 Perfect Numbers This chapter is based on [Dud08, §8]. In this section, we introduce perfect numbers. We then give the complete characterization of the even perfect numbers due to Euclid and Euler. Definition 9 (Perfect Numbers). A positive integer n is said to be perfect if it is equal to the sum of its proper positive divisors. That is, n is perfect provided X n= d − n ⇐⇒ σ(n) = 2n. d|n Example 13. The first four perfect numbers are 6, 28, 496, and 8128. These numbers are perfect since they are all equal to the sum of their proper positive divisors: 6 = 1 + 2 + 3; 28 = 1 + 2 + 4 + 7 + 14; 496 = 1 + 2 + 4 + 8 + 16 + 31 + 62 + 124 + 248; 8128 = 1 + 2 + 4 + 8 + 16 + 32 + 64 + 127 + 254 + 508 + 1016 + 2032 + 4064. We note that 6 = 22−1 (22 − 1); 28 = 23−1 (23 − 1); 496 = 25−1 (25 − 1); 8128 = 27−1 (27 − 1), and that all of 3 = 22 − 1, 7 = 23 − 1, 31 = 25 − 1 and 127 = 27 − 1 are prime numbers. This is a special case of the main result of this section. No odd perfect numbers are known, whereas, the even perfect numbers have been completely classified by Euler. In order to state this classification, we need to define Mersenne primes. These 51 52 CHAPTER 9. PERFECT NUMBERS primes are the ones that are one less than a power of two. In searching for such primes, we need only look at the numbers that are one less than a prime power of 2 as shown by the following proposition. Proposition 10. Let m ∈ N. If 2m − 1 is prime then m is itself prime. Proof. We will prove the contrapositive. That is, we will show that if m is composite then so too is 2m − 1. But this follows easily since if m = ab for integers a and b with 1 < a, b < m, then we have the factorization 2m − 1 = 2ab − 1 = (2a − 1)(1 + 2a + 22a + · · · + 2(b−1)a ), where 1 < 2a − 1, 1 + 2a + 22a + · · · + 2(b−1)a < 2m − 1. This shows that 2m − 1 is composite, as required. This brings us to the definition of Mersenne primes. Definition 10 (Mersenne Prime). A prime is called a Mersenne prime if it is one less than a power of 2. By Proposition 10, the Mersenne primes are the prime numbers of the form 2p − 1 for p prime. We ave arrived at the characterization of the even perfect numbers due to Euclid and Euler. Theorem 17 (Euclid, Euler). The even perfect numbers are precisely those numbers n of the form n = 2p−1 (2p − 1) (9.1) where p is a prime and 2p − 1 is a (Mersenne) prime. Proof. We first show that every integer n of the form (9.1) is a perfect number. This was shown by Euclid. We then complete the proof by showing that every even perfect number n has the form given by (9.1). This is the contribution of Euler. The first part is a simple calculation. Indeed, since 2p−1 is a power of two and 2p − 1 is odd, we see that (2p−1 , 2p − 1) = 1. We conclude from the multiplicativity of σ that for n defined by (9.1), σ(n) = σ(2p−1 (2p − 1)) = σ(2p−1 )σ(2p − 1) = (1 + 2 + · · · + 2p−1 )(1 + (2p − 1)) 2p − 1 p = ·2 2−1 = 2[2p−1 (2p − 1)] = 2n. We conclude that n is perfect as claimed. Conversely, suppose that n is an even perfect number. We need to show that there exists a prime p such that 2p − 1 is also prime and n = 2p−1 (2p − 1). Suppose that e is the power of 2 in the prime power factorization of n. Then n = 2e m where e ≥ 1 and m is odd. Since m and 1 are both positive divisor of m, we have σ(m) ≥ m + 1 > m. We can therefore write σ(m) = m + s for some positive integer s. But then, since n is perfect, we must have 2e+1 − 1 (m + s). 2n = σ(n) ⇐⇒ 2e+1 m = 2−1 53 Therefore, we have 2e+1 m − (2e+1 − 1)m = (2e+1 − 1)s, or, m = (2e+1 − 1)s. We conclude that s < m and is a positive divisor of m. From σ(m) = m + s we can conclude that s and m are the only positive divisors of m. We conclude that m is prime and s = 1. Thus m = 2e+1 − 1 is a Mersenne prime. From Proposition 10 we conclude that e + 1 = p for some prime p so that n = 2e m = 2p−1 (2p − 1), for primes p and 2p − 1, as required. 54 CHAPTER 9. PERFECT NUMBERS Chapter 10 Euler’s Theorem and Function This chapter is based on [Dud08, §9]. Recall Fermat’s little theorem that used the fact that for prime moduli p the invertible residue classes where the classes 1, 2, . . . , p − 1 to conclude that for (a, p) = 1, ap−1 ≡ 1 (mod p). If we reconstruct the same argument using a general modulus m ∈ N, we get Euler’s generalization of Fermat’s little theorem. First we introduce Euler’s ϕ-function that counts the number of invertible congruence classes modulo a particular integer. Definition 11. We define Euler’s ϕ-function on N by ϕ(n) = #{1 ≤ m ≤ n | (m, n) = 1} (n ∈ N). Since we have seen that the invertible classes modulo n are precisely the ones corresponding to integers relatively prime to n, we see that ϕ(n) is equal to the number of invertible residue classes modulo n. This observation allows us to generalize the proof of Fermat’s little theorem to obtain Euler’s generalization below. Theorem 18 (Euler’s Theorem). Let a ∈ Z and n ∈ N. If (a, n) = 1 we have aϕ(n) ≡ 1 (mod n). Proof. We take our cue from the proof of Fermat’s little theorem and consider the set S of invertible elements modulo n. As we have seen, this set S contains ϕ(n) classes and is given by S = {1 ≤ m ≤ n | (m, n) = 1}. As in the proof of Fermat’s little theorem, we show that multiplication by a is a permutation of S. We will then be able to conclude that S = aS (where aS = {ax | x ∈ S}) so that multiplying the elements of S together yields Y Y (ax) ≡ x (mod n). (10.1) x∈S x∈S Finally, since each x ∈ S is relatively prime to n (and so can be cancelled from (10.1)) and there are ϕ(n) elements in S, we conclude that aϕ(n) ≡ 1 55 (mod n) 56 CHAPTER 10. EULER’S THEOREM AND FUNCTION as required. We conclude the proof by observing that for all x ∈ S, ax ∈ S, and that no two distinct ax ∈ aS are congruent modulo n. In exactly the same fashion as in the proof of Fermat’s little theorem, multiplication by a is then a permutation of S, as required. We now show that Euler’s ϕ-function is another example of a multiplicative function. This will allow for efficient calculation of its values. Theorem 19. Euler’s ϕ-function is multiplicative. Proof. In order to prove the theorem, we need to show that, for positive integers m and n with (m, n) = 1, we have ϕ(mn) = ϕ(m)ϕ(n). That is, we need to verify that the number of invertible residue classes modulo mn is equal to the product of the number of invertible residue classes modulo m and the number of invertible residue classes modulo n. We will do this by way of the Chinese remainder theorem. First we need some notation. Given an integer r, and a modulus m, we denote by rm the least residue of r modulo m. That is, we let rm denote the remainder left when r is divided by m. Then, if Sm , Sn and Smn denote the sets of invertible residue classes modulo m, n and mn respectively, we will prove that the map f : Smn → Sm × Sn given by f (rmn ) = (rm , rn ) is a one to one correspondence. This will show that ϕ(mn) = #Smn = #(Sm × Sn ) = (#Sm )(#Sn ) = ϕ(m)ϕ(n) as required. Here, our function f takes as input some integer less than mn and relatively prime to mn and reduces it modulo m and n obtaining the two coordinates of the output ordered pair. In order to complete the proof, we need to show that f is well-defined and that it is one to one and onto. The fact that f is well defined is a consequence that since our input rmn is relatively prime to mn, it is also relatively prime to both m and n. But then, since rmn ≡ rm (mod m), rmn ≡ rn (mod n), we see that (rm , m) = (rmn , m) = 1 and (rn , n) = (rmn , n) = 1. Also, since rm and rn are least residues modulo m and n respectively, we have rm < m and rn < n. It follows that rm ∈ Sm and rn ∈ Sn . We conclude that f (rmn ) ∈ Sm ×Sn , as required. Having established that the definition of f makes sense, we proceed to showing that it is one to one and onto. We’ll see that this is basically a restatement of the Chinese remainder theorem. Indeed, given any pair (am , an ) ∈ Sm × Sn , the Chinese remainder theorem provides us with a solution to the system of congruences x ≡ am (mod m), x ≡ an (mod n). Here we have used the assumption that (m, n) = 1. But then, we have xm = am and xn = an . It follows that f (xmn ) = (xm , xn ) = (am , an ) so that f is onto as claimed. We see that the existence part of the Chinese remainder theorem proved that f was onto. The uniqueness part will prove that f is one-to-one. Indeed, if f (rmn ) = f (smn ), then rmn and smn are both solutions to the system of congruences x ≡ rm (mod m), x ≡ rn (mod n). 57 It follows from the Chinese remainder theorem that rmn ≡ smn (mod mn). But this forces rmn = smn since rmn and smn both lie between 1 and mn. We conclude that f is one-to-one, as required. Theorem 19 shows that Euler’s ϕ-function is multiplicative. Its values are then completely determined by its values on prime powers. Since these are easily computed, we obtain a general formula for computing ϕ(n) in terms of the prime powers appearing in the prime-power factorization of n. We first state a lemma that gives the values of ϕ on prime powers before stating the result for general positive integers n. Lemma 13. Let p be a prime and e be a positive integer. Then ϕ(pe ) = pe−1 (p − 1). Proof. In order to prove that ϕ(pe ) = pe−1 (p − 1), we need to count the number of positive integers less than pe that are relatively prime to pe . We will do this by subtracting from pe the number of positive integers less than pe that possess a nontrivial common factor with pe . Since the integers between 1 and pe that possess a nontrivial common factor with pe are given by p, 2p, 3p, . . . , pe−1 p, we see that there are pe−1 of these integers. We conclude that ϕ(pe ) = pe − pe−1 = pe−1 (p − 1) as required. We have arrived at the general formula for the values of ϕ at positive integers n. Theorem 20. Let n ∈ N have the prime-power factorization n = pe11 . . . perr , for distinct primes p1 , . . . , pr and positive integers e1 , . . . , er . We have the formula ϕ(n) = pe11 −1 (p1 − 1) . . . prer −1 (pr − 1). Proof. This is a simple consequence of Theorem 19 and Lemma 13. Indeed, from Theorem 19 we conclude that ϕ(n) = ϕ(pe11 ) . . . ϕ(perr ), (10.2) and from Lemma 13 we conclude that for each 1 ≤ i ≤ r we have ϕ(pei i ) = piei −1 (pi − 1). Putting (10.2) and (10.3) together yields ϕ(n) = ϕ(pe11 ) . . . ϕ(prer ) = p1e1 −1 (p1 − 1) . . . prer −1 (pr − 1) as required. (10.3) 58 CHAPTER 10. EULER’S THEOREM AND FUNCTION As a corollary, we note that the formula given in Theorem 20 can be expressed in an alternative way. Corollary 2. Let n be a positive integer and p1 , . . . , pr be the distinct primes appearing in the prime-power factorization of n. Then 1 ϕ(n) = n 1 − p1 1 ... 1 − pr . Proof. Indeed, if the prime-power factorization of n is given by n = pe11 . . . perr , we can use Theorem 20 to obtain ϕ(n) = p1e1 −1 (p1 − 1) . . . prer −1 (pr − 1) e1 er p1 pr = ... (p1 − 1) . . . (pr − 1) p1 pr pr − 1 p1 − 1 ... = (pe11 . . . perr ) p1 pr 1 1 =n 1− ... 1 − p1 pr as required. We now illustrate Theorem 20 by way of an example. Example 14. Use Theorem 20 to compute ϕ(500) and ϕ(588). Solution. We start by decomposing 500 and 588 into their prime-power factorizations. This gives 500 = 22 · 53 , 588 = 22 · 3 · 72 . Applying Theorem 20 yields ϕ(500) = 22−1 (2 − 1)53−1 (5 − 1) = 2 · 25 · 4 = 200, and ϕ(588) = 22−1 (2 − 1)31−1 (3 − 1)72−1 (7 − 1) = 2 · 2 · 7 · 6 = 168. We give an example similar to Example 11 that illustrates how one can apply Euler’s Theorem to compute the least residue of large powers modulo arbitrary positive integers. Example 15. Find the least residue of 51549 modulo 588. 59 Solution. In Example 14 we calculated ϕ(588) = 168. Euler’s Theorem then tells us that for (a, 588) = 1 we have a168 ≡ 1 (mod 588). Since the prime factors of 588 are 2, 3, and 7, we can apply this result for any integer a that fails to be divisible by 2, 3 and 7. Therefore, we find that 51549 = 5168 9 9 · 537 (Since 1549 = 9 · 168 + 37) 37 (Since 5168 ≡588 1) ≡588 1 · 5 9 = 54 · 5 (Since 37 = 4 · 9 + 1) 9 (Since 54 = 625) = 625 · 5 ≡588 379 · 5 4 = 372 · 37 · 5 (Since 625 ≡588 37) (Since 9 = 2 · 4 + 1) 4 (Since 372 = 1369) = 1369 · 37 · 5 ≡588 1934 · 37 · 5 2 = 1932 · 37 · 5 (Since 1369 ≡588 193) 2 = 37249 · 37 · 5 (Since 1932 = 37249) ≡588 2052 · 37 · 5 (Since 37249 ≡588 205) (Since 4 = 2 · 2) (Since 2052 = 42025) = 42025 · 37 · 5 ≡588 277 · 37 · 5 (Since 42025 ≡588 277) = 10249 · 5 ≡588 253 · 5 (Since 277 · 37 = 10249) (Since 10249 ≡588 253) = 1265 (Since 253 · 5 = 1265) ≡588 89 (Since 1265 ≡588 89) We conclude that the least residue of 51549 modulo 588 is equal to 89. We conclude this section with a result that we prove using a clever argument due to Gauss. Theorem 21. For positive integers n we have X ϕ(d) = n. d|n Proof. The idea of the proof is to partition the set Nn of positive integers less than or equal to n into equivalence classes obtained using the relation defined by considering two positive integers d1 and d2 less than or equal to n to be equivalent if they have the same greatest common divisor with n. That is, for an integer d with 1 ≤ d ≤ n, we define Cd by Cd = {1 ≤ g ≤ n | (g, n) = d}. Since, for any g, and d, we have (g, n) = d if and only if g n d, d = 1, we conclude that for all 60 CHAPTER 10. EULER’S THEOREM AND FUNCTION 1 ≤ d ≤ n, #Cd = #{1 ≤ g ≤ n | (g, n) = d} o n g n g n =1 =# 1≤ ≤ , d d d d n o n n = # 1 ≤ h ≤ h, =1 d d n =ϕ . d · S Now, we introduce a little bit of notation. If S is a collection of sets, we use the notation S to · S denote the disjoint union of the sets in S. That is, S S is the set consisting of all elements x that lie in one of the sets in S, and the · on top of the symbol is there to remind us that the sets in S share no elements in common (are disjoint). Using this notation, we can express the fact that the classes Cd partition the set Nn = {1, 2, 3, . . . , n} by Nn = Combining this with #Cd = ϕ n d · [ {Cd | d ≥ 1 and d | n}. yields n= n X 1 j=1 = X 1 j∈Nn = XX 1 d|n j∈Cd = X #Cd d|n X n = ϕ d d|n X = ϕ(d) n d |n = X ϕ(d). d|n where the last equality follows from the fact that summing over d instead of the order of the summands. This completes the proof. n d changes nothing but Chapter 11 Primitive Roots This chapter is based on [Dud08, §10]. Given a positive integer m, we will denote the set of congruence classes modulo m by Z/mZ × and the subset of invertible classes by (Z/mZ) . It is common in Abstract Algebra to denote these × sets by Zm and Zm , respectively, but we will avoid this notation due to the fact that for primes p, and for the purposes of Number Theory, the notation Zp is typically reserved for the p-adic integers rather than the integers modulo p. For those familiar with abstract algebra, Z is an integral domain, mZ is an ideal of Z and Z/mZ is the corresponding quotient ring, which explains the use of the symbol “/”, but for our purposes, we can ignore this inherent algebraic structure and simply consider Z/mZ as notation for × the integers modulo m. Similarly, (Z/mZ) is not just a set but is in fact an abelian group under multiplication, but this knowledge is not required in what follows; we can again simply consider × (Z/mZ) as notation, this time for the invertible elements modulo m (those that are relatively prime to m). × We know that (Z/mZ) consists precisely of the congruence classes corresponding to integers × that are relatively prime to m. Both Z/mZ and (Z/mZ) are finite sets, where the former contains × m elements and the latter contains ϕ(m) elements. Now, we know that every element of (Z/mZ) is invertible modulo m. What is shown in this section is that we can obtain the inverse of an invertible element a by raising it to a suitable power. This leads us to the notion of the order of elements, and specifically to the study of primitive roots which are the elements of largest possible order. × Definition 12. Let m ∈ N and a ∈ (Z/mZ) . The least positive integer k such that ak ≡m 1 is called the order of a modulo m, denoted ordm (a). × Proposition 11. Let m ∈ N and a ∈ (Z/mZ) . The order of a modulo m is well-defined. Proof. What needs to be shown here is that a least positive k such that ak ≡m 1 exists. We will do this by way of the least integer principle. Suppose then that S = {k ∈ N | ak ≡m 1}. Since the elements in S are all positive, we see that S is bounded below. We complete the proof by 61 62 CHAPTER 11. PRIMITIVE ROOTS showing that S is nonempty followed by invoking the least-integer principle. To this end, we note × × that (Z/mZ) is closed under powers. Indeed, if a ∈ (Z/mZ) , then a is invertible modulo m. Say ab ≡m 1. If k is any positive integer, it follows that ak bk = (ab)k ≡m 1k = 1. It follows that ak is also invertible (with inverse bk ). We conclude that × {ak | k ∈ N} ⊆ (Z/mZ) . × Since (Z/mZ) contains only ϕ(m) elements, we conclude that the set of powers of a modulo m is finite. Therefore, there must exist distinct positive integers k < ` such that ak ≡m a` . This implies that a`−k ≡m 1 so that ` − k ∈ S. We conclude that S is nonempty and then invoke the least integer principle to obtain a least element k ∈ S. But then k is the least positive power of a congruent to 1 modulo m. That is, ordm (a) = k exists. × Lemma 14. Let m, k, ` ∈ N and a ∈ (Z/mZ) . We have ak ≡m a` ⇐⇒ k ≡ordm (a) `. In particular, we have ak ≡m 1 ⇐⇒ ordm (a) | k. Proof. This follows readily from the division algorithm. Note that the “⇐” direction also holds if we replace ordm (a) by ϕ(m) and this formed the basis for our method of using Euler’s theorem to reduce large powers of integers prime to m. To prove the result, we proceed as follows. Since we are dealing with invertible elements, we are free to use negative exponents. We can also assume, without loss of generality, that k ≤ `. We then have ak ≡m a` ⇐⇒ a`−k ≡m 1. Since we also have k ≡ordm (a) ` ⇐⇒ ` − k ≡ordm (a) 0, we are reduced to proving, with n = ` − k ≥ 0, that an ≡m 1 ⇐⇒ n ≡ordm (a) 0. We do this by way of the division algorithm. Write n = ordm (a)q + r for (unique) integers q and r such that 0 ≤ r < ordm (a). Suppose first that an ≡m 1. We have the following chain of implications: an ≡m 1 =⇒ aordm (a)q+r ≡m 1 q =⇒ aordm (a) ar ≡m 1 =⇒ 1q ar ≡m 1 =⇒ ar ≡m 1 =⇒ r = 0. 63 Here, the last implication follows from the fact that ordm (a) is the smallest positive power of a congruent to 1 modulo m since we know that 0 ≤ r < ordm (a). Conversely, suppose that n ≡ordm (a) 0. We then have an integer q such that n = ordm (a)q. It follows that q an = aordm (a)q = aordm (a) ≡m 1q = 1 as required. × Corollary 3. Let m ∈ N and a ∈ (Z/mZ) . Then ordm (a) | ϕ(m). Proof. This is a simple consequence of combining Euler’s theorem with Lemma 14. Indeed, we know from Euler’s theorem that aϕ(m) ≡m 1 so that we may invoke Lemma 14 to conclude that ordm (a) | ϕ(m), as required. We now know that modulo m every invertible congruence class has order dividing ϕ(m). It × follows that the maximum possible order for an element of (Z/mZ) is ϕ(m). This leads us to the definition of primitive roots. × Definition 13. Let m ∈ N. If there exists g ∈ (Z/mZ) of order ϕ(m), then m is said to have a primitive root. Any such g is called a primitive root modulo m. Remark 5. There is another way to define primitive roots that warrants to be mentioned. Given an invertible congruence class g modulo m, denote by hgi the set of powers of g modulo m. That is × hgi ≡m {1, g, g 2 , . . . , g ϕ(m)−1 } ⊆ (Z/mZ) . (11.1) Note that since g ϕ(m) ≡m 1, this set consists of all of the integral powers of g modulo m. The × primitive roots modulo m are precisely those g ∈ (Z/mZ) for which we have equality rather than simply containment in (11.1). Indeed, since for primitive roots g modulo m, g ϕ(m) is the first power of g congruent to 1 modulo m, we see that the elements of hgi = {1, g, g 2 , . . . , g ϕ(m)−1 } are distinct × modulo m. This set is therefore a subset of (Z/mZ) having the same number of elements as × × × (Z/mZ) . It must therefore be equal to the whole of (Z/mZ) . Similarly, for any a ∈ (Z/mZ) , the order of a modulo m is equal to #hai modulo m. We therefore always have containment in (11.1) and equality in case of primitive roots. We now turn to the determination of the moduli possessing primitive roots. The interest in this classification is that if m possesses a primitive root g, then we can generate all of the invertible elements modulo m by taking powers of g. We start by showing that every prime possesses a primitive root. × Lemma 15. Let m ∈ N and a ∈ (Z/mZ) have order t modulo m. For any k ∈ Z we have ordm (ak ) = t . (t, k) In particular, ak and a have the same order modulo p if and only if (t, k) = 1. Proof. First of all, since ak t (t,k) = at k (t,k) k ≡m 1 (t,k) = 1, 64 CHAPTER 11. PRIMITIVE ROOTS we see that t . (t, k) ordm (ak ) | k On the other hand, we have (ak )ordm (a ) (11.2) ≡m 1 so that k akordm (a ) ≡m 1. It follows that t | kordm (ak ) since t is the order of a modulo m. Dividing by (t, k) yields k t | ordm (ak ). (t, k) (t, k) Finally, since t (t,k) and k (t,k) are relatively prime, we can conclude that t | ordm (ak ). (t, k) (11.3) From (11.2) and (11.3) together with the fact that we are dealing with positive quantities, we can conclude that t ordm (ak ) = (t, k) as required. Lemma 16. Let f (x) be a monic (lead coefficient equal to one) polynomial with integer coefficients of degree n and p be a prime. Then f (x) ≡p 0 has at most n solutions. Pn−1 Proof. Let f (x) = xn + j=0 cj xj . We start by showing that for a ∈ Z/pZ, f (a) ≡p 0 if and only if x − a is a factor of f modulo p. It is clear that x − a being a factor of f implies that f (a) ≡p 0. Conversely, suppose that f (a) ≡p 0. Then f (x) ≡p f (x) − f (a) ≡p (xn − an ) + n−1 X cj (xj − aj ) ≡p (x − a)g(x) j=1 for some polynomial g of degree n − 1. This is due to the fact that x − a is a factor of x` − a` for all ` ≥ 1. The result now follows from a simple induction on n. Indeed, if n = 1 then f (x) = x + c0 has a single root, and, if we assume for a given n > 1 that all polynomials such as f of degree at least 1 and less than n do not have more roots than their degree, and f has degree n, then either f (x) ≡p 0 has no solutions or it has a solution a which implies that we can write f (x) = (x − a)g(x) for some monic polynomial g of degree n − 1. Since 1 ≤ n − 1 < n we can then invoke the inductive hypothesis to obtain that g(x) ≡p 0 has at most n − 1 solutions. It follows that f (x) ≡p 0 has at most n solutions, as required. Lemma 17. Let p be prime and d be a positive divisor of p − 1. Then xd ≡p 1 has precisely d solutions modulo p. 65 Proof. Let r denote the number of solutions to xd ≡p 1. By Lemma 16, we know that there are at most d solutions to xd ≡p 1. That is r ≤ d. (11.4) On the other hand, since d | p−1 we can write p−1 = de for some e ∈ N and obtain the factorization xp−1 − 1 = xde − 1 = (xd − 1) e−1 X xdj . j=0 By Fermat’s little theorem, there are precisely p − 1 solutions to xp−1 ≡p 1, and by invoking Lemma Pe−1 16, we see that j=0 xdj ≡p 0 has at most d(e − 1) = p − 1 − d solutions. It follows that the number of solutions to xd ≡p 1 is at least (p − 1) − (p − 1 − d) = d. That is r ≥ d. (11.5) Putting (11.4) and (11.5) together yields r = d, as required. Theorem 22. Let p be a prime and d be a positive divisor of p − 1. Then there are precisely ϕ(d) elements of Z/pZ× of order d. In particular, there are ϕ(p − 1) primitive roots modulo p. Proof. Let p be prime and consider the partition of Z/pZ× associated to the equivalence relation defined by identifying elements having the same order modulo p. We then have Z/pZ× = · [ {a ∈ Z/pZ× | ordp (a) = d}. d|p−1 Now, for positive divisors d of p − 1, let ψ(d) denote the number of elements in Z/pZ× that have order d modulo p. We then have X p − 1 = #Z/pZ× = ψ(d). d|p−1 On the other hand, from Theorem 21 we also have X p−1= ϕ(d). d|p−1 We conclude that X d|p−1 ψ(d) = X ϕ(d). (11.6) d|p−1 If we can show that ψ(d) ≤ ϕ(d) for all d | p − 1, we would then be able to conclude from (11.6) that ψ(d) = ϕ(d) for all d | p − 1, as required. Suppose then that d is a positive divisor of p − 1. If ψ(d) = 0 then ψ(d) < ϕ(d). On the other hand, if ψ(d) ≥ 1, then there exists an element a of order d modulo p. But then, the d integers 1, a, a2 , . . . , ad−1 are distinct modulo p (lest ak ≡p 1 for some k < d) and are roots of xd ≡p 1. Since this congruence has only d solutions, we conclude that these powers of a are all of the solutions. But any element of order d must be a root of xd ≡p 1 and therefore equal to one of 1, a, . . . , ad−1 . But we know how to pick out the powers of a that have the same order modulo p as a: they are the ones having exponent prime to d. Since there are ϕ(d) of these, we conclude that ψ(d) = ϕ(d). In any case, we have shown that ψ(d) ≤ ϕ(d) for all positive divisors d of p − 1, as required. 66 CHAPTER 11. PRIMITIVE ROOTS At this point, having proved the existence of primitive roots modulo primes, it is natural to wonder if other moduli possess primitive roots. The answer is yes, and a complete classification of such moduli is given by the following theorem. Theorem 23. Let m ∈ N. Then m possesses a primitive root if and only if m = 1, 2, 4, pk , or 2pk where p is an odd prime and k ∈ N. In any case, there are ϕ(ϕ(m)) primitive roots when one exists. We now look at an example that illustrates the utility of the results of this section. × Example 16. Partition (Z/17Z) elements of the same order. into equivalence classes determined by the identification of Solution. We could proceed simply by raising each of the integers from 1 to 16 to subsequently × higher powers until we obtain 1 modulo 17 in order to classify the elements of (Z/17Z) according to their orders, but in order to get practice with the results of this section, we will go about matters differently. Since ϕ(17) = 16 = 24 , and Corollary 3 implies that the only possible orders for × elements of (Z/17Z) are divisors of ϕ(17), we see that the only possible orders for elements of × × (Z/17Z) are 1, 2, 4, 8, 16. At this point we could compute xa (mod 17) for all x ∈ (Z/17Z) and a ∈ {1, 2, 4, 8, 16} by using increasing values for a until we first obtain 1 modulo 17 to determine the × orders of the elements in (Z/17Z) . This would reduce the workload a little since we have restricted the exponents that we need to test, but we will continue on examining how to apply the results of this section. We know, by Theorem 22, that for each divisor d of 16 there are precisely ϕ(d) elements × × of (Z/17Z) of order d. The elements of (Z/17Z) are therefore split up as follows: Order d Number of elements of (Z/17Z) 1 ϕ(1) = 1 2 ϕ(2) = 1 4 ϕ(4) = 2 8 ϕ(8) = 4 16 ϕ(16) = 8 × of order d We also know from Lemma 15 how to determine the orders of powers of an element once we know the order of the element itself. In particular, once a primitive root is found, we can apply Lemma × 15 to immediately identify all elements of (Z/17Z) of order d for all d | 16. In searching for a × primitive root, we need only find an element of (Z/17Z) whose eighth power is not congruent to 1 modulo 17. This is due to Lemma 14 that tells us that the order of any element a divides every exponent n for which an ≡ 1 (mod 17). Now, we compute 38 = (34 )2 = (81)2 ≡17 ≡ (−4)2 = 16 6≡ 1 (mod 17). We conclude that 3 is a primitive root modulo 17. We then invoke Lemma 14 to conclude that for 16 . The characterization is then given by: 1 ≤ a ≤ 16, the order of 3a modulo 17 is equal to gcd(a,16) 67 × Order d Elements of (Z/17Z) of order d 1 316 2 38 4 34 , 312 8 32 , 36 , 310 , 314 16 31 , 33 , 35 , 37 , 39 , 311 , 313 , 315 Reducing these powers of 3 modulo 17 yields the partition × · · · · (Z/17Z) = {1} ∪ {16} ∪ {4, 13} ∪ {2, 8, 9, 15} ∪ {3, 5, 6, 7, 10, 11, 12, 14} where we have written the sets in order of increasing order. Recall that in the proof of Fermat’s Little Theorem, we came across the product Y a ≡p p−1 Y a∈Z/pZ× j = (p − 1)! j=1 and that we cancelled this factor from both sides of a particular congruence to obtain our desired result. We subsequently proved Wilson’s Theorem, thereby determining the value of this product modulo p. That is, we proved that Y a ≡p −1. a∈Z/pZ× In proving Euler’s generalization of Fermat’s Little Theorem, we came across the analogous product Y Y a ≡m j a∈(Z/mZ)× 1≤j≤m, (j,m)=1 and cancelled this factor from both sides of a particular congruence to obtain our desired result. The question arises as to the value of this product modulo m. One can show by combining Theorem 23 with The Chinese Remainder Theorem and a “singular” version of Hensel’s Lemma that this product is always congruent to 1 or −1 modulo m and that we obtain −1 precisely when m possesses primitive roots. Here we will content ourselves with the partial answer provided by the following proposition. Proposition 12. Let m ∈ N possess primitive roots. Then (a) We have x2 ≡m 1 if and only if x ≡m ±1. × (b) If m ≥ 3 then −1 is the unique element of (Z/mZ) Q (b) We have a∈(Z/mZ)× a ≡m −1. of order 2. × Proof. Suppose that m possesses a primitive root g so that (Z/mZ) = hgi = {1, g, g 2 , . . . , g ϕ(m)−1 }. 68 CHAPTER 11. PRIMITIVE ROOTS (a), (b) To prove (a) and (b) it is enough to verify (b). To this end, we note that the elements of order 2 are the powers g a , (0 ≤ a < ϕ(m)) for which (a, ϕ(m)) = ϕ(m) 2 . We would then require a to be an odd multiple of ϕ(m)/2 lying between 0 and ϕ(m) − 1. The only possibility is given × by a = ϕ(m)/2. We conclude that there is precisely one element of (Z/mZ) of order two, ϕ(m)/2 ϕ(m)/2 namely g . Since −1 is clearly of order two, we must have g ≡m −1. (c) Using the same argument as was used to prove Wilson’s Theorem, one can show that for any m Y Y a ≡m a. a∈(Z/mZ)× a∈(Z/mZ)× , ordm (a)=2 Indeed, we can pair off each of the invertible elements modulo m with its inverse to obtain a product of 1 as long as the element in question is not its own inverse. In our particular case, there is only one element of order two, namely −1, and so this product is congruent to −1 modulo m, as required. Chapter 12 Quadratic Congruences This chapter is based on [Dud08, §11]. In this section we study quadratic congruences modulo odd primes p. That is, we study the solutions to congruences of the form f (x) ≡p 0 where p is an odd prime and f is a polynomial of degree two that has integer coefficients. We first reduce our study to the study of congruences of the form x2 ≡p a. Write f (x) = ax2 + bx + c (a, b, c ∈ Z). If (a, p) 6= 1, then the congruence f (x) ≡p 0 reduces to the linear congruence bx + c ≡p 0 which we already studied in some depth. We can therefore assume that (a, p) = 1 so that a ∈ Z/pZ× . Further, by multiplying by the inverse of a modulo p if necessary, we may suppose that f is monic (has lead coefficient 1). We have therefore arrived at the study of congruences of the form x2 + bx + c ≡p 0. (12.1) The next simplification comes from completing the square in (12.1). If b is odd, we may replace b by b + p which is even and congruent to b modulo p. Therefore, we may suppose that b is even so that b = 2d for some integer d. But then, we can rewrite (12.1) as x2 + 2dx + c = (x + d)2 + (c − d2 ) ≡p 0. This completes the reduction since being able to solve x2 ≡p d2 − c is equivalent to being able to solve (x + d)2 + (c − d2 ) ≡p 0 since the solutions of one are simply translates of the solutions of the other. We illustrate what has been done so far with an example. Example 17. Find all solutions to 3x2 + 4x + 2 ≡11 0. Solution. We start by multiplying by 4 which is the inverse of 3 modulo 11. This gives the congruence 12x2 + 16x + 8 ≡11 0 ⇐⇒ x2 + 5x + 8 ≡11 0. The next step is to prepare for completing the square by replacing 5 with 5 + 11 = 16 so that the coefficient of x is even. This gives x2 + 16x + 8 ≡11 0. 69 70 CHAPTER 12. QUADRATIC CONGRUENCES We now complete the square to obtain (x + 8)2 + (8 − 64) ≡11 0 ⇐⇒ (x + 8)2 ≡11 1. We have therefore simplified our congruence to one of the form y 2 ≡11 1. Since this congruence has solutions y ≡11 1 and y ≡11 −1, we obtain two solutions to our congruence determined by x + 8 ≡11 1, x + 8 ≡11 −1. The two solutions to our congruence 3x2 +4x+2 ≡11 0 are then x ≡11 −7 ≡11 4 and x ≡11 −9 ≡11 2. We now turn to studying the congruence x2 ≡p a for an odd prime p and arbitrary integer a. We first note that there are at most two solutions since we are dealing with a monic quadratic polynomial modulo a prime. We can say more however as is shown by the following proposition. Proposition 13. Let p be an odd prime and a ∈ Z. The congruence x 2 ≡p a has the unique solution x ≡p 0 in case p | a and has either zero or two solutions otherwise. Proof. It is clear that if p | a then we obtain the unique solution x ≡p 0. On the other hand, if p - a and b2 ≡p a for some b, we also have (−b)2 ≡p a and b 6≡p −b. Here we have used the fact that p is odd and p - b. We therefore obtain two distinct solutions modulo p if one exists at all. The congruence x2 ≡p a has a solution for exactly half of the elements a ∈ Z/pZ× . In fact, we can distinguish the squares from the non-squares by use of Euler’s criterion. Before stating this, we require the following definition. Definition 14. Let m ∈ N and a ∈ Z. If x2 ≡m a has a solution then we call a a quadratic residue modulo m. Otherwise, a is referred to as a quadratic non-residue modulo m. We are now ready to state Euler’s criterion. Theorem 24. Let p be an odd prime. (a) Exactly half of the invertible elements modulo p are quadratic residues. (b) For all a ∈ Z/pZ× , we have a p−1 2 ≡p ±1 (c) (Euler’s criterion) For a ∈ Z/pZ× , a p−1 2 ≡p 1 ⇐⇒ a is a quadratic residue modulo p. Proof. Let g be a primitive root modulo p so that Z/pZ× = {1, g, . . . , g p−2 }. 71 (a) We show that the powers 1, g 2 , g 4 , . . . , g p−3 of g having even exponents are the quadratic residues modulo p and that the powers g, g 3 , g 5 , . . . , g p−2 of g having odd exponents are the quadratic non-residues modulo p. This implies that precisely half of the invertible elements modulo p are quadratic residues. Let a ≡p g k ∈ Z/pZ× , where 0 ≤ k < p − 1. We need to prove that a is a quadratic residue modulo p if and only if k is even. Suppose first that a is a quadratic residue. Then, there exists b ∈ Z/pZ× such that a ≡p b2 . But then, we can write b ≡p g ` for some 0 ≤ ` < p − 1 so that g k ≡p a ≡p b2 ≡p (g ` )2 ≡p g 2` . We conclude that k ≡p−1 2` so that k ≡2 0, as required. Conversely, if k = 2` is even then a ≡p g k ≡p g 2` ≡p (g ` )2 so that a is a quadratic residue, as required. (b) Let a ∈ Z/pZ× . Since a p−1 2 2 ≡p ap−1 ≡p 1, p−1 we see that a 2 satisfies x2 ≡p 1. Since the only solutions to this congruence are 1 and −1 modulo p we conclude that p−1 a 2 ≡p ±1 as required. (c) Let a ∈ Z/pZ× , and suppose that a ≡p g k where 0 ≤ k < p − 1. From Part (b) we know that a p−1 2 ≡p ±1. What we need to prove, taking the proof of Part (a) into consideration is that g k( p−1 2 ) ≡ 1 ⇐⇒ k is even. p p−1 We know that g 2 ≡p −1 is the unique element of Z/pZ× of order two and that g p−1 ≡p 1 is the unique element of Z/pZ× of order one. We are therefore reduced to showing that ( p−1 1 if k is even; k( 2 ) = ordp g 2 if k is odd. But this follows readily since p−1 ordp g k( 2 ) = p−1 = p − 1, k p−1 2 p−1 2 = = p−1 (2, k) (2, k) 2 We now illustrate what has been done so far with an example. ( 1 2 if k is even; if k is odd. 72 CHAPTER 12. QUADRATIC CONGRUENCES Example 18. Distinguish the quadratic residues modulo 17 from the quadratic non-residues modulo 17 by direct computation of the values modulo 17 taken on by squares. Show that this agrees with what is obtained by way of Euler’s Criterion. Solution. The direct computation yields: 12 ≡17 162 ≡17 1 22 ≡17 152 ≡17 4 32 ≡17 142 ≡17 9 42 ≡17 132 ≡17 16 52 ≡17 122 ≡17 8 62 ≡17 112 ≡17 2 72 ≡17 102 ≡17 15 82 ≡17 92 ≡17 13 We conclude that the quadratic residues modulo 17 are 1, 2, 4, 8, 9, 13, 15 and 16 and then quadratic non-residues modulo 17 are 3, 5, 6, 7, 10, 11, 12 and 14. We now turn to Euler’s criterion. In our case (p − 1)/2 = 8 and we compute 18 ≡17 168 ≡17 1 28 ≡17 158 ≡17 44 ≡17 (42 )2 ≡17 (16)2 ≡17 1 38 ≡17 148 ≡17 94 ≡17 (92 )2 ≡17 132 ≡17 16 ≡17 −1 48 ≡17 138 ≡17 164 ≡17 1 58 ≡17 128 ≡17 84 ≡17 (82 )2 ≡17 132 ≡17 16 ≡17 −1 68 ≡17 118 ≡17 24 ≡17 16 ≡17 −1 78 ≡17 108 ≡17 154 ≡17 (152 )2 ≡17 42 ≡17 16 ≡17 −1 88 ≡17 98 ≡17 134 ≡17 (132 )2 ≡17 162 ≡17 1 We note that this agrees with the answer obtained by direct computation since we obtain 1 for 1, 2, 4, 8, 9, 13, 15 and 16 and −1 otherwise. We now define the Legendre symbol which provides us with a convenient notation for distinguishing quadratic residues from quadratic non-residues modulo odd primes. Definition 15 (Legendre symbol). Let p be an odd prime and a ∈ Z be relatively prime to p. We define the Legendre symbol (a/p) by ( 1 if a is a quadratic residue modulo p; a = p −1 if a is a quadratic non-residue modulo p. By Euler’s criterion, we know that for (a, p) = 1 we have ( p−1 1 if a is a quadratic residue modulo p; a 2 ≡p −1 if a is a quadratic non-residue modulo p. We can therefore re-write Euler’s criterion using the notation of Definition 15 as p−1 a a 2 ≡p . p The following theorem lists some of the properties of the Legendre symbol. (12.2) 73 Theorem 25. Let p be an odd prime and a, b ∈ Z be relatively prime to p. Then (a) if a ≡p b then (a/p) = (b/p); (b) (a2 /p) = 1; (c) (ab/p) = (a/p)(b/p). ( 1 if p ≡4 1; (d) (−1/p) = −1 if p ≡4 3. Proof. Since p is an odd prime, we have −1 6≡p 1. Therefore, in order to verify an equality of Legendre symbols, it is enough to verify the corresponding congruence modulo p. Noting this, each of (a), (b), (c) and (d) follows readily from re-writing the statement using (12.2). Indeed, the statements become: (a) if a ≡p b then a (b) a2 p−1 2 (c) (ab) p−1 2 (d) (−1) ≡p b p−1 2 ; ≡p 1; p−1 p−1 ≡p a 2 b 2 . ( p−1 2 p−1 2 ≡p 1 −1 if p ≡4 1; if p ≡4 3. Each of these parts is clear except perhaps Part (d). But Part (d) merely expresses the fact that p−1 2 is even if p ≡4 1 and odd if p ≡4 3. Part (d) of this theorem tells us that Z/pZ× contains a square root of −1 if and only if p ≡4 1. We will see that combining Theorem 25 with the law of quadratic reciprocity allows for efficient computation of Legendre symbols. The law of quadratic reciprocity relates the Legendre symbols (p/q) and (q/p) for distinct odd primes p and q. It says that unless both of p and q are congruent to 3 modulo 4, p is a quadratic residue modulo q if and only if q is a quadratic residue modulo p. In case p ≡4 q ≡4 3, precisely one of p, q is a quadratic residue modulo the other. The precise statement of this celebrated theorem is given below. Theorem 26 (The Law of Quadratic Reciprocity). Let p and q be distinct odd primes. Then (p−1)(q−1) p q 4 = (−1) . q p That is, − q (p−1)(q−1) p q p 4 = (−1) = q q p p if p ≡4 q ≡4 3; otherwise. We will prove this theorem in the next section. We will also prove the supplementary result that classifies the odd primes p for which 2 is a quadratic residue. The answer is given by the following theorem. 74 CHAPTER 12. QUADRATIC CONGRUENCES Theorem 27. Let p be an odd prime. Then ( 1 2 = p −1 if p ≡8 ±1; if p ≡8 ±3. In general, for (a, p) = 1, (a/p) is completely determined by the value of p modulo 4|a|. Theorem 27 illustrates this for the case a = 2. To close this section, we provide some examples that illustrate the utility of combining Theorems 26 and 27 with Theorem 25 to compute Legendre symbols. Example 19. Determine whether or not 5335 is a quadratic residue modulo 8209. Solution. Since 5335 = 5 · 11 · 97 and 8209 is an odd prime not dividing 5335, we can determine whether or not 5335 is a quadratic residue modulo 8209 by computing the corresponding Legendre symbol (5335/8209). We note for future reference that 5 ≡4 1, 11 ≡4 3, 97 ≡4 1 and 8209 ≡4 1. We compute 5 · 11 · 97 5335 = (Since 5335 = 5 · 11 · 97) 8209 8209 5 11 97 = (By Theorem 25 Part (c)) 8209 8209 8209 8209 8209 8209 (By Theorem 26) = 5 11 97 4 3 61 = (By Theorem 25 Part (a)) 5 11 97 2 3 61 2 = 5 11 97 3 61 = (1) (By Theorem 25 Part (b)) 11 97 61 3 = 11 97 11 97 = − (By Theorem 26) 3 61 2 36 =− (By Theorem 25 Part (a)) 3 97 2 6 (By Theorem 27 or Theorem 25 Part (d)) = −(−1) 97 = (1)(1) (By Theorem 25 Part (b)) = 1. We conclude that 5335 is a quadratic residue modulo 8209. One can verify that in fact x2 ≡8209 5335 has the solutions x ≡8209 ±1315. Example 20. Determine the value of 3 p where p is an odd prime greater than or equal to 5. 75 Solution. By Theorem 26 we have ( p p 1 (3−1)(p−1) p−1 3 p 4 = = (−1) (−1) 2 = p 3 3 3 −1 if p ≡4 1; if p ≡4 −1. But also, we have ( 1 = 3 −1 p if p ≡3 1; if p ≡3 −1, × since the only quadratic residue in (Z/3Z) is 1. Putting these together yields ( ! ( ! 1 if p ≡3 1; 1 if p ≡4 1; 3 = p −1 if p ≡3 −1. −1 if p ≡4 −1. ( 1 if (p ≡3 1 and p ≡4 1) or (p ≡3 −1 and p ≡4 −1); = −1 if (p ≡3 1 and p ≡4 −1) or (p ≡3 −1 and p ≡4 1). However, an application of the Chinese Remainder Theorem shows that p ≡3 1 p ≡3 −1 ⇐⇒ p ≡12 1, ⇐⇒ p ≡12 −1, p ≡4 1 p ≡4 −1 p ≡3 1 p ≡3 −1 ⇐⇒ p ≡12 −5, ⇐⇒ p ≡12 5. p ≡4 −1 p ≡4 1 We conclude that ( 1 3 = p −1 if p ≡12 ±1; if p ≡12 ±5. Example 21. Determine whether or not the congruence x2 ≡159 211 has solutions. If it has solutions, find them all. Solution. We note that 159 = 3 · 53 and so in order to apply the law of quadratic reciprocity, we must answer this question modulo 3 and 53 and then apply the Chinese Remainder Theorem to complete the solution. We start with x2 ≡3 211. We compute 211 1 = = 1, 3 3 and 211 −1 = = 1. 53 53 Here we have used the fact that 53 ≡4 1. We conclude that the congruence in question has solutions modulo both 3 and 53 and therefore has solutions modulo 159. In fact x2 ≡3 211 ≡3 1 ⇐⇒ x ≡3 ±1, and x2 ≡53 211 ≡53 −1 ≡53 529 = 232 ⇐⇒ x ≡53 ±23. Applying the Chinese Remainder Theorem yields the solutions x ≡159 ±23, ±76. 76 CHAPTER 12. QUADRATIC CONGRUENCES Chapter 13 Quadratic Reciprocity This chapter is based on [Dud08, §12]. In this section we prove Gauss’ law of Quadratic Reciprocity. As of 2013, there are 246 known proofs of this fundamental result. References to each of these proofs can be found at http://www.rzuser.uni-heidelberg.de/~hb3/rchrono.html. We give Gauss’ third proof here, following the exposition given in the textbook. It will be convenient to set some notation before proceeding to the proof. Notation 6. Let p be an odd prime. We let `p denote the least residue function defined by `p (n) = the least residue of n modulo p (n ∈ Z). Let a be an integer relatively prime to p. Throughout this section we use the following notation: Lp (a) = {`p (ka) | 1 ≤ k ≤ (p − 1)/2} , L> p (a) = {x ∈ Lp (a) | x > (p − 1)/2} , L≤ p (a) = {x ∈ Lp (a) | x ≤ (p − 1)/2} , (p−1)/2 X ka . Sp (a) = p k=1 Here, for a real number x, bxc denotes the floor of x equal to the greatest integer less than or equal to x. We also use the notation {x} to denote the fractional part of x equal to x − bxc. Recall that the division algorithm for dividing ka by p with remainder can be written as ka ka = p + `p (ka) p We then have `p (ka) = p n ka p o . 77 78 CHAPTER 13. QUADRATIC RECIPROCITY The first result of this section tells us that when we multiply 1, . . . , (p − 1)/2 by a, the numbers between 1 and (p − 1)/2 that do not occur as a least residue of one of the multiples in question are covered by subtracting from p the least residues of the multiples that are greater than (p − 1)/2. Lemma 18. Let p be an odd prime and a be an integer relatively prime to p. Then · > {1, 2, . . . , (p − 1)/2} = L≤ p (a) ∪ (p − Lp (a)), > where p − L> p (a) denotes the set of all p − x for x ∈ Lp (a). > Proof. We need only show that L≤ p (a) ∩ (p − Lp (a)) = ∅. Indeed, since multiplication by a is known to permute the invertible elements modulo p, we would then be able to conclude that · > L≤ p (a) ∪ (p − Lp (a)) is a subset of {1, 2, . . . , (p − 1)/2} containing (p − 1)/2 elements. We would then obtain the equality we are after. Suppose then that for some 1 ≤ k, ` ≤ (p − 1)/2 we have ka ≡p p − `a. We would then have (k + `)a ≡p 0 so that, since p is prime, k + ` ≡p 0 or a ≡p 0. Since we know that a 6≡p 0, this implies that k + ` ≡p 0. However, 1 < k + ` < p and so this is impossible. We conclude by contradiction that the union is disjoint, as required. Now, Lemma 18 provides us with two distinct representations of the same set of invertible elements modulo p. We take our cue from the proofs of Fermat’s Little Theorem and Euler’s Theorem and multiply together the elements of the set in question and cancel a particular factor to obtain a significant congruence. We obtain Gauss’ Lemma as a result. Theorem 28 (Gauss’ Lemma). Let p be an odd prime and a be an integer relatively prime to p. Then, we have > a = (−1)#Lp (a) . p That is, a is a quadratic residue modulo p if and only if #L> p (a) is even. Proof. To prove Gauss’ Lemma, we use a familiar trick: we multiply together invertible elements using two different characterizations of the elements and then cancel a common factor from both sides of the resulting congruence. In this case, we invoke Lemma 18 to write · > {1, 2, . . . , (p − 1)/2} = L≤ p (a) ∪ (p − Lp (a)). Multiplying together the elements of this set yields 79 Y Y p−1 ! ≡p (ak) (p − ak) 2 > ≤ `p (ak)∈Lp (a) `p (ak)∈Lp (a) Y Y ≡p (ak) (−ak) `p (ak)∈L> p (a) ≤ `p (ak)∈Lp (a) > Y = (−1)#Lp (a) (ak) `p (ak)∈Lp (a) (p−1)/2 > Y = (−1)#Lp (a) (ak) k=1 (p−1)/2 > Y = (−1)#Lp (a) a(p−1)/2 k k=1 = (−1) #L> p (a) (p−1)/2 a p−1 ! 2 Cancelling the (invertible element) [(p − 1)/2]! from both sides yields > a(p−1)/2 ≡p (−1)#Lp (a) . Finally, we can invoke Euler’s criterion to conclude that > a ≡p a(p−1)/2 ≡p (−1)#Lp (a) , p so that, since p is odd, > a = (−1)#Lp (a) p as required. We now have all that is required to prove Theorem 27 that determines the value of (2/p): Proof of Theorem 27. Recall that Theorem 27 determined the value of (2/p) for an odd prime p as ( 1 2 = p −1 if p ≡8 ±1; if p ≡8 ±3. We now prove this claim by invoking Gauss’ Lemma (Theorem 28). We need to determine the parity of #L> p (2). The multiples of 2 in question are 2, 4, 6, . . . , p − 1. 80 CHAPTER 13. QUADRATIC RECIPROCITY These are already least residues modulo p and so we need only count how many of 2, 4, . . . , p − 1 are greater than p−1 2 . It is clear that the multiples in question that satisfy this condition are ( p−1 2 p−1 2 + 2, p−1 2 + 4, . . . , p − 1 p−1 + 1, 2 + 3, . . . , p − 1 Therefore, the number of such integers is ( p−1 4 p+1 4 if if p−1 2 p−1 2 is even. is odd. if p ≡4 1; if p ≡4 3. This number is even if p ≡8 ±1 and odd if p ≡8 ±3, as required. The next result we will need to prove Gauss’ law of Quadratic Reciprocity is the following lemma. Lemma 19. Let p be an odd prime and a be an odd integer relatively prime to p. Then Sp (a) ≡2 #L> p (a). Proof. We compute (p−1)/2 pSp (a) = p X k=1 ka p (p−1)/2 =p ka − p X k=1 ka p (p−1)/2 (p−1)/2 X X = (ka) − `p (ka) k=1 k=1 (p−1)/2 =a X X k− `p (ka). `p (ka)∈L> p (a) ≤ k=1 X `p (ka) − `p (ka)∈Lp (a) But, since · ≤ {1, 2, . . . , (p − 1)/2} = L> p (a) ∪ (p − Lp (a)), we have (p−1)/2 X ≤ `p (ka)∈Lp (a) `p (ka) = X k=1 (p−1)/2 = X k− X k=1 (p − `p (ka)) `p (ka)∈L> p (a) k − p · #L> p (a) + X `p (ka)∈L> p (a) `p (ka). 81 We conclude that (p−1)/2 pSp (a) = a X k=1 (p−1)/2 =a X X k− (p−1)/2 k− k=1 X `p (ka) `p (ka)∈L> p (a) ≤ `p (ka)∈Lp (a) X `p (ka) − X k + p#L> p (a) − 2 `p (ka) `p (ka)∈L> p (a) k=1 (p−1)/2 = p#L> p (a) + (a − 1) X X k−2 `p (ka) `p (ka)∈L> p (a) k=1 Taking this equation modulo 2 and using the fact that a ≡2 1 yields Sp (a) ≡2 #L> p (a) as required. The final result needed in our proof of Gauss’ Law of Quadratic Reciprocity is the following theorem. Theorem 29. Let p and q be distinct odd primes. Then Sp (q) + Sq (p) = (p − 1)(q − 1) . 4 Proof. Consider the line segment given by y= q x, p 0<x≤ p−1 2 . We know that the total number of points (x, y) for x, y ∈ Z and 1 ≤ x ≤ (p−1)/2, 1 ≤ y ≤ (q −1)/2 is equal to p−1 q−1 (p − 1)(q − 1) = . 2 2 4 We will complete the proof by counting these integer points in a different way and obtaining a total of Sp (q) + Sq (p). We will split the grid of integer points in question into three classes based on where they lie with respect to the line y = (q/p)x: (On the line:) We first note that no integer point under consideration can lie on the line. Indeed, if x and y are integers and y = (q/p)x, then x would be a multiple of p which is impossible for 1 ≤ x ≤ (p − 1)/2. (Below the line:) Here we need to count the number of points (x, y) with integer coordinates having 1 ≤ x ≤ (p − 1)/2 and 1 ≤ y < (q/p)x. Since we have seen that no integer point in question lies on the line, we can replace the condition 1 ≤ y < (q/p)x by the condition 82 CHAPTER 13. QUADRATIC RECIPROCITY 1 ≤ y ≤ b(q/p)xc. For each value of x there are exactly b(q/p)xc such points. We conclude that the total number of such points is given by (p−1)/2 X x=1 qx = Sp (q). p (Above the line:) Here, similarly to counting the points below the line, we see that we need to count the number of points (x, y) with integer coordinates having 1 ≤ y ≤ (q − 1)/2 and 1 ≤ x ≤ b(p/q)yc. The total number is then given by (q−1)/2 X y=1 py q = Sq (p). We have therefore shown that the total number of integer points (x, y) with 1 ≤ x ≤ (p − 1)/2 and 1 ≤ y ≤ (q − 1)/2 is equal to Sp (q) + Sq (p), as required. We are now prepared to prove Gauss’ law of Quadratic Reciprocity. Proof of Gauss’ law of Quadratic Reciprocity (Theorem 26). We have > > p q = (−1)#Lq (p) (−1)#Lp (q) (By Theorem 28) q p = (−1)Sq (p) (−1)Sp (q) = (−1) = (−1) (By Lemma 19) Sq (p)+Sp (q) (p−1)(q−1) 4 (By Theorem 29) as required. We close this section with a couple of results, the first of which gives sufficient conditions for 2 to be a primitive root modulo a prime, and the second of which provides a generalization of Euler’s criterion relevant in the search for cubic residues. Proposition 14. Let p and q be primes such that q = 4p + 1. Then 2 is a primitive root modulo q. Proof. Assume the hypotheses. We know that the order of 2 modulo q must divide ϕ(q) which is equal to q − 1 since q is prime. Since q − 1 = 4p, we obtain ordq (2) ∈ {1, 2, 4, p, 2p, 4p}. In order to prove that 2 is indeed a primitive root modulo q, we need to eliminate the first five possibilities. Also, since any possibility that is eliminated automatically eliminates all of its divisors, we are reduced to showing that 22p and 24 are not congruent to 1 modulo q. Well, we have q−1 2 2p 2 2 =2 ≡q q 83 by Euler’s criterion. By Theorem 27, we also know that ( 1 if q ≡8 ±1; 2 = q −1 if q ≡8 ±3. Since p ≡2 1, we see that 4p ≡8 4 so that q = 4p + 1 ≡8 5 ≡8 −3. We therefore have (2/q) = −1 so that 2 22p ≡q ≡q −1. q We have therefore ruled out the cases 1, 2, p and 2p from contention for the order of 2 modulo q. We complete the proof by showing that 24 6≡q 1 thereby forcing ordq (2) = 4p = ϕ(q), as required. But this is straight forward. Indeed, 24 ≡q 1 =⇒ q | (24 − 1) = 15. In turn, this forces q ∈ {3, 5} which is impossible since q = 4p + 1 > 5. All in all, we have shown that the least power a for which 2a ≡q 1 is a = 4p = ϕ(q) so that 2 is a primitive root modulo q, as required. We have seen that when p ≡2 1, the quotient (p − 1)/2 can be formed and that this quotient, due to Euler’s criterion, when employed as an exponent allows us to distinguish quadratic residues modulo p from quadratic nonresidues modulo p. It might be expected that when p ≡3 1 so that the quotient (p − 1)/3 can be formed that there is a generalization of Euler’s criterion which will allow us to use this quotient as an exponent to distinguish cubic residues modulo p from cubic nonresidues modulo p. Here the term cubic residue is used to describe the invertible elements modulo p that can be written as a cube modulo p. This is in fact the case, and we will close this section by stating and proving this generalization of Euler’s criterion. It should be noted that there is nothing special here about 2 or 3. One can reproduce the arguments used in the proof of Euler’s criterion to obtain a generalization that can be used to distinguish q-th power residues from q-th power nonresidues for any prime q. We note first that saying a prime is congruent to 1 modulo 3 is the same as saying that a prime is congruent to 1 modulo 6 since any such prime must be odd. Proposition 15. Let p be prime. If p 6≡6 1 then every element of Z/pZ× is a cubic residue. On the other hand, if p ≡6 1 then a ∈ Z/pZ× is a cubic residue if and only if a p−1 3 ≡p 1. Proof. Let g be a primitive root modulo p. Then ordp (g) p−1 ordp (g 3 ) = = = (ordp (g), 3) (p − 1, 3) ( p−1 p−1 3 if 3 - (p − 1) . if 3 | (p − 1) That is, ( 3 ordp (g ) = p−1 p−1 3 if p 6≡3 1 = if p ≡3 1 ( p−1 p−1 3 if p 6≡6 1 if p ≡6 1 We conclude that for p 6≡6 1, the element g 3 is a primitive root modulo p so that every invertible element can be written as a power of g 3 . Since every power of g 3 is a cube, we see that in this case 84 CHAPTER 13. QUADRATIC RECIPROCITY every invertible element is a cubic residue. Conversely, suppose that p ≡6 1. In this case, g 3 has × order p−1 3 . We now complete the proof by showing that for a ∈ Z/pZ , a is a cubic residue modulo (p−1)/3 k p if and only if a ≡p 1. Let a = g for some k. We have a(p−1)/3 ≡p 1 ⇐⇒ ordp a(p−1)/3 = 1 ⇐⇒ ordp g k(p−1)/3 = 1 ⇐⇒ ⇐⇒ ⇐⇒ ordp (g) ordp (g), k(p−1) 3 p−1 p − 1, k(p−1) 3 =1 =1 p−1 =1 (3, k) p−1 3 3 =1 (3, k) ⇐⇒ (3, k) = 3 ⇐⇒ ⇐⇒ k ≡3 0. We are therefore reduced to proving that g k is a cubic residue if and only if k ≡3 0. But this is clear since 3 | p − 1 and this allows us to construct the following chain of equivalences: g k is a cubic residue modulo p ⇐⇒ g k ≡p (g ` )3 for some ` ⇐⇒ k ≡p−1 3` for some ` ⇐⇒ k ≡3 0. Remark 7. Note that we can re-state Euler’s criterion as follows. Let p be prime. If p 6≡2 1 then every element of Z/pZ× is a quadratic residue. On the other hand, if p ≡2 1 then a ∈ Z/pZ× is a quadratic residue if and only if p−1 a 2 ≡p 1. This makes it reasonable to consider Proposition 15 as a generalization of Euler’s criterion. We have another result related to cubic residues that is analogous to the result that for primes p, x2 ≡p 1 has a nontrivial solution if and only if p is odd. Proposition 16. Let p be a prime. Then x 3 ≡p 1 has nontrivial solutions if and only if p ≡6 1. Proof. Suppose first that x3 ≡p 1 has a nontrivial solution a 6= 1. Then ordp (a) = 3 so that 3 | ϕ(p) = p − 1. This shows that p ≡3 1. However, as p 6= 2, we see that p ≡2 1 as well so that p ≡6 1, as required. Conversely, suppose that p ≡6 1. Consider the factorization x3 − 1 = (x − 1)(x2 + x + 1). 85 We show that x3 ≡p 1 has a nontrivial solution by proving that x2 + x + 1 has a nontrivial solution. We do this by completing the square and invoking quadratic reciprocity. We have 4x2 + 4x + 4 = (2x + 1)2 + 3. Therefore, we obtain nontrivial solutions if and only if (2x + 1)2 ≡p −3 has a solution. That is, we obtain nontrivial solutions if and only if (−3/p) = 1. We now turn to computing the value of this Legendre symbol. −1 3 −3 = p p p ( ! ( ! 1 if p ≡4 1 1 if p ≡12 ±1 = −1 if p ≡4 −1 −1 if p ≡12 ±5 ( 1 if p ≡12 1 or 7 = −1 if p ≡12 5 or 11 ( 1 if p ≡6 1 = −1 if p ≡6 −1 We conclude that x3 ≡p 1 has nontrivial solutions if and only if p ≡6 1, as required. 86 CHAPTER 13. QUADRATIC RECIPROCITY Chapter 14 Pythagorean Triangles This chapter is based on [Dud08, §16]. The goal of this section is to find all integer solutions to Pythagoras’ quadratic diophantine equation x2 + y 2 = z 2 . We first note that if d = (x, y), then d | z. We could then divide through by d2 to obtain x 2 y 2 z 2 + = , d d d where xd , yd = 1. Therefore, we may suppose that x and y are relatively prime. Indeed, if we can solve the equation in this case, the general case is obtained by simply multiplying our solution by the relevant greatest common divisor. We have therefore reduced our problem to finding the integer solutions to x2 + y 2 = z 2 , (x, y) = 1. (14.1) Next, we note that if any prime divides two of x, y, z, then it must also divide the third. In particular, we see that in the case given by (14.1), we also have (x, z) = (y, z) = 1. We have therefore arrived at the study of Pythagorean triples x, y, z (which are solutions to x2 + y 2 = z 2 in integers) that are relatively prime in pairs. Finally, it is clear that the solutions come in pairs since for any w, (−w)2 = w2 . All in all, we can find all Pythagorean triples as long as we can find all fundamental Pythagorean triples, where the fundamental Pythagorean triples are defined as follows: Definition 16. A triple (a, b, c) of integers is called a fundamental Pythagorean triple if a, b, c are positive, a2 + b2 = c2 and (a, b) = 1. As remarked above, for fundamental Pythagorean triples (a, b, c), the condition (a, b) = 1 is equivalent to the condition that a, b and c are relatively prime in pairs. Further, if S denotes the set of all fundamental Pythagorean triples, the set of all solutions to x2 +y 2 = z 2 in positive integers is given by {(da, db, dc) | (a, b, c) ∈ S and d ∈ N}. We turn now to the determination of all fundamental Pythagorean triples. We will use the results of the following proposition. 87 88 CHAPTER 14. PYTHAGOREAN TRIANGLES Proposition 17. Let (a, b, c) denote a fundamental Pythagorean triple so that a, b, c ∈ N, (a, b) = (a, c) = (b, c) = 1 and a2 + b2 = c2 . Then exactly one of a, b is even while the other is odd and c is odd. Proof. First of all, since a, b, c are relatively prime in pairs, we see that at most one of a, b, c is even. On the other hand, a, b and c cannot all be odd since c2 = a2 + b2 and so if a and b were both odd, c would have to be even. We conclude that exactly one of a, b, c is even while the other two are odd. We complete the proof by showing that c cannot be the one that is even. We do this by considering our equation modulo 4, and recalling that the even squares are 0 modulo 4 while the odd squares are 1 modulo 4. If a and b were odd and c were even, we would have a2 + b2 ≡4 1 + 1 ≡4 2 while c2 ≡4 0. This contradiction completes the proof that in any fundamental Pythagorean triple, (a, b, c), c is odd and exactly one of a, b is even. In light of Proposition 17, and due to symmetry, we may, and do, suppose that a is even and that b and c are odd for the remainder of this section. Here, as well as elsewhere, the concept of p-adic valuations will prove useful. In order to define this, we first make some preliminary remarks. Let nm o Q= m ∈ Z, n ∈ N, (m, n) = 1 n denote the set of all rational numbers. The fundamental theorem of arithmetic can then be seen to apply to the set of nonzero rational numbers as follows. Theorem 30 (Fundamental Theorem of Arithmetic for Rationals). Let P denote the set of all primes. Then every nonzero rational number x can be written uniquely in the form Y x=± pvp (x) , (14.2) p∈P for integers v2 (x), v3 (x), v5 (x), . . . of which only finitely many are nonzero. Proof. We first note that it is sufficient to prove the result for positive rationals x. Indeed, if we can prove the result for positive values of x, then we’d obtain the result for negative values of x simply by introducing a minus sign. Suppose then that x = m n for positive integers m and n such that (m, n) = 1. By the fundamental theorem of arithmetic (for integers) we know that m and n have unique representations of the form given by (14.2) using nonnegative exponents. That is, we have Y Y m= pvp (m) , n= pvp (n) p∈P p∈P for uniquely determined integers v2 (m), v3 (m), v5 (m), . . . , v2 (n), v3 (n), v5 (n), · · · ≥ 0 of which only finitely many are nonzero. We then obtain Q vp (m) Y m p∈P p = Q = pvp (m)−vp (n) . vp (n) n p∈P p p∈P 89 Since only finitely many of the differences vp (m) − vp (n) are nonzero, we can set vp vp (n) to see that Y Y m m = pvp (m)−vp (n) = pvp ( n ) n p∈P m n = vp (m) − p∈P has at least one representation of the form given by (14.2). To prove uniqueness, we proceed as usual by assuming that we have two potentially different representations of m n in the form given by (14.2) and then prove that they are in fact equal. To do this, we use the fact that every rational number x can be written uniquely in the form m n for relatively prime integers m and n with n ≥ 1. To see this, we simply divide out all common factors from the numerator and denominator of x and arrange for the minus sign, if it is present, to be attached to the numerator of x. Suppose then that Y Y 0 m m m = p vp ( n ) = p vp ( n ) n p∈P p∈P 0 m for integers v2 n , v3 n , v5 n , . . . , v20 n , v30 m n , v5 n , . . . of which only finitely many 0 m are nonzero. We need to prove that vp m n = vp n for all primes p. We do this as follows. Let m P+ denote the set of primes for which vp m n ≥ 0, P− denote the set of primes for which vp n < 0 0 0 and define P+ , P− similarly. We then have m m m m Q Q vp0 ( m vp ( m n ) n ) p∈P0+ p m p∈P+ p =Q = . Q m −vp0 ( m n n ) p−vp ( n ) 0 p p∈P− p∈P− From the uniqueness of the representation of rational numbers into quotients of relatively prime integers, we conclude that Y 0 m Y m pvp ( n ) m= pvp ( n ) = p∈P0+ p∈P+ and n= Y p∈P− p−vp ( n ) = m Y 0 p−vp ( n ) . m p∈P0− We now invoke the uniqueness part of the fundamental theorem of arithmetic to conclude that m m vp = vp0 n n for all primes p, as required. We now come to the definition of the p-adic valuation on the set Q of rationals. The p-adic valuation of nonzero rational numbers x will be defined to be the exponent vp (x) that appears in its prime-power factorization given by (14.2). Since it will be convenient to have the p-adic valuation defined for all rationals, including zero, we seek a reasonable definition for vp (0). To this end, we note that for nonzero rationals x, vp (x) is equal to the largest power of p that divides x. Here, this largest power is the difference of the largest power appearing in the factorization of the numerator of x and the largest power appearing in the factorization of the denominator of x. It would then be reasonable to define vp (0) to be “the largest power of p that divides 0.” Since every power of 90 CHAPTER 14. PYTHAGOREAN TRIANGLES p divides 0, it seems reasonable to define vp (0) = ∞. We therefore adopt the conventions that the values of vp lie in Z ∪ {∞} and ∞ is the maximum element of this set. We also adopt the convention that ∞ + a = a + ∞ = ∞ for all a ∈ Z ∪ {∞}. This discussion leads us to the formal definition of p-adic valuation. Definition 17 (p-adic valuation). Let x ∈ Q. We define the p-adic valuation of x, denoted vp (x), to be equal to ∞ if x = 0 and equal to the power of p appearing in the prime-power factorization of x given by (14.2) otherwise. Before proceeding, we first note an alternative way of defining the p-adic valuation of nonzero rational numbers x. Given a nonzero rational number x, we note that vp (x) is the unique integer for which x can be written in the form x = pvp (x) m0 , n0 (p - m0 , n0 ). This is saying nothing more than the fact that given any nonzero rational number x, we can factor the largest power of p appearing in the numerator and denominator and be left with a new numerator and denominator that are relatively prime to p. Since the factorizations of the numerator m0 and denominator n0 that are left over will not contain a power of p, it is clear that they will be relatively prime to p. Aside 1. Let p be a prime. By defining | · |p on the set of rational numbers by |x|p = p−vp (x) , (x ∈ Q), we get the p-adic absolute value which satisfies the same fundamental properties as the usual absolute value. It takes on only nonnegative values, is only zero when the input is zero, is completely multiplicative, and satisfies (a stronger version of) the triangle inequality. If we add to the set of rationals all numbers we can obtain as limits of (Cauchy) sequences of rationals with respect to the usual absolute value we obtain the field R of real numbers. In exactly the same way, if we add to the set of rationals all numbers we can obtain as limits of (Cauchy) sequences of rationals with respect to the p-adic absolute value we obtain the field Qp of p-adic numbers. These fields are fundamental in the study of more advanced topics in number theory. As a particular instance of this, we will be able to completely characterize the integers that can be written as the sum of two or four squares by using methods developed in the text. However, in order to classify those integers that can be written as the sum of three squares, one needs to use the fields Qp mentioned above. Further remarks in this direction will be made when we study sums of squares. We now state a proposition that gives the fundamental properties satisfied by the p-adic valuation (as well as every other “non-archimedean valuation”). Proposition 18. Let p be a prime. The p-adic valuation vp satisfies the following properties for all x, y ∈ Q: (a) vp (x) ≤ ∞. (b) vp (x) = ∞ if and only if x = 0. (c) vp (xy) = vp (x) + vp (y). (d) If y 6= 0, vp xy = vp (x) − vp (y). 91 (e) vp (xa ) = avp (x) for all a ∈ Z for which xa is defined. (f) vp (x + y) ≥ min{vp (x), vp (y)}. (g) If vp (x) 6= vp (y) then vp (x + y) = min{vp (x), vp (y)}. Proof. Let p be a prime and x, y ∈ Q. (a) This is clear from the definition of the p-adic valuation vp . (b) This is also clear from the definition of the p-adic valuation vp . (c) First suppose that at least one of x, y is equal to zero. In this case, since ∞ + a = a + ∞ = ∞ for all a ∈ Z ∪ {∞}, we see that both sides of the proposed equality are equal to ∞. Suppose then that x 6= 0 and y 6= 0. We can write x = pvp (x) m , n y = pvp (y) m0 n0 (p - m, n, m0 , n0 ). But then, mm0 , (p - mm0 , nn0 ). nn0 We conclude that vp (xy) = vp (x) + vp (y) as was to be shown. xy = pvp (x)+vp (y) (d) Suppose that y 6= 0 so that 1 y ∈ Q. We can write y = pvp (y) m , n (p - m, n). But then, 1 n = p−vp (y) , y m and so vp (p - n, m), 1 = −vp (y). y (14.3) Combining this with Part (c) gives us our result: x 1 1 vp = vp x · = vp (x) + vp = vp (x) − vp (y). y y y (e) This follows from Parts (c) and (d) by a routine induction. Indeed, the cases a ∈ {0, 1} are clear, and if we assume it holds for a fixed value of a ≥ 1, then we obtain from Part (c) that vp xa+1 = vp (xa · x) = vp (xa ) + vp (x) = avp (x) + vp (x) = (a + 1)vp (x). We conclude by induction that our result holds for all integers a ≥ 0. Finally, if a < 0 and xa is defined, we must have x 6= 0. We then have −a > 0 so that, by what we just proved together with (14.3) , 1 vp (xa ) = vp = −vp x−a = −(−a)vp (x) = avp (x). −a x 92 CHAPTER 14. PYTHAGOREAN TRIANGLES (f), (g) Suppose first that at least one of x, y is equal to zero. Without loss of generality, we can then suppose that x = 0. We then see that vp (x + y) = vp (0 + y) = vp (y) = min{∞, vp (y)} = min{vp (x), vp (y)}. The inequality in question therefore holds in this case. We are therefore reduced to the case where neither x nor y is equal to zero. In this case, we can write x = pvp (x) m , n y = pvp (y) m0 n0 (p - m, n, m0 , n0 ). By symmetry, we may suppose, without loss of generality, that vp (x) = min{vp (x), vp (y)}. Let a = vp (y) − vp (x) ≥ 0. We then have m (mn0 + pa m0 n) m0 m0 m . + pa 0 = pvp (x) x + y = pvp (x) + pvp (y) 0 = pvp (x) n n n n nn0 Now, since p - nn0 , we have vp mn0 + pa m0 n nn0 ≥ 0. Consequently, + pa m0 n) vp (x + y) = vp p nn0 mn0 + pa m0 n vp (x) = vp p + vp nn0 ≥ vp pvp (x) + 0 vp (x) (mn 0 (From Part (c)) (14.4) = vp (x) + 0 = vp (x) = min{vp (x), vp (y)}. Finally, we note that when vp (x) 6= vp (y), we have a ≥ 1. We conclude that p - (mn0 + pa m0 n), so that vp mn0 + pa m0 n nn0 p - nn0 = 0. This allows us to replace the inequality ≥ in (14.4) with equality thereby obtaining vp (x + y) = min{vp (x), vp (y)} as required. We now illustrate the utility of the p-adic valuation by proving a lemma that provides us with the final piece needed to obtain our classification of fundamental Pythagorean triples. 93 Lemma 20. Let s and t be relatively prime positive integers and suppose that st = r2 for some positive integer r. Then both s and t are squares. That is, there exist positive integers m and n such that s = m2 , t = n2 . Further, we have (m, n) = 1. Proof. Assume the hypotheses and let p be a prime. Applying the p-adic valuation to both sides of st = r2 yields vp (s) + vp (t) = 2vp (r) ≡2 0. (14.5) Since (s, t) = 1 we see that at least one of vp (s), vp (t) is equal to zero. Therefore, we can conclude from (14.5) that both of vp (s) and vp (t) are even. Every exponent appearing in the factorizations of s and t is therefore even so that s and t are both squares. Finally, if s = m2 and t = n2 , then m and n must be relatively prime since s and t are relatively prime. We now have all that is required to give the complete classification of all fundamental Pythagorean triples. Theorem 31. Let (a, b, c) be a triple of integers with a even. Then (a, b, c) is a fundamental Pythagorean triple if and only if there exist positive integers m > n with (m, n) = 1 and m 6≡2 n such that a = 2mn, (14.6) 2 2 b=m −n , (14.7) 2 2 (14.8) c=m +n . Proof. First suppose that we have positive integers m > n that are relatively prime and of opposite parity (m 6≡2 n). It is then clear that a, b and c defined by equations (14.6), (14.7) and (14.8) are positive integers. Further, we see that a2 + b2 = (2mn)2 + (m2 − n2 )2 = m4 + 2m2 n2 + n4 = (m2 + n2 )2 = c2 . Finally, we need to verify that (a, b) = 1. But this is clear since a = 2mn is even while b = m2 − n2 is odd so that any common prime divisor of a and b would have to divide both m and n thereby contradicting (m, n) = 1. (Here is where we use the assumption that m 6≡2 n so that exactly one of m, n is even while the other is odd). Conversely, suppose that (a, b, c) is a fundamental Pythagorean triple. We need to show that there exist positive integers m > n that are relatively prime and of opposite parity such that (14.6), (14.7) and (14.8) hold. We write a = 2r and then rewrite a2 + b2 = c2 as 4r2 = c2 − b2 = (c − b)(c + b). (14.9) Since b and c are both odd, we see that c−b and c+b are both even. Write c−b = 2t and c+b = 2s. Then Equation (14.9) reads 4r2 = (2s)(2t) = 4st, which simplifies to r2 = st. 94 CHAPTER 14. PYTHAGOREAN TRIANGLES We now show that (s, t) = 1 in preparation of invoking Lemma 20. First of all, we note that c = b + 2t = 2s − c + 2t =⇒ c = s + t, and that similarly, b = s − t. Suppose then that some prime p divides both s and t. Then p divides both b and c thereby contradicting the relative primality of b and c. We conclude by contradiction that (s, t) = 1. We now invoke Lemma 20 to write s = m2 , t = n2 for some relatively prime positive integers m and n. Since r2 = m2 n2 and each of r, m, n is positive, we obtain a = 2r = 2mn. Further, b = s − t = m2 − n2 , c = s + t = m2 + n2 . Since b is positive, we must have m > n, and since b is odd, we must have m 6≡2 n. This completes the proof. As a corollary to Theorem 31, we get the complete classification of solutions to x2 + y 2 = z 2 is integers. Corollary 4. Let x, y, z ∈ Z. Then x2 + y 2 = z 2 if and only if one of (x, y, z), (y, x, z) can be written as (2d1 mn, d2 (m2 − n2 ), d3 (m2 + n2 )) for d, m, n ∈ N0 , m > n, (m, n) = 1, m 6≡2 n, 1 , 2 , 3 ∈ {1, −1}. Proof. Let S be the set of all triples of the form (2d1 mn, d2 (m2 − n2 ), d3 (m2 + n2 )) for d, m, n ∈ N0 , m > n, (m, n) = 1, m 6≡2 n, 1 , 2 , 3 ∈ {1, −1}. First of all, for 1 , 2 , 3 ∈ {1, −1}, and any integers d, m, n, we have 2 2 (2d1 mn) + d2 (m2 − n2 ) = 4d2 m2 n2 + d2 (m2 − n2 )2 = 4d2 m2 n2 + d2 (m4 − 2m2 n2 + n4 ) = d2 m4 + 2m2 n2 + n4 = d2 (m2 + n2 )2 2 = d(m2 + n2 ) 2 = d3 (m2 + n2 )) . Therefore every element of S gives a solution. We are therefore reduced to proving that x2 +y 2 = z 2 implies that one of (x, y, z), (y, x, z) lies in S. We will do this by first looking at some trivial cases 95 and then invoking Theorem 31 to deal with the other cases. The trivial cases arising when at least one of x, y is equal to zero are dealt with as follows: (x, y, z) (0, b, b), (b, 0, b), b ∈ Z (0, b, −b), (b, 0, −b), b ∈ Z Values of parameters showing one of (x, y, z), (y, x, z) ∈ S b m = 1, n = 0, d = |b|, 2 = 3 = |b| b b m = 1, n = 0, d = |b|, 2 = |b| , 3 = − |b| 2 2 2 Now suppose that neither x nor y is zero, and let d = (x, y). Then d = (|x|, |y|) and |x| +|y| = |z| . |x| |y| |z| d , d , d We then obtain that the triple is a fundamental Pythagorean triple. By Theorem 31, |y| |z| there exist positive integers m > n with (m, n) = 1 and m 6≡2 n such that one of |x| d , d , d , |y| |x| |z| is equal to (2mn, m2 − n2 , m2 + n2 ). We therefore have one of (x, y, z), (y, x, z) in S d , d , d as is shown by taking x y z 1 = , 2 = , 3 = . |x| |y| |z| We end this section with an example. Example 22. Determine the right triangles having integer side lengths and area equal to twice their perimeter. Solution. Let x, y and z denote the sides of a right triangle, with z being the hypotenuse. Then, x, y, z are positive integers such that x2 + y 2 = z 2 . From Corollary 4, one of (x, y, z), (y, x, z) is equal to (2dmn, d(m2 − n2 ), d(m2 + n2 )) for positive integers d, m, n satisfying m > n, (m, n) = 1 and m 6≡2 n. If the area of our triangle is twice its perimeter, we have 1 xy = 2(x + y + z) =⇒ d2 mn(m2 − n2 ) = 2d(2mn + m2 − n2 + m2 + n2 ) 2 =⇒ d2 mn(m − n)(m + n) = 4dm(m + n) =⇒ dn(m − n) = 4 We conclude that d, n, and m−n are all positive divisors of 4. Recalling that m and n are relatively prime and of opposite parity, we obtain the possibilities given in the following table: d n 1 4 2 2 4 1 m−n m 1 5 1 3 1 2 Side lengths of the Corresponding Right Triangle 40, 9, 41 24, 10, 26 16, 12, 20 96 CHAPTER 14. PYTHAGOREAN TRIANGLES Chapter 15 Infinite Descent and Fermat’s Conjecture This chapter is based on [Dud08, §17]. In this section we introduce Fermat’s method of infinite descent that can be used to show that certain diophantine equations fail to have nontrivial integer solutions. The idea is to proceed by contradiction by supposing that there exists a nontrivial solution, taking such a solution that is smallest in some sense and then obtaining a contradiction by deriving an even smaller solution. This is the description of the “least element” version of the method. The “induction” version of the method constructs, from some given nontrivial solution, an infinite sequence of positive solutions, each smaller than its predecessor. This version explains why the method is called infinite descent. The classic example of using Fermat’s method of infinite descent is the n = 4 case of Fermat’s Last Theorem. We will prove this case later on in this section, but first we set the stage. Definition 18. Let f be a polynomial in the variables x1 , . . . , xn with integer coefficients. An integer solution to the diophantine equation f (x1 , . . . , xn ) = 0 is called nontrivial if none of the xj are equal to zero. The following theorem is known as Fermat’s Last Theorem: Theorem 32 (Fermat’s Last Theorem). If n is a positive integer greater than 2 then the diophantine equation xn + y n = z n has no nontrivial solutions in integers. Since x1 + y 1 = z 1 clearly has infinitely many nontrivial integer solutions, and the same is true for x2 + y 2 = z 2 by the previous section, we see that Fermat’s Last Theorem completes the determination of when a power can be written as the sum of two like powers. Before arriving at the n = 4 case of Fermat’s Last Theorem, we first state and prove the following lemma that generalizes Lemma 20. Lemma 21. Let k, r, s, t, q ∈ N with (s, t) = 1 and q a prime. Suppose that st = qrk . Then one of s, t is a k-th power and the other is q times a k-th power. 97 98 CHAPTER 15. INFINITE DESCENT AND FERMAT’S CONJECTURE Proof. Assume the hypotheses and let p be a prime. Applying the p-adic valuation vp to both sides of st = qrk yields ( 1 vp (s) + vp (t) = vp (q) + kvp (r) ≡k vp (q) ≡k 0 if p = q; if p = 6 q. Since (s, t) = 1, we know that at least one of vp (s), vp (t) is equal to zero. We conclude that one of vq (s), vq (t) is equal to zero and the other is congruent to 1 modulo k while for p 6= q, vp (s) ≡k vp (t) ≡k 0. Since a positive integer is a k-th power if and only if each of the exponents appearing in its prime-power factorization is a multiple of k, we see that one of s, t is a k-th power while the other is q times a k-th power. We now apply Fermat’s method of infinite descent to prove the n = 4 case of Fermat’s Last Theorem. Theorem 33. The diophantine equation x4 + y 4 = z 2 has no nontrivial solutions in integers. In particular, x4 + y 4 = z 4 has no nontrivial solutions in integers. Proof. Towards a contradiction, suppose that x4 + y 4 = z 2 has a nontrivial solution x, y, z in integers. Since the powers involved in the diophantine equation we are considering are even, we can assume that x, y and z are all positive. We show now that there is a solution having least positive value for z, and then obtain a contradiction by deriving from this solution another solution with an even smaller positive value for z. Suppose then that S = {z ∈ N | x4 + y 4 = z 2 for some x, y ∈ N}. To show that S has a least element, we will show that it is nonempty and bounded below and then invoke the least integer principle. By hypothesis, S 6= ∅ since we are assuming the existence of a nontrivial solution to our diophantine equation, and, as remarked above, this implies the existence of a solution x, y, z to our diophantine equation having x, y, z ∈ N. Also, since every element of S is positive, we see that S is bounded below. From the least integer principle, we conclude therefore that S has a least element z0 . Let x0 , y0 , z0 ∈ N be a corresponding solution to our diophantine equation. We claim that x0 and y0 are relatively prime. Indeed, if p is a prime dividing both x0 and y0 , then from x40 + y04 = z02 , we would conclude that p2 | z0 . But then, x0 p 4 + y0 p 4 = z0 p2 2 , yielding a “smaller”nontrivial solution to our diophantine equation. Indeed, we would have even though pz02 < z0 . We can therefore write x20 2 + y02 2 = z02 , z0 p2 ∈S 99 for (x0 , y0 ) = 1. We conclude that (x20 , y02 , z0 ) is a fundamental Pythagorean triple. We can therefore assume without loss of generality that x0 = 2r is even, y0 , z0 are odd and 4r2 = x20 = 2st, y02 (15.1) 2 2 (15.2) 2 2 (15.3) =s −t , z0 = s + t , for some positive integers s, t with s > t, (s, t) = 1 and s 6≡2 t. We know that one of s, t is even while the other is odd. To determine which one is even and which one is odd, we look at (15.2) modulo 4 recalling that even squares are congruent to 0 modulo 4 and odd squares are congruent to 1 modulo 4. Since y0 is odd we obtain ( ( 1 if s is odd and t is even; 12 − 02 if s is odd and t is even; 2 2 2 1 ≡4 y0 ≡4 s − t ≡4 = −1 if t is odd and s is even. 02 − 12 if t is odd and s is even. From this we conclude that s is odd and t is even. Looking at (15.2) once more, we see that t2 + y02 = s2 . Further, it is easy to see that (t, y0 ) = 1. We therefore have another fundamental Pythagorean triple which yields relatively prime positive integers m and n with m > n, m 6≡2 n such that t = 2mn; (15.4) y0 = m2 − n2 ; 2 2 s=m +n . (15.5) (15.6) What we do now is show that all three of m, n and s are squares. Equation (15.6) would then provide us with a positive integer smaller than z0 with square equal to the sum of two fourth powers. This is the contradiction we are after. First of all, we see from (15.1) that st = 2r2 . By Lemma 21, together with the fact that t is even, we conclude that for some u, v ∈ N, we have s = u2 and t = 2v 2 . But then, (15.4) yields mn = v 2 so that m and n are both squares. Finally, if m = a2 and n = b2 then (15.6) reads u2 = a4 + b4 . Since 0 < u ≤ u4 = s2 < s2 + t2 = z0 , we have obtained the contradiction we were after. We close this section by proving that the only integers that have rational square roots are the perfect squares. We first apply Fermat’s method of Infinite Descent to prove the result for primes, and then show how the general case follows from the Rational Root Theorem. √ Proposition 19. Let p be prime. Then p is irrational. √ Proof. Let p be a prime. We will prove that p is irrational by employing Fermat’s method of Infinite Descent. Let S = {n ∈ N | pn2 = m2 for some integer m} 100 Assuming that CHAPTER 15. INFINITE DESCENT AND FERMAT’S CONJECTURE √ p is rational, we can write √ p= m , n for some positive integers m and n. But then pn2 = m2 so that n ∈ S. We conclude that S is nonempty. Since every element of S is positive, we see as well that S is bounded below. Therefore, by the least integer principle, S has a least element n0 . Let m0 ∈ Z be such that pn20 = m20 . Then p | m0 so that m0 = pm1 for some integer m1 . Consequently pn20 = m20 = p2 m21 , and so n20 = pm21 . We now see that p | n0 so that n0 = pn1 for some n1 ∈ N. This gives p2 n21 = n20 = pm21 so that pn21 = m21 . But this forces n1 ∈ S which is a contradiction since n1 < pn1 = n0 . By contradiction, we conclude √ that p is irrational, as required. √ We have applied Fermat’s method of Infinite Descent to prove the irrationality of p for primes √ p. However, for any non square d ∈ N, d is irrational. We now prove this generalization as a corollary of the Rational Root Theorem. Theorem 34 (Rational Root Theorem). Let f be a monic polynomial with integer coefficients. Then every rational root of f is in fact an integer. Pk−1 Proof. We may suppose that the degree, k, of f is positive. Let f (x) = xk + j=0 aj xj , for integers a0 , . . . , ak−1 . Suppose that x0 = m n is a rational root of f . By cancelling common factors from the numerator and denominator of x0 if necessary, we may suppose that (m, n) = 1. Since x0 is a root of f , we have k−1 mk X mj aj j = 0. f (x0 ) = k + n n j=0 Multiplying through by nk yields mk + n k−1 X aj mj nk−j−1 = 0. j=0 Since every exponent of n that appears in the sum is nonnegative, we see that n divides the integer Pk−1 n j=0 aj mj nk−j−1 . Therefore n | mk . Since (m, n) = 1, we see that the only way for this to occur m is to have n = 1. We conclude that x0 = m n = 1 = m ∈ Z, as required. 101 √ Corollary 5. Let d ∈ N not be a square. Then d is irrational. √ Proof. We need to prove that if d is rational then d is the square of an integer. Consider the monic quadratic polynomial f given by f (x) = x2 − d. √ d is a root From Theorem 34 we know that every rational root of f is in fact an integer. Since √ of f , we conclude that if it were rational, it would have to be an integer. But if d = n ∈ Z, then d = n2 would be the square of an integer as was to be shown. 102 CHAPTER 15. INFINITE DESCENT AND FERMAT’S CONJECTURE Chapter 16 Sums of Squares This chapter is based on [Dud08, §18, 19] and [Ser73, Appendix to Ch. 4]. In this section, we classify the integers that can be written as sums of squares. We will give complete proofs for the two squares and four squares cases, and a very rough outline for the three squares case. The results are that the only positive integers that cannot be written as a sum of two squares are the ones divisible by a prime p ≡4 3 to an odd power, the only positive integers that cannot be written as a sum of three squares are the ones of the form 4a (8b − 1) for a ∈ N0 and b ∈ N and that every positive integer can be written as the sum of four squares. We start with the two squares case. Theorem 35. A positive integer n can be written as the sum of two squares if and only if vp (n) is even for all primes p ≡4 3. Proof. Let n ∈ N be the sum of two squares. Say n = x2 + y 2 (16.1) for nonnegative integers x, y. Suppose that p is a prime such that vp (n) is odd. Since n is an integer we have vp (n) ≥ 0 and so we conclude from the assumption that vp (n) is odd that vp (n) ≥ 1. We summarize vp (n) is an odd positive integer. (16.2) Let d = (x, y). Then, since d divides both x and y, we see from (16.1) that d2 divides n. Define x1 = x/d, y1 = y/d and n1 = n/d2 . Dividing (16.1) by d2 yields n1 = x21 + y12 . (16.3) Now, we take the p-adic valuation of n1 , recalling that the p-adic valuation of integers is nonnegative, to obtain n 0 ≤ vp (n1 ) = vp 2 = vp (n) − vp (d2 ) = vp (n) − 2vp (d). d We conclude from this and (16.2) that vp (n1 ) is odd and nonnegative. It is therefore positive so that p | n1 . From (16.3), we see that either both of x1 , y1 are divisible by p or that neither one of them is divisible by p. Since (x1 , y1 ) = 1 we conclude that neither x1 nor y1 is divisible by p so that they both lie in Z/pZ× . Considering (16.3) modulo p yields x21 + y12 ≡p n1 ≡p 0, 103 104 CHAPTER 16. SUMS OF SQUARES so that x21 ≡p −y12 . Since y1 ∈ Z/pZ× , the inverse y1−1 modulo p exists and we can multiply by y1−2 to obtain x1 y1−1 2 ≡p −1. But this forces p = 2 or p is odd and (−1/p) = 1. Since the latter forces p ≡4 1, we see that the only primes p for which vp (n) can be odd are p = 2 and p ≡4 1. Therefore, if n is the sum of two squares then for all primes p ≡4 3 we have vp (n) is even. Conversely, suppose that vp (n) is even for all primes p ≡4 3. We can then write the prime-power factorization of n as !2 ! v2 (n) n=2 Y p≡4 1 vp (n) p Y vp (n)/2 p . p≡4 3 Since any square is already a sum of two squares (equal to itself plus 02 ), in order to prove that n is a sum of two squares, it suffices to show that 2 and primes p ≡4 1 are sums of two squares and that multiplying together integers representable as the sum of two squares yields another integer that is representable as the sum of two squares. We start by showing that the product of representable integers is representable. To do this, we need only combine the observation that (a2 + b2 )(c2 + d2 ) = |a − bi|2 |c + di|2 = |(ac + bd) + (ad − bc)i|2 = (ac + bd)2 + (ad − bc)2 (16.4) with a routine induction. Therefore, since 2 = 12 + 12 , we are reduced to proving that every prime p congruent to 1 modulo 4 is representable as the sum of two squares. To this end, we first note that since p ≡4 1, we have (−1/p) = 1. Therefore, there exists a positive integer u such that u2 ≡p −1. This implies that p | (u2 + 1). (16.5) We will now complete the proof in two different ways. One way will be by Descent, and the other will be algebraic. First we proceed by descent. From (16.5) we see that the set S given by S = {k ∈ N | kp = x2 + y 2 for some x, y ∈ N} is nonempty. Since it is bounded below, we can invoke the least integer principle to obtain a least element k. Say x2 + y 2 = kp. (16.6) If we can show that k = 1, then we will have p written as the sum of two squares, as required. We now proceed to show this. Let r and s be the representatives modulo k for x and y respectively having least absolute value. Then r ≡k x, s ≡k y, k <r≤ 2 k − <s≤ 2 − Then r2 + s2 ≡k x2 + y 2 = kp ≡k 0. k ; 2 k . 2 (16.7) (16.8) 105 We can therefore write r2 + s2 = k1 k for some k1 ∈ N0 . Now, if k1 = 0 then r = s = 0 so that x ≡k y ≡k 0. By (16.6) we see that this forces k 2 | kp so that k | p. Therefore k = 1 or k = p. If k = 1 we’re done. If k = p then p | x, y and 2 2 y x + = 1. p p But this would force one of x, y to be equal to zero (and the other to be equal to p) which is a contradiction since x, y ∈ N. We can therefore assume that k1 ∈ N. We have (rx + sy)2 + (ry − sx)2 = (r2 + s2 )(x2 + y 2 ) = k1 k 2 p. (16.9) However, rx + sy ≡k r2 + s2 ≡k 0, ry − sx ≡k rs − sr ≡k 0 so that both rx + sy and ry − sx are divisible by k. We can therefore divide (16.9) through by k 2 to obtain 2 2 rx + sy ry − sx + = k1 p. k k However, we have 2 2 k k2 k + = < k2 , k ≤ k1 k = r + s ≤ 2 2 2 2 2 so that 1 ≤ k1 < k which contradicts the minimality of k. The only case that didn’t lead to a contradiction was the case r = s = 0 and k = 1. Therefore k = 1 and p = x2 + y 2 is representable as the sum of two squares, as required. To provide an alternative proof, we work in the ring Z[i] or Gaussian integers given by Z[i] = {a + bi | a, b ∈ Z}, and i is a chosen square root of −1. It can be shown that Z[i] has unique factorization so that primes in Z[i] correspond to irreducibles in Z[i]. Recall that an element α is called prime when α | βγ =⇒ α | β or α | γ and an element α is called irreducible if α = βγ =⇒ |β| = 1 or |γ| = 1. Since p | (u2 + 1), we see that p | (u + i)(u − i). However, as neither up + p1 i nor up − p1 i lies in Z[i] (lest follows that it is reducible so that we can write p = αβ 1 p ∈ Z) we see that p is not prime in Z[i]. It (16.10) 106 CHAPTER 16. SUMS OF SQUARES for some α, β ∈ Z[i] with |α| = 6 1 and |β| = 6 1. Note that for any δ ∈ Z[i], |δ|2 ∈ Z. Indeed, if 2 2 2 δ = c + di then |δ| = c + d ∈ Z. From (16.10) we see that p2 = |α|2 |β|2 so that both of |α|2 and |β|2 are integers not equal to 1 that divide p2 . The only possibility is to have |α|2 = |β|2 = p. But this completes the proof since if α = a + bi, then p = |α|2 = a2 + b2 , as required. We now turn to the determination of the positive integers that can be written as the sum of three squares. The proof of the classification of such integers is difficult and so we can only provide an outline. We start with the statement of Gauss’ classification before proceeding to an outline of the proof. Theorem 36 (Gauss). A positive integer n can be written as the sum of three squares if and only if it is not of the form 4a (8b − 1). In particular, an odd integer n can be written as the sum of three squares if and only if n 6≡8 −1. The proof of Theorem 36 is split into establishing three equivalences: n is not of the form 4a (8b − 1) ⇐⇒ −n fails to be a square modulo some power of 2; −n fails to be a square modulo some power of 2 ⇐⇒ n = x2 + y 2 + z 2 for some x, y, z ∈ Q; n = x2 + y 2 + z 2 for some x, y, z ∈ Q ⇐⇒ n = x2 + y 2 + z 2 for some x, y, z ∈ Z. We will prove the first and third equivalences, but content ourselves with a very rough sketch of a proof of the second equivalence. We prove the first equivalence by way of Hensel’s Lemma. By the Chinese Remainder Theorem and the Fundamental Theorem of Arithmetic, solving polynomial congruences modulo positive integers is reduced to solving polynomial congruences modulo prime powers. Hensel’s Lemma allows us, under certain conditions, to further reduce this problem to the consideration of polynomial congruences modulo primes. Theorem 37 (Hensel’s Lemma). Let f be a polynomial with integer coefficients and p be a prime. If there exists an integer a such that vp (f (a)) > 2vp (f 0 (a)) (16.11) then f (x) ≡pj 0 has solutions for all j. Sketch of Proof. Assume the hypotheses and define a sequence of rational numbers {α0 , α1 , α2 , . . . } recursively by f (αi ) α0 = a, αi+1 = αi − 0 (i ≥ 0). f (αi ) One then shows that limi→∞ vp (f (αi )) = ∞. This implies that f has roots modulo every power of p since given a particular power pk , we need only choose an index i such that vp (f (αi )) ≥ k. We would then have f (αi ) ≡pk 0 107 as required. More comes out of the proof, however. One shows that for all i ≥ 0 we have vp (αi+1 − αi ) ≥ 2i , vp (f 0 (αi )) = vp (f 0 (a)), vp (f (αi )) ≥ 2i . We conclude that for each i, αi+1 ≡p(2i ) αi , f (αi ) ≡p(2i ) 0. Some remarks are in order. First of all, the condition vp (f (a)) > 2vp (f 0 (a)) implies in particular that vp (f (a)) > 0 since f 0 (a) in an integer and so has nonnegative p-adic valuation. Therefore, the situation in which we apply Hensel’s Lemma is the situation where f has a root modulo p. This is the case since vp (f (a)) > 0 ⇐⇒ f (a) ≡p 0. Now, since vp (f 0 (a)) = 0 ⇐⇒ f 0 (a) 6≡p 0, we see that Hensel’s Lemma applies automatically whenever f has a simple root modulo p. Indeed, if f (a) ≡p 0 but f 0 (a) 6≡p 0, then vp (f (a)) > 0 = 2vp (f 0 (a)). The point is that whenever f has a simple root modulo p, this root can be lifted to higher powers of p without bound to obtain roots of f modulo any power of p. This is the nonsingular special case of Hensel’s Lemma given by Corollary 6 below. Hensel’s Lemma can also be applied, however, when f 0 (a) ≡p 0 (the singular case) provided f vanishes at a modulo a sufficiently high power of p (larger than 2vp (f 0 (a))). Aside 2. Recall that the p-adic absolute value | · |p is defined by |x|p = x−vp (x) , and that we obtain the p-adic numbers by adding to the set of rationals all limits of Cauchy sequences of rationals. We have vp (x) is large when |x|p is small. We can re-write (16.11) as |f (a)| < |f 0 (a)|2 , and it turns out that f (x) ≡pj 0 having solutions for all j is equivalent to f (x) = 0 having solutions in p-adic numbers. So, all in all, Hensel’s Lemma can be interpreted as stating that if we can find an integer, a, such that f (a) is sufficiently close to zero, (closer than f 0 (a)2 ) then we can actually find a p-adic integer b (corresponding to the infinite sequence {α0 , α1 , α2 , . . . } constructed via Newton’s method) such that f (b) = 0. Often, the following “nonsingular” version of Hensel’s Lemma corresponding to when f has a simple root modulo p is sufficient. 108 CHAPTER 16. SUMS OF SQUARES Corollary 6 (Hensel’s Lemma: Nonsingular Case). Let f be a polynomial with integer coefficients and p be a prime. If there exists an integer a such that f (a) ≡p 0 and f 0 (a) 6≡p 0, then f (x) ≡pj 0 has solutions for all j. In words, if f has a simple root modulo p then f has roots modulo every power of p. Proof. By Theorem 37, it is enough to show that f (a) ≡p 0 and f 0 (a) 6≡p 0 implies that vp (f (a)) > 2vp (f 0 (a)). But this is clear since f (a) and f 0 (a) are integers so that vp (f (a)) and vp (f 0 (a)) are nonnegative. We can then restate our hypotheses as follows: f (a) ≡p 0 ⇐⇒ vp (f (a)) ≥ 1 f 0 (a) 6≡p 0 ⇐⇒ vp (f (a)) = 0. Therefore, in this case, we have vp (f (a)) ≥ 1 > 0 = 2 · 0 = 2vp (f 0 (a)). We now illustrate the utility of Hensel’s Lemma with an example. Example 23. Show that f (x) has roots modulo every power of 3 for the following polynomials f (x): (a) f (x) = x3 + x2 + x + 1; (b) f (x) = x2 + x + 223. Solution. (a) We consider f (x) modulo 3 to obtain x3 + x2 + x + 1 ≡3 0. We see that if there is a solution a, it must not be congruent to 0 modulo 3. By Fermat’s Little Theorem, we have a3 ≡3 a and a2 ≡3 1. The congruence then becomes a + 1 + a + 1 ≡3 0 ⇐⇒ 2(a + 1) ≡3 0 ⇐⇒ a + 1 ≡3 0 ⇐⇒ a ≡3 2. Further, f 0 (x) = 3x2 + 2x + 1 so that f 0 (a) ≡3 2a + 1. If we choose a ≡3 2 we then have f (a) ≡3 0 and f 0 (a) ≡3 (2)(2) + 1 6≡3 0. Therefore, by the “nonsingular” version of Hensel’s Lemma, we see that f (x) ≡3j 0 has solutions for all j. 109 (b) Here, the congruences in question are x2 + x + 223 ≡3j 0 (j ≥ 1). We start with the j = 1 case with hopes of being able to apply the nonsingular case of Hensel’s Lemma. Since 223 ≡3 2 + 2 + 3 ≡3 7 ≡3 1 we see that the congruence in question is given by x2 + x + 1 ≡3 0 ⇐⇒ x2 − 2x + 1 ≡3 0 ⇐⇒ (x − 1)2 ≡3 0 ⇐⇒ x ≡3 1. Now we compute f 0 (1) in hopes that it is not congruent to 0 modulo 3. We are not that lucky in this case since f 0 (x) = 2x + 1 so that f 0 (1) ≡3 2(1) + 1 ≡3 0. We now see that we need to apply the “singular” version of Hensel’s Lemma. We therefore take note of the 3-adic valuation of f 0 (1) and try to find a solution to f (x) ≡3j 0 for j = 2v3 (f 0 (1)) + 1. Since f 0 (1) = 2(1) + 1 = 3, we see that v3 (f 0 (1)) = v3 (3) = 1. We therefore seek a solution a to f (x) ≡33 0 for which v3 (f 0 (a)) = 1. Since 223 ≡27 7, the congruence in question is given by x2 + x + 7 ≡27 0. Now, we know that any solution must be a solution modulo 3 as well and so must be congruent to 1 modulo 3. Since 1 clearly is not a solution modulo 27, the next integer to try is 4. We compute f (4) ≡27 42 + 4 + 7 ≡27 27 ≡27 0. However, f 0 (4) = 2(4) + 1 = 9. Therefore v3 (f 0 (4)) = v3 (32 ) = 2. We therefore did not manage to find a root a of f modulo 33 for which v3 (f 0 (a)) = 1. Turning things around, however, since we know that v3 (f 0 (4)) = 2, we will be able to apply Hensel’s Lemma provided f (4) ≡35 0. This is in fact the case since f (4) = 44 + 4 + 223 = 243 = 35 ≡35 0. We conclude that v3 (f (4)) ≥ 5 > 4 = (2)(2) = 2v3 (f 0 (4)) so that Hensel’s Lemma applies. We therefore have roots of f modulo every power of 3. Hensel’s Lemma comes into our outline of a proof to Theorem 36 by establishing the first equivalence mentioned above. Lemma 22. Let n ∈ N. Then n is of the form 4a (8b − 1) if and only if −n is a square modulo every power of 2. 110 CHAPTER 16. SUMS OF SQUARES Proof. Let n ∈ N and suppose that n = 4a (8b − 1) for some a ≥ 0 and b ≥ 1. Define f (x) = x2 + n. We need to prove that f (x) ≡2j 0 has solutions for all j. By Hensel’s Lemma, it is sufficient to find an integer m such that v2 (f (m)) > 2v2 (f 0 (m)). That is, it is sufficient to find an integer m such that f (m) ≡2e 0 for e ≥ 2v2 (f 0 (m)) + 1. We note that 2 n = 4a (8b − 1) = 22a+3 b − (2a ) ≡22a+3 −(2a )2 . Therefore f (2a ) = (2a )2 + n ≡22a+3 (2a )2 − (2a )2 = 0. Finally, since v2 (f (2a )) ≥ 2a + 3 > 2a + 2 = 2(a + 1) = 2v2 (2 · 2a ) = 2v2 (f 0 (2a )), we see that Hensel’s Lemma applies and we obtain solutions to f (x) ≡2j 0 for all j. Consequently, −n is a square modulo every power of 2. Conversely, suppose that −n is a square modulo every power of 2. Write n = 2k m for m odd. We know that −n is a square modulo 2k+1 so that, for some x ∈ N we have −2k m ≡2k+1 x2 . But then, we can write x2 = −2k m + 2k+1 ` for some ` which implies that x2 = 2k (2` − m). But 2` − m is odd since m is odd, and so 2v2 (x) = v2 (x2 ) = v2 (2k (2` − m)) = v2 (2k ) + v2 (2` − m) = k + 0 = k. We conclude that k = 2a is even. Since this gives n = 4a m, we are reduced to proving that m ≡8 −1. We will do this by showing that −m is a square modulo 8 which yields the desired result since the only odd square modulo 8 is 1. To this end, we know that −n is a square modulo 22a+3 . Therefore, for some integer y we have − 4a m ≡22a+3 y 2 , which yields an integer j such that −4a m + 22a+3 j = y 2 =⇒ 22a (8j − m) = y 2 . We conclude that 2a | y so that y = 2a z for some z. We now see that (16.12) becomes −22a m ≡22a+3 22a z 2 =⇒ −m ≡23 z 2 as was to be shown. (16.12) 111 We now proceed to establishing the third equivalence given above by showing that, for a positive integer n, n is representable as the sum of three rational squares if and only if it is representable as the sum of three integral squares. This is the content of the following proposition. Proposition 20. Let f (X, Y, Z) = X 2 + Y 2 + Z 2 and n ∈ N. Then f (X, Y, Z) = n has a solution X, Y, Z ∈ Q if and only if f (X, Y, Z) = n has a solution X, Y, Z ∈ Z. Proof. We will denote f (x1 , x2 , x3 ) by f (x) where x = [x1 , x2 , x3 ]T is the associated column vector. We then see that f (x) = kxk2 = x · x, where k · k denotes the norm and · denotes the dot product defined on vectors. If f (x) = n has a solution in Z3 , then it is clear that f (x) = n has a solution in Q3 . In fact, we can use the same solution. What needs to be proved here is that the existence of a solution over the rationals implies the existence of a solution over the integers. Suppose then that f (x) = n has a solution v ∈ Q3 . For 1 ≤ i ≤ 3, write vi = ri /si for integers ri , si with si > 0. The equation f (v) = n becomes r22 r32 r12 + + = n. s21 s22 s23 Multiplying through by (s1 s2 s3 )2 yields (r1 s2 s3 )2 + (r2 s1 s3 )2 + (r3 s1 s2 )2 = (s1 s2 s3 )2 n. We see, therefore, that there exists a positive integer t such that t2 n = f (x) for some x ∈ Z3 . The set S = {t ∈ N | t2 n = f (x) for some x ∈ Z3 } is then nonempty and bounded below. By the least integer principle it has a least element t. Say f (x) = t2 n for x ∈ Z3 . We aim to prove that t = 1 so that n is represented by f over the integers. Towards a contradiction, suppose that t > 1. For 1 ≤ i ≤ 3, let yi be the closest integer to xi /t so that |yi − xi /t| ≤ 21 . Define z = y − 1t x. Then f (z) = z · z = 3 X j=1 3 xi 2 X 1 3 yi − = < 1. ≤ t 4 4 j=1 (16.13) If f (z) = 0, then kzk = 0 so that z = 0. This forces x = ty so that t2 n = f (x) = x · x = (ty) · (ty) = t2 (y · y) = t2 f (y). But then f (y) = n, (y ∈ Z3 ) and so we have a representation of n by f over the integers which forces t = 1. On the other hand, if f (z) 6= 0, then z · z > 0. Define x0 = ax + by, 112 CHAPTER 16. SUMS OF SQUARES for a = f (y) − n and b = 2nt − 2x · y. Then f (x0 ) = x0 · x0 = (ax + by) · (ax + by) = a2 x · x + 2abx · y + b2 y · y = a2 f (x) + ab(2nt − b) + b2 f (y) = a2 t2 n + 2abnt − ab2 + b2 (a + n) = a2 t2 n + 2abnt − ab2 + ab2 + b2 n = n(a2 t2 + 2abt + b2 ) = n(at + b)2 . Thus, with t0 = at + b we have nt02 = f (x0 ) for x0 ∈ Z3 . However, we have tt0 = at2 + bt = t2 y · y − t2 n + 2nt2 − 2tx · y = t2 y · y + t2 n − 2tx · y = t2 y · y − 2tx · y + x · x = (ty − x) · (ty − x) 1 1 = t2 y − x · y − x t t = t2 z · z. We conclude from (16.13) and our assumption that z · z > 0 that t0 = tz · z is positive and less than t. But this contradicts the minimality of t. We now provide a rough sketch of a proof of the second equivalence given above and put everything we have done so far together to obtain an outline of a proof of Theorem 36. Outline of Proof of Theorem 36. By Lemma 22, we know that n is not of the form 4a (8b − 1) if and only if −n fails to be a square modulo some power of 2. Also, by Proposition 20 we know that n can be written as the sum of three integral squares if and only if it can be written as the sum of three rational squares. All in all, we are reduced to proving that with f (X, Y, Z) = X 2 + Y 2 + Z 2 , we have f (x) = n has a solution x ∈ Q3 ⇐⇒ −n fails to be a square modulo some power of 2. (16.14) Unfortunately, proving this final equivalence lies beyond the scope of these notes. What is required is the theorem of Hasse-Minkowski that states that nondegenerate quadratic forms have rational solutions if and only if they have real solutions and p-adic solutions for all primes p. This applies to f and so f (x) = n has solutions over Q if and only if it has solutions over R and each of the p-adic fields Qp . The condition f (x) = n having solutions over R forces n > 0 and it turns out that we automatically obtain solutions over each p-adic field Qp for p odd. It therefore all comes down to the p = 2 case and it can be shown that having solutions over Q2 is equivalent to −n not being a square in Q2 . This, in turn, is equivalent to the right hand side of (16.14). 113 We now show how Lagrange’s four square theorem follows readily from Theorem 36. Theorem 38 (Lagrange). Every positive integer can be written as the sum of four squares. 2 Proof. Let n ∈ N and write n = 4k m for m not divisible by 4. Since 4k = 2k is a square, it is sufficient to prove that m is a sum of four squares. Indeed, if m = a2 + b2 + c2 + d2 , then n = (2k a)2 + (2k b)2 + (2k c)2 + (2k d)2 . If m 6≡8 −1, then we know from Theorem 36 that m can be written as the sum of three squares. Adding 02 to such an expression shows that every such m can be written as the sum of four squares. On the other hand, if m ≡8 −1, then Theorem 36 implies that m − 1 can be written as the sum of three squares. Indeed, we’d have 7 if a = 0; a a m − 1 ≡8 −2 ≡8 6 whereas 4 (8b − 1) ≡8 −4 ≡8 4 if a = 1; 0 if a ≥ 2. Writing m − 1 = a2 + b2 + c2 yields m = a2 + b2 + c2 + 12 is the sum of four squares. The following table lists the smallest representations of the positive integers less than or equal to 100 as the sum of squares. Here, by smallest, we mean that we use the least number of positive squares necessary, and then pick from all representations using this number of squares the smallest with respect to the lexicographic ordering. 1 = 12 2 = 12 + 12 3 = 12 + 12 + 12 4 = 22 5 = 12 + 22 6 = 12 + 12 + 22 7 = 12 + 12 + 12 + 22 8 = 22 + 22 9 = 32 10 = 12 + 32 11 = 12 + 12 + 32 12 = 22 + 22 + 22 13 = 22 + 32 14 = 12 + 22 + 32 15 = 12 + 12 + 22 + 32 16 = 42 17 = 12 + 42 18 = 32 + 32 19 = 12 + 32 + 32 20 = 22 + 42 21 = 12 + 22 + 42 22 = 22 + 32 + 32 23 = 12 + 22 + 32 + 32 24 = 22 + 22 + 42 25 = 52 26 = 12 + 52 27 = 12 + 12 + 52 28 = 12 + 12 + 12 + 52 29 = 22 + 52 30 = 12 + 22 + 52 31 = 12 + 12 + 22 + 52 32 = 42 + 42 33 = 12 + 42 + 42 34 = 32 + 52 35 = 12 + 32 + 52 36 = 62 37 = 12 + 62 38 = 12 + 12 + 62 39 = 12 + 12 + 12 + 62 40 = 22 + 62 41 = 42 + 52 42 = 12 + 42 + 52 43 = 32 + 32 + 52 44 = 22 + 22 + 62 45 = 32 + 62 46 = 12 + 32 + 62 47 = 12 + 12 + 32 + 62 48 = 42 + 42 + 42 49 = 72 50 = 12 + 72 51 = 12 + 12 + 72 52 = 42 + 62 53 = 22 + 72 54 = 12 + 22 + 72 55 = 12 + 12 + 22 + 72 56 = 22 + 42 + 62 57 = 22 + 22 + 72 58 = 32 + 72 59 = 12 + 32 + 72 60 = 12 + 12 + 32 + 72 61 = 52 + 62 62 = 12 + 52 + 62 63 = 12 + 12 + 52 + 62 64 = 82 65 = 12 + 82 66 = 12 + 12 + 82 67 = 32 + 32 + 72 68 = 22 + 82 69 = 12 + 22 + 82 70 = 32 + 52 + 62 71 = 12 + 32 + 52 + 62 72 = 62 + 62 73 = 32 + 82 74 = 52 + 72 75 = 12 + 52 + 72 76 = 22 + 62 + 62 77 = 22 + 32 + 82 78 = 22 + 52 + 72 79 = 12 + 22 + 52 + 72 80 = 42 + 82 81 = 92 82 = 12 + 92 83 = 12 + 12 + 92 84 = 22 + 42 + 82 85 = 22 + 92 86 = 12 + 22 + 92 87 = 12 + 12 + 22 + 92 88 = 42 + 62 + 62 89 = 52 + 82 90 = 32 + 92 91 = 12 + 32 + 92 92 = 12 + 12 + 32 + 92 93 = 22 + 52 + 82 94 = 22 + 32 + 92 95 = 12 + 22 + 32 + 92 96 = 42 + 42 + 82 97 = 42 + 92 98 = 72 + 72 99 = 12 + 72 + 72 100 = 102 114 CHAPTER 16. SUMS OF SQUARES We have seen that every positive integer can be written as the sum of four squares. However, we have allowed 02 to appear as one of the summands. The question arises: “What is the situation for nonzero squares?” By the following proposition, we see that infinitely many positive integers cannot be written as the sum of four nonzero squares but that every sufficiently large positive integer can be written as the sum of five nonzero squares. Proposition 21. (a) No odd power of 2 can be written as the sum of four nonzero squares. (b) Every integer n > 169 can be written as the sum of five nonzero squares. Proof. (a) We prove the result by way of the least integer principle. Let S = {r ∈ N0 | 22r+1 can be written as the sum of four nonzero squares}. We aim to show that S = ∅. Suppose not. Then S is a nonempty set of integers that is bounded below. By the least integer principle it has a least element r. Say 22r+1 = x2 + y 2 + z 2 + w2 , (1 ≤ x ≤ y ≤ z ≤ w). (16.15) Since 22r+1 = x2 + y 2 + z 2 + w2 ≥ 12 + 12 + 12 + 12 = 4 = 22 , we see that 2r + 1 ≥ 2 so that r ≥ 1. Since this implies that 2r + 1 ≥ 3, we see that x2 + y 2 + z 2 + w2 ≡8 22r+1 ≡8 0. Since every odd square is congruent to 1 modulo 8, we see that 0, 2 or 4 of x, y, z, w are odd. However, a calculation shows that ( 4 if all of x, y, z, w are odd; 2 2 2 2 x + y + z + w ≡8 2 or 4 if exactly two of x, y, z, w are odd. Therefore all of x, y, z, w are even. We now divide (16.15) by 4 to obtain 22(r−1)+1 = 22r−1 = x 2 2 + y 2 2 + z 2 2 + w 2 2 . But r − 1 ≥ 0, and so this last equation implies that r − 1 ∈ S thereby contradicting the minimality of r. We must then have S = ∅ as was to be shown. (b) Observe that 169 = 132 (16.16) 2 = 5 + 12 2 2 2 2 2 (16.17) = 3 + 4 + 12 2 2 (16.18) 2 = 1 + 2 + 8 + 10 . (16.19) 115 Now let n > 169 so that n − 169 is a positive integer. By Lagrange’s four square theorem we know that n − 169 can be written as the sum of four squares. Also, since n − 169 > 0, at least one of the squares involved in such a representation must be nonzero. We can therefore write n − 169 = x2 + y 2 + z 2 + w2 (0 ≤ x ≤ y ≤ z ≤ w, w 6= 0). We then have 2 13 + x2 + y 2 + z 2 + w2 52 + 122 + y 2 + z 2 + w2 n= 32 + 42 + 122 + z 2 + w2 2 1 + 22 + 82 + 102 + w2 if if if if x, y, z 6= 0; x = 0, y, z 6= 0; x = y = 0, z 6= 0; x = y = z = 0. In any case, we have represented n as the sum of five nonzero squares, as required. 116 CHAPTER 16. SUMS OF SQUARES Chapter 17 x2 − N y 2 = 1 This chapter is based on [Dud08, §20]. We have been interested in this course in the values taken on by polynomials in two variables having integer coefficients for which every term has the same degree. Such polynomials are called binary homogeneous forms. We have already studied this question for linear forms as well as a particular case of a quadratic form. In the linear case, we ask which values c are taken on by the form ax + by. We found that c is not taken on by this form when (a, b) - c while when (a, b) | c, c is taken on by this form infinitely often. Indeed, we found, in case (a, b) | c, by way of the Euclidean Algorithm that c is taken on by the form ax + by at least once and that if ax0 + by0 = c then c is taken on by the form infinitely often as is seen by taking x = x0 + t b , (a, b) y = y0 − t a , (a, b) (t ∈ Z). The next level of complexity is given by binary quadratic forms. In this case, we ask which integers c are taken on by the form ax2 + bxy + dy 2 (a, b, d ∈ Z). The particular case obtained by setting a = d = 1 and b = 0 was studied earlier in this course. This case calls for the classification of the integers c that can be written as the sum of two squares. In this section we consider another binary quadratic form, namely x2 − dy 2 for d ∈ Z. We will show that for d positive and not a square, we obtain infinitely many representations of 1 by this form. So, we can obtain infinitely many representations for a given integer by binary linear forms and binary quadratic forms. This is where it stops, however, as is shown by the following theorem. Pn Theorem 39. Let f (x) = k=0 ak xk be a polynomial of degree n ≥ 3 with no repeated roots. Then, for any integer c, the binary form F of degree n given by F (x, y) = n X k=0 takes on the value c only finitely often. 117 ak xk y n−k CHAPTER 17. X 2 − N Y 2 = 1 118 We now turn to determining the solutions to Pell’s equation x2 − dy 2 = 1. That is, we determine when 1 can be represented by the binary quadratic form x2 −dy 2 . First of all, since we are dealing with squares, the nontrivial solutions (xy 6= 0) are determined by the positive solutions (x, y > 0). Also, the equation is not interesting when d = 0. Further, by the following lemma, in searching for positive solutions, we can assume that d is positive and not a square. Lemma 23. If d < 0 or d is a square, then there are no nontrivial solutions to x2 − dy 2 = 1. Proof. If d < 0 and neither x nor y is zero then x2 , y 2 ≥ 1 and d ≤ −1. Therefore x2 − dy 2 ≥ 1 + 1 = 2 > 1. We therefore have no nontrivial solutions. If d = m2 is a square, then our equation becomes x2 − (my)2 = (x − my)(x + my) = 1. Therefore x − my = x + my = ±1. In particular, we have 2my = 0 so that y = 0. We therefore have a trivial solution. We therefore arrive at the determination of the positive solutions to x2 − dy 2 = 1 for d ∈ N not a square. The observation that we can factor our equation as √ √ (x + y d)(x − y d) = 1, √ leads us to the question of when a real number of the form x + y d √ has an inverse of the same form. To answer √ this question, we develop some properties of the set Z[ d] of numbers of this form and its subset Z[ d]× consisting of the numbers of this form having inverse also of this form. Definition 19. Let d ∈ N not be a square. We define √ √ 1. Z[ d] = {a + b d | a, b ∈ Z} √ √ √ √ 2. Z[ d]× = {a + b d | a, b ∈ Z, (a + b d)(r + s d) = 1 for some r, s ∈ Z} Remark 8. For d ∈ N not a square, the set √ √ Q( d) = {a + b d | a, b ∈ Q} √ is a real quadratic field and the subset Z[ d] consists of elements that satisfy a quadratic irreducible monic polynomial with integer coefficients. The set of all √ such elements is called the ring of integers √ in the number√ field Q( d). In fact, unless d ≡4 1, Z[ d] will actually be equal to the ring of integers in Q( d). In case d ≡4 1, we get more elementsh in thei ring n of integers, √ and it can be o shown √ √ 1+ d 1+ d that the entire ring of integers in Q( d) is given by Z = a+b a, b ∈ Z . 2 2 √ √ It will be important for us to note that Z[ d] is an integral domain and that Z[ d]× is an abelian group. The relevant definitions are given below. Definition 20. 1. An abelian group is a set G together with a binary operation ∗ : G × G → G such that 119 (a) (Closure under ∗) For all a, b ∈ G, a ∗ b ∈ G; (b) (Associativity of ∗) For all a, b, c ∈ G we have (a ∗ b) ∗ c = a ∗ (b ∗ c); (c) (Existence of Identity) There exists an element e ∈ G such that for all a ∈ G we have a ∗ e = e ∗ a = a; (d) (Existence of Inverse) For all a ∈ G there exists b ∈ G such that a ∗ b = b ∗ a = e; (e) (Commutativity of ∗) For all a, b ∈ G, a ∗ b = b ∗ a. 2. An integral domain is a set D together with binary operations + and · such that (a) (D, +) is an abelian group (with identity element denoted by 0) (b) (Closure under ·) For all a, b ∈ D, a · b ∈ D; (c) (Associativity of ·) For all a, b, c ∈ D we have (a · b) · c = a · (b · c); (d) (Existence of Identity) There exists an element 1 ∈ D such that for all a ∈ D we have a · 1 = 1 · a = a; (e) (Commutativity of ·) For all a, b ∈ D, a · b = b · a. (f) (D fails to have zero divisors) For all a, b ∈ D, if ab = 0 then a = 0 or b = 0. (g) (· distributes over +) For all a, b, c ∈ D we have a · (b + c) = (a · b) + (a · c) Remark 9. We denote a · b by ab when it proves convenient to do so. We note that when D is an integral domain, the subset D× consisting of all invertible elements is an abelian group under ·. Also, fields are precisely the sets of numbers obtained by taking quotients of elements in an integral domain. For example, Z is an integral domain, √ and Q is its field of quotients. By the proposition d] is an integral domain and the real quadratic field below, we obtain another example of this: Z[ √ Q( d) is its field of quotients. √ √ Proposition 22. Let d ∈ Z. Then Z[ d] is an integral domain. In particular, Z[ d]× is an abelian group. √ √ Proof. Since Z[ d] ⊆ C, and we are using the usual operations, the verification that Z[ d] is an integral domain reduces to the verification of closure under the operations in question. Indeed, any of the axioms defining an integral domain that hold in C will automatically hold in subsets of C as long as we remain inside the subset when we apply the operations in question. In our case, we are reduced to verifying the following: √ √ 1. For all α, β ∈ Z[ d] we have α + β ∈ Z[ d]; √ 2. 0 ∈ Z[ d]; √ √ 3. For all α ∈ Z[ d] we have −α ∈ Z[ d]; √ √ 4. For all α, β ∈ Z[ d] we have αβ ∈ Z[ d]; √ 5. 1 ∈ Z[ d]. CHAPTER 17. X 2 − N Y 2 = 1 120 √ In each case, we simply need to note that the element in question can be written in the form x+y d for integers x, y. But this follows from a quick calculation together with the fact that Z is closed under the operations in question: √ √ √ 1. If α = a + b d and β = r + s d for integers a, b, r, s, then α + β = (a + r) + (b + s) d is of the desired form since a + r and b + s are integers; √ 2. 0 = 0 + 0 d is of the desired form since 0 ∈ Z; √ √ 3. If α = a + b d for integers a and b, then −α = (−a) + (−b) d is of the desired form since −a, −b ∈ Z; √ √ √ 4. If α = a + b d and β = r + s d for integers a, b, r and s, then αβ = (ar + dbs) + (as + br) d is of the desired form since ar + dbs, as + br ∈ Z; √ 5. 1 = 1 + 0 d is of the desired form since 1, 0 ∈ Z. Aside 3. Any field extension of Q of finite degree (i.e. any number field) can be seen to be a finite-dimensional vector we are dealing with the quadratic √ √ space over Q. In our case of interest, √ field Q( d) = {a + b d | a, b ∈ Q}. It follows from the fact that d is√irrational that every element √ of Q( d) can√be written uniquely as a linear combination of 1 and d √ with rational coefficients. Therefore Q(√ d) is a vector space over Q of dimension 2 and basis {1, d}. A similar argument applies to Z[ d], but we need to use different language to express this fact. This is due to the fact that vector spaces are only defined when the scalars come from a field like Q, R or C or a number field or a finite field like the integers modulo a prime. In case the scalars come from an integral domain like Z, we use the notion of a module. Since we don’t always obtain bases for modules over an integral domain, we attach the word “free” to the description in case a basis exists. The √ following proposition √ can be seen as expressing the fact that Z[ d] is a free module of dimension 2 over Z with basis {1, d}. √ √ Proposition 23. Every element α ∈ Z[ d] can be written uniquely in the form α = a + b d for a, b ∈ Z. That is, for a, b, r, s ∈ Z we have √ √ a + b d = r + s d ⇐⇒ a = r and b = s. √ √ Proof. All we need to note here is that a + b d = r + s d forces √ (b − s) d = (r − a). Therefore, b − s 6= 0 would imply that √ d= r−a ∈Q b−s contrary to our assumption that d is not a square. Therefore b − s = 0 so that b = s. We then have r = a as well. √ √ From now on, when we say that α = a + b d ∈ Z[ d], it will be understood that a, b ∈ Z. In light of the previous result, a and b are uniquely determined by√α. We call a and b the components of α. We now define the conjugate and norm of elements of Z[ d]. 121 √ √ Definition 21. Let α = a + b d ∈ Z[ d]. √ √ 1. The conjugate of α, denoted α is defined by α = a − b d ∈ Z[ d]. 2. The norm of α, denoted N (α) is defined by N (α) = αα = a2 − db2 ∈ Z. √ We can now rephrase our problem as the search for all α ∈ Z[ d] having positive components and norm equal to 1. We will use the following result to show that once a single α having positive components and norm equal to 1 can be found, all powers of α are also such elements. √ Proposition 24. Let α, β ∈ Z[ d]. Then N (αβ) = N (α)N (β). √ √ Proof. Suppose that α = a + b d and that β = r + s d for integers a, b, r, s. We compute √ N (αβ) = N (ar + dbs) + (as + br) d = (ar + dbs)2 − d(as + br)2 = a2 r2 + 2dabrs + d2 b2 s2 − da2 s2 − 2dabrs − db2 r2 = a2 r2 − d(a2 s2 + b2 r2 ) + d2 b2 s2 = (a2 − db2 )(r2 − ds2 ) = N (α)N (β). √ √ We now show that Z[ d]× consists precisely of the elements of Z[ d] having norm equal to 1 or −1. Our problem is then to determine when we get +1 rather than −1. √ √ Proposition 25. Let d ∈ N not be a square. Then Z[ d]× = {α ∈ Z[ d] | N (α) = ±1}. √ √ Proof. Let α ∈ Z[ d]× . Then there exists β ∈ Z[ d] such that αβ = 1. Taking norms of both sides yields N (α)N (β) = N (αβ) = N (1) = 1. √ Since N (α) ∈ Z, we conclude that N (α) ∈ {±1}. Conversely, suppose that α ∈ Z[ d] satisfies N (α) = ±1. We then have αα = ±1. √ We conclude √ × that one of α, −α is the inverse of α. Therefore, α is invertible in Z[ d] so that α ∈ Z[ d] , as required. √ √ Having established that Z[ d]× consists of the elements α ∈ Z[ d] with norm equal to ±1, we see that the invertible elements are split into two categories. We have the ones of norm 1 that correspond to solutions to x2 − dy 2 = 1 and the ones of norm −1 that correspond to solutions to x2√ − dy 2 = −1. Since we are interested here only in the ones giving +1, we introduce the notation Z[ d]× + for these elements. That is √ √ Z[ d]× + = {α ∈ Z[ d] | N (α) = 1}. √ Our problem can now be formulated as determining the elements of Z[ d]× + having positive components. The outline of this classification is as follows: CHAPTER 17. X 2 − N Y 2 = 1 122 √ 1. We show that for α, β ∈ Z[ d]× + , we have α < β if and only if their first components satisfy the same inequality. 2. This allows us to order the elements of interest by ordering their first components. 3. We then apply the least integer principle to the set of first components of the elements of interest to obtain a least first component. 4. We show that this corresponds to a least θ > 1, called the generator for x2 − dy 2 = 1, among our elements of interest. 5. We show that every element of interest greater than one is a positive power of θ. √ 6. We conclude that all positive solutions to x2 − dy 2 = 1 correspond to x + y d being a positive power of θ. √ Among the above steps in the classification of the elements of Z[ d]× + with positive components, the one that is the most difficult to establish is the fact that the set of first components of the elements of interest is nonempty. This is required in order to apply the least integer principle. Equivalently, it is difficult to prove that there exists at least one positive solution to x2 − dy 2 = 1, but once we know that there is at least one, it isn’t too difficult to describe the rest of the solutions. We now complete the classification of the elements of interest by establishing (1)–(6) above. We start with the verification of (1). √ √ √ Proposition 26. Let α = a + b d and β = r + s d lie in Z[ d]× + have positive components. Then α < β ⇐⇒ a < r. Proof. We have a2 − db2 = r2 − ds2 = 1, and a, b, d, r, s ≥ 1. Therefore a < r =⇒ a2 < r2 =⇒ db2 + 1 < ds2 + 1 =⇒ db2 < ds2 =⇒ b2 < s2 =⇒ b < s √ √ =⇒ a + b d < r + s d =⇒ α < β. Proving the converse is entirely similar: a ≥ r =⇒ a2 ≥ r2 =⇒ db2 + 1 ≥ ds2 + 1 =⇒ db2 ≥ ds2 =⇒ b2 ≥ s2 =⇒ b ≥ s √ √ =⇒ a + b d ≥ r + s d =⇒ α ≥ β. 123 As outlined above, we now define √ √ S = {a ∈ N | a + b d ∈ Z[ d]× + for some b ∈ N}; √ √ SZ[√d] = {a + b d ∈ Z[ d] | a, b ∈ N, a ∈ S} √ √ = {a + b d ∈ Z[ d]× + | a, b ∈ N}, invoke the least integer principle to obtain a least element in S and then show that this corresponds to a least element θ of SZ[√d] whose positive powers provide us with all of the positive solutions to x2 − dy 2 = 1. Since proving that S is nonempty is harder than the rest of the steps involved, we will save the proof of this fact for last. Proposition 27. With the above notation, we have S 6= ∅. Assuming this result for the time being, we see that we can invoke the least integer principle to obtain a least element√a of S √since S is bounded below. Since a ∈ S, we then have a positive b since the components of integer b such that a + b d ∈ Z[ d]× + . In fact, there is exactly one such √ our elements of interest are uniquely determined. Define θ = a + b d. We now show that SZ[√d] has θ as a minimum. Lemma 24. With the above notation, we have √ (a) SZ[√d] = Z[ d]× + ∩ (1, ∞) (b) θ = min SZ[√d] . Proof. (a) It is clear √ that if α√has positive components, then α > 1. Conversely, suppose that α = a + b d ∈ Z[ d]× + is greater than 1. Then, since αα = 1, we conclude that 0 < α < 1. We have √ √ 1 < α = α + 2b d < 1 + 2b d. √ Consequently, 2b d > 0 so that b > 0. Having established that b is positive, we now obtain √ α > 0 =⇒ a − b d > 0 √ =⇒ a > b d > 0. Therefore, α has positive components so that α ∈ SZ[√d] , as required. (b) To prove this part, we first note that since θ ∈ SZ[√d] by construction, we are reduced to proving that for any α ∈ SZ[√d] , we have θ ≤ α. But this follows readily from Proposition √ 26. Indeed, if α = r + s d for positive r and s, then r ∈ S so that a ≤ r since a is the least element of S. We conclude that θ ≤ α, as required. We now show that every element of interest is a positive power of θ. CHAPTER 17. X 2 − N Y 2 = 1 124 Proposition 28. With the above notation, we have SZ[√d] = {θk | k ∈ N}. Proof. Since θ ∈ SZ[√d] and SZ[√d] is closed under multiplication, a routine induction shows that {θk | k ∈ N} ⊆ SZ[√d] . Conversely, suppose that α ∈ SZ[√d] . Then 1 < θ ≤ α so that α lies between two consecutive positive powers of θ. Say θk ≤ α < θk+1 , (k ∈ N). We obtain upon dividing by θk that 1 ≤ θ−k α < θ. We must then have θ−k α = 1 for otherwise θ−k α ∈ SZ[√d] due to part (a) of Lemma 24. Since θ−k α is strictly smaller than θ, this would contradict the minimality of θ. Therefore θ−k α = 1 so that α = θk . Since this implies that SZ[√d] ⊆ {θk | k ∈ N}, and we have already established the reverse containment, we conclude that SZ[√d] = {θk | k ∈ N} as required. We now have all that is required to classify the positive solutions to x2 − dy 2 = 1. Theorem 40. Let d ∈ N not be a square and θ be the generator for x2 − dy 2 = 1. The positive solutions to x2 − dy 2 = 1 are precisely the components of the positive powers of θ. That is, all solutions to x2 − dy 2 = 1 in positive integers are given by k θk + θ x= , 2 θk − θ √ y= 2 d k (k ∈ N) Proof. We have seen that the positive solutions correspond to the elements of SZ[√d] all of which are positive positive powers of θ. It follows that the positive solutions to x2 − dy 2 = 1 are given by the components of such elements. The last part follows from √ the observation that the formulae given extract the components of θk . Indeed, if θk = ak + bk d, then √ √ θk = ak + bk d, θk = ak − bk d. k If we solve these equations for ak and bk , and use the fact that θk = θ , we obtain k θk + θ ak = , 2 as required. θk − θ √ bk = 2 d k 125 We now turn to the proof that S 6= ∅. That is, we establish that there exist positive integers a and b such that a2 − db2 = 1. Proof of Proposition 27. We start with a proposition due to Dirichlet regarding approximating irrational numbers by rational numbers. The proof requires invoking the Pigeonhole Principle. This principle states that if one has n + 1 pigeons to place in n pigeonholes then at least one of the pigeonholes contains at least two pigeons. Proposition 29. Let ξ ∈ R \ Q then there exist infinitely many rational numbers x/y with x, y relatively prime such that ξ − x < 1 . y y2 Proof. Let n ∈ N and consider the partition of the half-open unit interval given by · 1 · 1 2 · n−2 n−1 · n−1 [0, 1) = 0, ∪ , ∪ ... ∪ , ∪ ,1 . n n n n n n Recall that for real numbers α, the floor of α, denoted bαc, is defined to be the largest integer less than or equal to α and the fractional part of α, denoted {α}, is defined by {α} = α − bαc. It is clear that for any α ∈ R, we have {α} ∈ [0, 1). Consider the following list of numbers: {0ξ}, {1ξ}, {2ξ}, . . . , {nξ} ∈ [0, 1). These n + 1 numbers (representing pigeons) all lie in one of the n subintervals of [0, 1) listed above (representing the pigeonholes). By the Pigeonhole Principle, we conclude that at least one of the subintervals above contains at least two of the numbers listed above. That is, for some 0 ≤ j ≤ n−1, there exist integers k and ` with 0 ≤ k < ` ≤ n such that j j+1 {kξ}, {`ξ} ∈ , . n n Thus 1 . n Using the floor rather than the fractional part, we obtain |{`ξ} − {kξ}| < |(`ξ − b`ξc) − (kξ − bkξc)| < We therefore have 1 . n 1 . n Let a = b`ξc − bkξc, b = ` − k, g = gcd(a, b) and define x = a/g, y = b/g. We then have (x, y) = 1 and 1 |gyξ − gx| < . n Dividing by gy and noticing that y < n and g ≥ 1 yields ξ − x < 1 ≤ 1 < 1 . y ngy ny y2 |(` − k)ξ − (b`ξc − bkξc)| < CHAPTER 17. X 2 − N Y 2 = 1 126 We have therefore shown that there exists a rational x/y with (x, y) = 1 such that ξ − x < 1 . y y2 Now, since ξ is irrational, we have ξ − x > 0. y We can then choose an integer m such that 1 m > ξ − xy and apply the above argument with n = m to obtain relatively prime integers x1 , y1 with 0 < y1 < m such that ξ − x1 < 1 y1 y12 and in fact ξ − x1 < 1 < 1 y1 my1 y1 ξ − x ≤ ξ − x . y y Repeating the process provides us with relatively prime integers x2 , y2 with y2 > 0 such that ξ − x2 < 1 y2 y22 and ξ − x2 < ξ − x1 < ξ − x . y2 y1 y Continuing in this fashion, we inductively obtain an infinite sequence of rationals xk /yk (k ≥ 1) such that ξ − x > ξ − x1 > ξ − x2 > ξ − x3 > · · · > 0 y y1 y2 y3 and ξ − xk < 1 yk yk2 for all k. We use Proposition 29 to prove the following proposition. Proposition 30. If d ∈ N is not a square, then the inequality √ |x2 − dy 2 | < 1 + 2 d. has infinitely many integer solutions. 127 √ Proof. Since d is positive and not a square, we know that d ∈ R \ Q. By Proposition 29, we therefore obtain infinitely many rational numbers x/y with x, y relatively prime such that √ d − x < 1 . y y2 Multiplying by |y| yields √ 1 . x − y d < |y| By the triangle inequality, we then have √ √ √ √ √ √ 1 + 2|y| d. x + y d = (x − y d) + 2y d ≤ x − y d + 2|y| d < |y| Thus √ √ √ √ 2 x − dy 2 = x + y d x − y d < 1 + 2 d ≤ 1 + 2 d. 2 y This proves the claim. We now apply Proposition 30 to establish that x2 − dy 2 = 1 has a solution in positive integers x, y. By Proposition 30, we know that there are infinitely many integers x, y such that √ 2 x − dy 2 < 1 + 2 d. √ There must then exist an integer m such that 1 ≤ |m| < 1 + 2 d and x2 − dy 2 = m for infinitely many integers x and y. This can be seen by applying an extended version of the Pigeonhole Principle. In particular, we can find two solutions (x1 , y1 ) and (x2 , y2 ) such that x1 6= ±x2 but x1 ≡|m| x2 and y1 ≡|m| y2 . Again, this can be seen by applying an extended version of the Pigeonhole Principle and using the √ only finitely many congruence classes √ fact that there are modulo |m|. Now define α = x1 − y1 d and β = x2 − y2 d. We then have √ √ √ αβ = (x1 − y1 d)(x2 + y2 d) = (x1 x2 − dy1 y2 ) + (x1 y2 − x2 y1 ) d. By construction, the components of αβ are congruent to 0 modulo |m|. We can therefore write √ αβ = ma + mb d for some integers a and b. Taking the norm of both sides yields m2 = |m||m| = N (α)N (β) = m2 a2 − dm2 b2 . Consequently, a2 − db2 = 1. To complete the proof, we need only establish that ab 6= 0. Now, a 6= 0 since otherwise −db2 = 1 and the left hand side is negative whereas the right hand side is positive. Also, if b = 0 then a = ±1 so that αβ = ±m. Multiplying by β yields mα = ±mβ so that α = ±β. But this forces x1 = ±x2 which is a contradiction. Therefore, x2 − dy 2 = 1 has a nontrivial solution, as required. CHAPTER 17. X 2 − N Y 2 = 1 128 We close this section with an example. Example 24. Find all positive solutions to x2 − dy 2 = 1 for d ∈ {2, 3, 5}. √ Solution. For each value of d in question, we need to find the generator θ = a+b d for x2 −dy 2 = 1. We know that it will correspond to the least √ positive √ value of x that yields a positive solution. Also, since the corresponding element θ = a + b d of Z[ d]× + will be greater than 1, we know that a > 1. We know that a is the least integer greater than 1 such that a2 − 1 = db2 for some b ∈ N. In particular, we require a2 ≡d 1. This implies that a must be congruent to ±1 modulo the prime divisors of d. In particular, if d = p is a prime, then a ∈ {p − 1, p + 1, 2p − 1, 2p + 1, . . . }. We then look at the numbers a2 − 1 p a ∈ {p − 1, p + 1, 2p − 1, 2p + 1, . . . } in ascending order until we come across a square. Once a square is found, we have located the generator whose positive powers yield the positive solutions to x2 − py 2 = 1. 1. For d = 2 we have 32 − 1 = 4 = 22 2 √ so we obtain the generator θ = 3 + 2 2. 2. For d = 3 we have 22 − 1 = 1 = 12 3 so we obtain the generator θ = 2 + √ 3. 3. For d = 5 we have 42 − 1 = 3; 5 2 6 −1 = 7; 5 92 − 1 = 16 = 42 , 5 √ so we obtain the generator θ = 9 + 4 5. In each case, the positive solutions are given by the components of the powers of θ. Bibliography [Dud08] Underwood Dudley, Elementary number theory, 2nd ed., Dover Publications, 2008. [Ser73] Jean-Pierre Serre, A course in arithmetic, Springer, 1973. 129