EC119 Mathematical Analysis

Transcription

Nicholas Jackson
Autumn Term 2010–2011
ii
CONTENTS
Contents
0 Introduction
0.1 Course description .
0.2 Aims and objectives
0.3 Prerequisites . . . .
0.4 Assessment . . . . .
0.5 Assignments . . . . .
0.6 Recommended books
0.7 Learning methods .
0.8 The Greek alphabet
.
.
.
.
.
.
.
.
1
1
1
1
1
1
2
2
3
1 Set theory
1.1 Sets and elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Real Numbers
2.1 The Natural Numbers . . . . . . . . .
2.2 The Integers . . . . . . . . . . . . . . .
2.3 The Rational Numbers . . . . . . . . .
2.4 Irrational and real numbers . . . . . .
2.5 Some further results for real numbers .
2.5.1 Classifying decimals as rational
2.5.2 Absolute value . . . . . . . . .
2.5.3 The Triangle Inequality . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
9
9
11
12
12
12
12
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
13
14
14
14
14
14
14
15
15
16
16
17
17
17
18
18
18
19
20
20
4 Proof and reasoning
4.1 Some mathematical terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Why do we need to prove results? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
21
21
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
or irrational
. . . . . . .
. . . . . . .
3 Complex Numbers
3.1 General Definitions . . . . . . . . . . . . . . . .
3.2 Algebra of Complex Numbers . . . . . . . . . .
3.2.1 Addition, negative and subtraction . . .
3.2.2 Multiplication . . . . . . . . . . . . . . .
3.2.3 Equality . . . . . . . . . . . . . . . . . .
3.3 Complex conjugate . . . . . . . . . . . . . . . .
3.3.1 Multiplicative Inverse . . . . . . . . . .
3.3.2 Division . . . . . . . . . . . . . . . . . .
3.4 The Fundamental Theorem of Algebra . . . . .
3.5 Geometric interpretation and the polar form . .
3.5.1 Complex multiplication . . . . . . . . .
3.5.2 Powers: De Moivre’s theorem . . . . . .
3.5.3 Reciprocal . . . . . . . . . . . . . . . . .
3.5.4 Division . . . . . . . . . . . . . . . . . .
3.5.5 Conjugate . . . . . . . . . . . . . . . . .
3.6 Roots of polynomials . . . . . . . . . . . . . . .
3.6.1 Cube roots of unity . . . . . . . . . . .
3.6.2 The nth roots of unity: generalisation of
3.7 Exponential form . . . . . . . . . . . . . . . . .
3.7.1 A remarkable result . . . . . . . . . . .
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
the above
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
iii
4.3
4.4
CONTENTS
Methods of Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mathematical induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Functions
5.1 Domain and codomain . . . . . . . . . . . . . . . . .
5.2 Interval Notation for Subsets of R . . . . . . . . . .
5.3 Images and image sets . . . . . . . . . . . . . . . . .
5.4 One-one functions: injections . . . . . . . . . . . . .
5.5 Onto functions: surjections . . . . . . . . . . . . . .
5.6 One-one and onto functions: bijections . . . . . . . .
5.7 Composition of functions . . . . . . . . . . . . . . .
5.8 Inverse Functions . . . . . . . . . . . . . . . . . . . .
5.9 Real-valued functions . . . . . . . . . . . . . . . . . .
5.10 An alternative definition of function (non-examined)
21
22
.
.
.
.
.
.
.
.
.
.
25
26
26
26
27
27
27
28
28
29
30
6 Counting
6.1 Finite Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Infinite sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Countable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
31
31
31
7 Limits
7.1 Some well-behaved functions . . . .
7.2 Some general properties of limits . .
7.3 Further properties of limits . . . . .
7.4 An important limit (1) . . . . . . . .
7.5 An important limit (2) . . . . . . . .
7.6 Some more properties and examples
7.7 Limits when x → ±∞ . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
35
36
36
37
39
40
8 Continuity
8.1 The Intermediate Value Theorem . . .
8.2 Numerical methods for solving f(x)=0
8.2.1 The Newton–Raphson method
8.2.2 The bisection method . . . . .
8.2.3 Direct iteration . . . . . . . . .
8.3 Monotone functions . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
42
43
43
43
43
44
9 Differentiability
9.1 Differentiation rules . . . . . . . . . . . . . . . . . . .
9.2 Differentiating composite functions . . . . . . . . . . .
9.3 Differentiating inverse functions . . . . . . . . . . . . .
9.4 Table of derivatives . . . . . . . . . . . . . . . . . . . .
9.5 Leibniz’ Theorem (the extended product rule) . . . . .
9.6 Rolle’s Theorem . . . . . . . . . . . . . . . . . . . . .
9.6.1 An application of Rolle’s Theorem . . . . . . .
9.7 The Mean Value Theorem . . . . . . . . . . . . . . . .
9.7.1 Alternative forms of the Mean Value Theorem
9.8 The Cauchy Mean Value Theorem . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
45
46
46
46
47
47
47
48
48
49
50
10 L’Hˆ
opital’s Rule
CONTENTS
51
iv
CONTENTS
10.1 Limits when f (x) → ±∞ and g(x) → ±∞ as x → a (type ∞/∞) . . . . . . . . . . . . . . . . .
10.2 Limits as x → ±∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3 Further types: 00 , ∞0 , 1∞ , etc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11 Taylor’s Theorem
11.1 Taylor’s Theorem – the nth Mean Value Theorem
11.2 Taylor and Maclaurin Series . . . . . . . . . . . . .
11.3 Remarks . . . . . . . . . . . . . . . . . . . . . . . .
11.4 Some Maclaurin series . . . . . . . . . . . . . . . .
11.5 Sequences . . . . . . . . . . . . . . . . . . . . . . .
11.6 Series . . . . . . . . . . . . . . . . . . . . . . . . .
11.7 Convergence of series . . . . . . . . . . . . . . . . .
11.8 D’Alembert’s ratio test . . . . . . . . . . . . . . . .
52
52
52
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
53
53
55
56
58
59
59
61
61
12 Integration
12.1 Indefinite Integration . . . . . . . . . . . . . . . . . . .
12.2 Linearity Properties . . . . . . . . . . . . . . . . . . .
12.3 Integral types and methods of integration . . . . . . .
R
12.3.1 Integrals of the form g(f (x))f 0 (x) dx . . . . .
12.3.2 Integrals re-expressed in partial fraction form .
12.3.3 Integration by parts . . . . . . . . . . . . . . .
R
12.3.4 Integrals of the form xα ln x dx . . . . . . . .
12.4 Definite integration and the Riemann integral . . . . .
12.4.1 The Newton–Leibniz approach . . . . . . . . .
12.4.2 The Riemann approach . . . . . . . . . . . . .
12.4.3 Some properties of definite integrals . . . . . .
12.4.4 The Mean Value Theorem for definite integrals
12.4.5 The Fundamental Theorem of Calculus . . . .
12.5 Improper Integrals . . . . . . . . . . . . . . . . . . . .
12.5.1 Infinite Integrals . . . . . . . . . . . . . . . . .
12.5.2 Other improper integrals . . . . . . . . . . . .
12.6 Functions defined by integrals . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
63
63
63
63
63
64
64
64
65
65
66
66
67
67
67
67
69
70
finding the path’
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
71
71
71
72
73
73
74
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
77
77
78
78
79
79
79
80
13 First-order Differential Equations
13.1 Qualitative approach: ‘knowing the direction and
13.2 Euler’s method . . . . . . . . . . . . . . . . . . .
13.3 Linear equations: the integrating factor method .
13.3.1 Calculating the integrating factor . . . . .
13.3.2 Verification . . . . . . . . . . . . . . . . .
13.4 Nonlinear equations of ‘separable’ type . . . . . .
.
.
.
.
.
.
.
.
14 Second-order Differential Equations
14.1 Directly-integrable equations . . . . . . . . . . . . . . . . . .
14.2 Linear second-order equations . . . . . . . . . . . . . . . . . .
14.2.1 The solution of the related homogeneous equation . .
14.2.2 The general solution of the nonhomogeneous equation
14.3 Second-order linear equations with constant coefficients . . .
14.3.1 The complementary solution . . . . . . . . . . . . . .
14.3.2 The particular solution . . . . . . . . . . . . . . . . .
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
v
CONTENTS
14.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS
81
vi
0
0
INTRODUCTION
Introduction
Lecturer
Class tutor
Credit
Teaching
Assessment
Lectures
Classes
0.1
Dr Nicholas Jackson <[email protected]>
Mathematics B2.38, x28336
Dr Mark Cummings <[email protected]>
15 CATS
Two lectures and one class per week, in term 1.
One 1 21 –hour examination in June (80%)
Five fortnightly coursework assignments (20%)
Mondays 2pm–3pm R0.12, Fridays 11am–12pm L5
Tuesdays 10am–11am H3.57 and 12pm–1pm H0.58 and Wednesdays 11am–12pm H0.43
Course description
This is an optional module offered alongside EC123 Mathematical Techniques B, for students who enjoyed
mathematics at A–level and wish to extend and deepen their understanding. This module is neither compulsory
nor a prerequisite for subsequent modules in economic theory, mathematical economics and econometrics, but
will help to provide a sound foundation for studying these subjects.
0.2
Aims and objectives
• To give a rigorous underpinning to the topics and techniques met in EC120 Quantitative Techniques and
EC123 Mathematical Techniques B.
• To introduce some of the more advanced mathematics used by mathematical economists.
• To develop a more detailed understanding of set theory and the calculus of functions of real variables.
• To develop and strengthen your capabilities for abstract thought.
By the end of this module you should have acquired a deeper understanding of the mathematics introduced in
other modules such as EC123 Mathematical Techniques B and EC120 Quantitative Techniques, and obtained a
sound foundation for later specialisation in the mathematical and statistical aspects of economics. You will have
had much more practice at thinking in a logical and precise manner, and will hopefully find your capabilities
for clear and rigorous reasoning are improved in a way that will be useful in the rest of your studies.
0.3
Prerequisites
Note that Further Mathematics A–level is not a prerequisite for this module. All that is required is an aptitude
for, and an interest in learning some additional mathematical techniques and establishing the theory behind what
you do in EC123 Mathematical Techniques B, as indicated by a good grade at A–level. (EC121 Mathematical
Techniques A is not a suitable corequisite.) Of course, those students who have done Further Mathematics may
well have an easier ride with some of the topics, but the final results in the summer don’t necessarily reflect
previous knowledge!
0.4
Assessment
The total mark for the module will comprise the following:
(i) One 1 12 –hour examination in June (80% weight)
(ii) Five homework assignments (20% weight)
0.5
Assignments
Assignments will usually be handed out at the Monday lecture in odd-numbered weeks, and should be handed
in to the Undergraduate Office by the deadline stated on the assignment sheet: this will typically be 12pm
1
0
INTRODUCTION
(noon) on the Friday of the following week. This means that you will have just under two weeks to do the
assignment, and you are strongly recommended not to leave all of it until the Thursday evening. Deadlines
will be strictly enforced, and extensions or exemptions granted only in exceptional (and suitably-documented)
unforeseen circumstances such as illness, personal crisis, or natural disasters.
The class tutor will mark your work and return it to you in the following week’s class. Each assignment will
receive a total mark out of 25, comprising up to 20 marks for your answers to the questions, and up to five
marks for clarity of exposition.
Assignments will typically comprise three sections. Section A will consist of easier questions, which you should
do as a warm-up exercise, but which will not be marked or assessed. Section B will consist of harder questions,
which should be handed in by the specified deadline, and which will be marked for credit. Section C consists of
non-assessed questions which may be useful for revision purposes.
When working on your assignment, you may discuss your ideas with other students, but you must write up
your final solutions independently. Any instances of copying may be dealt with severely. (Please see the
Undergraduate Handbook’s section on plagiarism.)
0.6
Recommended books
There is no one recommended book for this course, as printed lecture notes will be provided. However, you
may wish to refer to some or all of the following:
S Lipschutz, Set Theory and Related Topics, Schaum Outline series, McGraw–Hill (1998) QA 248.L4
F M Hart, Guide to Analysis, Macmillan (2001) QA 300.H2
R P Burn, Numbers and Functions, Cambridge (2000) QA 300.B8
G H Hardy, A Course of Pure Mathematics, Cambridge (2008) QA 303.H2
0.7
Learning methods
Attendance at all lectures and classes is essential. Lecture notes will be distributed, and there is no reliance on
a single textbook. It is especially important to keep up with the problem sheets (which count towards your
final mark for this module), and to try lots of examples throughout the course, as well as regularly reviewing
your lecture notes to make sure you understand everything properly.
Mathematics is not a spectator sport – the only really successful way to learn and understand it is to do it.
This course contains edited highlights of the core of the first year Mathematics degree syllabus. Skipping the
lectures, classes and homework, and then trying to memorise the lecture notes in the last week before the exam
will not work.
You should:
Attend all lectures and classes
Make your own notes
Even though lecture notes are provided, you should augment these with your own notes. First, taking notes
concentrates the mind. Also, the approach used in the lecture may be slightly different from that in the
handouts.
Read your lecture notes
Lecture notes may be distributed ahead of the lecture. If so, you should at the very least, skim read them
before the next lecture. Certainly (re)read them after the lecture, and before attempting the assignment.
If you do miss a lecture, collect the notes from the Undergraduate Office and make up the lost work as soon as
possible afterwards. Neither I nor Hiroko will have spare copies.
No one fully understands everything in a lecture as it is being delivered, so it is crucial to work at it afterwards
and before the next lecture.
Do the assignments
Start on these as soon as possible after the relevant lectures. Dont leave them until the night before you are
due to hand them in, or youll have no time to sort out any difficulties you may encounter. Mathematics is not
a spectator sport.
0.6
Recommended books
2
0
INTRODUCTION
Talk to others
Discuss the material with others on your course; indeed, explaining ideas to someone else is an excellent way of
mastering the material yourself. You may find that other students, for example engineers, scientists, as well as
mathematicians, are doing similar topics.
Use the library
If you find the lecture notes inadequate, look in some of the books suggested.
Seek help
If you have done all of the above, and are still having difficulty, do speak to me at the end of the lecture, or
come to see me during my office hours (no appointment is needed). Don’t suffer in silence.
Work consistently
Since the coursework counts 20% towards your final mark for this module, consistent work really pays off. In
previous years, most students have followed this advice and have done very well overall.
Revise thoroughly
As you will already know, the best way to revise maths is to do lots of problems, not by just reading through
lecture notes. Re-work assignment questions, try past papers, etc.
0.8
The Greek alphabet
Name
Alpha
Beta
Gamma
Delta
Epsilon
Zeta
Eta
Theta
Iota
Kappa
Lambda
Mu
0.8
The Greek alphabet
Uppercase
A
B
Γ
∆
E
Z
H
Θ
I
K
Λ
M
Lowercase
α
β
γ
δ
or ε
ζ
η
θ or ϑ
ι
κ
λ
µ
Name
Nu
Xi
Omicron
Pi
Rho
Sigma
Tau
Upsilon
Phi
Chi
Psi
Omega
Uppercase
N
Ξ
O
Π
P
Σ
T
Υ
Φ
X
Ψ
Ω
Lowercase
ν
ξ
o
π
ρ or %
σ
τ
υ
φ or ϕ
χ
ψ
ω
3
0.8
The Greek alphabet
0
INTRODUCTION
4
1
1
SET THEORY
Set theory
Almost all of pure mathematics relies on the concept of a set, an amorphous collection of objects. Usually,
some additional structure is imposed on the set. We might, for example, impose an ordering on the elements
of the set, which then enables us to use symbols like <, >, 6 and > with impunity. If we define a notion of
‘distance’ between two elements of the set, we get a structure called a metric space, which is the subject of an
entire 30-lecture second-year module in the Mathematics Department. If we define some sort of ‘multiplication’
or ‘addition’ operation on the elements of the set, then we get a very versatile structure called a group, which
is important in a wide range of mathematical topics, as well as physics and chemistry.
Next term’s module EC133 Linear Algebra is concerned with objects called vector spaces, which again are
really just sets equipped with some extra structure. But for the rest of this section, we’ll just be looking at
ordinary sets with no extra structure.
The book Set Theory and Related Topics by Seymour Lipschutz [4] is a good (and cheap) reference for the
material in this and some later sections.
1.1
Sets and elements
The most detailed and complex definition in the Oxford English Dictionary1 is that for the word ‘set’, comprising
several hundred distinct senses spread over 26 consecutive pages. For our purposes, the following definition will
suffice:
Definition 1.1 A set is a collection of objects, which are referred to as the members or elements of the set.
Example 1.2
(i) The numbers 137, 1391 and −3.
(ii) The people living in Coventry.
(iii) The solutions of the equation x2 + 3x + 2 = 0.
(iv) The odd integers . . . , −5, −3, −1, 1, 3, 5, . . ..
(v) The consonants in the Latin alphabet.
(vi) All mountains on earth more than 8 500 metres high.
(vii) The real numbers.
Sets are usually denoted by capital letters, and the elements, if not specified, by lower case letters.
Definition 1.3 (Membership) If A is a set and x is an element of A, we say that x belongs to A (or x is a
member of A) and we write x ∈ A. We denote non-membership by using the symbol ∈,
/ thus x ∈
/ A means
that x does not belong to the set A.
We can often describe a set by listing all of its elements. The order in which they are written is unimportant.
Example 1.4
(i) {137, −3, 1391} = {−3, 137, 1391}
(ii) {a, e, i, o, u}
(iii) {. . . , −5, −3, −1, 1, 3, 5, . . .} = {±1, ±3, ±5, . . .}
(iv) {Everest, K2, Kangchenjunga, Lhotse}
1 For a very readable history of the OED, see the books The Surgeon of Crowthorne [5] and The Meaning of Everything [6] by
Simon Winchester.
5
1
SET THEORY
Sometimes we use the symbols | or : which mean ‘such that’:
Example 1.5
(i) {x | x2 − 3x + 2 = 0}
(ii) {x : x is even}
(iii) {x : 0 6 x 6 1} – the set of numbers x such that x lies between 0 and 1 inclusive.
(Strictly speaking, these sets are not properly defined, as we haven’t said what sort of number x represents.)
Definition 1.6 (Finite and infinite sets, cardinality) A set is said to be finite if it contains only a finite
number of elements, and infinite if it contains an infinite number of elements. The order or cardinality of a
set A is defined to be the number of elements the set contains, and is written |A| or card(A).
In Example 1.2, the sets (i), (ii), (iii), (v) and (vi) are finite, while (iv) and (vii) are infinite (but differently so –
more on this later).
Definition 1.7 (Equality) Two sets A and B are equal (written A = B) if they contain exactly the same
elements.
Example 1.8
{x : x2 = 1} = {−1, 1}
Definition 1.9 (Subsets) If A and B are two sets, we say that A is a subset of B (written A ⊆ B, or
sometimes B ⊇ A) if all the members of A are also in B:
‘whenever x ∈ A then x ∈ B’
The symbol ⊆ means that A could be equal to B (compare with the symbol 6). If there are additional
elements of B which are not contained in A, then we say that A is a proper subset of B (written A ⊂ B, or
sometimes B ⊃ A).
Clearly A = B if and only if 2 A ⊆ B and B ⊆ A. That is, every element of A is contained in B, and every
element of B is contained in A.
Definition 1.10 (Intersection) The intersection A ∩ B of two sets A and B is defined to be the set of all
elements which belong to both sets.
A ∩ B := {x : x ∈ A and x ∈ B}
Example 1.11 If S = {a, b, c} and T = {c, e, f, b} then S ∩ T = {b, c}.
If A1 , A2 , . . . , An are sets then their intersection is
n
\
Ai := A1 ∩ A2 ∩ . . . ∩ An
i=1
Definition 1.12 (The empty set) The empty set is a set which contains no elements; we represent it with
the symbol ∅, not to be confused with the lowercase Greek letter φ (phi). The empty set is a subset of all
other sets, since all of its elements are elements of any other given set.
Note ∅ =
6 {0}.
2 This
1.1
phrase, which turns up quite often in mathematics, is occasionally abbreviated as ‘iff’.
Sets and elements
6
1
SET THEORY
Example 1.13
(i) The set of all mountains on earth more than 30 000 feet high is empty.
(ii) If A = {1, 5, 7} and B = {2, 4, 11} then A ∩ B = ∅.
Definition 1.14 (Disjoint sets) If the intersection of two sets is empty, then they are said to be disjoint.
The sets A and B in Example 1.13(ii) are disjoint – they have no elements in common.
Definition 1.15 (Union) Given two sets A and B, their union A ∪ B is defined to be the set of all elements
which belong to A or B or both.
A ∪ B := {x : x ∈ A and/or x ∈ B}
Example 1.16 Using the sets in Example 1.13(ii), A ∪ B = {1, 2, 4, 5, 7, 11}.
The union of sets A1 , A2 , . . . , An is
n
[
Ai := A1 ∪ A2 ∪ . . . ∪ An
i=1
Note that
A ∪ A = A,
A ∪ ∅ = A,
A ∩ A = A,
A∩∅=∅
for any set A. Also note that
A ∪ B = B ∪ A,
A∩B =B∩A
for any sets A and B (we say ∪ and ∩ are commutative operations).
Definition 1.17 (The universal set) In any question involving sets, all sets under investigation may be
considered to be subsets of some fixed set, the universal set, which we denote by Ω.
Note that, if A ⊆ Ω,
A ∩ Ω = A,
A∪Ω=Ω
for all sets A. (We could, in fact, formally define Ω to be the set satisfying these two properties.)
Definition 1.18 (Difference) Given two sets A and B, their difference A \ B is defined to be the set
consisting of all the elements of A which are not also contained in B:
A \ B := {x : x ∈ A and x ∈
/ B}
This difference is also called the complement of B relative to A.
Note that A \ B need not be the same as B \ A (and, indeed, will not be unless A = B). If A and B are disjoint,
then A \ B = A, otherwise A \ B ⊂ A.
If A is the universal set Ω, then we refer to Ω \ B as simply the complement of B, and denote it variously by
e and so forth (although in this course we will adopt the notation B 0 exclusively.)
B 0 , B c , B, B,
Note also that
B ∪ B 0 = Ω,
B ∩ B 0 = ∅,
(B 0 )0 = B
for any set B.
Example 1.19
(i) Let A = {1, 2, 3, 4, 5} and B = {2, 4}. Then A \ B = {1, 3, 5}.
(ii) Let A = {1, 2, 3, 4, 5} and B = {2, 4, 6, 8}. Then A \ B = {1, 3, 5} and B \ A = {6, 8}.
1.1
Sets and elements
7
1
SET THEORY
Definition 1.20 (Cartesian product) Given two sets A and B, their Cartesian product A×B is defined to
be the set of all ordered pairs (a, b) such that a ∈ A and b ∈ B. That is,
A×B := {(a, b) : a ∈ A and b ∈ B}.
In general, the pair (a, b) is not the same as (b, a).
Example 1.21 Suppose A = {1, 2, 3} and B = {a, b}, then
A × B = {(1, a), (1, b), (2, a), (2, b), (3, a), (3, b)}.
In general we could have the Cartesian product of n sets:
A1 × A2 × · · · An := {(a1 , a2 , . . . an ) : ai ∈ Ai , i = 1, 2, . . . n}
In particular, A × A × A × · · · × A (n times) is abbreviated to An .
We use the Cartesian product when we represent a point in the plane by coordinates (x, y), because x ∈ R and
y ∈ R. The Cartesian product is R × R or R2 .
Furthermore, Rn consists of ordered elements of the form (x1 , x2 , . . . xn ) where each xi ∈ R. (x1 , x2 , . . . xn ) is
sometimes called an n–tuple. We will consider R2 , R3 and Rn in more depth next term, in the linear algebra
section of the course.
1.1
Sets and elements
8
2
2
REAL NUMBERS
Real Numbers
The main purpose of this module is to investigate various properties of functions of real variables. In particular,
we will study calculus (differentiation and integration) in careful detail, and examine concepts such as limits,
continuity and differentiability in a properly rigorous manner.
In this section, we will spend a little time studying the various classes of real numbers.
2.1
The Natural Numbers
This is the set of ‘counting numbers’ {1, 2, 3, . . .} and is denoted N (for ‘natural’).
We could extend this set slightly so that we include 0 (call it N∗ say, or N0 ). But if we extend the set further
to include negative numbers as well, the resultant set will satisfy even more of the basic properties specified.
2.2
The Integers
The set of integers {. . . , −3, −2, −1, 0, 1, 2, 3, . . .} is denoted Z (for ‘zahl’).
Z+ = {positive integers} = N
Z− = {negative integers} = −N
All integers can be classified as being even (divisible by 2), or odd (not divisible by 2).
All even integers can be written in the form 2n, where n ∈ Z, and all odd integers can be written as 2n + 1, or
if more convenient, 2n − 1, again with n ∈ Z. The evens and odds are disjoint sets:
Zodd ∩ Zeven = ∅
Zodd ∪ Zeven = Z
Example 2.1 The sum of two even integers is even, the sum of two odd integers is even, and the sum of an
even and odd integer is odd.
Proof Let a = 2m and b = 2n be two even integers, where m, n ∈ Z. Then a + b = 2m + 2n = 2(m + n). Since
m and n are both integers, so is m + n and hence a + b = 2(m + n) is an even integer.
Now let a = 2m + 1 and b = 2n + 1. Then a + b = 2m + 2n + 1 + 1 = 2(m + n + 1), which is also even.
Finally, let a = 2m and b = 2n + 1. Then a + b = 2m + 2n + 1 = 2(m + n) + 1, which is an odd integer.
Example 2.2 The product of two even integers is even, the product of two odd integers is odd, and the product
of an even and odd integer is even.
A consequence of this result is that the square of an even integer is even, and the square of an odd integer is
odd.
Also the converse of this is true: given that a2 is even, where a ∈ Z, then a must be even, whereas if a2 is an
odd perfect square, then a must be odd.
2.3
The Rational Numbers
The set of all numbers of the form
and is denoted Q (for ‘quotient’).
m
n,
where m and n are integers and n 6= 0, are called rational numbers,
The rational numbers are dense; that is, between any two rational numbers is at least one other rational
number:
Lemma 2.3 If a, b ∈ Q and a < b then 21 (a + b) ∈ Q, and a < 12 (a + b) < b.
9
Proof Let a =
Then
p
q
2
REAL NUMBERS
and b = rs , where p, q, r, s ∈ Z and q, s 6= 0.
1
2 (a
+ b) =
1 p r (ps + rq)
=
+
∈Q
2 q
s
2qs
since both numerator and denominator are integers and the denominator is non-zero. Now
1
2 (a
+ b) − a =
(ps + rq) p
ps + rq − 2ps
− =
2qs
q
2qs
rq − ps
1r p 1
= 2 (b − a) > 0
=
=
−
2qs
2 s q
Hence there is a rational number 12 (a + b) satisfying a < 21 (a + b). Similarly we can show that 12 (a + b) < b,
and the result follows.
So it looks as though we can ‘fill up’ the whole number line with rational numbers. But this is wrong, as we
shall see next.
Theorem 2.4
√
2 is not a rational number.
This result is attributed to the school of Pythagoras of Samos (480–400BC), although a more rigorous proof
is known to have been in existence by 380BC. This result contradicted a central pillar of the Pythagoreans’
worldview – that the rational numbers underlay the entirety of existence – and an apocryphal legend states that
this was so shocking that Pythagoras had the unfortunate responsible mathematician taken out and drowned.
The following proof uses a technique known as ‘proof by contradiction’ or reductio ad absurdum. Essentially
we start by assuming the desired result is not true, and then proceed logically to show that this leads to an
inconsistency. Thus, our initial assumption must have been incorrect.
√
√
Proof Suppose that 2 is rational. In that case, it can be written in the form 2 = pq , where p and q are
integers which have no factors in common – that is, pq is a proper fraction – and q 6= 0.
Squaring both sides:
p2
2 = 2 =⇒ p2 = 2q 2
q
Thus p2 is an even integer, and by Example 2.2 this means that p also must be even. (If p were odd, then p2
would be odd.)
So we can write p = 2r where r is some integer. Therefore
2q 2 = (2r)2 = 4r2
and so q 2 = 2r2 .
Hence q is also even, for the same reason that p is.
p
This means that both p and q are even, which contradicts the hypothesis that
√ q was in its simplest form. So
the only option open to us is to conclude that it is not possible to express 2 as a rational number.
This proof uses a subtle but powerful technique called reductio ad absurdum, or proof by contradiction. Broadly
speaking, we start out by assuming the opposite statement to the one we want to prove, and then proceed
logically from that point until we arrive at a statement which is either inconsistent with the original hypothesis
or clearly false. The only possible conclusion then is that the original assumption was false, and therefore that
the thing we wanted to prove must have been true all along.
The mathematician G H Hardy (1877–1947) wrote, in his book A Mathematician’s Apology:
. . . and reductio ad absurdum, which Euclid loved so much, is one of a mathematician’s finest
weapons. It is a far finer gambit than any chess gambit: a chess player may offer the sacrifice of a
pawn or even a piece, but a mathematician offers the game.
2.3
The Rational Numbers
10
2.4
2
REAL NUMBERS
Irrational and real numbers
√
Since 2 is not rational, we call it an irrational number. There is no specific symbol for the irrationals, but
all the rational and irrational numbers constitute the real numbers which we denote by the symbol R.
The symbols R+ and R− have analogous meanings to the symbols Z+ and Z− . We can define the irrational
numbers to be R \ Q. Obviously there are an infinite number of square
√ roots which are irrational – as an
exercise, see if you can adapt the proof of Theorem 2.4√to show that 3 is irrational. Similarly, an infinite
number of higher-order roots will also be irrational (eg 3 5).
Irrational numbers which are the solution of polynomial equations (eg x2 = 2) are said to be algebraic.
Irrational numbers which aren’t algebraic are called transcendental – well-known examples of this second
class are π and e.
It isn’t always easy to prove that a number is irrational, and indeed there have been many interesting numbers
which mathematicians have been convinced were irrational, but which were later shown to be rational.
The Feigenbaum constant δ = 4.669201 . . ., which turns up in the study of dynamical systems and chaos
theory (eg in models of population growth) is believed to be transcendental, but has not been proved to be so.
Chaitin’s constant (the probability that a random algorithm halts) is known to be not only transcendental, but
uncomputable.
It is sometimes possible to prove existence statements without finding explicit numbers which satisfy the
statement.
Proposition 2.5 There exist irrational real number x and y such that xy is rational.
√
√ √2
Proof We know that 2 is irrational. Consider the number 2 . It is (obviously) either rational or irrational.
√
If it is rational, then the result follows by setting x = y = 2.
If not, then set
√
√ √2
and y = 2.
x
=
2
√ √
√ 2
√
Then xy = ( 2 ) 2 = ( 2)2 = 2, which is rational.
√
The fact that 2 is irrational implies that between any two rational numbers there is at least one irrational
number.
√
2
2 (b
Lemma 2.6 If a, b ∈ Q with a < b, then c = a +
− a) is an irrational number such that a < c < b.
Proof We first show that c is irrational. Again, the proof is by contradiction.
√
Suppose that c is rational. We have c = a +
2
2 (b
√
− a) and rearranging this we get
2=
2(c − a)
.
(b − a)
√
Since a, b, c ∈ Q, then 2 ∈ Q; but this is false. Therefore c is irrational.
√
For the second part, note that 1 < 2, since 1 < 2. This means that
√
1
2
< 1.
0 < √ < 1, or 0 <
2
2
Hence
√
c=a+
Obviously
2
(b − a) < a + (b − a) = b.
2
√
a<a+
2
(b − a),
2
(since b − a > 0, ie b > a).
Remark The rational and irrational numbers are mixed up in a very complicated way. Do not make the
common mistake of thinking that they alternate in some way along the real line.
2.4
Irrational and real numbers
11
2
2.5
Some further results for real numbers
2.5.1
Classifying decimals as rational or irrational
REAL NUMBERS
(i) Any finite decimal is rational.
(ii) Any infinite repeating decimal is rational.
(iii) Any infinite, non-repeating decimal is irrational.
Case (i) could be included within case (ii) because we can always append an infinite number of zeros to the end
of any finite decimal.
Example 2.7
(i) 0.14362 = 0.1436200000 . . . =
14362
100000 .
(ii) 0.123123123 . . . is infinite but repeating, so is rational.
Let x = 0.123123123 . . ., then multiplying by 103 = 1000,
1000x = 123.123123123 . . . ,
and subtracting x we get 999x = 123 and so x =
123
999 .
Actually in this example we can simplify further by dividing the numerator and denominator by 3 to give
41
x = 333
2.5.2
Absolute value
If a ∈ R, then
|a| :=
a if
−a if
a > 0,
a < 0.
√
This means that a 6 |a|. Also |a| = a2 , or |a|2 = a2 . Later we will extend this concept to cover complex
numbers as well.
√
Note Nowadays,
is always
√ defined to be the positive square root. So while the solutions of the equation
2
x = 4 are ±2, the symbol 4 means just 2, not ±2.
|a| may be interpreted as the distance between ‘a’ and the origin on the real number line.
|a − b| is the distance between a and b, and |a + b| is the distance between a and −b, (or between b and −a).
2.5.3
The Triangle Inequality
|a + b| 6 |a| + |b|
Proof Now (|a + b|)2 = (a + b)2 = a2 + 2ab + b2 . Also since a 6 |a| and b 6 |b|, then
(|a + b|)2 6 a2 + 2|a||b| + b2
= |a|2 + 2|a||b| + |b|2
= (|a| + |b|)2
Hence (|a + b|)2 6 (|a| + |b|)2 , and since all absolute values are non- negative, this leads to
|a + b| 6 |a| + |b|
2.5
Some further results for real numbers
12
3
3
COMPLEX NUMBERS
Complex Numbers
We now complete the standard hierarchy of number systems
N⊂Z⊂Q⊂R⊂C
In the 16th century, there was much interest in solving algebraic equations. For example: find two numbers
whose sum is ten and product forty.
Algebraically, this is the problem
x + y = 10
xy = 40
x(10 − x) = 40
so
2
x − 10x + 40 = 0
√
√
√
√
This has solutions x = 10± 2 −60 = 5 ± −15. So the solutions are x = 5 + −15 and y = 5 − −15.
√
Although, at this point, we haven’t yet attached any meaning to −15, if we manipulate these expressions
using the familiar rules of algebra, we can show that x + y = 10 and xy = 40.
Let us consider the equation x2 + 1 = 0. If x is a real number, this equation has no solutions. We know that
in general, a quadratic equation (a polynomial equation of degree 2) has 2, 1 or 0 real solutions. In some
circumstances, it is useful to think of such equations as having two solutions in all cases. If so, we have to
extend our set of real numbers R to accommodate this.
√
The Swiss mathematician Leonard Euler (1707–1783) introduced the symbol i for −1. We will define it via
i2 = −1, for reasons which will emerge later. In this case we solve x2 − i2 = 0 and the equation x2 + 1 = 0 has
solutions x = i and x = −i.
√
√
√
√
√
In Section
2.7 we obtained √
solutions x = 5 + √−15 and y = 5 − −15. We can write −15 as 15i2 = 15i
√
(or i 15), and so x = 5 + i 15 and y = 5 − i 15
3.1
General Definitions
Any number of the form a + ib (or a + bi) where a and b are real numbers, is called a complex number. We
usually call a complex number z or w rather than x or y. We denote the set of all complex numbers C.
Two complex numbers are added or multiplied term by term using the usual rules of algebra, remembering
that i2 = −1.
Given a complex number z = a + ib, we call the real number a the real part of z and the real number b is the
imaginary part of z. This can be written Re(z) = a and Im(z) = b. Note that these terms are historical and
have no particular significance.
Example 3.1 Re(3 − 5i) = 3 and Im(3 − 5i) = −5, (not −5i).
Given the general complex number z = a + ib, if b = 0, then z is real, so a real number a is equivalent to the
complex number a + 0i.
If z = a + ib and a = 0 then we say that z is purely imaginary. If both a and b are zero, then the complex
number 0 + 0i is equivalent to the real number 0. So zero belongs to C and z = 0 means that both Re(z) = 0
and Im(z) = 0.
13
3
3.2
Algebra of Complex Numbers
3.2.1
Addition, negative and subtraction
COMPLEX NUMBERS
If z = a + bi and w = c + di then
z+w
−w
z−w
3.2.2
=
(a + c) + (b + d)i
= −c − di
=
(a − c) + (b − d)i
Multiplication
If z = a + bi and w = c + di then
zw
3.2.3
=
(a + bi)(c + di) = ac + adi + bci + bdi2
=
(ac − bd) + (ad + bc)i
Equality
If z = a + bi and w = c + di, then z = w if and only if a = c and b = d.
3.3
Complex conjugate
Let z = a + bi be a complex number, then z¯ = a − bi is called the complex conjugate of z.
z z¯ = (a + bi)(a − bi) = a2 + b2 which is a positive real number for all nonzero complex numbers.
The real number
√
a2 + b2 is called the modulus of z. It is denoted by |z|.
Thus
|z|2 = z z¯ = z¯z
Lemma 3.2 Let z, w ∈ C
(i) z + w = z¯ + w.
¯ The conjugate of a sum is the sum of the conjugates.
(ii) zw = z¯w
¯ The conjugate of a product is the product of the conjugates.
Proof Let z = a + bi, w = c + di
(i) z + w = (a + c) + (b + d)i, so
z + w = (a + c) − (b + d)i = (a − bi) + (c − di) = z¯ + w.
¯
(ii) zw = (ac − bd) + (ad + bc), so
zw = (ac − bd) − (ad + bc)i = (a − bi)(c − di) = z¯w.
¯
3.3.1
Multiplicative Inverse
If z = a + bi, z 6= 0, then z −1 =
1
z
=
1
a+bi .
Multiplying the numerator and denominator by the complex conjugate a − bi gives
z −1 =
1
a − bi
a − bi
.
=
= 2
z
(a + bi)(a − bi)
a + b2
We can now carry out division:
3.2
Algebra of Complex Numbers
14
3.3.2
3
COMPLEX NUMBERS
Division
If z = a + bi and w = c + di, w 6= 0 then
(ac + bd) + (bc − ad)i
(c − di)
z
=
.
= zw−1 = (a + bi). 2
w
c + d2
c2 + d2
We also have
3.4
z
w
=
z
w.
The conjugate of a quotient is the quotient of the conjugates.
The Fundamental Theorem of Algebra
Consider the general polynomial of degree n:
Pn (x) = an xn + an−1 xn−1 + · · · a2 x2 + a1 x + a0
where the coefficients ai are real numbers and the leading coefficient an 6= 0.
The polynomial Pn (x) has exactly n roots. These can be a combination of real and distinct, real and repeated
or complex.
Note Roots of the polynomial are just solutions of the polynomial equation Pn (x) = 0, ie values of x for
which the equation holds.
Example 3.3 Consider the polynomial
P3 (x) = x3 + x2 + 3x − 5
Clearly, P3 (1) = 0 because 1 + 1 + 3 − 5 = 0, so x = 1 is a root and (x − 1) must be a factor. We could
search for another real root, but actually we would be unsuccessful. We find the remaining quadratic factor by
synthetic division.
The solutions of x2 + 2x + 5 = 0 are given by
−2 ±
√
4 − 20
= −1 ± 2i.
2
So the roots of P3 (x) are 1, −1 + 2i and −1 − 2i, and we can factorise the cubic as follows:
x=
x3 + x2 + 3x − 5 = (x − 1)[x − (−1 + 2i)][x − (−1 − 2i)] = (x − 1)(x + 1 − 2i)(x + 1 + 2i).
Note that the complex roots occur in a conjugate pair. This will always be the case when the coefficients in the
original polynomial are real: ie all complex roots occur in conjugate pairs, as we shall now demonstrate.
Theorem 3.4 (The Fundamental Theorem of Algebra) Let
P (x) = an xn + an−1 xn−1 + · · · + a2 x2 + a1 x + a0 = 0
be a polynomial equation, where ai ∈ R.
Suppose that z ∈ C satisfies this equation, then so does z¯.
Proof We know that P (z) = 0, ie
P (z) = an z n + an−1 z n−1 + · · · + a2 z 2 + a1 z + a0 = 0.
Now consider P (¯
z ) = an z¯n + an−1 z¯n−1 + · · · a2 z¯2 + a1 z¯ + a0 . We want to show that this also is zero.
Using Lemma 3.2,
P (z) = an z n + an−1 z n−1 + · · · a2 z 2 + a1 z¯ + a0
Now since ai ∈ R, then ai = ai , (the conjugate of a real number is the number itself), so
P (z)
= an z n + an−1 z n−1 + · · · a2 z 2 + a1 z¯ + a0
= an z n + an−1 z n−1 + · · · + a2 z 2 + a1 z + a0
zn
= an + an−1
¯=0
= 0
z n−1
+ · · · + a2
z2
+ a1 z + a0
using Lemma 3.2
using Lemma 3.2
so z¯ satisfies the equation. Hence any complex roots of the polynomial equation occur in conjugate pairs.
3.4
The Fundamental Theorem of Algebra
15
3
COMPLEX NUMBERS
From this theorem it is clear that a cubic must have one or three real roots.
If a pair of complex conjugate roots are a ± ib, then these give rise to a real quadratic factor because
[x − (a + ib)][x − (a − ib)] = x2 − (a + ib + a − ib)x + (a + ib)(a − ib)
= x2 − 2ax + (a2 + b2 )
Note here that the coefficient of x, viz −2a is ‘−the sum of the roots’ and the constant term a2 + b2 is the
‘square of the conjugates’.
3.5
Geometric interpretation and the polar form
Every complex number z = a + bi can be represented by a point (a, b) in the x–y plane. In this context, the
x–y plane is called the complex plane or Argand diagram.
Im(z)
z=a+bi
b
r
θ
a
Re(z)
In polar coordinates, the cartesian point (a, b) can be represented as hr, θi. Clearly, a = r cos θ and b = r sin θ.
So we can write
z = a + bi = r(cos θ + i sin θ).
If we square and add we get
and so r =
√
a2 + b2 = r2 cos2 θ + r2 sin2 θ = r2
a2 + b2 .
Thus r = |z|, the modulus of z, which was defined earlier. Note that r > 0.
The polar angle θ is called the argument of z, written arg z and is defined by
r cos θ = a
r sin θ = b
or, if there is no ambiguity, by θ = tan−1
b
a
.
Since θ ≡ θ + 2kπ for k ∈ Z, we usually define the principal value of the argument to lie in the range
0 6 θ < 2π or possibly, −π < θ 6 π, (0◦ 6 θ < 360◦ or −180◦ < θ 6 180◦ ).
3.5.1
Complex multiplication
Suppose z1 = r1 (cos θ1 + i sin θ1 ) and z2 = r2 (cos θ2 + i sin θ2 ), then
z1 z2
= r1 r2 (cos θ1 + i sin θ1 )(cos θ2 + i sin θ2 )
= r1 r2 cos θ1 cos θ2 + i cos θ1 sin θ2 + i sin θ1 cos θ2 + i2 sin θ1 sin θ2
= r1 r2 [(cos θ1 cos θ2 − sin θ1 sin θ2 ) + i(sin θ1 cos θ2 + cos θ1 sin θ2 )]
= r1 r2 [cos(θ1 + θ2 ) + i sin(θ1 + θ2 )]
3.5
16
3
COMPLEX NUMBERS
If we write this result in shorthand form we have:
hr1 , θ1 ihr2 , θ2 i = hr1 r2 , (θ1 + θ2 )i
That is, multiply the moduli and add the arguments.
3.5.2
Powers: De Moivre’s theorem
A consequence of the multiplication rule is that if z = hr, θi in polar form, then
z 2 = hr2 , 2θi,
z 3 = hr3 , 3θi,
...,
z n = hrn , nθi
where n ∈ N. We shall prove this formally later on.
3.5.3
Reciprocal
If z = r(cos θ + i sin θ) = hr, θi then
1
z
=
=
=
1
r(cos θ + i sin θ)
1
((cos θ − i sin θ))
·
r (cos θ + i sin θ)(cos θ − i sin θ)
1
· (cos θ − i sin θ).
r
Now cos(−θ) = cos(θ) and sin(−θ) = − sin(θ) so the above could be written as
1
1
= [(cos(−θ) + i sin(−θ)] .
z
r
Thus, in polar form,
1
z
=
or z −1
=
D1
, −θ
E
r−1
r , (−1)θ
It may also be shown that
z −m = hr−m , −mθi,
where m ∈ N.
Thus we have the general result
z n = hrn , nθi,
3.5.4
for all n ∈ Z.
Division
If z1 = hr1 , θ1 i and z2 = hr2 , θ2 i then
D1
E
z1
= z1 .z2−1 = hr1 , θ1 i
, −θ2 ,
z2
r2
Using the result for multiplication:
z2 6= 0.
Dr
E
z1
1
=
, (θ1 − θ2 )
z2
r2
that is, for division, divide the moduli and subtract the arguments.
3.5
17
3.5.5
3
COMPLEX NUMBERS
Conjugate
If z = hr, θi then z¯ = hr, −θi.
Example 3.5
(i) Mark on an Argand diagram the complex numbers z1 = 3 − 2i, z2 = −i, z3 = i2 , z4 = −2 − 4i, z5 = 3.
Find the modulus and argument of each complex number and hence write in polar form.
(ii) Find the polar form of
√
(a) z1 = − 3 + i
(b) z2 = 4 + 4i
Hence express z1 z2 , zz21 , z11 and z14 in polar form.
√ π π (iii) Express
2, 4 , 2, 6 and 2, − π6 in Cartesian form a + bi.
3.6
3.6.1
Roots of polynomials
Cube roots of unity
Find the roots of z 3 − 1.
Method 1 Since z 3 − 1 = 0 when z = 1, then (z − 1) must be a factor, and completing the factorisation
z 3 − 1 = (z − 1)(z 2 + z + 1),
so z 3 − 1 = 0 when z = 1 and when z 2 + z + 1 = 0, ie when z = − 21 ±
√
√
√
3
2 i.
So the roots of z 3 − 1 are 1, − 12 + 23 i and − 12 − 23 i. These are the three cube roots of 1, or the ‘cube roots of
unity’.
Method 2 We solve z 3 = 1 and express both sides of the equation in polar form. If z = hr, θi, then z 3 = hr3 , 3θi
and 1 = h1, 0i, so we solve
hr3 , 3θi = h1, 0i,
giving r3 = 1 and 3θ = 0.
This means that r = 1. We may be tempted to say that the second equation has the one solution θ = 0, but
this is wrong. We have to remember that the argument of a complex number isn’t defined uniquely, but we can
always add on integer multiples of 2π.
Thus we write 3θ = 2kπ, k = 0, 1, 2, . . ., i.e.
3θ = 0, 2π, 4π, 6π, 8π, . . . . . . .
Dividing by 3 gives
θ = 0,
2π 4π
8π
,
, 2π,
,......
3 3
3
4π
It is clear that after the first three values, 0, 2π
3 , 3 , the remaining solutions are just repetitions.
So we can write the three solutions as z =
2kπ
3 ,
for k = 0, 1, 2. Writing in full, the solutions are
z0 = h1, 0i = cos 0 + i sin 0 = 1, (as expected)
√
2π
2π
z1 = h1, 2π
= − 21 + i 23
3 i = cos 3 + i sin 3
√
4π
4π
z2 = h1, 4π
= − 21 − i 23 .
3 i = cos 3 + i sin 3
Remarks
(i) z1 and z2 are complex conjugates.
3.6
18
3
COMPLEX NUMBERS
(ii) The three roots are distributed evenly around the origin with an angular separation of
2π
3 .
(iii) If we square either of z1 or z2 we get the other one
z12
=
4π
h12 , 2. 2π
3 i = h1, 3 i = z2
z22
=
2π
h1, 8π
3 i ≡ h1, 3 i = z1 .
So if we denote either of z1 or z2 by ω then the other one is ω 2 and the three solutions are 1, ω and ω 2 .
(iv) Consider any general cubic of the form
P (z) = z 3 + az 2 + bz + c
which has roots α, β and γ, so the cubic can be factorised as
P (z) = (z − α)(z − β)(z − γ).
If we multiply out the factors we get
z 3 − (α + β + γ)z 2 + (αβ + βγ + γα)z − αβγ
and matching up with z 3 + az 2 + bz + c we, see in particular that the sum of the roots is α + β + γ = −a
In (iii) our polynomial is z 3 − 1. Since the term in z 2 is missing, or equivalently, its coefficient is zero,
then the sum of the roots is 0, and hence
1 + ω + ω2 = 0
3.6.2
The nth roots of unity: generalisation of the above
Here we solve z n = 1. Following the same method as above, the roots are zk = 2kπ
n , for k = 0, 1, 2, . . . (n − 1).
Again the roots are distributed evenly around the origin with an angular separation of 2π
n . Now consider the
following:
(i) In any general polynomial equation
z n + an−1 z n−1 + · · · a2 z 2 + a1 z + a0 = 0,
the sum of the solutions of the equation is equal to ‘the coefficient of z n−1 ’, i.e. the real number −an−1 .
Now when we look at the equation z n − 1 = 0 we see that an−1 = 0, so this tells us that the sum of the
roots of the polynomial is 0.
(ii) In finding the roots of z n − 1, we know that one root z0 = 1. Consider the root z1 = h1, 2π
n i. Then
z12 = 1, 4π
= z2 ,
n
z13 = 1, 6π
= z3 , . . .
n
z1n−1 = 1, 2(n−1)π
= zn−1
n
Using (i), we have
z0 + z1 + z2 + z3 + · · · + zn−1 = 0
and using (ii)
1 + z1 + z12 + z13 + · · · + z1n−1 = 0.
This result may be shown to be true for any of the solutions zk , k 6= 0. So if we call any one of them ω, (ω 6= 1)
then
1 + ω + ω 2 + ω 3 + · · · + ω n−1 = 0.
Example 3.6
(i) Solve z 5 = −32
(ii) Solve z 4 = −1 + i
3.6
19
3
COMPLEX NUMBERS
Example 3.7 Find the solutions of the equation z 5 + z 4 + z 3 + z 2 + z + 1 = 0.
On the face of it this looks difficult. Now suppose we multiply this throughout by (z − 1) then we get
(z − 1)(z 5 + z 4 + z 3 + z 2 + z + 1) = z 6 − 1.
We therefore solve z 6 = 1 following the method in Section 3.6.2. We then discard the extra solution z = 1. The
remaining five solutions are the ones required.
Example 3.8
(i) How would you find the roots of z 5 − z 4 + z 3 − z 2 + z − 1 = 0?
(ii) Can you generalise to find the roots of 1 + z + z 2 + · · · + z n ?
3.7
Exponential form
For real x we can find the following series representations (justification later on).
ex
=
x2
x3
x4
2! + 3! + 4!
x2
x4
2! + 4! − · · ·
x3
x5
3! + 5! − · · ·
1+x+
cos x =
1−
sin x =
x−
+
x5
5!
+ ···
For complex z we define
ez = 1 + z +
z2
2!
+
z3
3!
+
z4
4!
+
z5
5!
+ ···
If we now write z = iθ (where θ is real) then we obtain
eiθ
=
=
=
(iθ)2
(iθ)3
(iθ)4
(iθ)5
2! + 3! + 4! + 5! + · · ·
2
3
4
5
1 + iθ − θ2! − i θ3! + θ4! + i θ5! + · · ·
4
3
5
2
1 − θ2! + θ4! − · · · + i θ − θ3! + θ5! − · · ·
1 + iθ +
We conclude that
eiθ = cos θ + i sin θ
This is known as Euler’s Formula.
Replacing θ by −θ in this formula gives
e−iθ = cos θ − i sin θ.
Note In this formula, θ must be measured in radians.
Now the complex number z = r(cos θ + i sin θ) may be written in the exponential form z = reiθ . We can then
use the usual laws of algebra (without justification!) to obtain the familiar polar results:
Conjugate If z = reiθ , then z¯ = re−iθ ,
Multiplication If z1 = r1 eiθ1 ,
z2 = r2 eiθ2 , then
z1 z2 = r1 r2 eiθ1 eiθ2 = r1 r2 ei(θ1 +θ2 ) .
Reciprocal If z = reiθ , then
1
z
=
1
reiθ
= 1r e−iθ
Powers If z = reiθ , then z n = rn einθ , n ∈ Z.
Division If z1 = r1 eiθ1 ,
3.7.1
z2 = r2 eiθ2 , then
z1
z2
=
r1 eiθ1
r2 eiθ2
=
r1 i(θ1 −θ2 )
.
r2 e
A remarkable result
eiπ = −1
3.7
Exponential form
20
4
4
PROOF AND REASONING
Proof and reasoning
As you may have realised by now, the mathematics that you encounter in this course may be somewhat different
from the mathematics you did at school. Making the transition can be quite difficult. Mathematics at school is
mostly about learning techniques, problem solving and doing calculations. At University, pure mathematics is
about definitions, theorems and proofs. It can be very theoretical and abstract. In this course, I will attempt
to achieve a balance between these approaches.
At the end of a school lesson, you would expect to understand most if not all of it, although you may need
extra practice by doing exercises before you feel completely confident. At the end of a lecture, this may not be
the case. You may well need to go away, work through the lecture notes on your own or better still, in a group,
before you fully understand the material. This is completely normal, and you should not consider that you are
failing (nor that the lecturer is useless) if you don’t understand immediately. But it does require input from
you in order to master the material. Please don’t be afraid to ask questions, either during or after the lectures,
or in the tutorials.
4.1
Some mathematical terms
Definition A definition introduces and explains some mathematical concept.
Theorem This is a key mathematical result or list of key results in a theory.
Lemma This is a minor mathematical result, usually preparing the way for a theorem.
Corollary This is a straightforward consequence of a theorem, often a special case of it.
Proposition In logic, a proposition is just a statement about something, be it true or false. It is often also
used to describe a statement of something to be proven.
Hypothesis or Conjecture This is an unproven mathematical result. Once it is proven, it may become a
lemma or a theorem.
Proof This is an argument that a mathematician uses to convince others that a result is true. The idea of
proof is very important to the whole subject of mathematics. This will be looked at in the next section.
4.2
Why do we need to prove results?
Mathematics differs from many other disciplines in that it does not depend on previous experience or experiment
in order to justify its results: although such an approach may allow us to guess or conjecture, and form hypotheses.
Mathematicians like to present a topic in an orderly way, setting out what assumptions are made (axioms) and
then progress in a logical manner, with all terms used being defined precisely. Among mathematicians, there is
not always agreement about what constitutes an acceptable proof, ie what axioms are to be chosen.
You may feel little need to justify or prove various results, being quite happy to accept my word for it. This
isn’t a good idea – I could be making it all up, or I could just be wrong. We now look at what might constitute
a rigorous argument or proof.
Proofs have the following characteristics:
• All assumptions are clearly stated.
• All terms are defined.
• There is a logical argument indicating how the conclusion follows from the assumptions.
• All cases are covered.
4.3
Methods of Proof
We have met the following already:
Direct Proof
Start with ‘What I know’.
Set up a chain of argument.
End up with ‘What I want’.
21
4
PROOF AND REASONING
Contradiction
Assume that ‘the required result is false’.
Work to a contradiction.
Exhaustion
Check all possible cases individually.
Demonstration
Use this method when you are asked to ‘Prove there exists . . . ’.
You only need one example.
Counterexample
Use this method when you need to disprove something.
You need to show that it is false for some case.
One example is enough.
4.4
Mathematical induction
A good analogy for this method of proof is that of someone trying to climb a ladder.
In order to get to the top of the ladder, we first have to get on to the first rung; from there we go to the second
rung, to the third and so on. If we can get started, we can get to the top eventually.
Let P (n) be a proposition involving the natural number n.
Definition 4.1 (Principle of mathematical induction) Suppose we have a variable proposition P (n) which
depends on some natural number n. Then if P (1) is true, and if P (k) ⇒ P (k + 1) for some k ∈ N, then P (n)
is true for all n ∈ N.
Example 4.2 Let P (n) be the statement “the sum of the first n natural numbers is
We can prove this to be true for all n ∈ N.
n(n+1)
.”
2
(i) Direct Proof Let the sum be Sn , then
Sn = 1 + 2 + 3 + · · · + (n − 1) + n
Also, writing in reverse order
Sn = n + (n − 1) + (n − 2) + · · · + 2 + 1
Adding each pair we get
2Sn = (n + 1) + (n + 1) + (n + 1) + · · · + (n + 1) + (n + 1)
|
{z
}
n times
So Sn =
Pn
i=1
=
n(n+1)
.
2
(ii) Proof by Induction
(a) (The first rung)
P (1): the sum of 1 natural number is obviously 1. Also from the formula,
S1 =
1(2)
2
= 1,
so the formula holds for n = 1.
(b) Now suppose that the formula holds for some particular value of n, say n = k, i.e. assume P (k) is
true. The question then is, is P (k + 1) true? Does the formula hold for n = k + 1? (Again, the
analogy is: if, somehow or other, we’ve reached the kth rung, can we climb to the (k + 1)th?)
4.4
22
4
So we assume Sk = 1 + 2 + · · · + k =
k(k+1)
.
2
Then Sk+1
PROOF AND REASONING
Then
= Sk + (k + 1)
=
k(k+1)
+ (k
2
(k+1)(k+2)
2
=
(k+1)(k+1+1)
2
=
+ 1)
This is the same as the formula for Sk when we replace k by k + 1 throughout.
So if P (k) is true, then so is P (k + 1), or P (k) ⇒ P (k + 1).
(c) So now we have
P (k) ⇒ P (k + 1), where k ∈ N
We have the following implications:
P (1) is true, and
and P (1) is true.
P (1) ⇒ P (2) ⇒ P (3) ⇒ · · · ⇒ P (n) ⇒
P (n) is true for all n ∈ N.
Example 4.3 The sum of the first n odd numbers is n2 , that is,
Sn = 1 + 3 + 5 + · · · + (2n − 1) =
n
X
(2i − 1) = n2 .
i=1
Proof
(i) P (n) : the sum of the first n odd numbers is n2 .
(ii) P (1): S1 = 1 and also the formula gives 12 = 1, so P (1) is true.
(iii) Assume P (k) is true: Sk = 1 + 3 + 5 + · · · + (2k − 1) = k 2 .
Now Sk+1 = Sk + (2k + 1) = k 2 + (2k + 1) = (k + 1)2 ,
which has the same form as Sk .
So if P (k) is true then so is P (k + 1).
(iv) Therefore P (n) is true for all n ∈ N.
Example 4.4 For any natural number n, 5n − 1 is divisible by 4.
Proof
(i) P (n) is the statement above.
(ii) P (1) : 51 − 1 = 4, which is divisible by 4, so P (1) is true.
(iii) Suppose P (k) is true. i.e. 5k − 1 is divisible by 4.
Then we could write 5k − 1 = 4m, where m ∈ N.
We now investigate P (k + 1).
5k+1 − 1
=
5.5k − 1
=
5(4m + 1) − 1
=
20m + 4
=
4(5m + 1)
which is clearly divisible by 4. So P (k) ⇒ P (k + 1).
(iv) By the principle of mathematical induction, P (n) is true for all n ∈ N.
Example 4.5 What is wrong with the following?
P (n): the sum of the first n even numbers is n2 + n + 1.
4.4
23
4
PROOF AND REASONING
Proof? Suppose P (k) is true, that is, Sk = 2 + 4 + · · · + 2k = k 2 + k + 1
then Sk+1
= Sk + (2k + 2)
=
(k 2 + k + 1) + (2k + 2)
= k 2 + 3k + 3
=
(k + 1)2 + (k + 1) + 1
which has the same form as Sk . So P (k) ⇒ P (k + 1)
Therefore P (n) is true for all n ∈ N .
So, for example, the sum of the first 2 even numbers is 22 + 2 + 1 = 7, which is traditionally regarded as an
odd number. In fact the sum is 2 + 4 = 6.
What went wrong?
Example 4.6 Prove that if z is a complex number which in polar form is hr, θi, then z n = hrn , nθi for all
n ∈ N.
4.4
24
5
5
FUNCTIONS
Functions
We can illustrate the theory in this section by means of the following, rather contrived example
Example 5.1
(i) Consider the following scenario: We have a group of customers and a selection of goods that they can
buy. The rules of the game are as follows:
(a) Every customer has to choose one good (one and only one).
(b) Two or more customers are allowed to choose the same good, (there is a large enough supply).
(c) Not all goods have to be chosen, indeed all customers could choose the same good.
To be more precise, let the customers be a, b, c, d and the goods be p, q, r, s, t, u, v.
We know that a chooses q: so we write this as f (a) = q. Similarly, b chooses p, c chooses q and d chooses
u. So we have f (a) = q, f (b) = p, f (c) = q, f (d) = u.
We can represent this information in a diagram:
a
b
c
d
p
q
r
s
t
u
v
Note that the only goods chosen are p, q and u. Goods r, s, t and v have not been chosen by anyone.
We say that the domain of f is the set {a, b, c, d}, the codomain of f is the set {p, q, r, s, t, u, v} and the
image set is the set {p, q, u}. f is a function from the domain to the codomain.
(ii) We change the rules. Now, any good chosen by one customer cannot be chosen by anyone else, ie suppose
f (a) = q, f (b) = p, f (c) = t, f (d) = u. Again, some of the goods (r, s and v) have not been chosen.
a
b
c
d
p
q
r
s
t
u
v
The domain and codomains are as before, but the image set is now {p, q, t, u}. Since no good is chosen by
more than one customer, the function is said to be one-one or injective.
(iii) Yet another rule change: this time there are fewer goods to choose from. More than one customer is
allowed to choose a particular good, as in (i), but now there are none left over. All goods are chosen by
somebody.
Suppose the customers are a, b, c, d as before, the only goods now are p, q and u, and f (a) = q, f (b) = p,
f (c) = q, f (d) = u.
a
b
c
p
q
u
d
Here the domain is the same as in (i) and (ii), the codomain is {p, q, u} and so also is the image set. Since
the image set is equal to the codomain, the function is said to be onto or surjective.
25
5
FUNCTIONS
This rather convoluted example illustrates the general properties of functions which we shall investigate next.
Note that I didn’t give a mathematical formula for the rule for f , just a verbal description of what it does.
5.1
Domain and codomain
Definition 5.2 Let A and B be sets. A function f : A → B, where f : x 7−→ y is a rule which assigns to
each x ∈ A a single (that is, exactly one) element y ∈ B. This assigned element is denoted f (x).
A is called the domain of the function f , and B is called the codomain of f .
5.2
Interval Notation for Subsets of R
Suppose a and b are real numbers with a < b, then the set of all real numbers x, such that
(i) a < x < b is denoted by (a, b), and is called an open interval. (Some books use ]a, b[ ).
(ii) a 6 x 6 b is denoted by [a, b] and is called a closed interval.
(iii) a 6 x < b, is written [a, b) and we say the interval is semi-open or semi-closed.
(iv) x > a is denoted by (a, ∞).
(v) x > a is denoted by [a, ∞).
(vi) x < b, (or x 6 b) is denoted by (−∞, b) (or (−∞, b]).
(iv), (v) and (vi) are all called semi-infinite intervals. If you prefer, you can use interval notation wherever
appropriate.
Examples 5.3
(i) Let A = B = R then
(a) f (x) = x2 is a function which assigns to each x ∈ R, the unique value x2 .
(b) f (x) = 0, for all x ∈ R is called the zero function.
(c) f (x) = x, for all x ∈ R is called the identity function.
(ii) Let A = {x : x ∈ R and x > 0}, or you could write this as R+ or using interval notation (0, ∞) and
B = R. Then f (x2 ) = x is not a function since both 2 and −2 are elements which satisfy this relation
(counter example).
√
√
If we write f (x) = x, then this is a function from (0, ∞) to R, since
is defined to be the positive
square root.
(iii) Let A = the set of all car owners in GB, B = the set of all registration numbers of cars. f is a function
which assigns to each car owner the registration number of his/her car.
(iv) f , as given in Example 5.1, is a function.
5.3
Images and image sets
In Example 5.1(i) we saw that four of the values in the codomain B were not attained.
Definition 5.4 (Image) In the above definition of function, corresponding to each element x ∈ A is the
unique element f (x) in B. f (x) is the image of x in B.
Definition 5.5 (Image set) The image set or range of the function f : A → B is the set of all values of
f (x) attained in B, with x ∈ A. It is a subset of the codomain B. and we could write the image set as f (A);
clearly f (A) ⊆ B.
In Example 5.1(i), the image set is {p, q, u}.
5.1
Domain and codomain
26
5.4
5
FUNCTIONS
One-one functions: injections
In Example 5.1(i), we saw that two elements of the domain, namely a and c, have the same image q. We now
consider functions where this possibility is disallowed.
Definition 5.6 (Injective function) A function f : A → B is one-one, injective, or an injection if f (a) = f (b)
only when a = b.
The function defined in Example 5.3(ii) is one-one.
5.5
Onto functions: surjections
Definition 5.7 (Surjective function) A function f : A → B is onto, surjective, or a surjection if its image
set exactly coincides with its codomain. That is, for any b ∈ B there exists at least one (and possibly more
than one) a ∈ A such that f (a) = b.
5.6
One-one and onto functions: bijections
Definition 5.8 (Bijective function) A function which is both injective and surjective is said to be bijective
or a bijection.
Much of the above can often be illustrated best using graphs, although this isn’t appropriate for all functions.
Example 5.9
(i) The function f : R → R given by f (x) = x2 is not onto because in this case the image set is the set of
non-negative real numbers, not the whole set R. We could write the image set as R+ ∪ {0}, or R+
0 or
much easier, [0, ∞).
2
(ii) The function f : R → R+
0 , f (x) = x is onto.
(iii) (i) above is not one-one: for example, f (2) = 4 and f (−2) = 4 (counter-example).
(iv) The function f : (0, ∞) → R, where f (x) = x2 is one-one.
Proof (Take note of the method of proof used here, (also see parts (vi) and (vii)).
We have to show: given f (a) = f (b) for all real a, b > 0 then a = b.
Consider f (b) − f (a) = b2 − a2 = (b − a)(b + a) = 0.
Now (b + a) > 0, since both a and b are positive, therefore a = b.
OR f (a) = f (b) ⇒ a2 = b2 ⇒ a = ±b. Since neither a nor b is negative, this implies that a = b.
(v) Let X = {1, 2, 3, 4}, Y = {a, b, c, d}, then f : X → Y given by f (1) = b, f (2) = c, f (3) = a, f (4) = d is a
bijection since it is both one-one and onto.
(vi) The function f : R → R given by f (x) = x + 1 is a bijection
It is onto because the image set f (R) is the whole of the real numbers R.
It is one-one because if for any a, b ∈ R, f (a) = f (b),
then a + 1 = b + 1 ⇒ a = b.
(vii) Let Y = {x ∈ R : x 6= 0} = R \ {0}.
The function f : Y → Y given by f (x) =
1
x
is a bijection.
It is onto because the image set is Y
It is one-one because for all a, b ∈ Y .
f (a) = f (b)
5.4
⇒
1
a
=
1
b
One-one functions: injections
⇒ a = b(6= 0)
27
5.7
5
FUNCTIONS
Composition of functions
Definition 5.10 (Composite function) Given two functions f : A → B and g : C → D such that f (A) ⊆ C,
we can form the composite function (g ◦ f ) : A → D by defining (g ◦ f )(a) = g(f (a)) for all a ∈ A.
Example 5.11
(i) Suppose f : R → R is given by f (x) = 3x2 + 2 and g : R → R is given by g(x) = 5x.
(a) Clearly the image set of f is {x ∈ R : x > 2}, which is a subset of R, the domain of g, so g ◦ f can
be formed.
g ◦ f (x) = g(f (x)) = 5f (x) = 5(3x2 + 2) = 15x2 + 10.
(b) Similarly, the image set of g is the whole of R which is the same as the domain of f , and f ◦ g can
be formed.
f ◦ g(x) = f (g(x)) = f (5x) = 3(5x)2 + 2 = 75x2 + 2
(ii) Let A be the set of even integers, ie A = {x ∈ Z : x = 2m, for some m ∈ Z}
Let f : Z → A be given by f (x) = 2x and g : A → Z be given by g(x) = x2
(a) The image set of f is A, so g ◦ f can be formed
g ◦ f (x) = g(f (x)) = g(2x) = (2x)2 = 4x2
(b) The image set of g is of the form {x ∈ Z : x = 4m, for m ∈ Z} and so is a subset of Z, the domain
of f , so f ◦ g can be formed.
f ◦ g(x) = f (g(x)) = f (x2 ) = 2x2 .
5.8
Inverse Functions
The inverse function f −1 (where it exists) reverses the effect of f ,
That is, if f (x) = y, then f −1 (y) = x. In other words f −1 (f (x)) = x.
More formally:
Definition 5.12 (Inverse function) Let X and Y be sets, and suppose that f is a function from X to Y .
We say that f is invertible if there exists a function f −1 : Y → X such that:
(i) f −1 ◦ f (x) = x for all x ∈ X,
(ii) f ◦ f −1 (y) = y for all y ∈ Y .
We then say that f −1 is the inverse of f .
(i) says that f −1 ◦ f : X → X is the identity function on X
(ii) says that f ◦ f −1 : Y → Y is the identity function on Y .
Theorem 5.13 f : X → Y is invertible if and only if f is a bijection, (one-one and onto).
Proof (Omitted.)
Examples 5.14
(i) Let X, Y be as in Example 5.9(v). Then f −1 (a) = 3, f −1 (b) = 1, f −1 (c) = 2, f −1 (d) = 4.
5.7
Composition of functions
28
5
(ii) Refer to Example 5.9(i) and (ii). Neither of these functions has an inverse.
Suppose f : [0, ∞) → [0, ∞), where f (x) = x2 . Then writing y = x2 we obtain x =
√
f −1 : [0, ∞) → [0, ∞),
f −1 (x) = x
√
FUNCTIONS
y. So we write
(X and Y are the same in this case).
(iii) Refer to Example 5.9(vi): f : R → R, f (x) = x + 1. Write y = x + 1, then x = y − 1 and the inverse
function is
f −1 : R → R, f −1 (x) = x − 1
(Check: f ◦ f −1 (x) = f (f −1 (x)) = f (x − 1) = (x − 1) + 1 = x).
(iv) f : R → R+ , f (x) = ex . Write y = ex , then x = ln y, so
f −1 : R+ → R,
f −1 (x) = ln x
(Check: f −1 (f (x)) = f −1 (ex ) = ln(ex ) = x; similarly, f (f −1 (x)) = f (ln x) = eln x = x).
(v) Y = R \ {0}
f : Y → Y , where f (x) = x1 . Write y = x1 , then x =
f −1 : Y → Y,
1
y
f −1 (x) = x1 .
This function is self inverse.
Much of this is much easier to grasp when we resort to the use of diagrams and graphs.
The graph of f −1 is just the graph of f reflected in the line y = x.
Examples 5.15 Find inverse functions, if they exist.
(i) f : R → R, f (x) = x3
(ii) f : R− → R+ , f (x) = x2
√
(iii) f : R+ → R− , f (x) = − x
5.9
Real-valued functions
Many of the functions that we shall discuss in this part of the course will be real valued functions of a real
variable. By ‘of a real variable’ we mean that the domain of the function is R, or some proper subset of R,
(for example, some interval of the real line) and ‘real valued function’ means that the image set is such that
f (x) ∈ R for all x in the domain of f : the codomain will always be taken to be R. Note that we cannot always
take the domain to be the whole of R, so in general, f : A → R, where A ⊆ R.
A real-valued function f of a real variable x is
(i) even if f (−x) = f (x) for all x in the domain of f ,
(ii) odd if f (−x) = −f (x) for all x in the domain of f .
Most functions are neither even nor odd.
Even
5.9
Real-valued functions
Odd
Neither
29
5
FUNCTIONS
The product and quotient of two even functions or of two odd functions is even.
The product and quotient of an even and an odd function is odd.
(Note that the domain of any quotient function will have to exclude any values for which the denominator
is zero). In order to refer sensibly to a function as being even or odd, we have to assume that its domain is
symmetric about the origin.
5.10
An alternative definition of function (non-examined)
Let X and Y be sets. A function f : X −→ Y is a subset S of the Cartesian product X × Y such that
(i) for each x ∈ X, (x, y) ∈ S for some y ∈ Y , ‘there is a y corresponding to each x’.
(ii) if (x, y1 ) ∈ S and (x, y2 ) ∈ S where x ∈ X and y1 ∈ Y , y2 ∈ Y , then y1 = y2 , ‘for each x there corresponds
only one y’.
Example 5.16
(i) Consider the function from N to N given by S = {(1, 2), (2, 4), (3, 6), . . .}.
Clearly, S = N × N, that is, S is a subset of the Cartesian product N × N. (Check that S satisfies the
conditions above). This function corresponds to the function f : N → N given by f (x) = 2x under our
previous definition.
(ii) Let X = {a, b, c} and Y = {d, e, f, g}. Then S = {(a, d), (b, d), (c, e)} is a function. The function
corresponds to f (a) = d, f (b) = d, f (c) = e.
(iii) Let X = {1, 2, 3} and Y = {6, 7, 9}. Then S = {(1, 7), (2, 8)} is not a function because it does not satisfy
condition (i), since (3, −) is missing.
(iv) Let X and Y be as in (iii) above. Then S = {(1, 7), (1, 8), (2, 6), (3, 6)} is not a function because it does
not satisfy condition (ii). Both y = 7 and y = 8 correspond to x = 1.
Example 5.17 Let X = {1, 2, 3, 4} and Y = {a, b, c}.
(i) Are the following subsets of X × Y functions or not?
(a) S = {(1, a), (2, a), (3, b), (4, c)}
(b) S = {(1, a), (2, b), (4, c)}
(c) S = {(1, a), (1, b), (2, c), (3, c), (4, b)}
(ii) How many subsets of X × Y are there?
(iii) How many of these are functions?
(iv) How many are onto, how many are one-one, and how many are bijections?
5.10
An alternative definition of function (non-examined)
30
6
6
COUNTING
Counting
6.1
Finite Sets
If the set is finite, then counting how many elements are in the set is theoretically a straightforward process,
even if not so in practice.
Example 6.1 Consider the finite sets in Examples 1.2 and 1.4 (in Section 1).
(i) A = {−3, 137, 1391}
(ii) B = {the people living in Coventry}
(iii) C = {x ∈ R : x2 +3x+2 = 0}
(iv) D = {b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, w, x, y, z}
(v) E = {a, e, i, o, u}
(vi) F = {Everest, K2, Kangchenjunga, Lhotse}
In each case, you would probably count by working systematically from left to right. There is a bijection
between A and the set {1, 2, 3}, between C and {1, 2}, between D and the set {1, . . . , 21}, and between E and
{1, 2, 3, 4, 5}.
The population of Coventry is continually and unpredictably fluctuating, so it’s difficult to say exactly what the
cardinality of B is at any given time. It is, however, finite, and in 2006 was estimated to be just under 310 000.
There are four mountains on Earth with heights above 8 500m, so there is a bijection between F and the set
{1, 2, 3, 4}.
Definition 6.2 A set X is said to be finite if either X = ∅, or for some n ∈ N there is a bijection between X
and the set {1, 2, . . . n}. Then X is said to have size or cardinality |X| = n.
6.2
Infinite sets
Definition 6.3 A set X is said to be infinite if it is not finite.
Example 6.4 N, Z, Q, R and C are all infinite (why?) but N ⊂ Z ⊂ Q ⊂ R ⊂ C, so is R more infinite than N,
say?
6.3
Countable Sets
Definition 6.5 A set X is said to be countable if either X is finite or there exists a bijection from N to
X, (that is, there is an exact match between the two sets). A set X is said to be uncountable if it is not
countable.
Example 6.6
(i) X = {2, 4, 6, . . .} is countable because f : N → X, given by f (n) = 2n for all n ∈ N, is a bijection.
(ii) Z = {. . . − 3, −2, −1, 0, 1, 2, 3, . . .} is countable, because we can define a function
n
if n is even
2
f : N → Z, given by f (n) =
1−n
if n is odd.
2
f (1) = 0, f (2) = 1, f (3) = −1, f (4) = 2, f (5) = −2 and so on. f is a bijection (why?).
Lemma 6.7 A subset of a countable set is itself countable.
Example 6.8
(i) The odd numbers are countable.
(ii) The prime numbers are countable.
31
6
COUNTING
(iii) The set of all natural numbers greater than 100 is countable.
(iv) Any subset of the natural numbers is countable, and any infinite set of natural numbers has the same
cardinality as N itself.
Definition 6.9 A set X which is infinite but countable is said to have cardinality ℵ0 (aleph null, aleph nought
or aleph zero).
Theorem 6.10 The set of rational numbers Q is countable.
Proof We first consider the positive rationals
Q+ = {q ∈ Q : q > 0}.
We display these numbers as follows, then follow the arrow pattern
1
2
1
1/1
↓
1/2
2
2/1
%
2/2
.
3
4
1/3
↓
1/4
6
..
.
1/5
↓
1/6
3
3/1
4
4/1
2/3
%
4/2
3/3
2/5
%
3/5
2/6
..
.
3/6
6/5
5/6
..
.
...
%
%
4/6
..
.
6/4
5/5
.
...
.
.
4/5
%
...
6/3
5/4
%
...
.
6/6
..
.
...
...
%
%
4/4
.
6/2
5/3
.
.
−→
.
.
4/3
3/4
6
6/1
%
%
%
%
5
5/1
5/2
.
.
2/4
−→
.
%
3/2
%
.
5
−→
.
..
.
...
..
.
We create the following list:
1
,
1
1
,
2
2
,
1
3
,
1
2
,
2
1
,
3
1
,
4
2
,
3
3
,
2
4
,...
1
etc
In this list will be some repetitions, because for example,
2
1
≡ ,
2
1
1
2
3
≡ ≡ . . . etc
2
4
6
Rewrite the list, deleting any entry if it has occurred before, and we get
1
,
1
1
,
2
2
,
1
3
,
1
1
,
3
1
,
4
2
,
3
3
,
2
1
,...
4
We can now set up a bijection between N and this (revised) list.
If an is the rational number which is the nth member of this list, then
f : N → Q+ ,
given by f (n) = an
is a bijection, and Q+ is countable. We can then extend the list to the whole of Q.
0, a1 , −a1 , a2 , −a2 , . . .
is a complete listof all the rationals. Therefore the function f : N −→ Q given by
f (1) = 0,
f (2n) = an ,
f (2n + 1) = −an ,
for all
n ∈ N.
is a bijection. Therefore Q is countable.
6.3
Countable Sets
32
6
COUNTING
Remark Although we do not have an explicit formula for the bijection, we have an explicit prescription for
constructing it:1
↓
0
2
↓
1
3
↓
-1
4
↓
1
2
5
↓
− 12
6
↓
2
7
↓
-2
8
↓
3
9
↓
-3
10
↓
1
3
11
↓
- 13
...
...
Theorem 6.11 R is not countable, that is, it is uncountable. There is no bijection from N to R
Proof (Omitted.)
Remark Since R is not the same size as N it must have its own cardinality. R, or any other set X for which
there is a bijection from R to X is said to have cardinality ℵ1 (aleph one).
6.3
Countable Sets
33
6.3
Countable Sets
6
COUNTING
34
7
7
LIMITS
Limits
First we present an informal definition:
Definition 7.1 (informal) If we can make f as close as we like to a real number `, by making x sufficiently
close to, but not equal to, the real number a, we say that f (x) approaches or tends to the limit ` as x tends
to a. We write either
f (x) → ` as x → a
or
lim f (x) = `
x→a
This definition means that the ‘distance’ between f (x) and ` can be as small as any prescribed number provided
that the ‘distance’ between x and a is small enough. Mathematically this is written as follows:
Definition 7.2 (formal) For a real valued function f of a real variable x, f (x) tends to the limit ` when
x → a if for every choice of ε > 0, however small, there is a corresponding number δ > 0 such that
if
7.1
0 < |x − a| < δ
then
|f (x) − `| < ε.
Some well-behaved functions
There are many functions for which the limit as x tends to a is the value of the function at x = a.
Example 7.3 Consider the functions f : R → R defined by:
(i) f (x) = 1
(ii) f (x) = 5x2
(iii) f (x) = sin x
(iv) f (x) = ex
Such functions are called continuous functions. See Section 8 for more on this.
2
−1
. What happens when x gets
Example 7.4 Consider the function g : R \ {1} → R defined by g(x) = xx−1
closer and closer to 1? Note that in this example g(1) is not defined, nor is there any value of x which makes
g(x) = 2.
Example 7.5 Consider the following functions:
(i) g(x) =
x
x
(ii) g(x) =
(iii) g(x) =
7.2
x
1
x 6= 0
x=0
x
|x|
Some general properties of limits
If f and g are real valued functions of a real variable x and if f (x) → ` and g(x) → m as x → a then
(i) k.f (x) → k`, where k is a constant;
(ii) f (x) ± g(x) → ` ± m;
(iii) f (x)g(x) → `m;
(iv)
1
f (x)
→ 1` , provided ` 6= 0;
(v)
f (x)
g(x)
→
`
m,
provided m 6= 0.
35
7.3
7
LIMITS
Further properties of limits
(i) The Sandwich Rule (important)
Suppose f (x), g(x), h(x) are defined on some domain and they satisfy the inequality
g(x) 6 f (x) 6 h(x),
that is, f is always sandwiched between g and h. If both g(x) and h(x) tend to the same limit ` as
x → a, then f (x) → ` as x → a.
(ii) If f (x) → ` and g(x) → m as x → a and f (x) 6 g(x), in some neighbourhood of a, then ` 6 m.
7.4
An important limit (1)
Let R∗ = R \ {0}. Consider the function f : R∗ → R defined by h(x) = sinx x .
The domain has to be R∗ because putting x = 0 leads to 00 which is indeterminate or undefined.
We want to find out what happens to f as x gets close to zero.
Using trigonometry, we can show that
sin x < x < tan x
then dividing by sin x 1 <
or
cos x <
x
sin x
sin x
x
<
for
0<x<
π
2,
for
0<x<
π
2.
(where x is measured in radians)
1
cos x
< 1,
We now let x decrease towards zero, or ‘approach zero through positive values’, which is usually written x → 0+ ,
then cos x increases towards 1.
Hence sinx x also increases towards 1, since it is ‘sandwiched’ between cos x and 1.
This result is also true when x approaches zero through negative values, which we write x → 0− , because sinx x
is an even function.
We can therefore say that sinx x → 1 as x → 0 in any manner, or
lim
x→0
sin x
x
= 1.
This limit is important because it is used in proving that the derivative of sin x is cos x.
1
0.9
0.8
0.7
0.6
0.5
–2
–1
0
1
2
x
Figure 7.1: f (x) =
7.3
Further properties of limits
sin x
x
36
7
Remark In the above example we have said nothing about whether or not
LIMITS
sin x
x
(i) takes the value 1 for some value of x, or
(ii) has a value when we put x = 0.
In fact sinx x is indeterminate when x = 0 as we have stated above. Also we did not need to use any value for
f (0) at any stage of our calculation of the limit. There is no value of x for which sinx x = 1 and we can say that
the limit is unattained.
This means that when x becomes close to 0, sinx x becomes close to 1, and we can make sinx x as close as we
please to 1 for all values of x sufficiently close to 0.
√
Example 7.6 limx→0
1+x−1
.
x
tan x−sin x
sin3 x
Example 7.8 We know that
5 sin x
x
(i)
sin x
x
(ii)
→ 1 and (x + 2) → 2 as x → 0. Then
+x+2→
sin x(x+2)
x
(iv)
x
sin x
→
sin x
x(x+2)
7.5
sin x
x
→
(iii)
(v)
(remember sin2 x + cos2 x = 1).
→
and
1
x+2
→
→
xm −am
x→a x−a
lim
= mam−1 ,
m∈Q
This is easy to prove when m ∈ Z+ = N, and then for m ∈ Z− .
Proof
(i) Let m ∈ N.
Now xm − am = (x − a)(xm−1 + xm−2 a + xm−3 a2 + · · · + am−1 ).
If x 6= a, then g(x) =
xm −am
x−a
= xm−1 + xm−2 a + xm−3 a2 + · · · + am−1
(m terms).
m−1
m−1
Letting x → a, then g(x) → a
m−2
+a
m−3 2
a+a
m−1
a + ··· + a
= ma
(ii) Now let m ∈ Z− , then we can write m = −n where n ∈ N.
Then xm − am = x−n − a−n =
xm −am
x−a
1
xn
−
1
an
n
=
an −xn
xn an
n
n−1
−a )
na
= − x(x
= −na−n−1 = mam−1
n an (x−a) → − a2n
Thus the result holds for all m ∈ Z.
The proof is trickier when m is not an integer. This limit is important in proving that the derivative of xn is
nxn−1
Remarks
(i) The phrase ‘x tends to a’ implies that x must be allowed to approach a from both sides, ie from below,
x → a− and from above, x → a+ and obtain the same answer from both sides.
7.5
37
7
LIMITS
0.8
0.6
0.4
0.2
–2
–1
0
1
2
x
–0.2
Figure 7.2: f (x) = x sin
1
x
(ii) f (x) does not necessarily approach ` steadily as x → a as is the case for
on page 36).
Consider the function f given by f (x) = x sin x1 (see Figure 7.2).
sin x
x
when x → 0, (see Figure 7.1
The function is even because it is the product of x and a sine function, which are both odd. We find that
f is positive for some values of x near 0, and negative for others. What is critical is that the numerical
difference between f and ` should get smaller and smaller, that is, it should be as small as we like,
provided we are close enough to 0.
(iii) The difference between f and `, as described above, must be as small as we require for all values of x
sufficiently near a.
Consider the function f (x) = sin x1 (see Figure 7.3).
1
0.5
–2
–1
1
2
x
–0.5
–1
Figure 7.3: f (x) = sin
1
x
1
Now sin x = 0 when x = nπ, so sin x1 = 0 when x = nπ
, n ∈ Z, and is as small as we like if x is
sufficiently close to one of these numbers. However in fact, the function does not have a limit as x → 0,
because it does not stay close to 0 for all small values of x.
7.5
38
7
LIMITS
(iv) In the definition of limit, no mention is made of f (a), the value of f at x = a, indeed the function may
not even be defined when x = a. So when we say ‘all values of x sufficiently close to a’, we mean all
values of x in some small interval about x = a, but excluding the value a itself. This interval may be
written in the form a − δ < x < a + δ, for x 6= a or, better,
0 < |x − a| < δ.
(|x − a| is the ‘distance’ between ‘x’ and ‘a’). The interval is called a neighbourhood of a.
(v) If f (x) = k, a constant, for all x in some neighbourhood of a, then limx→a = k.
7.6
Some more properties and examples
(i) If f (x) → 0 as x → a, then f (x) is as small as we please for all x sufficiently close to a.
In this case,
1
f (x)
will not tend to a limit, but will be numerically large as x → a.
Let f (x) = x2 . Then when x is small (positive or negative), x2 is small and positive, and
positive. x12 can be made as large as we please provided x is sufficiently close to 0.
We say that
1
x2
1
x2
is large and
tends to infinity when x tends to 0, and write
1
x2
→ ∞ as
x → 0.
Similarly, if g(x) = x1 , then as x → 0+ , g(x) → +∞, and as x → 0− , g(x) → −∞
(ii) NB When we say that f (x) tends to the limit ` as x → a, we shall understand ` to be finite. We shall
not say that x12 tends to a limit as x → 0.
In other words, a limit is by definition a finite limit.
(iii) f (x) = x1 sin x1 (see Figure 7.4).
20
–1
–0.8
–0.6
–0.4
y
10
–0.2
0
0.2
0.4
0.6
0.8
1
x
–10
–20
Figure 7.4: f (x) =
1
x
1
x
sin
1
x
assumes all values between 1 and −1, however close we get to x = 0.
It follows that f (x) = x1 sin x1 can take any value, positive or negative, however large.
The function sin
But f (x) does not remain large for all values of x near x = 0, eg it is zero when x =
so does not tend to ±∞ as x → 0.
1
nπ ,
for n ∈ Z, and
We say that
(a) sin x1 oscillates finitely between -1 and +1 (see Figure 7.3),
7.6
Some more properties and examples
39
1
x
1
x
7
LIMITS
oscillates infinitely as x → 0 (see Figure 7.4).
(iv) (a) The functions x1 , x12 and x1 sin x1 are all unbounded near x = 0, because they tend to infinity
(positive or negative) as x → 0.
(b) sin x1 is said to be bounded near x = 0 because it lies between two fixed numbers for all small
values of x.
(b)
sin
The terms ‘bounded’ and ‘unbounded’ can be applied to the behaviour of any function in the neighbourhood
of any number a.
Theorem 7.9 If f (x) → ` as x → a, then f (x) is bounded near x = a. (` is finite and when x is close to a,
f (x) is close to `).
7.7
Limits when x → ±∞
If x becomes sufficiently large, g(x) = x1 becomes small and positive, since it can be made as small as we like
provided that x is large enough. Similarly, f (x) = sinx x can be made as small as we please for all x sufficiently
large, (see Figures 7.1–7.4). We say that x1 , x12 and sinx x tend to 0 as x tends to infinity.
Definition 7.10 (informal) If the values of f (x) can be made as close as we like to the number ` by making
x sufficiently large, we say that f (x) tends to the limit ` as x tends to infinity. We write
f (x) → ` as x → ∞,
or
We can often find out what happens as x → ∞, by writing x =
Example 7.11 Consider the behaviour of f (x) =
Example 7.12
(ii) limx→∞
(iii) limx→∞
7.7
(i) limx→∞
1
xα ,
x2 +1
x+1
lim f (x) = `,
x→∞
1
t
and then letting t → 0.
as x → −1 and as x → ±∞.
α>0
x2 −4x+3
x3 −x2 +x−1
√
x
√
x+a−
Limits when x → ±∞
√ x
40
8
8
CONTINUITY
Continuity
We want a mathematically precise concept of a continuous function. Intuitively, the graph of such a function
f (x) should have no breaks or jumps in it. That is, at any given point, the value of the function should be
equal to the limit of the function (from both sides):
Definition 8.1 (Continuity at a point)
The function f (x) is continuous at x = a if limx→a f (x) = f (a). If f (x) either does not have a well-defined
limit at x = a, or if the limit is not equal to f (a) then we say that f (x) is discontinuous at x = a.
We therefore see that f has to be defined at a as well as in the neighbourhood of a.
Examples 8.2
(i) f (x) = |x|, at x = 0.
(
x2 + 1 x > 1,
(ii) g(x) =
2x − 3 x < 1.
(iii) h(x) =
x2 −a2
x−a ,
for x 6= a; h(a) = 2a.
We are particularly interested in functions which are continuous for most, if not all, of their domain:
Definition 8.3 (Continuity in an interval)
If f is continuous at each point of an open interval (a, b), then f is continuous in (a, b). For a closed interval
[a, b], we add the end conditions
lim f (x) = f (a)
x→a+
and
lim f (x) = f (b).
x→b−
Theorem 8.4 The sum, difference, product and quotient of continuous functions are also continuous functions
(except for the zeros of the denominator in the case of quotients).
Care must be taken with domains:
The domain of sums, differences, products are the intersections of individual domains.
The domain of a quotient is the intersection of individual domains, minus any values for which the denominator
is zero.
Examples 8.5 (Some continuous functions)
(i) f : R → R, f (x) = xn ,
(ii) g : R∗ → R, g(x) =
1
xn ,
n ∈ N.
n∈N
(iii) Polynomials:
p : R → R given by p(x) = a0 + a1 x + a2 x2 + · · · + an xn ,
ai ∈ R,
n ∈ N.
(iv) Rational functions, that is, functions which are quotients of polynomials =
any values of x for which q(x) = 0.
p(x)
q(x) .
The domain excludes
(v) sin(x), continuous for all x ∈ R.
(vi) cos(x), continuous for all x ∈ R.
(vii) exp(x) = ex , continuous for all x ∈ R.
(viii) ln(x), continuous for all {x ∈ R : x > 0}
41
8
CONTINUITY
Theorem 8.6 (Continuity of composite functions)
If f : R → R is a continuous function of x at x1 and g : R → R is also a continuous function of t at t1 , where
x1 = g(t1 ), then F (t) = f (g(t)) is continuous at t = t1 .
Theorem 8.7 (Boundedness of continuous functions on closed intervals)
If f is continuous on [a, b], then it is also bounded on [a, b].
This means that there are constants M and m such that m 6 f (x) 6 M , and it attains its bounds (its bounding
values). That is, there are numbers c, d ∈ [a, b] such that f (c) = m and f (d) = M .
Examples 8.8
(i) y = sin(x), and −1 6 sin(x) 6 1. Bounds are ±1, which are attained.
(ii) φ(x) = x1 , for 0 < x 6 1, and φ(0) = 0. φ is finite in [0, 1], but it is not continuous on this closed interval.
It tends to −∞ as x → 0, and is not bounded.
sin(x)
π
(iii) f (x) = sin(x)
never takes the value 1 at any point
x , where 0 < x 6 2 . An upper bound is 1, but
x
within the given interval, and it is not continuous since it is not defined at x = 0.
(
1
sin(x) x 6= 0,
x ∈ 0, π2 .
(iv) g(x) = x
1
x = 0.
Now, the function is defined at x = 0. We see that M = 1 and m =
hence g is continuous.
2
π,
and that limx→0
1
x
sin(x) = 1 = g(0),
Theorem 8.9 If f is continuous and both limits exist, then
lim f g(x) = f lim g(x)
x→a
Examples 8.10
(i) limx→1 sin
x2 −1
x−1
x→a
(ii) limx→∞ e1/x
8.1
The Intermediate Value Theorem
Theorem 8.11 (Simple version) If f is continuous on [a, b] and f (a) and f (b) have opposite signs, then
there is at least one point c ∈ (a, b) at which f (c) = 0.
Theorem 8.12 (General version) If f is continuous on [a, b] and f (a) = α and f (b) = β, then for any
number γ ∈ (α, β) there is at least one number c ∈ (a, b) at which f (c) = γ.
Both versions of these theorems together are called the Intermediate Value Theorem (IVT). They are very
useful, and in fact are just common sense.
Example 8.13 Show that the polynomial equation p(x) = x5 − 5x + 2 = 0 has at least 3 real roots.
By experimentation, we find that p(−2) = −20, p(−1) = 6, p(0) = 2, p(1) = −2, p(2) = 24, so there must be a
root in each of the intervals (−2, 1), (0, 1) and (1, 2).
There could be more. How many?
Corollary 8.14 If f is continuous on [a, b] and is such that a 6 f (x) 6 b whenever a 6 x 6 b, then the
equation f (x) = x has a real root in [a, b].
8.1
The Intermediate Value Theorem
42
8.2
8
CONTINUITY
Numerical methods for solving f(x)=0
We now examine three different numerical methods for solving equations of the form f (x) = 0.
8.2.1
The Newton–Raphson method
For this method, we iteratively calculate a sequence of approximate solutions given by the recurrence relation
xn+1 = xn −
f (xn )
.
f 0 (xn )
Newton’s method does not always work in finding a specific root. It may converge to the ‘wrong’ root if you
choose the ‘wrong’ initial value.
Example 8.15 Find, to an accuracy of one decimal place, a root of the quadratic polynomial p(x) = x5 − 5x + 2.
We first note that p(0) = 2 and p(1) = −2, so by the IVT, there is a solution in the interval (0, 1). Take the
initial approximation to be x0 = 0.5.
(x5 −5xn +2)
The first derivative of p(x) is p0 (x) = 5x4 − 5, so we have to iterate xn+1 = xn − n(5x4 −5)
.
n
Setting x0 = 0.5, we find that x1 = 0.4 and x2 = 0.4021. Since x1 and x2 agree to the required accuracy we
can stop, so x = 0.4 to one decimal place. (Carry on for greater accuracy.)
8.2.2
The bisection method
This is a direct application of the IVT. It always works, but convergence is slow, so it isn’t very efficient.
Example 8.16 Use the bisection method to find an approximate (to an accuracy of one decimal place) solution
of the equation x5 − 5x + 2 = 0.
Interval [0, 1],
p(0) > 0, p(1) < 0, so take x = 0.5, p(0.5) < 0,
Interval [0, 0.5],
p(0) > 0, p(0.5) < 0, p(0.25) > 0
Interval [0.25, 0.5],
p(0.25) > 0, p(0.5) < 0, p(0.375) > 0
Interval [0.375, 0.5],
p(0.375) > 0, p(0.5) < 0, p(0.4375) < 0
Interval [0.375, 0.4375], p(0.375) > 0, p(0.4375) < 0, p(0.40625) < 0
This is very slow.
8.2.3
Direct iteration
This is when we rearrange the given equation as an iterative formula.
Example 8.17 Use direct iteration to find a solution to the equation x5 − 5x + 2 = 0.
Rewrite x5 − 5x + 2 = 0 as x = 15 (x5 + 2). (Why choose this? See below). Express this as an iteration formula:
xn+1 = 15 (x5n + 2).
Starting with x0 = 0.5 then x1 = 0.40626, x2 = 0.402213, x3 = 0.4021 . . ..
This gives reasonable convergence quite quickly.
An alternative rearrangement is
p
x = 5 (5x − 2)
Does this work?
x0 = 0.5, x1 = 0.871, x2 = 1.1866, x3 = 1.315, x4 = 1.355, x5 = 1.367, . . .
This is not converging on the root in (0, 1), but does seem to be converging on to the root in (1, 2). The second
rearrangement does work for the root in (−2, −1) whereas our first rearrangement does not.
How do we tell whether a particular rearrangement will or will not work?
Suppose we want to find the root c of the equation F (x) = 0. Rearrange this to the form
xn+1 = f (xn ).
8.2
Numerical methods for solving f(x)=0
43
8
CONTINUITY
This will converge to the required root c, i.e. c = f (c), if both the initial value x0 and c both lie within an
interval for which
|f 0 (x)| 6 α, where α < 1
Example 8.18 Show that the equation xex = 1 has a solution between 0 and 1. Find a root correct to 2
decimal places.
8.3
Monotone functions
Definition 8.19 The function f : X → Y is said to be strictly increasing on a set X if, whenever x1 < x2 ,
f (x1 ) < f (x2 )
for all x1 , x2 ∈ X. Similarly, f is strictly decreasing if, whenever x1 < x2 ,
f (x1 ) > f (x2 )
for all x1 , x2 ∈ X.
Note that f doesn’t have to be continuous in this definition.
A strictly monotone (or strictly monotonic) function is one which is either strictly increasing or strictly
decreasing.
By a (not strictly) monotonic increasing function, we mean that
f (x1 ) 6 f (x2 )
8.3
Monotone functions
for all
x1 , x2 ∈ X.
44
9
9
DIFFERENTIABILITY
Differentiability
Having considered what it means for a function to be continuous, we now want to examine what it means for a
function to be differentiable. Intuitively, we expect the graph of the function to be smooth, with no jagged
points in it.
Definition 9.1 (Derivative at a point) Suppose f (x) is defined at all points in the neighbourhood of x = a,
and the ratio
f (a + h) − f (a)
h
tends to a (finite) limit as h → 0, then f is said to be differentiable at a. The value of the limit is called the
derivative of f at a, and is denoted by f 0 (a). That is,
f (a + h) − f (a)
h→0
h
f 0 (a) = lim
Note that the limit must exist and be taken from both sides of x = a.
As before, we can extend the concept of differentiability at a single point, to differentiability over an interval:
Definition 9.2 (Derivative in an interval) Suppose f (x) is defined for all x ∈ (a, b). Then f is said to be
differentiable in (a, b) if it is differentiable at all points of the interval as determined by Definition 9.1.
There is an important connection between the concepts of continuity and differentiability:
Theorem 9.3 Differentiable functions are continuous.
(Note that the converse is not true in general: there exist continuous functions which are not differentiable.)
Proof It is often more convenient to write x in place of (a + h), and in this case, h is replaced by (x − a) and
with this notation, the limit becomes
lim (f (x) − f (a))
f (x) − f (a)
= x→a
.
x→a
x−a
lim (x − a)
f 0 (a) = lim
x→a
Cross multiplying,
lim (f (x) − f (a)) = f 0 (a) lim (x − a) = 0,
x→a
x→a
so limx→a f (x) = f (a). This means that if f is differentiable, then it is continuous.
Corollary 9.4 If f is differentiable on (a, b), then it is continuous on (a, b).
Example 9.5 Are the following functions differentiable? If not, why and where not?
(i) f (x) = xn ,
(n ∈ N)
(ii) g(x) = |x|
Example 9.6 To find the derivative of sin(x) from first principles, we consider the ratio
the trigonometric identity
C−D
sin C − sin D = 2 cos C+D
2 sin 2 ,
we have
2 cos
sin(x) − sin(a)
=
x−a
x+a
2
sin
x−a
x−a
2
sin(x)−sin(a)
.
x−a
Using
= cos
x+a
2
· sin
x−a
2
x−a
2
so
sin(x) − sin(a)
= cos
x→a
x−a
Thus the derivative of sin(x) is cos(x).
lim
2a
2
· 1 = cos(a)
45
9
DIFFERENTIABILITY
Example 9.7 Prove from first principles that f : R → R given by
(
x2 if x < 0
f (x) =
x3 if x > 0
is differentiable at x = 0, and hence for all x.
9.1
Differentiation rules
Suppose f and g are differentiable at x = a, and k ∈ R is a non zero scalar, then:
(i) f ± g are differentiable at x = a, with derivative f 0 (a) ± g 0 (a)
(ii) kf is differentiable at x = a with derivative kf 0 (a)
(Scalar Multiple Rule)
(iii) f g is differentiable at x = a with derivative f 0 (a)g(a) + f (a)g 0 (a)
(iv)
f
g
is differentiable at x = a with derivative
(Sum/Difference Rule)
f 0 (a)g(a)−f (a)g 0 (a)
{g(a)}2
(Product Rule)
provided that g(a) 6= 0 (Quotient Rule)
Rules (i) and (ii) together mean that differentiation is a linear transformation.
9.2
Differentiating composite functions
Suppose g is differentiable at x = a and that g(a) = b. Also suppose f is differentiable at b. Then the
composite function f ◦ g or f (g(x)) is differentiable at a, and its derivative at a is f 0 (b)g 0 (a). More generally,
if h(x) = f (g(x)) on some interval, then h0 (x) = f 0 (g(x))g 0 (x).
This is the familiar chain rule when written in Leibniz notation:
Write u = g(x), y = f (u), so that
dy
dy du
=
·
.
dx
du dx
Example 9.8 Find the derivatives of:
(i) (x2 + 1)3
(ii) ln(2x3 − 1)
(iii) x cos−1 (x/2)
(iv) xx
9.3
Differentiating inverse functions
Suppose that f : A → B has an inverse f −1 : B → A, where A and B are subsets of R. Using the composite
relation in Section 9.2, viz h(x) = f (g(x)), then x = f (f −1 (x)) = f −1 (f (x)). Differentiating the second version
with respect to x,
0
1 = f −1 (f (x)) f 0 (x)
so that
f −1 (f (x))
0
=
1
f 0 (x)
.
This looks more straightforward in Leibniz notation:
If y = f (x) then x = f −1 (y).
dy
= f 0 (x)
dx
so
9.1
and
0
dx
= f −1 (y) ,
dy
. dy
dx
=1
.
dy
dx
Differentiation rules
46
9
DIFFERENTIABILITY
Example 9.9 Find the derivatives of:
(i) f (x) = ln(x) and f −1 (x) = exp(x)
(ii) f −1 (x) = sin−1 (x/a)
9.4
9.5
Table of derivatives
f (x)
f 0 (x)
f (x)
f 0 (x)
f 0 (x)
k
0
sin(ax)
a cos(ax)
sin−1
x
a
xn
nxn−1
cos(ax)
−a sin(ax)
cos−1
x
a
eax
aeax
tan(ax)
a sec2 (ax)
tan−1
x
a
ln(ax)
1
x
f (x)
√ 1
a2 −x2
√
− a21−x2
a
a2 +x2
Leibniz’ Theorem (the extended product rule)
Let y = uv, where u = u(x) and v = v(x).
The familiar product rule gives y 0 = u0 v + uv 0 , where u0 = u0 (x), etc.
Applying the product rule again and again gives
y 00
y
000
= u00 v + 2u0 v 0 + uv 00
= u000 + 3u00 v + 3u0 v 0 + uv 000
Clearly this notation is impractical, so instead we write y 0 = y (1) , y 00 = y (2) , etc, then these equations become
y (1)
= u(1) v + uv (1)
y (2)
= u(2) v + 2u(1) v (1) + uv (2)
y (3)
= u(3) v + 3u(2) v (1) + 3u(1) v (2) + uv (3)
The general result may be proved by induction:
Theorem 9.10 (Leibniz’ Theorem)
y (n) = u(n) v +
Example 9.11 Find
9.6
d5
2
dx5 (x
n
1
u(n−1) v (1) +
n
2
u(n−2) v (2) + · · · + uv (n)
sin(x)).
Rolle’s Theorem
Theorem 9.12 (Rolle’s Theorem) Suppose the function f is
(i) continuous on the closed interval [a, b],
(ii) differentiable on the open interval (a, b), and
(iii) f (a) = f (b).
Then there is at least one number c ∈ (a, b) such that f 0 (c) = 0.
Question Why an open interval in the second hypothesis? Why not a closed interval?
9.4
Table of derivatives
47
9.6.1
9
DIFFERENTIABILITY
An application of Rolle’s Theorem
Consider any function f which is both continuous and differentiable for all R or continuous and differentable in
the interval (a, b). Suppose α and β are two solutions of the equation f (x) = 0 which lie within the interval
(a, b), that is, f (α) = f (β) = 0. Then the conditions of Rolle’s theorem are satisfied and between α and β there
is at least one point where f 0 (x) = 0.
Now consider two consecutive solutions of f 0 (x) = 0. No more than one solution of f (x) = 0 can lie between
consecutive roots of f 0 (x).
There may be no solution, but if there is one, then there is only one. This may be proved by contradiction
using Rolle’s Theorem:
Suppose a and b are consecutive roots of f 0 (x), that is, f 0 (a) = f 0 (b) = 0, and that there are two solutions of
f (x) = 0 between a and b. But by Rolle’s Theorem, there is at least one c ∈ (a, b) where f 0 (c) = 0, so a and b
are not consecutive roots of f 0 (x).
This is a contradiction, hence there cannot be more than one solution of f (x) = 0 between consecutive roots of
f 0 (x).
Example 9.13 Verify that Rolle’s Theorem is satisfied for the following function on the given interval. Hence
find the value(s) of c.
f (x) = x(1 − x)3 on [0, 1].
Theorem 9.14 If f is continuous on [a, b] and differentiable on (a, b) and f 0 (x) > 0 in (a, b), then f is strictly
increasing throughout [a, b]. If f 0 (x) > 0, then f is increasing (but not strictly so).
Example 9.15 Show that x − sin x is an increasing function when x > 0.
Example 9.16 We return to p(x) = x5 − 5x + 2 = 0. We know the following:
(i) There is at least one solution in (−2, −1), in (0, 1) and in (1, 2) – a minimum of three solutions.
(ii) There are at most 5 solutions for x ∈ R (by the Fundamental Theorem of Algebra). We now show that it
has exactly 3 roots and no more.
(iii) p0 (x) = 5x4 − 5, p0 (x) = 0 when x = ±1, so there is at most one solution of p(x) = 0 between -1 and 1, so
using (a) there is exactly on solution in (0, 1).
(iv) For |x| > 1, p0 (x) > 0, so p is a strictly increasing function.
(v) For x > 1, since p(1) = −2 < 0, there can be at most one solution, which is the solution in (1, 2).
(vi) For x < −1, p(−1) = 6 > 0, p0 (x) > 0, so p decreases as we move to the left and so there can be at most
one solution which is the solution in (−2, −1).
Example 9.17 Show that xesin x = cos x has no more than one root in the interval (0, π2 ). First use the
Intermediate Value Theorem to prove that it has at least one root.
9.7
The Mean Value Theorem
Theorem 9.18 (The Mean Value Theorem) Suppose that f is
(i) continuous on the closed interval [a, b],
(ii) differentiable in the open interval (a, b),
then there exists at least one number c ∈ (a, b) such that
f (b) − f (a)
= f 0 (c).
b−a
9.7
48
9
DIFFERENTIABILITY
This theorem says that the gradient of the chord between the points (a, f (a)) and (b, f (b)) is equal to the
gradient of the tangent to the curve at x = c.
Corollary 9.19 We can multiply throughout by (b − a) to give
f (b) − f (a) = (b − a)f 0 (c)
If, as indicated in the diagram, b > a, and in addition, if f 0 (c) > 0, then f (b) > f (a) and the function is
increasing.
Theorem 9.20 If f is continuous on [a, b] and differentiable in (a, b) and f 0 (x) = 0 for all x ∈ (a, b), then
f (x) is constant for all x ∈ [a, b].
Corollary 9.21 If both f and g are continuous in [a, b], and are such that f 0 (x) − g 0 (x) = 0 for all x ∈ (a, b)
then f (x) − g(x) is constant for all x ∈ [a, b].
9.7.1
Alternative forms of the Mean Value Theorem
(i) Obviously, since b 6= a the theorem may be written
f (b) = f (a) + (b − a)f 0 (c),
where a < c < b.
We can interpret this as f (b) is approximately equal to f (a), and the term (b − a)f 0 (c) is the error or
remainder in making this approximation.
(ii) Let b = a + h (h can be positive or negative) and c = a + θh, where 0 < θ < 1, then
f (a + h) = f (a) + hf 0 (a + θh).
In this case the error or remainder is hf 0 (a + θh).
Example 9.22 If x > 0, show that ln(1 + x) < x.
Solution Let f (x) = ln x. We know that f (x) is continuous and differentiable for all x > 0 and that f 0 (x) = x1 .
Using the Mean Value Theorem in the form
f (b) − f (a)
= f 0 (c), then
b−a
ln(b) − ln(a)
1
= , where 0 < a < c < b.
b−a
c
(∗)
Now consider the function g(x) = x1 (that is, the derivative f 0 (x) of f ).
Now g 0 (x) = − x12 < 0, so g is a decreasing function, and since 0 < a < c < b,
1
1
1
< <
b
c
a
(†)
Substituting into (∗) gives
1
ln (b/a)
1
<
= .
b
b−a
a
Now write a = 1, b = 1 + x, (where x > 0), then
or
ln(1 + x)
1
<
< 1,
1+x
x
x
< ln(1 + x) < x.
1+x
for x > 0
(More than was asked for!)
Note that result (†) could be written down very easily for this problem, without adopting the above approach,
but the method indicated is useful for more complicated problems.
9.7
49
9.8
9
DIFFERENTIABILITY
The Cauchy Mean Value Theorem
Theorem 9.23 (The Cauchy Mean Value Theorem) If f and g are continuous on [a, b] and are differentiable
in (a, b), then there is a number c ∈ (a, b) such that
f (b) − f (a)
f 0 (c)
= 0 ,
g(b) − g(a)
g (c)
where
g 0 (x) 6= 0 in(a, b).
Important note Contrary to first appearances, this result is not a simple matter of applying the First MVT
to f and g separately and then dividing the one by the other, because application of the First MVT leads to a
different ‘c’ for each function. Also note that when g(x) = x, the Cauchy MVT reverts to the ordinary MVT.
9.8
The Cauchy Mean Value Theorem
50
10
10
ˆ
L’HOPITAL’S
RULE
L’Hˆ
opital’s Rule
Theorem 10.1 (L’Hˆ
opital’s Rule) If f (a) = 0, g(a) = 0 and f 0 (a), g 0 (a) both exist, and g 0 (a) 6= 0, then
lim
x→a
f (x)
f 0 (a)
= 0
g(x)
g (a)
Proof We use the definition of a derivative. Since f (a) and g(a) = 0
f (x)
f (x) − f (a)
f (x) − f (a) . g(x) − g(a)
=
=
g(x)
g(x) − g(a)
x−a
x−a
Taking the limit as x → a,
lim
x→a
f (x)
f 0 (a)
= 0
g(x)
g (a)
Remarks
(i) This result has nothing to do with the Quotient Rule.
(ii) It doesn’t always work! If not, we try the following theorem.
Theorem 10.2 If f (a) = 0, g(a) = 0 and f and g are differentiable in some neighbourhood of x = a, but not
necessarily at x = a itself, and g 0 (x) 6= 0 in this neighbourhood, except possibly at x = a, then
lim
x→a
f 0 (x)
f (x)
= lim 0
g(x) x→a g (x)
provided this limit exists.
This is proved using the Cauchy Mean Value Theorem.
Example 10.3 Calculate the following limits:
(i) limx→0
sin x
x
(ii) limx→0
ex −1
x
(iii) limx→1
(x−1)3
ln x
(iv) limx→0
eax −e−ax
ln(1+x)
When attempting to use L’Hˆopital’s rule (version 1), you may get f 0 (a) = 0, g 0 (a) = 0, in which case you need
to apply version 1 or 2 again, provided the limits exist.
f 00 (a)
f 00 (x)
f (x)
f 0 (x)
= lim 0
= lim 00
= 00
,
x→a g(x)
x→a g (x)
x→a g (x)
g (a)
lim
and so on.
(i) limx→0
(ii) limx→1
cos x−1
x2
x3 +x2 −x−1
x2 +2x−3
Note At each stage, the limits must exist.
51
10.1
10
ˆ
L’HOPITAL’S
RULE
Limits when f (x) → ±∞ and g(x) → ±∞ as x → a (type ∞/∞)
We can use L’Hˆ
opital’s Rule.
ln(sin x)
ln(sin 2x) .
Then f (x) = ln(sin x) and g(x) = ln(sin 2x) → −∞ as x → 0. Also f 0 (x) =
Neither of these exist at x = 0, but we can solve as follows:
lim
x→0
10.2
cos x
sin x
and g 0 (x) =
2 cos 2x
sin 2x .
f 0 (x)
cos x 2 sin x cos x
cos2 x
= lim
·
= lim
=1
0
x→0 cos 2x
g (x) x→0 sin x
2 cos 2x
Limits as x → ±∞
These can be solved directly in the same way as in the preceding examples, or we could replace x by
t → 0.
f (x)
lim
= lim f 1t g 1t
provided the limits exist.
x→∞ g(x)
t→0
Example 10.6 limx→∞ xe−x = limx→∞
x
ex
= limx→∞
1
ex
1
t
and let
= 0, since as x → ∞, e−x → 0.
(i) limx→∞ xn e−x , where n ∈ N.
2
(ii) limx→0
(iii) limx→1
10.3
ex −1
sin(x2 )
x
x−1
−
1
ln x
.
(Type ‘∞ − ∞’.)
Further types: 00 , ∞0 , 1∞ , etc
For limits like these, try taking logs:
Example 10.8 Calculate limx→0 (1 − x)1/x .
Let y = (1 − x)1/x , then ln y =
1
x
ln(1 − x) or
ln(1−x)
.
x
Then
ln(1 − x)
1 .
1 = lim −
1 = lim −
= −1.
x→0
x→0
x→0
x
1−x
1−x
lim
Thus ln y → −1 as x → 0, or y → e−1 as x → 0.
10.1
Limits when f (x) → ±∞ and g(x) → ±∞ as x → a (type ∞/∞)
52
11
11
TAYLOR’S THEOREM
Taylor’s Theorem
The following is a restatement of Theorem 9.18 on page 48, and is proved (as an exercise in Assignment 8) by
applying Rolle’s Theorem to an appropriate continuous and differentiable function.
Theorem 11.1 (First Mean Value Theorem) If f is continuous on [a, b] and differentiable on (a, b) then
there exists c ∈ (a, b) such that
f (b) = f (a) + (b − a)f 0 (c),
where a < c < b.
We can now extend this result, by again applying Rolle’s theorem to an appropriate function, to give:
Theorem 11.2 (Second Mean Value Theorem) If f and its first derivative is continuous on [a, b], and f and
is twice differentiable on (a, b) then there exists c ∈ (a, b) such that
f (b) = f (a) + (b − a)f 0 (a) +
(b − a)2 00
f (c),
2!
where a < c < b
In fact, these theorems also hold when b < a, so we could write
f (a + h)
and f (a + h)
= f (a) + R1 ,
= f (a) + hf 0 (a) + R2
2
where h could be positive or negative and the remainders are R1 = hf 0 (c), R2 = h2! f 00 (c).
c lies between a and a + h and as before, we could write c = a + θh, where 0 < θ < 1.
Alternatively we could write
f (x)
=
f (a) + R1
f (x)
=
f (a) + (x − a)f 0 (a) + R2
2
In this case, R1 = (x − a)f 0 (c) and R2 = (x−a)
f 00 (c), where c lies between a and x.
2!
If we ignore the remainder term, we obtain a polynomial approximation T (x) to the function f (x) near x = a.
In the first case, the approximation is just a constant, and in the second, a linear approximation.
Example 11.3 Find polynomial approximations to f (x) = ex near x = 0.
First approximation
T0 (x) = e0 = 1, R1 = xec = xeθx ,
(0 < θ < 1).
Second approximation
x2 θx
e .
2!
We can continue in this way, until we find the general form of the theorem.
T1 (x) = e0 + xe0 = 1 + x,
11.1
R2 =
Taylor’s Theorem – the nth Mean Value Theorem
As before this may be stated in different ways.
Theorem 11.4 (Taylor’s Theorem) Suppose f possesses derivatives of orders 1, 2, . . . , (n−1) on the closed
interval [a, b]. If
(i) f and its derivatives are all continuous on [a, b], and
(ii) f has a derivative of order n on (a, b)
then there is a number c ∈ (a, b), such that
f (b) = f (a) + (b − a)f 0 (a) +
(b − a)2 00
(b − a)n−1 (n−1)
(b − a)n (n)
f (a) + · · · +
f
(a) +
f (c).
2!
(n − 1)!
n!
53
11
TAYLOR’S THEOREM
Once again, the last term is the remainder. In this form it is called the Lagrange form of the remainder, that is,
we write
Rn =
(b − a)n (n)
f (c).
n!
Remarks
(i) If n = 1, these results revert to the First Mean Value Theorem.
(ii) It may be more convenient to write x in place of b:
f (x) = f (a) + (x − a)f 0 (a) +
where Rn =
(x−a)n (n)
f (c),
n!
(x − a)2 00
(x − a)n−1 (n−1)
f (a) + · · · +
f
(a) + Rn
2!
(n − 1)!
for a < c < x.
(iii) If we ignore the remainder term, we obtain a polynomial approximation to the function f (x) near the
point x = a.
(iv) If a = 0, we obtain a special case of Taylor’s Theorem
f (x) = f (0) + f 0 (0)x +
where Rn =
xn (n)
(θx),
n! f
f 00 (0) 2 f 000 (0) 3
f (n−1) (0) n−1
x +
x + ··· +
x
+ Rn
2!
3!
(n − 1)!
0 < θ < 1.
(v) The theorems also hold when x < a.
(vi) We can write Taylor’s theorem alternatively as follows. Replace b by a + h and c by a + θh, where
0 < θ < 1, then
f (a + h) = f (a) + hf 0 (a) +
where Rn =
hn (n)
(a
n! f
h2 00
hn−1 (n−1)
f (a) + · · · +
f
(a) + Rn
2!
(n − 1)!
+ θh), for 0 < θ < 1.
Example 11.5
(i) Find a polynomial approximation to f (x) = ex near x = 0, up to and including the term in x4 .
(ii) Estimate the remainder when x = 0.5, 1, 2.
Solution
(i) (a = 0) We have already obtained the first two approximations in Example 11.3. If f (x) = ex , it has
derivatives of all orders, and all its derivatives are ex .
Thus f (0) = f 0 (0) = f 00 (0) = f 000 (0) = f iv (0) = 1, and the fourth order polynomial approximation to ex is
T4 (x) = 1 + x +
(ii) The remainder is R5 =
x5
5!
x2
x3
x4
+
+ .
2!
3!
4!
eθx , for 0 < θ < 1.
Consider any fixed value of x, then eθx is an increasing function of θ on its domain (0,1).
When θ = 0, eθx = 1 and when θ = 1, eθx = ex so we can say that
x5
5!
< R5 <
x 5 ex
5! .
Evaluating for different values of x:
11.1
Taylor’s Theorem – the nth Mean Value Theorem
54
11
x
0.5
1
2
TAYLOR’S THEOREM
R5
2.6 × 10 < R5 < 4.3 × 10−4
8.3 × 10−3 < R5 < 2.3 × 10−2
0.27 < R5 < 1.97.
−4
So we see that the further we are from x = 0, the greater the error.
11.2
Taylor and Maclaurin Series
Suppose that a and x are two distinct real numbers, and I is the closed interval [a, x] with end-points a and x,
(x can be greater or less than a).
Suppose that f satisfies the conditions of Taylor’s theorem on I, then
that is,
f (x)
= f (a) + (x − a)f 0 (a) +
f (x)
= Sn + Rn ,
Pn−1
where Sn is the partial sum
(x−a)n
f
n!
k=0
(n)
(x−a)k (k)
f (a),
k!
(x − a)2 00
(x − a)n−1 (n−1)
f (a) + · · · +
f
(a) + Rn ,
2!
(n − 1)!
(1)
ie the sum of the first n terms,
Rn is the remainder
(c) and c lies between a and x.
We now need to find out what happens to Sn + Rn as n → ∞.
There are two questions:
(i) Is the series convergent? Does Sn → a finite limit as n → ∞?
(ii) If the series converges, does it converge to f (x)?
It could well be that the series is convergent, but possibly only on some restricted interval of the real line, the
interval of convergence.
In addition, in order to establish convergence to f (x), we would need to show that for x and a in the interval
of convergence for Sn , the remainder Rn converges to 0 as n → ∞.
P∞
So first we should find the range of values for which the series n=0 Sn converges, then show that Rn → 0, for
these values of x, as n → ∞. This can get quite complicated and we shall not pursue this in any detail here.
Note that when a = 0, Taylor series are frequently called Maclaurin Series.
Example 11.6 Consider the Taylor series about x = 0 for ex .
x2
x3
xn−1
+
+ ··· +
+ Rn
2!
3!
(n − 1)!
xn eθx
where Rn =
, 0 < θ < 1.
n!
exp(x) = 1 + x +
One method for testing series for convergence is called the Ratio Test, (see Section 11.8 on page 61 for further
details). We compare two consecutive terms in the series as follows:
un+1 xn (n − 1)! |x|
= .
un n! xn−1 = n
As n → ∞, this tends to 0 < 1, regardless of the value of x. The test tells us that the series converges.
The remainder Rn =
xn eθx
n!
→ 0 as n → ∞, so the series actually converges to the function exp(x) for all x.
This example illustrates how intervals of convergence may be found. They aren’t always the whole of R.
Example 11.7 Find the Taylor (Maclaurin) series for the following functions centred at x = 0.
11.2
Taylor and Maclaurin Series
55
11
TAYLOR’S THEOREM
(i) f (x) = sin x
f (x) = sin x
f (0) = 0
0
f (x) = cos x
f 0 (0) = 1
f 00 (x) = − sin x
f 00 (0) = 0
f 000 (x) = − cos x
f 000 (0) = −1
f (4) (x) = sin x
..
.
f (4) (0) = 0
..
.
Substituting into formula (1) in Section 11.2, we get
sin x = x −
x3
x5
+
− ···
3!
5!
Once again it may be shown that this series converges for all x ∈ R.
(ii) f (x) = (1 + x)α , for α ∈ R.
f (x) = (1 + x)α
0
f (0) = 1
f (x) = α(1 + x)
f 0 (0) = α
f 00 (x) = α(α − 1)(1 + x)α−2
f 00 (0) = α(α − 1)
f 000 (x) = α(α − 1)(α − 2)(1 + x)α−3
..
.
f 000 (0) = α(α − 1)(α − 2)
..
.
α−1
So (1 + x)α = 1 + α x +
α(α−1)
2!
x2 +
α(α−1)(α−2)
3!
x3 + · · ·
Clearly, if α ∈ N, the series will stop at some point, and the Taylor series is a finite polynomial. Otherwise,
if α ∈
/ N, it may be shown that the series converges for |x| < 1 only.
(iii) f (x) = ln(1 + x)
11.3
Remarks
One series can be used to obtain another.
Example 11.8
∞
X
(−1)n xn = 1 − x + x2 − x3 + · · · =
n=0
1
,
1+x
|x| < 1
Replacing x by x2 gives the new series
∞
X
(−1)n x2n = 1 − x2 + x4 − x6 + · · · =
n=0
1
,
1 + x2
|x| < 1
Example 11.9 Using a standard series, find the series expansion, and interval of convergence for
(2 + 3x)1/2 .
Convergent power series may be differentiated term by term to give a new convergent series whose sum is the
derivative of the sum of the original series.
11.3
Remarks
56
11
TAYLOR’S THEOREM
Example 11.10
∞
X
=
1 + x + x2 + · · · =
nxn−1
=
1 + 2x + 3x2 + · · · =
n=0
∞
X
1
,
1−x
xn
n=1
|x| < 1
1
,
(1 − x)2
|x| < 1
etc.
Note that while the domain of both functions in this example is R \ {1}, the domain of the series is the interval
(−1, 1), so in each case, the series cannot be equal to the function outside (−1, 1).
Similarly, we may integrate a convergent power series, term by term, to give a new convergent series whose
sum is the integral of the sum of the original series. The interval of convergence remains the same.
Example 11.11
∞
X
(−1)n x2n = 1 − x2 + x4 − x6 + · · · =
n=0
Integrating,
1
,
1 + x2
∞
X
x3
x5
x7
(−1)n x2n+1
=x−
+
−
+ · · · = tan−1 x
(2n
+
1)
3
5
7
n=0
|x| < 1
|x| < 1.
Series can be multiplied, and one series can be substituted into another.
Example 11.12 Find Maclaurin series for
(i) ex sin x
(ii) esin x
Example 11.13 Find the Taylor series for f (x) = sin x, about x =
π
2.
Example 11.14 Limits (third method!). Find
(i) limx→0
(ii) limx→0
11.3
sin x
x .
1−cos x
.
x2
Remarks
57
11.4
11
TAYLOR’S THEOREM
Some Maclaurin series
It’s a good idea to try to remember the first few of these:
ex = 1 + x +
sin x = x −
x3
x2
+
+ ···
2!
3!
x3
x5
x7
+
−
+ ···
3!
5!
7!
x2
x3
x4
+
−
+ ···
2
3
4
(1 + x)α = 1 + α x +
=
∞
X
(−1)n x2n+1
,
(2n + 1)!
n=0
x∈R
∞
X
(−1)n−1 xn
,
n
n=1
α(α − 1) 2 α(α − 1)(α − 2) 3
x +
x + ··· ,
2!
3!
x∈R
x ∈ (−1, 1]
|x| < 1, α ∈
/N
=
(1 + x)−2 = 1 − 2x + 3x2 − 4x3 + · · ·
=
(1 − x)−1 = 1 + x + x2 + x3 + · · ·
tan−1 x = x −
x∈R
=
(1 + x)−1 = 1 − x + x2 − x3 + · · ·
ln(1 − x) = − x −
∞
X
xn
,
n!
n=0
∞
X
(−1)n x2n
=
,
(2n)!
n=0
x4
x6
x2
+
−
+ ···
cos x = 1 −
2!
4!
6!
ln(1 + x) = x −
=
=
x2
x3
x4
−
−
− ···
2
3
4
=
x3
x5
x7
+
−
+ ···
3
5
7
=
∞
X
n=0
∞
X
n=0
∞
X
n=0
∞
X
n=1
∞
X
n=0
(−1)n xn ,
|x| < 1
(−1)n (n + 1)xn ,
|x| < 1
xn ,
|x| < 1
−
xn
,
n
x ∈ [−1, 1)
(−1)n x2n+1
,
2n + 1
1 x3
1 3 x5
1 3 5 x7
·
+ · ·
+ · · ·
+ ··· ,
2 3
2 4 5
2 4 6 7
1
2
17 7
62 9
tan x = x + x3 + x5 +
x +
x + ··· ,
3
15
315
2835
sin−1 x = x +
|x| < 1
|x| < 1
|x| <
π
2
20
15
10
5
–3
–2
–1
1
2
3
x
Figure 11.1: Graphs of successive polynomial approximations to f (x) = exp(x), T0 to T5
11.4
Some Maclaurin series
58
11
TAYLOR’S THEOREM
2
y
–10
1
0
–5
5
10
x
–1
–2
Figure 11.2: Graphs of successive polynomial approximations to f (x) = sin(x), Tn for n = 1, 3, 5, 7, 13, 21
10
8
6
y
4
2
–1
0
1
2
3
4
x
–2
–4
–6
–8
–10
Figure 11.3: Graphs of successive polynomial approximations to f (x) = ln(1 + x), Tn for n = 1, 2, 3, 4, 7, 10
11.5
Sequences
Definition 11.15 A sequence is a function whose domain is the integers or a subset of the integers (n ∈ A ⊆
Z).
Example 11.16
(i) 1, 21 , 31 , 14 , . . .. The general term is un =
1
n,
for n ∈ N.
(ii) a, a + d, a + 2d, . . .. The general term is un = a + (n − 1)d (an arithmetic sequence).
(iii) a, ax, ax2 , ax3 , . . .. The general term is un = axn−1 (a geometric sequence).
In general, if un tends to a finite limit as n → ∞, then we say that the infinite sequence converges.
11.6
Series
Definition 11.17 A series is the sum of the terms in a sequence.
11.5
Sequences
59
11
TAYLOR’S THEOREM
Let u1 , u2 , u3 , . . . , un , . . . be a sequence of real numbers.
If we sum the first n terms in the sequence we obtain the finite sum
Sn = u1 + u2 + u3 + · · · + un =
n
X
uk .
k=1
We can do this for all n = 1, 2, 3, . . .:
S1 = u1 ,
S2 = u1 + u2 ,
S3 = u1 + u2 + u3 , . . .
Sn = u1 + u2 + u3 + · · · + un . . .
That is, we obtain a new sequence of sums: S1 , S2 , S3 , . . . , Sn . . ..
If this sequence S1 , S2 , . . . , Sn , . . . tends to a finite limit as n → ∞, then we say that the infinite series
converges.
The value of this limit is called the sum of the infinite series, that is,
lim Sn =
n→∞
n
X
un = S.
i=1
Here, Sn is called the partial sum of the infinite series.
Example 11.18 (The geometric sequence and series) Consider the familiar geometric sequence
1, x, x2 , x3 , . . . xn−1 , xn , . . .
You have seen that this sequence converges when −1 < x 6 1. For this sequence we can find a formula for Sn ,
the sum of the first n terms, as follows:
Sn
multiplying throughout by x:
=
1 + x + x2 + x3 + · · · + xn−1
= x + x2 + x3 + · · · + xn−1 + xn
xSn
Subtracting, (1 − x)Sn = (1 − xn ), so that
Sn =
(1 − xn )
,
1−x
provided that x 6= 1.
What happens as n → ∞?
(i) If |x| < 1, Sn →
1
1−x
: Convergent.
(ii) If |x| > 1, Sn → ∞: Divergent.
(iii) If x = −1, Sn =
1−(−1)n
2
which oscillates between 0 and 1: Divergent.
(iv) If x = 1, Sn = n → ∞: Divergent.
So the series only converges when |x| < 1, and in this case we can write
S = 1 + x + x2 + x3 + · · · + xn + · · · =
∞
X
n=0
xn =
1
= (1 − x)−1 ,
1−x
|x| < 1.
If we replace x by −x in the above we get
1 − x + x2 − x3 + · · · + (−1)n xn + · · · =
∞
X
(−1)n xn =
n=0
11.6
Series
1
= (1 + x)−1 ,
1+x
|x| < 1.
60
11.7
11
TAYLOR’S THEOREM
Convergence of series
For most series, it is impossible to find a neat formula for Sn , the sum of the finite series and so we cannot
proceed as in the example above. Many tests have been developed which enable us to determine whether or
not a series converges, without finding an explicit formula for Sn and also without finding its limiting value,
the sum S. Before we look at one of these, we note the following useful result:
P∞
An infinite series n=1 un cannot converge unless its terms un → 0 as n → ∞.
So we see that although the geometric sequence converges when x = 1, (converges to 1), the series certainly
does not converge for this value of x.
The converse is not true, as illustrated in the next example.
Example 11.19
(i) 1 +
1
2
+
1
4
+
1
8
+ · · · is convergent. This is a geometric series, with x = 12 .
(ii) 1 +
1
2
+
1
3
+
1
4
+ · · · is not convergent even though un =
1
n
→ 0 as n → ∞.
This series is called the harmonic series.
11.8
D’Alembert’s ratio test
Given the series
P∞
n=1
un and limn→∞ uun+1
= `, then:
n
(i) if ` < 1 the series converges,
(ii) if ` > 1 or if the ratio → ∞ (that is, no finite limit ` exists) then the series diverges,
(iii) if ` = 1, no conclusion can be drawn, and the test fails.
11.7
Convergence of series
61
11.8
D’Alembert’s ratio test
11
TAYLOR’S THEOREM
62
12
12
INTEGRATION
Integration
Contrary to the customary order of presentation: starting with differentiation and later considering integration,
the ideas of the integral calculus were in fact developed first3 . The original ideas were noted by Archimedes in
223 BC, but the major
R initial contributions were made by Leibniz sometime between 1673 and 1676. He was
the first to use the notation. (Of course, Newton was involved as well).
There are many types of integrals, Riemann, Riemann–Stieltjes, Lebesgue, line integrals, double and other
multiple integrals, improper integrals of various kinds, and so forth. but all can be interpreted as a generalisation
of area. Here, we shall develop the theory in the usual way. First we consider indefinite integrals, then definite
integrals, the connection between differentiation and integration, and finally, improper integrals.
12.1
Indefinite Integration
This is simply the reverse process to differentiation. You should be able to recognise all the following standard
integrals:
R
R
R
f (x)
f (x) dx
f (x)
f (x) dx
f (x)
f (x) dx
xα
cos(ax)
1
x2 +a2
Since
d
dx (C)
12.2
xα+1
α 6= −1
α+1 ,
1
a sin(ax)
1
−1 x
a tan
a
1
x
sin(ax)
√ 1
a2 −x2
ln |x|
− a1 cos(ax)
sin−1 xa
eax
sec2 (ax)
1
a
1
a
eax
tan(ax)
= 0, we should always add an arbitrary constant C to the results above.
Linearity Properties
R
R
R
f + g = f + g.
R
R
(ii) kf = k f , where k ∈ R is a scalar constant.
(i)
12.3
Integral types and methods of integration
Certain types of integral may be solved by applying particular standard techniques.
12.3.1
Integrals of the form
R
g(f (x))f 0 (x) dx
R
(i) (f (x))n f 0 (x) dx or, more generally,
R
(ii) g(f (x))f 0 (x) dx,
are evaluated by substitution.
Example 12.1
R
(i) (3 − 4x)1/5 dx
R
(ii) x2x+4 dx
R
(iii) esin x cos x dx
3H
Eves, An Introduction to the History of Mathematics
63
12.3.2
INTEGRATION
Integrals re-expressed in partial fraction form
Example 12.2
R
1
(i) (x−1)(x+1)
dx =
R
1
(ii) (x−1)(x
2 +1) dx
12.3.3
12
1
2
R
1
x−1
−
1
x+1
dx
Integration by parts
This is an application of the product rule
(f g)0 = f 0 g + f g 0
for differentiation. Given the integral
Z
R
f g 0 , we rearrange it as follows
Z
0
fg =
0
Z
0
((f g) − f g) = f g −
f 0g
In Leibniz notation:
Z
dv
u
dx = uv −
dx
Z
v
du
dx.
dx
Typically, this method is used on integrals of the following forms:
Z
Z
Z
xn eax dx
xn cos(ax) dx,
xn sin(ax) dx
Z
Z
ax
e cos(bx) dx,
eax sin(bx) dx
n∈N
Essentially we are trying to rearrange the integral in such a way as to end up with an easier integral than we
dv
began with. This usually means that we have to choose u and dx
carefully. The method is also used to evaluate
Z
xα ln x dx,
(α 6= −1),
including
R
ln x dx and
R
sin−1 x dx, and also some integrals which can be solved using alternative methods.
Example 12.3
R
(i) x2 e−5x dx
R
(ii) x cos(2x) dx
R
(iii) e−x sin x dx
12.3.4
dv
dx
= xα ,
du
dx
R
xα ln x dx
= x1 ,
xα+1
α+1 ,
α 6= −1, and the integral becomes
Z
Z
xα+1
1
xα+1
· ln x −
dx =
xα+1 ln x − xα dx
(α + 1)
x(α + 1)
(1 + α)
1
xα+1
=
xα+1 ln x −
+ C,
for all
(1 + α)
(α + 1)
Let u = ln x,
v=
Now consider the case α = −1:
Let u = ln x, then
12.3
du
dx
Z
α 6= −1.
ln x
dx.
x
= x1 , so we use substitution, not parts, and the integral is
Z
u du = 12 u2 + C = 12 (ln x)2 + C.
Integral types and methods of integration
64
12
INTEGRATION
R
Surprisingly, integration by parts still works for the case α = 0: ln x dx.
dv
1
Let u = ln x, dx
= 1, du
v = x, and
dx = x ,
Z
Z
ln x dx = x ln x −
dx = x ln x − x + C.
R
The integral sin−1 x dx is similar.
dv
√ 1
If u = sin−1 x, dx
= 1, du
, v = x, and
dx =
1−x2
Z
Z
p
x
√
dx = x sin−1 x + 1 − x2 + C.
sin−1 x dx = x sin−1 x −
2
1−x
and so on.
12.4
Definite integration and the Riemann integral
12.4.1
The Newton–Leibniz approach
In elementary work, we use geometric arguments to establish the existence of the definite integral as the limit
of a sum
a
b
Divide the interval [a, b] into n equal subintervals of width h =
Sum the areas of the rectangles indicated above:
Total area Sn
b−a
n .
= h {f (a + h) + f (a + 2h) + · · · + f (a + ih) + · · · + f (a + nh)}
n
X
= h
f (a + ih)
i=1
n
=
(b − a) X
f
n
i=1
(b − a)
a+i
n
Assuming that it exists, take the limit of Sn as n → ∞, that is, limn→∞ Sn = S.
Write this as
Z b
f (x) dx,
a
which represents the ‘area under the graph’ of y = f (x) from x = a to x = b.
Example 12.4 Let f (x) = x, then
(
)
n n
(b − a) X
(b − a)
(b − a)
(b − a) X
Sn =
a+i
=
na +
i
n
n
n
n
i=1
i=1
12.4
65
Now
Pn
i=1
i=
n(n+1)
,
2
12
(previously proved by mathematical induction), so
Sn = (b − a)a +
and as n → ∞,
12.4.2
INTEGRATION
Sn → (b − a)a +
(b−a)2
2
(b − a)2 n(n + 1)
·
n2
2
= 12 (b − a)(2a + b − a) = 21 (b2 − a2 )
The Riemann approach
The difficulty with the Newton–Leibniz approach is that we automatically assume that f is a continuous
function. Indeed in the diagram, I’ve also made it look smooth – ie differentiable.
The first rigorous theory of integration, in which there is no mention of graphs or diagrams, was given by
G F B Riemann in about 1854, hence the name Riemann integration. Riemann’s integral is exactly the same as
the ordinary familiar definite integral (due to Newton and Leibniz) when f is continuous.
The condition for the existence of the Riemann integral is that f should be bounded on the interval [a, b], that
is,
m 6 f (x) 6 M, for x ∈ [a, b]
To define the integral, and to find how to perform the limiting process, we now subdivide [a, b] into n parts,
not necessarily the same width, so that
a = x0 < x1 < x2 · · · < xn−1 < xn = b
We call the length of the general interval [xi , xi+1 ], ∆xi and choose any number inside the interval ti . We then
form the sum
Sn = f (t0 )∆x0 + f (t1 )∆x1 + · · · + f (tn−1 )∆xn−1 =
n−1
X
f (ti )∆xi .
i=0
Sn obviously depends on the function f , on the form of the subdivision, and on the choice of ti .
Assume n → ∞, and simultaneously the largest of the subintervals ∆xi → 0 and that Sn → a limit S, then f
is said to be Riemann-integrable. It can be shown that every continuous function is Riemann-integrable.
So in addition to continuous functions, this definition of integration also allows us to deal with functions which
are not differentiable, and those which are piecewise continuous (functions which have a finite number of finite
continuities).
Example 12.5
R4
(i) f (x) = |x| is not differentiable at x = 0, but the integral −1 |x| dx exists.
R1
−1, x < 0
(ii) f (x) = sgn x =
is not continuous at x = 0, but −2 sgn x dx exists.
1, x > 0
R1 1
(iii) f (x) = x1 .
dx does not exist because f does not exist at x = 0, ie f is not bounded on [0, 1].
0 x
12.4.3
Some properties of definite integrals
(i) The linearity properties as for indefinite integrals (see p 36).
Ra
(ii) a f dx = 0.
Rb
Ra
(iii) b f dx = − a f dx.
Rb
Rb
(iv) If f > g throughout [a, b], then a f (x) dx > a g(x) dx.
Rb
Special case: If f > 0 throughout [a, b], then a f (x) dx > 0.
12.4
66
12
INTEGRATION
(v) If f is Riemann-integrable on (a, b) and m 6 f (x) 6 M on [a, b], then
b
Z
m(b − a) 6
f (x) dx 6 M (b − a)
a
(vi) If f is Riemann-integrable, then so is |f | and
Z
Z
b
b
|f (x)| dx
f (x) dx 6
a
a
12.4.4
The Mean Value Theorem for definite integrals
Theorem 12.6 If f is continuous on [a, b] then there is a number c ∈ (a, b) such that
b
Z
f (x) dx = (b − a)f (c).
a
f (c) is the mean value of f over the interval [a, b].
12.4.5
The Fundamental Theorem of Calculus
This is the theorem which connects integration and differentiation. This discovery by Newton and Leibniz
was quite astonishing at the time it was made, and laid the foundation for the development of the Calculus,
considered by some to be the mathematical discovery that fuelled the scientific revolution for the next 200
years.
Theorem 12.7 (The Fundamental Theorem of Calculus)
Rx
(i) If f is Riemann-integrable on (a, b) and F (x) = a f (t) dt, then F is a continuous function of x on [a, b].
(ii) Furthermore, if f is continuous on [a, b], then F is differentiable, and F 0 = f . In this case,
Z
b
f (x) dx = F (b) − F (a).
a
Note also that
Z
d
dx
x
f (t) dt = f (x).
(2)
a
Equation (2) may well be the most important equation in calculus.
12.5
Improper Integrals
12.5.1
Infinite Integrals
We wish to attach meaning to integrals of the form
Z ∞
f (x) dx,
(3)
f (x) dx,
(4)
f (x) dx.
(5)
a
Z
b
−∞
Z ∞
−∞
type (3), if f is Riemann-integrable on (a, b) for all b > a, then we define the infinite integral
RConsidering
∞
f
(x)
dx,
to be the following limit, if it exists:
a
12.5
Improper Integrals
67
12
∞
Z
b
Z
f (x) dx = lim
b→∞
a
INTEGRATION
f (x) dx
a
If the limit exists and is equal to ` say, we say that the infinite integral converges to `.
If the limit does not exist, we say that the integral diverges.
There are various tests, not gone into here, which enable us to establish convergence, without actually finding
the limit; but we shall not adopt this approach, but rather, work out limiting values for some infinite integrals.
Example 12.8
(i) f (x) = λe−λx , where x > 0 and λ > 0. This is the exponential distribution.
Show that the area under the graph of f over [0, ∞) is 1.
Suppose that b > 0.
b
Z
0
b
λe−λx dx = −e−λx 0 = 1 − e−λb
−λb
As b → ∞, e
→ 0, so the integral is convergent to 1.
R∞ 1
(ii) 0 1+x2 dx.
R∞
(iii) 1 x1α dx. Consider
−α+1 b
Z b
1
x
1
I=
=
dx =
b1−α − 1 ,
α
x
1
−
α
(1
−
α)
1
1
where b > 1 and α 6= 1. Now let b → ∞.
If α > 1, then b1−α → 0 and the integral converges to
1
α−1 .
If α < 1, then b1−α → ∞ and the integral diverges.
Rb
b
If α = 1, the integral is 1 x1 dx = [ln x]1 = ln b − ln 1 = ln b.
This tends to infinity as b → ∞, so the integral diverges.
Type (4)
Rb
−∞
f (x) dx is similar to type (5) above. We define
Z
b
Z
b
f (x) dx = lim
a→−∞
−∞
f (x) dx
a
when the limit exists.
Example 12.9
R0
−∞
ex dx converges to 1 and
For type (5) we define
R∞
−∞
R0
−∞
xex dx converges to −1.
f (x) dx by
Z
∞
Z
c
Z
f (x) dx =
−∞
f (x) dx +
−∞
∞
f (x) dx.
c
for some real number c. So the integral on the left only exists when both the integrals on the right exist. (c is
frequently taken to be 0, but this isn’t necessary).
Important note The definition is not
Z
b
lim
b→∞
12.5
Improper Integrals
f (x) dx.
−b
68
12
Example 12.10 The limit
Z
INTEGRATION
c
sin x dx
lim
a→−∞
a
does not exist, and neither does
b
Z
lim
sin x dx,
b→∞
but
Z
c
b
lim
b→∞
sin x dx = 0,
−b
because sin x is an odd function.
Example 12.11
R∞
Example 12.12
R∞
1
−∞ 1+x2
−∞
dx.
2
xe−cx dx, for c > 0.
12.5.2
Other improper integrals
Rb
We wish to evaluate a f (x) dx, if possible, when f becomes unbounded somewhere in [a, b].
(i) Suppose f becomes infinite at x = a, otherwise f is well-behaved.
We define the integral by
b
Z
Z
b
f (x) dx = lim
h→0
a
f (x) dx,
h > 0,
f (x) dx,
h > 0,
a+h
if the limit exists.
(ii) Suppose f becomes infinite at x = b.
This is similar to (i).
Z
b
Z
f (x) dx = lim
h→0
a
b−h
a
if the limit exists.
(iii) Suppose f becomes infinite at x = c ∈ (a, b).
We define the integral by
Z
b
Z
f (x) dx = lim
a
h→0
c−h
Z
b
f (x) dx + lim
a
k→0
f (x) dx,
h, k > 0,
c+k
provided both limits exist. This result can be extended to functions which have a finite number of infinite
discontinuities.
Example 12.13
R1 1
(i) 0 √1−x
dx.
2
(ii)
12.5
R1
0
x−α dx.
Improper Integrals
69
12.6
12
INTEGRATION
Functions defined by integrals
It is important to realise that there are a large number of integrals which cannot be expressed in terms of
elementary functions, and so are frequently used to define new functions. They are computed numerically.
Examples are
Z
Z x
Z
Z
Z
Z
Z √
e
sin x
1
−x2
x
−x
e
dx
dx
dx
dx
x dx
x dx
sin x dx
x
x
ln x
Example 12.14
(i) The sine-integral function
(
x
Z
Si(x) =
f (t) dt,
where f (t) =
sin t
t ,
t 6= 0
t=0
1,
0
(ii) Euler’s gamma function is defined by the improper integral
Z ∞
Γ(x) =
tx−1 e−t dt.
0
In particular, Γ(n + 1) = n! for n ∈ N.
(iii) The error function
2
erf(x) = √
π
Z
x
2
e−t dt.
0
and there are many more (such as the elliptic functions F , E and Π).
R∞
It is an astonishing fact that improper integrals 0 f (t) dt can often be calculated where ordinary integrals
Rb
f (t) dt cannot. The following example makes use of techniques which are beyond the scope of this course.
a
Rb
R∞
2
2
Example 12.15 There is no elementary formula for a e−t dt, but the value of 0 e−t dt can be calculated
precisely. The usual technique involves squaring to get the double integral
Z
∞
e
−t2
2 Z
dt =
0
∞
∞
Z
e
0
−x2 −y 2
e
Z
∞
Z
dx dy =
0
0
∞
e−(x
2
+y 2 )
dx dy
0
by changing to polar coordinates to get
Z
π
2
Z
∞
e
0
and hence
R∞
0
2
e−t dt =
0
−r 2
Z
r dr dθ =
0
π
2
1 −r2 ∞
−2e
dθ =
0
π
4,
√
π
2 .
Example 12.16 The normal or Gaussian density function is defined by
(x − µ)2
1
f (x) = √ exp −
for x ∈ (−∞, ∞).
2σ 2
σ 2π
Verify that f does indeed represent a probability density function, and find the mean and variance of the
distribution.
12.6
Functions defined by integrals
70
13
13
FIRST-ORDER DIFFERENTIAL EQUATIONS
First-order Differential Equations
When we solve algebraic, or trigonometric equations, the unknown quantity is a variable. In contrast, the
solution(s) of differential equations are unknown functions.
(i) An ordinary differential equation is an equation containing the derivatives of a function of a single
variable.
(ii) A partial differential equation is one where the unknown function is a function of several variables, so
the equation contains partial derivatives.
(iii) The order of a differential equation is the order of the highest order derivative within the equation.
(iv) A differential equation is said to be linear if it does not contain products, quotients, or powers of the
derivatives, or of the unknown function.
We will use a variety of notations. We should choose the one which is most appropriate to the context, or to
the situation being modelled. For example:
dy
dx
dx
dt
= f (x, y) or y 0 = f (x, y) : solve for y
= g(t, x) or x˙ = g(t, x) : solve for x, etc.
˙ means ‘differentiate with respect to time t’ .
13.1
Qualitative approach: ‘knowing the direction and finding the path’
Most differential equations do not have solutions that can be found in a neat closed form, so instead, numerical
techniques have to be used. Even when analytical solutions can be found, it is sometimes useful to have a
qualitative picture of the behaviour of possible solutions. Such an approach is illustrated in the following
example.
dy
Example 13.1 dx
= x + y.
This can be solved analytically (see later), but we can obtain a qualitative picture of the behaviour of y as
follows.
We can obtain a direction diagram or gradient field, because at each point of the x, y plane, we know that the
dy
gradient dx
is equal to x + y. For example, at (1, 2), the gradient is 3, at (−1, 0) the gradient is −1, and so on.
We represent the gradient at the point (x, y) by a short line segment drawn at the point. When the picture is
complete, we have obtained our gradient field.
The line segments give us a means of drawing curves that follow the direction of the gradient field, because we
know that they are tangential to the solution curves or integral curves y = y(x). In applied subjects they are
also called trajectories or time paths. Generally, there will be an infinite number of them.
Through each point (x, y) will pass one and only one solution curve, so we can determine the precise solution
of our differential equation provided we know a single point on it. In particular, if we know the value of y
when x = 0, we can find the appropriate integral curve. Such a condition is called an initial condition. The
problem of solving the differential equation with a given initial condition is called an initial value problem.
See Figure 13.1 for solution curves corresponding to two sets of initial conditions.
13.2
Euler’s method
dy
Suppose that we wish to solve dx
= f (x, y), subject to the initial condition y = y0 when x = x0 .
If x1 = x0 + h is a nearby point on the solution curve through (x0 , y0 ). (Stepsize h is x1 − x0 ). Then we
can use the differential equation to find an approximate value for y1 . Using the definition of a derivative,
y1 −y0
≈ f (x0 , y0 ), or, approximately y1 = y0 + hf (x0 , y0 ).
h
Repetition of this result gives a table of approximate values of y. This formula can be refined in several ways
to give greater accuracy.
Example 13.2
dy
dx
= xy. See Figure 13.2.
71
y(x)
13
10
10
5
5
0
y(x)
–5
0
–5
–10
–10
–10
–5
0
5
10
–10
–5
x
dy
dx
10
= x + y, for initial values y = 3.5, x = −5.0 (left) and
10
10
5
5
0
y(x)
–5
0
–5
–10
–10
–10
–5
0
5
10
–10
–5
x
0
5
10
x
Figure 13.2: Two solution curves for the equation
x = −2 (right)
13.3
5
x
Figure 13.1: Two solution curves for the equation
y = 4.1, x = −5.0 (right)
y(x)
0
dy
dx
= xy, for initial values y = 2, x = −2 (left) and y = −3,
Linear equations: the integrating factor method
These are equations that are of (or can be rearranged into) the form
dy
dt
+ a(t)y = b(t),
where a(t) and b(t) are continuous functions on some interval of the real numbers,
6 0. Consider the left hand side and recall the product rule for
Example 13.3 t2 dy
dt + 2ty = b(t) for t =
differentiation:d
dv
du
dt (uv) = u dt + v dt .
We see that the left hand side of the differential equation may be expressed as
d 2
dt (t y).
If we now integrate both sides of the equation with repect to t,
Z
2
t y =
b(t) dt + C,
Z
1
y = 2
b(t) dt + C
t
(6)
Note that the arbitrary constant must be inserted at stage (6), not at the end.
Usually, the terms on the left hand side do not combine together as simply as in the preceding example.
The trick is to multiply through by an adjusting function to put it into the required form. This is called an
integrating factor.
13.3
72
13.3.1
13
Calculating the integrating factor
(i) We start with an equation in the form dy
dt + a(t)y = b(t).
R
(ii) Evaluate a(t) dt. No arbitrary constant is needed at this stage.
R
(iii) Determine u(t) = exp a(t) dt . This is the integrating factor.
(iv) Multiply the differential equation throughout by u(t), then it can be rearranged to
d
dt u(t)y = u(t)b(t).
(v) Integrate
R
u(t)y = u(t)b(t) dx + C,
so y =
13.3.2
1
u(t)
R
u(t)b(t) dx + C .
Verification
If
R
u(t)y = exp
a(t) dt ,
then
du
dt
= a(t) exp
R
a(t) dt = a(t)u(t).
So
d
dt
u(t)y = u(t)b(t)
becomes
u(t) dy
dt = a(t)u(t)y = u(t)b(t),
for u(t) 6= 0, or
dy
dt
Example 13.4
dx
dt
+ a(t)y = b(t).
= xt.
Example 13.5 dx
dt − 2tx = t.
The integrating factor is
u(t) = exp
R
2
−2t dt = e−t .
The differential equation then becomes
d
dt
2 2
e−t x = te−t .
Integrating both sides of this equation gives
2
e−t x =
=
R
2
te−t dt + C.
2
− 21 e−t + C,
hence
2
x = Cet −
1
2
This is the general solution of the differential equation.
Example 13.6 Find the general solution of
dx
dt
=
x
t
+ t.
Example 13.7 Consider the differential equation in Example 13.1. Rearrange it to the standard linear form
dy
dx − y = x.
R d
The integrating factor is u(x) = exp − dx = e−x , so the differential equation becomes dx
e−x y = xe−x .
R
Integrating, we get e−x y = xe−x dx = −xe−x − e−x +C = −e−x (1+x)+C, so that y = −(1+x)+Cex . This is
the general solution.
Now consider various solution curves, and compare with the diagrams obtained earlier:
The solution curve through (0, 0) is given by setting C = 1, yielding the solution y = −(1+x)+ex .
The solution curve through (−2, 0) is given by setting C = −e2 , which gives the solution y = −(1+x) − ex+2 .
As x → −∞, y → −(1+x) or (x+y) → −1.
13.3
73
13
Example 13.8 (A special case) As in the previous example, a linear differential equation which is frequently
encountered in applications is
dy
dt + ay = b,
where a and b are constants. This can be solved in several ways as we shall see later, but for now we’ll use the
integrating factor method.
R
The integrating factor is exp a dt = eat . Thus
d at e y = beat
dt
b
eat y = eat + C
a
y = Ce−at +
13.4
b
a
Nonlinear equations of ‘separable’ type
dy
Suppose dx
= f (x)g(y) where f is some function of just x, and g is some function of just y. Then we can write
this equation in the form
1 dy
= f (x)
g(y) dx
provided that g(y) 6= 0. Integrating both sides of this equation with respect to x gives
Z
Z
1 dy
dx = f (x) dx + C,
g(y) dx
Z
Z
1
or
dy = f (x) dx + C.
g(y)
Note that the arbitrary constant need only be added to one side of the equation.
If possible, evaluate both integrals, and again if possible express the result in the explicig form y = F (x). In
this case, F (x) is the general solution of the differential equation.
dy
≡ 0 and there will be
In addition, we have to consider separately what happens when g(y) = 0. In this case, dx
solutions y = constant for each value of y for which g(y) = 0.
If we know any one point on the time path, any arbitrary constant may be determined.
Example 13.9
dx
dt
= xt. Rearranging this gives
Z
1
dx =
x
Z
t dt.
1 2
Assume first that x > 0, then ln x = 12 t2 + C, so that x = exp 21 t2 + C , which can be written x = Ae 2 t for
some A > 0.
1 2
Alternatively, if x < 0, then the solution is ln(−x) = 12 t2 + D. This leads to x = −Be 2 t for some B > 0.
Both cases are covered by the general solution x = A exp 12 t2 ; whether A > 0 or A < 0 depends on the initial
value.
Note We have already solved this equation using the integrating factor method.
1
2
Example 13.10 Consider the equation dx
dt = −2x t. Find the solution curve through the point (t, x) = (0, − 2 ).
What happens if we change the initial value of x?
Example 13.11 We look at the differential equation dy
dt +ay = b (where a and b are constants) again. Separating
the variables gives
Z
Z
1
dy =
dt.
ay − b
13.4
74
13
Assuming that ay > b, we integrate to give
1
a
ln(ay − b) = −t + C, that is,
ay − b = e−at+K = Ae−at
where K > 0. If we assume ay < b we obtain a similar result.
As before, in both cases the general solution may be rewritten in the form y =
arbitrary constant.
b
a
+ Ce−at , where C is an
Example 13.12 We consider the following two equations from economics:
D(P ) = a − bP
(the demand equation)
S(P ) = α + βP
(the supply equation)
All constants a, b, α and β are positive.
Assume P changes continuously with time; that is, P = P (t). The model says that P˙ is proportional to excess
demand:
dP
= k(D(P ) − S(P ))
dt
for some k > 0. Then dP
dt + k(b + β)P = k(a − α).
This is of the form of Example 13.8.
Example 13.13 Consider the (discrete) compound interest equation
Wt = (1 + r)Wt−1 + Yt − Ct
where Wt is the size of the account (wealth), r is the interest rate, Yt the amount deposited, and Ct the amount
withdrawn.
The continuous-time analogue of this is
dW
= rW (t) + Y (t) − C(t),
dt
where r is constant.
13.4
75
13.4
13
76
14
14
SECOND-ORDER DIFFERENTIAL EQUATIONS
Second-order Differential Equations
In general, these are of the form
d2 y
=F
dx2
dy
x, y,
dx
or
x
¨ = F (t, x, x).
˙
Example 14.1 (Legendre polynomials) The nonlinear equation
d
2 d
(1 − x ) P (x) + n(n + 1)P (x) = 0,
dx
dx
originally studied by Adrien-Marie Legendre (1752–1833) gives rise to a family of functions Pn (x), called
Legendre polynomials, the first few of which are:
P0 (x) = 1
P1 (x) = x
P2 (x) = 12 (3x2 − 1)
P3 (x) = 12 (5x3 − 3x)
P4 (x) = 18 (35x4 − 30x2 + 3)
P5 (x) = 18 (63x5 − 70x3 + 15x)
(Check that these satisfy Legendre’s equation.)
Example 14.2 (Bessel functions) The nonlinear equation
x2
dy
d2 y
+x
+ (x2 − α2 )y = 0
2
dx
dx
was originally studied by Daniel Bernoulli (1700–1782) and Friedrich Bessel (1784–1846) and has no solutions
in terms of elementary functions. Instead, we define a new class of functions (Bessel functions) to be the
solutions of this equation.
We shall only consider a restricted subset of second-order differential equations.
14.1
Directly-integrable equations
First, we look at equations that can be integrated directly.
Example 14.3 The equation
d2 x
dt2
= k, where k is some given constant, can be integrated directly.
dx
dt
Integrating once gives
= kt + C; integrating again gives x = 12 kt2 + Ct + D., where C and D are two
arbitrary constants.
In order to determine them, we need two pieces of information about x. These could be:
(i) Two points through which the solution curve passes. This is called a boundary value problem.
(ii) x and the gradient function
dx
dt
at the same point. This is called an initial value problem.
(iii) If we know x at one point and the gradient at another, this is called a mixed problem.
Example 14.4 The variable x could be missing explicitly, as in the equation
d2 x
dx
=
+ t,
2
dt
dt
The easiest approach is to replace
dx
dt
(or x
¨ = x˙ + t),
x(0) = 1, x(0)
˙
= 2.
by a new variable, y say, then the equation becomes a first order one
y˙ = y + t,
with
y(0) = 2.
77
14
We have already solved this equation (albeit with different variables):
y = −(1 + t) + Cet
When t = 0, y = 2, so C = 3.
Now we have
x˙ = −(1 + t) + 3et
x(0) = 1.
Integrating directly with respect to t gives
x = −t − 21 t2 + 3et + D.
When t = 0, x = 1, so D = −2. The solution is x = − 2 + t + 12 t2 + 3et .
14.2
Linear second-order equations
These are any equations of the form
a(t)
d2 y
dy
+ b(t)
+ c(t)y = f (t)
dt2
dt
(7)
where a(t), b(t), c(t) and f (t) are continuous functions of t on some interval, and a(t) 6= 0.
If f (t) = 0, the equation is said to be homogeneous.
If f (t) 6= 0, the equation is inhomogeneous or nonhomogeneous.
In short-hand notation, we could write (7) as
Ly = f (t)
(8)
where L is the linear differential operator
a(t)
14.2.1
d2
d
+ b(t) + c(t).
2
dt
dt
The solution of the related homogeneous equation
We first investigate the solution(s) of
a(t)
d2 y
dy
+ b(t)
+ c(t)y = 0,
dt2
dt
or Ly = 0
(9)
Let V be the set of all solutions of (9) and let u1 , u2 ∈ V , that is,
Lu1
=
0
Lu2
=
0
Adding these gives L(u1 +u2 ) = 0, so (u1 +u2 ) is also a solution of (9).
Similarly, if u is a solution of (9) so that Lu = 0, and k ∈ R then L(ku) = 0 and ku is also a solution.
V is a vector space, consisting of all the vectors (ie functions) which are mapped to the zero function by the
linear map L.
We can combine the two properties above and say that Au1 +Bu2 is a solution of (9) for all choices of the
(real) arbitrary constants A and B.
Is the general solution of (9) always the form Au1 + Bu2 ?
No, only when u1 and u2 span the solution space of (9) and are linearly independent. In other words, they
must form a basis for the solution space of (9).
Examples 14.5 Some sets of linearly independent functions:
(i) 1, t, t2 , t3 , . . .
(ii) sin t, cos t, sin 2t, cos 2t, sin 3t . . .
(iii) et , e2t , e3t , . . .
14.2
Linear second-order equations
78
14.2.2
14
The general solution of the nonhomogeneous equation
The nonhomogeneous differential equation (7) or (8) has the general solution
y = Au1 (t) + Bu2 (t) + v(t),
(10)
where A and B are arbitrary constants.
Au1 (t)+Bu2 (t) is called the complementary solution and is the general solution of (9), the related homogeneous
equation,
v(t) is any particular solution of (7) so that Lv = f (t), provided that v 6∈ V , the solution set of the homogeneous
equation (9).
In other words, u1 , u2 and v must be linearly independent functions.
Verification Substituting into the differential equation (8), and using the definition of a linear transformation,
the left-hand side is
L(Au1 + Bu2 + v)
14.3
=
AL(u1 ) + BL(u2 ) + L(v)
=
0 + 0 + Lv
=
f (t) = right-hand side.
Second-order linear equations with constant coefficients
We now specialise further and assume that a, b and c are constants. We shall continue to assume that f
depends on t. (Compare this work with the discrete case considered in Mathematical Techniques B).
14.3.1
The complementary solution
We solve the homogeneous differential equation
a
dy
d2 y
+b
+ cy = 0.
2
dt
dt
(11)
2
mt
If we try the solution y = emt , then dy
and ddt2y = m2 emt .
dt = me
Substitute into the homogeneous differential equation (11):
am2 emt + bmemt + cemt
emt am2 + bm + c
=
0
=
0
Now emt 6= 0, so am2 + bm + c = 0.
This is called an auxiliary or characteristic equation, which has solutions
√
−b ± b2 − 4ac
m1 , m2 =
.
2a
Clearly three cases arise:
Case 1 (m1 , m2 are real and distinct, ie (b2 > 4ac))
u1 = em1 t
and u2 = em2 t .
These are linearly independent functions, so the complementary solution is a linear combination of them, viz.
Aem1 t + Bem2 t ,
where A and B are arbitrary constants.
Case 2 (m1 = m2 = m are real and equal, ie (b2 = 4ac))
u1 = emt .
Clearly the second solution u2 cannot also be emt , because u1 and u2 must be linearly independent.
14.3
79
14
The second solution is u2 = temt . This can be verified by substitution into the differential equation. The
complementary solution is
Aemt + Btemt or emt (A + Bt),
where A and B are again arbitrary.
Case 3 (m1 , m2 = α ± iβ complex conjugate, ie (b2 < 4ac))
u1
= e(α+iβ)t = eαt eiβt = eαt (cos βt + i sin βt),
u2
= e(α−iβ)t = eαt e−iβt = eαt (cos βt − i sin βt).
u1 and u2 are linearly independent, but they are complex, and are not really suitable choices for solutions to a
real problem. Instead we choose two real linearly independent solutions from the complex subspace spanned by
u1 and u2 , viz. eαt cos βt and eαt sin βt.
That is,
u1 + u2
u1 − u2
and
.
2
2
The general solution is thus
eαt (A cos βt + B sin βt).
Summary
Roots of characteristic equation
real and distinct
real and equal
complex conjugate
General solution of (11)
Aem1 t + Bem2 t
emt (A + Bt)
αt
e (A cos βt + B sin βt).
Examples 14.6
(i) x
¨ − 4x = 0
(ii) x
¨ − 4x˙ + 4x = 0
(iii) x
¨ − 6x˙ + 13x = 0
14.3.2
The particular solution
f (t)
constant
t
t2
polynomial in t
ekt
sin at
cos at
Trial solution
C
Ct + D
Ct2 + Dt + E
polynomial of same degree in t
Cekt , provided k 6= either m
C cos at + D sin at
Note The trial solutions listed do not always work.
Example 14.7 x
¨ + 5x˙ + 6x = f (t)
Auxiliary equation m2 + 5m + 6 = 0, so m = −2, −3.
Complementary solution Ae−2t + Be−3t
Particular solution
(i) f (t) = 4. Let x = C, then x˙ = x
¨ = 0.
Substituting into the differential equation gives 6C = 4, so C = 23 , and hence the general solution is
x = 32 + Ae−2t + Be−3t .
14.3
80
14
(ii) f (t) = 3e2t .
Try x = Ce2t , then x˙ = 2Ce2t and x
¨ = 4Ce2t
Substituting into the differential equation gives
The general solution is thus x =
(4C + 10C + 6C)e2t
≡
3e2t
20C
≡
3
C
≡
3
20 .
3 2t
20 e
+ Ae−2t + B −3t .
(iii) f (t) = 8e−2t .
We could try Ce−2t , but this will fail because we already have a solution e−2t in the complementary
solution, so we need to find another linearly independent solution. Instead we try x = Cte−2t .
Then x˙ = C(1 − 2t)e−2t , x
¨ = C(4t − 4)e−2t . (Use Leibniz’ theorem?)
Substituting into the differential equation gives C(4t − 4)e−2t + 5C(1 − 2t)e−2t + 6Cte−2t ≡ 8e−2t , that
is, (−4 + 5)C ≡ 8, or C = 8.
The general solution is thus
x = 8te−2t + Ae−2t + Be−3t .
In each of the above cases, determine what happens in the long run.
Example 14.8 Solve x
¨ + 5x˙ + 6x = 3e2t subject to x(0) = 0 and x(0)
˙
=
14.4
17
20 .
Stability
This is a very important consideration when dealing with dynamical systems.
If small changes in the initial conditions have no effect on the long run behaviour of the solution, the system
is said to be stable. If small changes in initial conditions can lead to significant differences in the long run
behaviour then the system is unstable.
Recall that the solution is of the form
x = Au1 (t) + Bu2 (t) + v(t)
This will be stable if the complementary solution tends to 0 as t → ∞ for all values of A and B; in this case x
tends to the particular solution as t → ∞, which is independent of the initial conditions.
Thus the equation will be stable in the three cases considered above if
(i) m1 , m2 are real and distinct and are both negative, or
(ii) m1 = m2 = m is real and negative, or
(iii) m1 , m2 = α ± iβ are complex conjugate roots with α < 0.
14.4
Stability
81
REFERENCES
References
[1] R P Burn, Numbers and Functions, Cambridge (2000) QA 300.B8
[2] G H Hardy, A Course of Pure Mathematics, Cambridge (2008) QA 303.H2
[3] F M Hart, Guide to Analysis, Macmillan (2001) QA 300.H2
[4] S Lipschutz, Set Theory and Related Topics, Schaum’s Outlines, McGraw–Hill (1998) QA 248.L4
[5] S Winchester, The Surgeon of Crowthorne, Viking (1998)
[6] S Winchester, The Meaning of Everything, Oxford University Press (2003)
REFERENCES
82

EC119 Mathematical Analysis

Transcription

Similar documents

1.6 Exam advice and sample questions for Chapter 1

Homework 33: Stokes Theorem

Sample Third Exam: Calculus 1 (MAC 2311)

Math 121 Sample Problems for Test 3 e 1. lim

MATH 307: Problem Set #6

Calculus II Final Sample

AS Entrance Examination Sample Paper Mathematics Instructions

(a) Draw the line -axis, and between the vertical lines

Sample Solutions to Quiz 3 for MATH3270A − y

Calculus I Final, Sample