ML. Numerical Methods
Transcription
ML. Numerical Methods
Contract POSDRU/86/1.2/S/62485 Universitatea Tehnică din Cluj-Napoca Ioan Gavrea Mircea Ivan ML. Numerical Methods Universitatea Tehnică ”Gheorghe Asachi” din Iaşi Universitatea din Craiova Contents ML.1 ML.1.1 ML.1.2 ML.1.3 ML.1.4 ML.1.5 ML.1.6 ML.1.7 ML.1.8 ML.1.9 ML.1.10 ML.1.11 ML.1.12 ML.1.13 Numerical Methods in Linear Algebra Special Types of Matrices . . . . . . . . . . . . . Norms of Vectors and Matrices . . . . . . . . . . Error Estimation . . . . . . . . . . . . . . . . . . Matrix Equations. Pivoting Elimination . . . . . Improved Solutions of Matrix Equations . . . . . Partitioning Methods for Matrix Inversion . . . . LU Factorization . . . . . . . . . . . . . . . . . . Doolittle’s Factorization . . . . . . . . . . . . . . Choleski’s Factorization Method . . . . . . . . . . Iterative Techniques for Solving Linear Systems . Eigenvalues and Eigenvectors . . . . . . . . . . . Characteristic Polynomial: Le Verrier Method . . Characteristic Polynomial: Fadeev-Frame Method ML.2 ML.2.1 ML.2.2 ML.2.3 ML.2.4 ML.2.5 ML.2.6 ML.2.7 ML.2.8 ML.2.9 ML.2.10 Solutions of Nonlinear Equations Introduction . . . . . . . . . . . . . . . . . . . . . . . . Method of Successive Approximation . . . . . . . . . . The Bisection Method . . . . . . . . . . . . . . . . . . The Newton-Raphson Method . . . . . . . . . . . . . . The Secant Method . . . . . . . . . . . . . . . . . . . . False Position Method . . . . . . . . . . . . . . . . . . The Chebyshev Method . . . . . . . . . . . . . . . . . Numerical Solutions of Nonlinear Systems of Equations Newton’s Method for Systems of Nonlinear Equations . Steepest Descent Method . . . . . . . . . . . . . . . . ML.3 ML.3.1 ML.3.2 ML.3.3 ML.3.4 Elements of Interpolation Theory Lagrange Interpolation . . . . . . . . . . . . . . . Some Forms of the Lagrange Polynomial . . . . . Some Properties of the Divided Difference . . . . Mean Value Properties in Lagrange Interpolation 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 9 14 16 20 20 23 28 31 33 36 38 39 . . . . . . . . . . 41 41 42 43 44 45 45 46 48 49 51 . . . . 53 53 54 61 63 4 CONTENTS ML.3.5 ML.3.6 ML.3.7 ML.3.8 ML.3.9 ML.3.10 ML.3.11 ML.3.12 Approximation by Interpolation . . . . . . . . . Hermite-Lagrange Interpolating Polynomial . . Finite Differences . . . . . . . . . . . . . . . . . Interpolation of Functions of Several Variables . Scattered Data Interpolation. Shepard’s Method Splines . . . . . . . . . . . . . . . . . . . . . . . B-splines . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 65 68 71 72 74 75 78 . . . . . . . . . 81 81 82 83 84 85 86 87 88 88 . . . . . . . . . 91 91 93 95 95 97 99 101 106 107 . . . . . 109 109 109 110 111 112 ML.7 Integration of Partial Differential Equations ML.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ML.7.2 Parabolic Partial-Differential Equations . . . . . . . . . . . . . . . . . . . 115 115 116 ML.4 ML.4.1 ML.4.2 ML.4.3 ML.4.4 ML.4.5 ML.4.6 ML.4.7 ML.4.8 ML.4.9 Elements of Numerical Integration Richardson’s Extrapolation . . . . . . . . . . Numerical Quadrature . . . . . . . . . . . . . Error Bounds in the Quadrature Methods . . Trapezoidal Rule . . . . . . . . . . . . . . . . Richardson’s Deferred Approach to the Limit Romberg Integration . . . . . . . . . . . . . . Newton-Cotes Formulas . . . . . . . . . . . . Simpson’s Rule . . . . . . . . . . . . . . . . . Gaussian Quadrature . . . . . . . . . . . . . . ML.5 ML.5.1 ML.5.2 ML.5.3 ML.5.4 ML.5.5 ML.5.6 ML.5.7 ML.5.8 ML.5.9 Elements of Approximation Theory Discrete Least Squares Approximation . . Orthogonal Polynomials and Least Squares Rational Function Approximation . . . . . Padé Approximation . . . . . . . . . . . . Trigonometric Polynomial Approximation Fast Fourier Transform . . . . . . . . . . . The Bernstein Polynomial . . . . . . . . . Bézier Curves . . . . . . . . . . . . . . . . The METAFONT Computer System . . . ML.6 ML.6.1 ML.6.2 ML.6.3 ML.6.4 ML.6.5 Integration of Ordinary Differential Equations Introduction . . . . . . . . . . . . . . . . . . . . . . . The Euler Method . . . . . . . . . . . . . . . . . . . The Taylor Series Method . . . . . . . . . . . . . . . The Runge-Kutta Method . . . . . . . . . . . . . . . The Runge-Kutta Method for Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS 5 ML.7.3 Hyperbolic Partial Differential Equations . . . . . . . . . . . . . . . . . . . ML.7.4 Elliptic Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . 116 117 ML.7 Self Evaluation Tests ML.7.1 Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ML.7.2 Answers to Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 119 124 Index 136 Bibliography 139 6 ML.1 Numerical Methods in Linear Algebra ML.1.1 Special Types of Matrices Let Mm,n (R) be the set of all m × n type matrices with real entries, where m, n are positive integers (Mn (R) := Mn,n (R)). DEFINITION ML.1.1.1 A matrix A ∈ Mn (R) is said to be strictly diagonally dominant when its entries satisfy the condition n X |aij | |aii | > j=1 j6=i for each i = 1, . . . , n. THEOREM ML.1.1.2 A strictly diagonally dominant matrix is nonsingular. Proof. Consider the linear system AX = 0, A ∈ Mn (R), which has a nonzero solution X = [x1 , . . . , xn ]t ∈ Mn,1 (R). Let k be an index such that |xk | = max |xj |. 1≤j≤n Since n X akj xj = 0, j=1 7 8 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA we have akk xk = − This implies that |akk | ≤ n X j=1 j6=k n X akj xj . j=1 j6=k n |xj | X |akj | |akj |. ≤ |xk | j=1 j6=k This contradicts the strict diagonal dominance of A. Consequently, the only solution to AX = 0 is X = 0, a condition equivalent to the nonsingularity of A. Another special class of matrices is called positive definite. DEFINITION ML.1.1.3 A matrix A ∈ Mn (R) is said to be positive definite if det(X t AX) > 0 for every X ∈ Mn,1 (R), X 6= 0. Note that, for X = [x1 , . . . , xn ]t , we have det(X t AX) = n n X X aij xj xi . i=1 j=1 Using the definition ( ML.1.1.3 ) to determine whether a matrix is positive definite or not can be difficult. The next result provides some conditions that can be used to eliminate certain matrices from consideration. THEOREM ML.1.1.4 If the matrix A ∈ Mn (R) is symmetric and positive definite, then: (1) A is nonsingular; (2) akk > 0, for each k = 1, . . . , n; (3) max |akj | < max |aii |; 1≤k6=j≤n 1≤i≤n (4) (aij )2 < aii ajj , for each i, j = 1, . . . , n, i 6= j. Proof. (1) If X 6= 0 is a vector which satisfies AX = 0, then det(X t AX) = 0. This contradicts the assumption that A is positive definite. Consequently, AX = 0 has only the zero solution and A is nonsingular. (2) For an arbitrary k, let X = [x1 , . . . , xn ]t be defined by 1, when j = k, xj = 0, when j 6= k, j = 1, 2, . . . , n. ML.1.2. NORMS OF VECTORS AND MATRICES Since X 6= 0, akk = det(X t AX) > 0. (3) For k 6= j define X = [x1 , . . . , xn ]t 0, −1, xi = 1, 9 by when i 6= j and i 6= k, when i = j, when i = k. Since X 6= 0, ajj + akk − ajk − akj = det(X t AX) > 0. But At = A, so ajk = akj and 2akj < ajj + akk . Define Z = [z1 , . . . , zn ]t where 0, when i = 6 j and i 6= k, zi = 1, when i = j or i = k, Then det(Z t AZ) > 0, so −2akj < akk + ajj . We obtain |akj | < akk + ajj ≤ max |aii |. 1≤i≤n 2 Hence max |akj | < max |aii |. 1≤k6=j≤n (4) For i 6= j, define X = [x1 , . . . , xn ]t 0, α, xk = 1, 1≤i≤n by when k 6= j and k 6= i, when k = i, when k = j. where α represents an arbitrary real number. Since X 6= 0, 0 < det(X t AX) = aii α2 + 2aij α + ajj . As a quadratic polynomial in α with no real roots, the discriminant must be negative. Thus 4(a2ij − aii ajj ) < 0 and a2ij < aii ajj . ML.1.2 Norms of Vectors and Matrices DEFINITION ML.1.2.1 A vector norm is a function k∙k : Mn,1 (R) → R satisfying the following conditions: (1) kXk ≥ 0 for all X ∈ Mn,1 (R), (2) kXk = 0 if and only if X = 0, (3) kαXk = |α|kXk for all α ∈ R and X ∈ Mn,1 (R), (4) kX + Y k ≤ kXk + kY k for all X, Y ∈ Mn,1 (R). 10 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA The most common vector norms for n-dimensional column vectors with real number coefficients, X = [x1 , . . . , xn ]t ∈ Mn,1 (R), are defined by: kXk1 = kXk2 n X i=1 |xi |, v u n uX = t x2i , i=1 kXk∞ = max |xi |, 1≤i≤n The norm of a vector gives a measure for the distance between an arbitrary vector and the zero vector. The distance between two vectors can be defined as the norm of the difference of the vectors. The concept of distance in Mn,1 (R) is also used to define the limit of a sequence of vectors. DEFINITION ML.1.2.2 A sequence (X (k) )∞ k=1 of vectors in Mn,1 (R) is said to be convergent to a vector X ∈ Mn,1 (R), with respect to the norm k ∙ k, if lim kX (k) − Xk = 0. k→∞ The notion of vector norm will be extended for matrices. DEFINITION ML.1.2.3 A matrix norm on the set Mn (R) is a function k ∙ k : Mn (R) → R satisfying the conditions: (1) kAk ≥ 0, (2) kAk = 0 if and only if A = 0, (3) kαAk = |α|kAk, (4) kA + Bk ≤ kAk + kBk, (5) kABk ≤ kAk ∙ kBk, for all matrices A, B ∈ Mn (R) and any real number α. It is not difficult to show that if k ∙ k is a vector norm on Mn,1 (R), then kAk := max kAXk kXk=1 is a matrix norm called the natural norm or the induced matrix norm associated with the vector norm. In this text, all matrix norms will be assumed to be natural matrix norms unless specified otherwise. ML.1.2. NORMS OF VECTORS AND MATRICES 11 The k ∙ k∞ norm of a matrix has an interesting representation with respect to the entries of the matrix. THEOREM ML.1.2.4 If A = [aij ] ∈ Mn (R), then kAk∞ = max 1≤i≤n n X j=1 |aij |. Proof. Let X ∈ Mn,1 (R) be an arbitrary column vector with 1 = kXk∞ = max |xi |. 1≤i≤n We have: kAXk∞ = max |(AX)i | 1≤i≤n n n X X aij xj ≤ max |aij | ∙ max |xj | = max 1≤i≤n 1≤j≤n 1≤i≤n j=1 = max 1≤i≤n n X j=1 j=1 |aij | ∙ 1. Consequently, kAk∞ = max kAXk∞ ≤ max 1≤i≤n kXk=1 However, if p is an integer such that n X j=1 and X is chosen with xj = |apj | = max 1≤i≤n 1, −1, n X j=1 n X j=1 |aij | |aij | when apj ≥ 0, when apj < 0, then kXk∞ = 1 and apj xj = |apj | for j = 1, . . . , n. So kAXk∞ n X = max aij xj 1≤i≤n j=1 n n n X X X apj xj = |apj | = max |aij |, ≥ 1≤i≤n j=1 j=1 j=1 which, together with inequality (~), gives kAk∞ = max 1≤i≤n n X j=1 |aij |. (~) 12 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA Similarly, we can prove that kAk1 = max 1≤j≤n n X i=1 |aij |. An alternative method for finding kAk2 will be presented in the next section (see Theorem ( ML.1.11.5 )). DEFINITION ML.1.2.5 The Frobenius norm (which is not a natural norm) is defined, for a matrix A = [aij ] ∈ Mn (R), by v uX n u n X t |aij |2 . kAkF = i=1 j=1 One can easily prove that, for any matrix A, kAk2 ≤ kAkF ≤ √ n kAk2 . Another matrix norm can be defined by kAk = n n X X i=1 j=1 |aij |. Note that the function defined by f (A) = max |aij | 1≤i,j≤n is not a norm. ML.1.2. NORMS OF VECTORS AND MATRICES EXAMPLE ML.1.2.6 Mathematica (*NORMS*) n = 4; a = Table [i2 − j, {i, 1, n}, {j, 1, n}] ; na = Dimensions[a][[1]]; SequenceForm[“A= ”, MatrixForm[a]] 0 −1 −2 −3 3 2 1 0 A= 8 7 6 5 15 14 13 12 NormInfinity = h hP ii na Max Table Abs[a[[i, j]]], {i, 1, na} ; j=1 SequenceForm ["||Akinf = ", NormInfinity] ||Akinf = 54 Norm1 = Pna Abs[a[[i, j]]], {j, 1, na} ; Max Table i=1 SequenceForm ["||Ak1 = ", Norm1] ||Ak1 = 26 EXAMPLE ML.1.2.7 Mathematica (*Norm2 *) A = Table[Min[i, j], {i, −1, 1}, {j, −1, 1}]; norm2[A ]:=Sqrt[Max[Abs[ComplexExpand[Eigenvalues[Transpose[A].A]]]]]// FullSimplify SequenceForm[“A = ”, MatrixForm[A]] SequenceForm ["At A = ", MatrixForm[Transpose[A].A]] SequenceForm ["norm 2 [A] = ", norm2[A], “ = ”, N [norm2[A]], “...”] −1 −1 −1 −1 0 0 A= −1 0 1 3 1 0 t 1 1 1 AA= 0 1 2 q √ norm2 [A] = 2 + Cos π9 + 3Sin π9 = 1.87939... 13 14 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA ML.1.3 Error Estimation Consider the linear system AX = B, where A ∈ Mn (R) and B ∈ Mn,1 (R). It seems intuitively reasonable that if X is an approximation to the solution X of the above-mentioned system and the residual vector R = B − AX has the property that kRk is small, then kX −X k would be small as well. This is often the case, but certain systems, which occur frequently in practical problems, fail to have this property. EXAMPLE ML.1.3.1 The linear system AX = B given by 3 1 2 x1 = ∙ 3.0001 1.0001 2 x2 has the unique solution X = [1, 1]t . The poor approximation X = [3, 0]t has the residual vector R = B − AX 3 1 2 3 0 = − = , 3.0001 1.0001 2 0 −0.0002 so kRk∞ = 0.0002. Although the norm of the residual vector is small, the approximation X = [3, 0]t is obviously quite poor. In fact, kX − X k∞ = 2. In a general situation we can obtain information of when problems might occur by considering the norm of the matrix A and its inverse. THEOREM ML.1.3.2 Let X be an approximative solution of AX = B, where A is a nonsingular matrix, R is the residual vector for X , and B 6= 0. Then for any natural norm, kX − X k ≤ kRk ∙ kA−1 k, and kX − X k kXk ≤ kAk ∙ kA−1 k ∙ kRk kBk . Proof. Since A is nonsingular and R = B − AX = A(X − X ), we obtain kX − X k = kA−1 Rk ≤ kA−1 k ∙ kRk. ML.1.3. ERROR ESTIMATION 15 Moreover, since B = AX, we have kBk ≤ kAk ∙ kXk, and kX − X k kAk ∙ kA−1 k ≤ ∙ kRk. kXk kBk DEFINITION ML.1.3.3 The condition number K(A) of a nonsingular matrix A relative to a natural norm k ∙ k is defined by K(A) = kAk ∙ kA−1 k. With this notation, the inequalities in Theorem ( ML.1.3.2 ) become kX − X k ≤ K(A) ∙ kRk kAk and kRk kX − X k ≤ K(A) ∙ . kXk kBk For any nonsingular matrix A and the natural norm k ∙ k, we have 1 = kIk = kA ∙ A−1 k ≤ kAk ∙ kA−1 k = K(A), so K(A) ≥ 1. The matrix A is said to be well-conditioned if K(A) is close to one and ill-conditioned when K(A) is significantly greater than one. The matrix of the system considered in Example ( ML.1.3.1 ) is 1 2 A= 1.0001 2 which has kAk∞ = 3.0001. But A −1 = −10000 10000 5000.5 −5000 so kA−1 k∞ = 20000. The condition number for the infinity norm is K(A) = 60002. Its size certainly keeps us from making hasty accuracy decisions based on the residual vector of an approximation. 16 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA ML.1.4 Matrix Equations. Pivoting Elimination Matrix equations are associated with many problems arising in engineering and science, as well as with applications of mathematics to social sciences. For solving a matrix equation, the partial pivoting elimination is about as efficient as any other method. Pivoting is a process of interchanging rows (partial pivoting) or rows and columns (full pivoting), so as to put a particularly desirable element in the diagonal position from which the pivot is about to be selected. Let us recall the principal row elementary operations used to transform a matrix equation to a more convenient one with the same solution: 1. Multiply the row i by a nonzero constant λ. This operation is denoted by ri ← λ r i . 2. Add the row j multiplied by a constant λ to the row i. This operation is denoted by ri ← ri + λ r j . 3. Interchange rows i and j. This operation is denoted by ri r j . EXAMPLE ML.1.4.1 Consider the matrices: 1 0 1 A = 2 −1 0 , 0 1 1 1 B = 1 . 0 Let us use the partial pivoting method to solving the matrix equation AX = B for the unknown matrix X. By augmenting A with the column matrix B we obtain 1 0 1 ; C0 = [A; B] = 2 −1 0 ; 0 1 1 ; the augmented matrix 1 1 . 0 Performing row operations, step by step, on the matrix C0 , we will obtain the matrices C1 , C 2 , C 3 : Step 1: From C0 , using the operations: r 1 r2 r1 ← 12 r1 , r 2 ← r2 − r 1 ML.1.4. MATRIX EQUATIONS. PIVOTING ELIMINATION we obtain 17 1 − 12 0 ; 12 1 1 ; 12 . C1 = 0 2 0 1 1 ; 0 Step 2: From C1 , using the operations: r2 r 3 r3 ← r3 − 12 r2 , r1 ← r1 + 12 r2 we get the matrix 1 0 12 ; 12 C2 = 0 1 1 ; 0 . 0 0 12 ; 12 Step 3: From C2 , using the operations: r1 ← r1 − r 3 r 3 ← 2 r3 , r 2 ← r2 − r 3 we obtain the matrix 1 0 0 ; 0 C3 = 0 1 0 ; −1 . 0 0 1 ; 1 The last column in the matrix C3 is the unknown X, i.e., 0 X = A−1 B = −1 . 1 The partial pivoting elimination method can be described as follows: Let A ∈ Mn (C), and B ∈ Mn,p (C). The matrix equation AX = B, will be solved for the unknown X ∈ Mn,p (C). Consider the augmented matrix C (0) = [A; B] ∈ Mn,n+p (C). A set of elementary row operations will be performed on the matrix C (0) until the matrix A is reduced to the identity matrix. When this is done, the solution X = A−1 B replaces the matrix B. We can arrange these transformations into n steps. In the kth step, a new matrix C (k) is obtained from the existing matrix C (k−1) , k = 1, 2, . . . , n. At the beginning of the step k we (k−1) compare the moduli of the elements cik , i = k, . . . , n, . Among these, the element having the 18 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA largest modulus, is called pivot element. Then, the row containing the pivot element and the (k−1) kth row are interchanged. If, after last interchange, the modulus of the pivot element ckk is less than a given value ε, then the matrix A is considered to be singular and the procedure stops. Now, the elements of the matrix C (k) are calculated and given by: (k) (k−1) (k−1) (j = k + 1, . . . , n + p) c = ckj /ckk kj (k) (k−1) (k−1) (k) cij = cij − cik ckj (i = 1, . . . , n; i 6= k; j = k + 1, . . . , n + p; ) (1.4.1) k = 1, . . . , n. The last p columns of the matrix C (n) is the solution X = A−1 B. At the same time, the determinant of the matrix A can be calculated by the formula (0) (1) (n−1) , det(A) = (−1)σ c11 c22 ∙ ∙ ∙ cnn (k−1) where, ckk performed. (1.4.2) is the pivot element in the k th step, and σ is the number of row interchanges Note that, after last interchange, it is not necessary to perform transforms to the columns 1, . . . , k. ML.1.4. MATRIX EQUATIONS. PIVOTING ELIMINATION EXAMPLE ML.1.4.2 Mathematica (* Pivoting elimination *) c = {{1, 0, 1, 1}, {2, −1, 0, 1}, {0, 1, 1, 0}}; Print[“c= ”, MatrixForm[c]]; det = 1; k = 1; n = 3; p = 1; While[k ≤ n, If[k 6= n, imx = k; cmx = Abs[c[[k, k]]]; For[i = k + 1, i ≤ n, i++, If[cmx < Abs[c[[i, k]]], imx = i; cmx = Abs[c[[i, k]]]]]; If[imx 6= k, For[j = n + p, j ≥ 1, j–, t = c[[imx, j]]; c[[imx, j]] = c[[k, j]]; c[[k, j]] = t]]; det = − det]; If[Abs[c[[k, k]]] < 0.1, k = n + 1, det = det h iic[[k, k]]; c[[k,j]] For j = n + p, j ≥ 1, j–, c[[k, j]] = c[[k,k]] ; For[i = 1, i ≤ n, i++, If[i 6= k, For[j = n + p, j ≥ 1, j–, c[[i, j]] = c[[i, j]] − c[[i, k]]c[[k, j]]]]; ]; Pause[2]; Print[“Step ”, k]; Pause[2]; Print[MatrixForm[c]]; k++]; If[k==n + 2, Singular, Print[“det=”, det]] 1 0 1 1 2 −1 0 1 c= 0 1 1 0 Step 1 1 − 12 0 12 0 1 1 1 2 2 0 1 1 0 Step 2 1 0 12 12 0 1 1 0 0 0 12 12 Step 3 1 0 0 0 0 1 0 −1 0 0 1 1 det=1 19 20 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA REMARK ML.1.4.3 • If p = n and B is the identity matrix, then the solution of the equation AX = B is A−1 . • If p = 1 we obtain the Gauss-Jordan method for solving linear system of equations. • If A is a strictly diagonally dominant matrix, the Gaussian elimination can be performed on any linear system AX = B without row interchange, and the computations are stable with respect to the growth of round-off errors [BF93, p 372]. ML.1.5 Improved Solutions of Matrix Equations Obviously it is not easy to obtain greater precision for the solution of a matrix equation than the precision of the computer’s floating-point word. Unfortunately, for large sets of linear equations, it is not always easy to obtain precision equal to, or even comparable to the computer’s limits. In the direct methods of solution, roundoff errors are accumulated and they magnify to the extend when the matrix is close to singular. Suppose that the matrix X is the exact solution of the equation AX = B. We don’t, however, know X. We only know some slightly wrong solution say X . Substituting this into AX = B we obtain B = AX . In order to find a correction matrix E(X) we solve the equation A ∙ E(X) = B − B = AX − AX , where the right-hand side B − B is known. We obtain a slightly wrong correction E(X) . So, X + E(X) is an improved solution. An extra benefit occurs if we repeat the previous steps. ML.1.6 Partitioning Methods for Matrix Inversion Let A ∈ Mn (R) be a nonsingular matrix. Consider the partition A11 A12 A= A21 A22 where A11 ∈ Mm (R), A12 ∈ Mm,p (R), A21 ∈ Mp,m (R), and A22 ∈ Mp (R), such that m+p = n. In order to compute the inverse of the matrix A, we shall try to find the matrix A−1 into the ML.1.6. PARTITIONING METHODS FOR MATRIX INVERSION form A −1 From A∙A we deduce −1 = B11 B12 B21 B22 = In = 21 . Im 0m,p 0p,m Ip , A11 B11 + A12 B21 = Im A21 B11 + A22 B21 = 0p,m A21 B12 + A22 B22 = Ip In what follows, we shall frame the matrices at the moment when they can be effectively calculated. We have: B21 = −A−1 22 A21 B11 , (A11 − A12 A−1 22 A21 )B11 = Im , hence −1 B11 = (A11 − A12 A−1 22 A21 ) , consequently From A−1 A = In we deduce: B21 = −A−1 22 A21 B11 . B11 A12 + B12 A22 = 0m,p , hence B12 = −B11 A12 A−1 22 , and so −1 B22 = A−1 22 − A22 A21 B12 . By choosing p = 1, we obtain the following iterative technique for matrix inversion. Let 1 −1 A1 = [a11 ], A1 = a11 and let Ak ∈ Mk (R) be the matrix obtained by cancelling the last n − k rows and columns of the matrix A. We have: Ak−1 uk Bk−1 xk −1 Ak = , Ak = , vk akk y k βk a1k uk = ... , vk = [ak1 , . . . ak,k−1 ], ak−1,k where A−1 k−1 , vk , uk , are known. We deduce: Ak−1 Bk−1 + uk ∙ yk = Ik−1 Ak−1 xk + uk ∙ βk = 0k−1,1 vk ∙ xk + akk βk = I1 22 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA hence xk = −A−1 k−1 uk βk , (−vk A−1 k−1 uk + akk )βk = I1 , βk = −vk ∙ A−1 k−1 ∙ uk + akk −1 , xk = −βk A−1 k−1 uk . Using the relation yk Ak−1 + βk vk = 01,k−1 , we obtain yk = −βk vk A−1 k−1 , −1 −1 −1 Bk−1 = A−1 . k−1 − Ak−1 uk yk = Ak−1 + xk ∙ yk βk In summa, starting from the matrices: Ak = Ak−1 uk vk akk we obtain A−1 k where: Finally, we obtain βk xk = A−1 k−1 , Bk−1 xk yk βk , −1 = (−vk A−1 k−1 uk + akk ) , = −βk A−1 k−1 uk , yk = −βk vk A−1 k−1 , −1 Bk−1 = A−1 k−1 + xk yk βk −1 A−1 n = A . ML.1.7. LU FACTORIZATION EXAMPLE 23 ML.1.6.1 Mathematica (* Inverse by partitioning *) CheckAbort[A = Table[Min[i, j], {i, 3}, {j, 3}]; If[A[[1, nn 1]]==0, oo Abort[]]; 1 inva = ; A[[1,1]] k = 2; While[k ≤ 3, u = Table[{A[[i, k]]}, {i, k − 1}]; v = {Table[A[[k, j]], {j, k − 1}]}; d = −v.inva.u + A[[k, k]]; If[d[[1, 1]]==0, Abort[]]; β = d1 ; x = −β[[1, 1]] inva.u; x.y ; y = −bb[[1, 1]]v.inva; B = inva + β[[1,1]] inva = Transpose[Join[Transpose[Join[B, y]], Transpose[Join[x, β]]]]; k++]; SequenceForm[“ A = ”, MatrixForm[A], “ inva = ”, MatrixForm[inva]], Print[“Method fails.”]] 2 −1 0 1 1 1 A = 1 2 2 inva = −1 2 −1 0 −1 1 1 2 3 (***************************************************************) Inverse[A]//MatrixForm 2 −1 0 −1 2 −1 0 −1 1 ML.1.7 LU Factorization Suppose that the matrix A ∈ Mn (R) can be written as the product of two matrices, A = L ∙ U, where L is lower triangular (has zero entries above the ∗ 0 0 0 ∗ ∗ 0 0 L= ∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ and U is upper triangular (has zero entries below ∗ ∗ 0 ∗ U = 0 0 0 0 leading diagonal), the leading diagonal), ∗ ∗ ∗ ∗ . ∗ ∗ 0 ∗ 24 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA The matrix A has an LU factorization or LU decomposition. The LU factorization of a matrix A can be used to solve the linear system of equations AX = B. Using the LU factorization we can solve the system by using a two-step process. We write the system into the form L(U X) = B. First solve the system LY = B for Y. Since L is triangular, determining Y from this equation requires only O(n2 ) operations. Next, the triangular system UX = Y requires an additional O(n2 ) operations to determine the solution X. It is to be noted that only O(n2 ) number of operations is required to solve the system AX = B compared with O(n3 ) needed by the Gaussian elimination. Determining the specific matrices L and U requires O(n3 ) operations, but, once the factorization is determined, any system involving the matrix A can be solved in the simplified manner. ML.1.7. LU FACTORIZATION EXAMPLE 25 ML.1.7.1 Consider the system = 4 x 1 + x2 2x1 + x2 − x3 = 1 . 3x1 − x2 = 0 One can easily verify the following LU factorization: 1 1 0 1 0 0 1 1 0 1 −1 = L ∙ U = 2 1 0 ∙ 0 −1 −1 . A= 2 0 0 4 3 4 1 3 −1 0 Solving the system y1 1 0 0 4 LY = 2 1 0 ∙ y2 = 1 = B, 3 4 1 0 y3 for Y, we find Next, solving the system for X, we find 4 Y = −7 . 16 x1 1 1 0 4 U X = 0 −1 −1 ∙ x2 = −7 = Y, 0 0 4 16 x3 1 X = 3 . 4 Note that there exist matrices which have no LU factorization, e.g., 0 1 1 0 . Now, a method for performing matrix LU factorization is presented. 26 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA THEOREM ML.1.7.2 Let A ∈ Mn (R) and Ak ∈ Mk (R) be obtained by cancelling the last n − k rows and columns of the matrix A (k = 1, . . . , n − 1). If Ak are nonsingular matrices (k = 1, . . . , n − 1), then there exists a lower triangular matrix L whose entries of the leading diagonal are 1, and an upper triangular matrix U such that A = LU ; furthermore, the LU factorization is unique. Proof. We shall use the mathematical induction method. Firstly, find L2 = such that A2 = 1 0 l21 1 a11 a12 a21 a22 , = U2 = 1 0 l21 1 u11 u12 0 u22 u11 u12 0 u22 , . We obtain: u11 = a11 6= 0, u12 = a12 , l21 u11 = a21 , a21 , l21 = a11 l21 u12 + u22 = a22 , u22 = a22 − a21 a12 /a11 = det(A2 ) . a11 The factorization is unique. Next, suppose that the matrix Ak−1 has an LU factorization Ak−1 = Lk−1 Uk−1 . Since matrix Ak−1 is nonsingular the matrices Lk−1 and Uk−1 are nonsingular. Try to find the matrix Ak into the form Ak = Lk U k , = Pk 0 lk 1 Ak−1 bk ck akk Qk uk 0 ukk = Pk u k Pk Q k lk Qk lk uk + ukk . ML.1.7. LU FACTORIZATION 27 From Ak−1 = Pk Qk , and using the uniqueness property of the factorization, we obtain: Pk Qk Lk−1 uk uk lk Uk−1 lk lk uk + ukk ukk = Lk−1 , = = = = = = = Uk−1 , bk , L−1 k−1 bk , ck −1 ck Uk−1 akk , akk − lk uk . For k = n the existence and the uniqueness of the LU factorization are proved. Following the proof, we obtain a recurrent method for finding the matrices L and U. EXAMPLE ML.1.7.3 Mathematica (* LUDecomposition *) A:= {{a11 , a12 } , {a21 , a22 }} ; n = Dimensions[A][[1]]; SequenceForm[“A =”, MatrixForm[A]] a11 a12 A= a21 a22 Information[“LUDecomposition”] LUDecomposition[m] generates a representation of the LU decomposition of a matrix m.More. . . Attributes[LUDecomposition] = {Protected} Options[LUDecomposition] = {Modulus → 0} MatrixForm[LUDecomposition[A][[1]]//Factor] a11 a12 a21 a11 −a12 a21 +a11 a22 a11 L = Table[If[i > j, LUDecomposition[A][[1]][[i]][[j]], If[i == j, 1, 0]], {i, n}, {j, n}]//Factor; U = Table[If[i ≤ j, LUDecomposition[A][[1]][[i]][[j]], 0], {i, n}, {j, n}]//Factor; SequenceForm[“L =”,MatrixForm[L], “ U = ”, MatrixForm[U ]] a12 a11 1 0 L = a21 U= −a12 a21 +a11 a22 1 0 a11 a11 (******************* ∗ TEST ********************) MatrixForm[L.U ]//Simplify a11 a12 a21 a22 28 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA REMARK ML.1.7.4 The determinant of an LU decomposed matrix is the product of the diagonal entries of U , n Y det(A) = det(U ) = uii . i=1 There are several factorization methods: • Doolittle’s method, which requires that lii = 1, i = 1, . . . , n; • Crout’s method, which requires the diagonal elements of U to be one; • Choleski’s method, which requires that lii = uii , i = 1, . . . , n. ML.1.8 Doolittle’s Factorization We present an algorithm which produces a Doolittle type factorization. Consider the matrices: 1 0 0 ... 0 0 l21 1 0 . . . 0 0 L = .. .. , .. . . .. .. . . . . . . ln1 ln2 ln3 . . . ln,n−1 1 u11 u12 u13 . . . u1,n−1 u1,n 0 u22 u23 . . . u2,n−1 u2n U = .. .. , .. . . .. .. . . . . . . 0 0 0 ... 0 unn a11 . . . a1n A = ... . . . ... , an1 . . . ann such that L ∙ U = A. This gives the following set of equations: min(i,j) X lik ukj = aij , i, j = 1, . . . , n, k=1 which can be solved by arranging them in a certain order. The order is as follows: u11 := a11 For j = 2, . . . , n set ^^^^ u1j = a1j , lj1 = aj1 /u11 For i = 2, . . . , n − 1 ML.1.8. DOOLITTLE’S FACTORIZATION ^^^^ ^^^^ uii = aii − i−1 X 29 lik uki k=1 For j = i + 1, . . . , n uij = aij − ^^^^^^^^ ^^^^^^^^ unn = ann − n−1 X lji = i−1 X lik ukj k=1 aji − i−1 X ljk uki k=1 ! /uii lnk ukn k=1 It is clear that each aij is used only once and never again. This means that the corresponding lij or uij can be stored in the location that aij used to occupy; the decomposition is “in place” (the diagonal unit elements lii are not stored at all). The following example presents the order of finding lij and uij for n = 4. The calculations are performed in the order shown in brackets. a11 a21 A= a31 a41 [1] u11 [3] l21 → [5] l31 [7] l41 [2] u12 a22 a32 a42 u11 u12 l21 u22 → l31 l32 l41 l42 [4] u13 a23 a33 a43 u13 u23 [13] u33 [15] l43 [6] a12 a22 a32 a42 a13 a23 a33 a43 a14 a24 a34 a44 u11 u14 l21 a24 → l31 a34 l41 a44 u12 [8] u22 [10] l32 [12] l42 u13 [9] u23 a33 a43 u14 u11 u12 u13 u24 l21 u22 u23 → [14] l31 l32 u33 u34 l41 l42 l43 a44 u14 [11] u24 a34 a44 u14 u24 . u34 [16] u44 30 EXAMPLE ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA ML.1.8.1 Mathematica (* Doolittle Factorization Method *) a = {{1, 1, 0}, {2, 1, −1}, {3, −1, 0}}; n = Dimensions[a][[1]]; l = Table[0, {i, n}, {j, n}]; u = Table[0, {i, n}, {j, n}]; CheckAbort[ For[i = 1, i ≤ n, i++, l[[i, i]] = 1]; If[a[[1, 1]]==0, Abort[]]; u[[1, 1]] = a[[1, 1]]; For[j = 2, j ≤ n, i j++, u[[1, j]] = a[[1, j]]; a[[j,1]] l[[j, 1]] = a[[1,1]] ; For[i = 2, i ≤ n − 1, i++, Pi−1 u[[i, i]] = a[[i, i]] − k=1 l[[i, k]]u[[k, i]]; If[u[[i, i]]==0, Abort[]]; For[j = i + 1, j ≤ n, j++, Pi−1 l[[i, k]]u[[k, j]]; u[[i, j]] = a[[i, j]] − k=1 i i Pi−1 i−1 a[[j,i]]− k=1 l[[j,k]]u[[k,i]] k=1 l[[j, i]] = ; ; u[[i,i]] Pn−1 u[[n, n]] = a[[n, n]] − k=1 l[[n, k]]u[[k, n]]; , Print[“Factorization impossible”]] SequenceForm[“L= ”MatrixForm[l], “ U = ”MatrixForm[u]] SequenceForm[“A= ”MatrixForm[a], “ L.U =”MatrixForm[l.u]] 1 1 0 1 0 0 L= 2 1 0 U = 0 −1 −1 0 0 4 3 4 1 1 1 0 1 1 0 A= 2 1 −1 L.U = 2 1 −1 3 −1 0 3 −1 0 ML.1.9. CHOLESKI’S FACTORIZATION METHOD EXAMPLE 31 ML.1.8.2 Mathematica (*DoolittleFactorizationMethod − “in place”*) a = {{1, 1, 0}, {2, 1, −1}, {3, −1, 0}}; olda = a; n = Dimensions[a][[1]]; CheckAbort[If[a[[1, 1]]==0, Abort[]]; i h a[[j,1]] For j = 2, j ≤ n, j++, a[[j, 1]] = a[[1,1]] ; h Pi−1 a[[i, k]]a[[k, i]]; For i = 2, i ≤ n − 1, i++, a[[i, i]] = a[[i, i]] − k=1 If[a[[i, h i]]==0, Abort[]]; Pi−1 a[[i, k]]a[[k, j]]; For j = i + 1, j ≤ n, j++, a[[i, j]] = a[[i, j]] − k=1 i i P i−1 a[[j,i]]− k=1 a[[j,k]]a[[k,i]] k=1 a[[j, i]] = ; ; a[[i,i]] Pn−1 a[[n, n]] = a[[n, n]] − k=1 a[[n, k]]a[[k, n]]; , Print[“Factorization impossible”]] l = Table[0, {i, n}, {j, n}]; u = Table[0, {i, n}, {j, n}]; For[i = 1, i ≤ n, i++, l[[i, i]] = 1; For[j = 1, j ≤ i − 1, j++, l[[i, j]] = a[[i, j]]]]; For[i = 1, i ≤ n, i++, For[j = i, j ≤ n, j++, u[[i, j]] = a[[i, j]]]]; SequenceForm[“A = ”MatrixForm[olda], “ New A = L\\\U = ”MatrixForm[a]] SequenceForm[“L = ”MatrixForm[l], “ U = ”MatrixForm[u]] SequenceForm[“Test: L.U = ”MatrixForm[l.u]] 1 1 0 1 1 0 −1 New A = L\\U = 2 −1 −1 A= 2 1 3 4 4 0 3 −1 1 1 0 1 0 0 0 −1 −1 2 1 0 U= L= 0 0 4 3 4 1 1 1 0 2 1 −1 Test: L.U = 3 −1 0 ML.1.9 Choleski’s Factorization Method If a square matrix A is symmetric and positive definite, then it has a special triangular decomposition. One can prove that a matrix A is symmetric and positive definite if and only if it can be factored in the form A = L ∙ Lt (1.9.1) where L is lower triangular with nonzero diagonal entries. 32 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA The factorization (( 1.9.1 )) is sometimes referred to as “taking the square root” of A. Writing out Equation ( 1.9.1 ) in components, one readily obtains lii = 1 lji = lii aii − X 2 lik 1≤k≤i−1 aij − X 1≤k≤i−1 !1/2 lik ljk ! 1 ≤ i ≤ n, i + 1 ≤ j ≤ n. See Examples ( ML.1.9.1 ) and ( ML.1.9.2 ). EXAMPLE ML.1.9.1 Mathematica (*CholeskyDecompositionMethod–1*) BeginPackage[“LinearAlgebra`Cholesky`”] LinearAlgebra`Cholesky` CholeskyDecomposition::“usage” For a symmetric positive definite matrix A, CholeskyDecomposition[A] returns an upper-triangular matrix U such that A = Transpose[U] . U. << “LinearAlgebra`Cholesky`” A:={{4, 0, 0}, {0, 9, 1}, {0, 1, 2}}; MatrixForm[A] 4 0 0 0 9 1 0 1 2 U :=CholeskyDecomposition[A]; MatrixForm[U ] 2 0 0 0 3 13 √ 0 0 317 Transpose[U ].U == A True ML.1.10. ITERATIVE TECHNIQUES FOR SOLVING LINEAR SYSTEMS EXAMPLE 33 ML.1.9.2 Mathematica (*CholeskyDecompositionMethod–2*) (*A = L.Transpose[L]*) n = 3; A = {{4, 0, 0}, {0, 9, 1}, {0, 1, 2}}; CheckAbort[L = A; For[i = 2, i ≤ n, i++, For[j h = i, j ≤ n, j++, L[[i, j]] = 0]]; Pi−1 For i = 1, i ≤ n, i++, tmp = A[[i, i]] − k=1 L[[i, k]]2 ; √ If tmp > 0, L[[i, i]] = tmp, Abort[] ; For[j = i + 1, j ≤ n, j++, ii L[[j, i]] Pi−1 i−1 A[[j,i]]− L[[j,k]]L[[i,k]] k=1 k=1 = ; L[[i,i]] Print[“A = ”, MatrixForm[A], “ L = ”, MatrixForm[L]], Print[“With rounding errors, A is not positive definite”]] 2 0 0 4 0 0 0 3 √0 0 9 1 L= A= 17 0 1 2 0 13 3 A == L.Transpose[L] True ML.1.10 Iterative Techniques for Solving Linear Systems Consider the linear system AX = B, where A ∈ Mn (R) and B ∈ Mn,1 (R). An iterative technique to solve the above linear system begins with an initial approximation X (0) to the solution X, and generates a sequence of approximations (X (k) )k≥0 , that converges to X. Writing the system AX = B into an equivalent form P X = QX + B, where A = P − Q, after the initial approximation X (0) is selected, the sequence of approximate solutions is generated by computing P X (k+1) = QX (k) + B, k = 0, 1, . . . One can prove that, for kP −1 Qk < 1, the sequence (X (k) )k≥0 is convergent. In order to present some iterative techniques, we consider a standard decomposition of a matrix A, A = L + D + U, where: 34 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA • L has zero entries above and on the leading diagonal; • D has zero entries above and below the leading diagonal; • U has zero entries below and on the leading diagonal, 0 0 0 L = ∗ 0 0 , ∗ ∗ 0 0 ∗ ∗ U = 0 0 ∗ . 0 0 0 ∗ 0 0 D = 0 ∗ 0 , 0 0 ∗ • Jacobi’s Iterative Method Write the system AX = B into the equivalent form DX = −(L + U )X + B. From the recurrence formula DX (k+1) = −(L + U )X (k) + B, k = 0, 1, . . . , where X (k) we obtain: (k+1) xi = (k) x1 = ... , (k) xn n X 1 (k) b − aij xj , i aii j=1 (1.10.1) j6=i −1 i = 1, . . . , n; k = 0, 1, . . . If k − D (L + U )k∞ < 1, i.e., n X aij < 1, max aii 1≤i≤n j=1 j6=i (the matrix A is strictly diagonally dominant), the Jacobi iterative method is convergent. • Gauss-Seidel Iterative Method Write the system AX = B into the equivalent form (L + D)X = −U X + B. From k = 0, 1, . . . , we obtain: (k+1) xi (L + D)X (k+1) = −U X (k) + B, 1 = aii bi − i−1 X j=1 (k+1) aij xj − n X j=i+1 (k) aij xj ! , (1.10.2) ML.1.10. ITERATIVE TECHNIQUES FOR SOLVING LINEAR SYSTEMS 35 i = 1, . . . , n. If the matrix A is strictly diagonally dominant, then the Gauss-Seidel method is convergent. Note that, there exist linear systems for which the Jacobi method is convergent but not the Gauss-Seidel method, and vice versa. • Relaxation Methods Let ω > 0. Modifying the Gauss-Seidel procedure (( 1.10.2 )) to (k+1) xi ω (k) = (1 − ω)xi + aii bi − i−1 X j=1 (k+1) aij xj − n X j=i+1 (k) aij xj ! (1.10.3) i = 1, . . . , n, certain choice of positive ω will lead to significant faster convergence. Methods involving (( 1.10.3 )) are called relaxation methods. For values of ω in (0, 1), the procedure is called under-relaxation method and can be used to obtain convergence of some systems that are not convergent by the Gauss-Seidel method. For choice of ω > 1, the procedure is called over-relaxation method, which is used to accelerate the convergence for systems that are convergent by Gauss-Seidel technique. These methods are abbreviated by SOR for Successive Over-Relaxation. 36 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA EXAMPLE ML.1.10.1 Mathematica (* Succesive Over Relaxation *) n = 3; A = Table[Min[i, j], {i, n}, {j, n}]; B = {6, 11, 14}; CheckAbort[Print[“A = ”, MatrixForm[A], “ B = ”, MatrixForm[B]]; X = Table[0, {i, n}]; X0 = Table[0, {i, n}]; tol = 0.0001; omega = 1.2; numbiter = 20; k = 1; While[k ≤ numbiter, For[i = 1, i ≤ n, i++, X[[i]] = (1 − omega)X0[[i]]+ i Pi−1 Pn omega B[[i]] − A[[i, j]]X[[j]] − A[[i, j]]X0[[j]] ; j=1 j=i+1 A[[i,i]] If[Max[Abs[X − X0]] < tol, Print[“X = ”, MatrixForm[X]]; Abort[]]; X0 = X; k++]; Print[“Maximum number of iterartions exceeded”], Null] 6 1 1 1 A = 1 2 2 B = 11 14 1 2 3 0.999998 X = 2.00004 2.99998 Max[Abs[X − X0]] 0.0000610555 Max[Abs[B − A.X]] 0.0000309092 (* Exact Solution *) MatrixForm[{1, 2, 3}] 1 2 3 ML.1.11 Eigenvalues and Eigenvectors Let A ∈ Mn (R) and I ∈ Mn (R) be the identity matrix. ML.1.11. EIGENVALUES AND EIGENVECTORS DEFINITION 37 ML.1.11.1 The polynomial P defined by P (λ) = det(A − λI) is called the characteristic polynomial of the matrix A. P is a polynomial of degree n. DEFINITION ML.1.11.2 If P is the characteristic polynomial of a matrix A, the roots of P are called eigenvalues or characteristic values of A. Let λ be an eigenvalue of the matrix A. DEFINITION ML.1.11.3 If X ∈ Mn,1 (R), X 6= 0, satisfies the equation (A − λI)X = 0, then X is called an eigenvector or characteristic vector of A corresponding to the eigenvalue λ. DEFINITION ML.1.11.4 The spectral radius ρ(A) of a matrix A is defined by n o ρ(A) = max |λ| λ is an eigenvalue of A . The spectral radius is closely related to the norm of a matrix, as shown in the following theorem: THEOREM ML.1.11.5 If A p ∈ Mn (R), then |ρ(At A)| = kAk2 ; (1) (2) ρ(A) ≤ kAk, for any natural norm k ∙ k. 38 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA EXAMPLE ML.1.11.6 Mathematica Eigenvalues[m] gives a list of the eigenvalues of the square matrix m. Eigenvectors[m] gives a list of the eigenvectors of the square matrix m. Eigensystem[m] gives a list {values, vectors} of the eigenvalues and eigenvectors of the square matrix m. (* Eigenvalues,Eigenvectors,Eigensystem *) A = {{1, 2}, {3, 6}}; SequenceForm[“MatrixForm[A] = ”, MatrixForm[A]] 1 2 MatrixForm[A] = 3 6 SequenceForm[“MatrixForm[Eigenvalues[A]] = ”, MatrixForm[Eigenvalues[A]]] 7 MatrixForm[Eigenvalues[A]] = 0 MatrixForm[Solve[Det[A − λ ∗ IdentityMatrix[2]] == 0, λ]] λ→0 λ→7 SequenceForm[“MatrixForm[Eigenvectors[A]] = ”, MatrixForm[Eigenvectors[A]]] 1 3 MatrixForm[Eigenvectors[A]] = −2 1 SequenceForm[“Eigensystem[A] = ”, Eigensystem[A]] Eigensystem[A] = {{7, 0}, {{1, 3}, {−2, 1}}} SequenceForm[“MatrixForm[Eigensystem[A]]] = ”, MatrixForm[Eigensystem[A]]] 7 0 MatrixForm[Eigensystem[A]]] = {1, 3} {−2, 1} ML.1.12 Characteristic Polynomial: Le Verrier Method This algorithm has been rediscovered and modified several times. In 1840, Urbain Jean Joseph Le Verrier provided the basic connection with Newton’s identities. Let A ∈ Mn (R). The trace of a matrix A denoted by tr(A) is defined by tr(A) = a11 + a22 + ∙ ∙ ∙ + ann . Write the characteristic polynomial P of a matrix A in the form P (λ) = λn − c1 λn−1 − ∙ ∙ ∙ − cn−1 λ − cn . It is known that, if λ1 , . . . , λn are the eigenvalues of A, then: c1 = λ1 + λ2 + ∙ ∙ ∙ + λn = tr(A), sk = λk1 + λk2 + ∙ ∙ ∙ + λkn = tr(Ak ), ML.1.13. CHARACTERISTIC POLYNOMIAL: FADEEV-FRAME METHOD 39 k = 1, . . . , n. Using the Newton formula ck = 1 (sk − c1 sk−1 − c2 sk−2 − ∙ ∙ ∙ − ck−1 s1 ), k k = 2, . . . , n, we obtain: c1 = tr(A), c2 = 12 (tr(A2 ) − c1 tr(A)), .............................................. cn = n1 (tr(An ) − c1 tr(An−1 ) − ∙ ∙ ∙ − cn−1 tr(A)). ML.1.13 Characteristic Polynomial: Fadeev-Frame Method J. M. Souriau, also from France, and J. S. Frame, from Michigan State University, independently modified the algorithm to its present form. Souriau’s formulation was published in France in 1948, and Frame’s method appeared in the United States in 1949. Paul Horst (USA, 1935) along with Faddeev and Sominskii (USSR, 1949) are also credited with rediscovering the technique. Although the algorithm is intriguingly beautiful, it is not practical for floating-point computations. The Fadeev-Frame algorithm is closely related to the Le Verrier method. Let det(A − λIn ) = (−1)n P (λ), where Using the notations: P (λ) = λn − c1 λn−1 − ∙ ∙ ∙ − cn−1 λ − cn . A1 = A, A2 = AB1 , .. . c1 = tr(A1 ), c2 = 12 tr(A2 ), .. . B 1 = A1 − c 1 I n , B 2 = A2 − c 2 I n , .. . An = ABn−1 , cn = n1 tr(An ), Bn = An − cn In , and taking into account the fact that the matrix A is a solution to its characteristic equation 1 (Cayley -Hamilton theorem), we obtain An − c1 An−1 − ∙ ∙ ∙ − cn−1 A − cn In = P (A) = 0. The relation Bn = An − cn In is a control test, Bn = 0. Furthermore, from the relation det(A) = (−1)n P (0) = (−1)n+1 cn , (1.13.1) In 1896 Frobenius became aware of Cayley’s 1858 Memoir on the theory of matrices and after this started to use the term matrix. Despite the fact that Cayley only proved the Cayley-Hamilton theorem for 2 × 2 and 3 × 3 matrices, Frobenius generously attributed the result to Cayley (Frobenius, in 1878, had been the first to prove the general theorem). Hamilton proved a special case of the theorem, the 4 × 4 case, in the course of his investigations into quaternions. 1 40 ML.1. NUMERICAL METHODS IN LINEAR ALGEBRA we can determine the determinant of the matrix A. If det(A) 6= 0 then, from the relations ABn−1 = An = cn In , using the formula A−1 = 1 Bn−1 , cn (1.13.2) we obtain the inverse of the matrix A. Also, note that (−1)n+1 Bn−1 is the adjoint of the matrix A. EXAMPLE ML.1.13.1 Mathematica (* Faddev - Frame *) n = 3; A = Table[Min[i, j], {i, n}, {j, n}]; c = Table[0, {i, n}]; PLength[x] tr[x h]:= i=1 x[[i, i]]; B = IdentityMatrix[n]; For k = 1, k ≤ n − 1, k++, c[[k]] = tr[A.B] ;B k i = A.B − c[[k]]IdentityMatrix[n] ; c[[n]] = tr[A.B] ; detA = (−1)n+1 c[[n]]; adjA = (−1)n+1 B; n SequenceForm[“A = ”, MatrixForm[A]] SequenceForm[“adj(A) = ”, MatrixForm[adjA]] P c[[i]]xn−i SequenceForm “CharPol(x) = ”, xn − n i=1 SequenceForm[“Det(A) = ”, detA] adjA CheckAbort If detA==0, Abort[], invA = detA ; ∧ (-1) = ”, MatrixForm[invA]], Print[“Singular matrix”]] SequenceForm[“A 1 1 1 1 2 2 A= 1 2 3 2 −1 0 adj(A) = −1 2 −1 0 −1 1 CharPol(x) = − 1 + 5x − 6x2 + x3 Det(A) = 1 2 −1 0 A∧ (-1) = −1 2 −1 0 −1 1 ML.2 Solutions of Nonlinear Equations ML.2.1 Introduction In this chapter we consider one of the most basic problems in numerical approximation, the root-finding problem. A few type of nonlinear equations can be solved by using direct algebraic methods. In general, algebraic equations of fifth and of higher orders cannot be solved by means of radicals. Moreover, there exists no explicit formula for finding the roots of nonalgebraic (transcendental) equations such as, ex + x = 0, x ∈ R. Therefore, root-finding invariably proceeds by iteration, and this is equally true in one or in many dimensions. Let x∗ be a real number and (xk )k≥0 be a sequence of real numbers that converges to x∗ . DEFINITION ML.2.1.1 If positive constants p and λ exist such that lim k→∞ |xk+1 − x∗ | |xk − x∗ |p = λ, then the sequence (xk )k≥0 is said to converge to x∗ of order p, with asymptotic constant λ. An iterative technique of the form xk+1 = f (xk ), k ∈ N, is said to be of order p if the sequence (xk )k≥0 converges to the solution x∗ = f (x∗ ) of order p. In general a sequence of a high order of convergence converges more rapidly than a sequence with a lower order. Two cases of order are given special attention, namely the linear ( p = 1) and the quadratic (p = 2). Consider that ε is a bound of the maximum size of the error. We have to stop the calculations when the following condition is satisfied |xk − x∗ | < ε. 41 42 ML.2. SOLUTIONS OF NONLINEAR EQUATIONS But such a criterion cannot be used because the root x∗ is not known. A typical stopping criterion is to stop when the difference between two successive iterates satisfy the inequality |xk − xk+1 | < ε, We can also use as stopping criterion |xk − xk+1 | ≤ ε|xk |, etc. Unfortunately none of these criteria guarantees the required precision. The following section deals with several traditional iterative methods. ML.2.2 Method of Successive Approximation Let F : [a, b] → R be a function. A number x is said to be a fixed point of the function F if F (x) = x. Root-finding problems and fixed-point problems are equivalent classes in the following sense: A point x is a root of the equation f (x) = 0 if and only if it is a fixed point of the function F (x) = x − f (x). Although the problem we wish to solve is in the root-finding form, the fixed-point form is easier to analyze and certain fixed-point choices lead to very powerful root-finding techniques. THEOREM ML.2.2.1 If F : [a, b] → [a, b] has derivative such that |F 0 (x)| ≤ α < 1, for all x ∈ (a, b), then the function F has a unique fixed point x∗ , and the sequence (xk )k>0 , defined by: xk+1 = F (xk ), k ∈ N, x0 ∈ [a, b]. converges to x∗ . Proof. Using Lagrange’s mean value theorem it follows that for all x, y ∈ [a, b] there exists c ∈ (a, b) such that |F (x) − F (y)| = |F 0 (c)| ∙ |x − y| ≤ α|x − y|. The function F is a contraction. The proof is concluded by using the Banach fixed point theorem. ML.2.3. THE BISECTION METHOD THEOREM 43 ML.2.2.2 Let F ∈ C p [a, b], p ≥ 2, and x∗ ∈ (a, b) be a fixed point of F . If F 0 (x∗ ) = F 00 (x∗ ) = ∙ ∙ ∙ = F (p−1) (x∗ ) = 0, F (p) (x∗ ) 6= 0, then, for x0 sufficiently close to x∗ , the sequence (xk )k>0 , where xk+1 = F (xk ), k ∈ N, x0 ∈ [a, b], converges to x∗ , of order p. Proof. Since |F 0 (x∗ )| = 0 < 1, there exists a ball centered at x∗ on which F be a contraction. By virtue of Banach’s fixed point theorem it follows that the sequence (xk )k≥0 converges to x∗ . Using Taylor’s formula it follows that there exists ck ∈ (a, b) such that xk+1 = F (xk ) ∗ = F (x ) + p−1 X i=1 ∗ = x +0+F hence (xk − x∗ )i (xk − x∗ )p (p) F (x ) + F (ck ) i! p! (p) (i) ∗ (xk − x∗ )p (ck ) , p! |F (p) (x∗ )| |xk+1 − x∗ | |F (p) (ck )| = lim = > 0. k→∞ |xk − x∗ |p k→∞ p! p! lim Consequently, the method has order p. ML.2.3 The Bisection Method THEOREM ML.2.3.1 If f ∈ C[a, b] and f (a)f (b) < 0 then the sequence (xk )k≥0 , defined by: b; x0 = a, y0 = xk + y k xk+1 , yk+1 ∈ xk , , yk , such that: 2 y k − xk , yk+1 − xk+1 = 2 f (xk+1 )f (yk+1 ) ≤ 0, k ∈ N, converges to a root x∗ of the equation f (x) = 0. Moreover, we have |xk − x∗ | ≤ b−a 2k , k ∈ N. Proof. The sequence (xk )k≥0 is increasing, the sequence (yk )k≥0 is decreasing such that yk −xk → 0. It follows that they converge to the same limit x∗ . From the conditions f (xk )f (yk ) ≤ 44 ML.2. SOLUTIONS OF NONLINEAR EQUATIONS 0, k → ∞, we deduce f 2 (x∗ ) ≤ 0, that is f (x∗ ) = 0. Since yk − xk = b−a , 2k it follows that b−a . 2k This is only a bound for the arising error in approximation process. The actual error can be much smaller. |xk − x∗ | ≤ The bisection method, is also called the binary-search method. It is very slow in converging. However, the method guarantees convergence to a solution and, for this reason, it is used as a “starter” for more efficient methods. ML.2.4 The Newton-Raphson Method The Newton-Raphson method (or simply Newton’s method) is one of the most powerful and well-known techniques used for a root-finding problem. THEOREM ML.2.4.1 If x0 ∈ [a, b] and f ∈ C 2 [a, b] satisfies the conditions: (1) f (a)f (b) < 0; (2) f 0 and f 00 have no roots in [a, b], (3) f (x0 )f 00 (x0 ) > 0, then the equation f (x) = 0 has a unique solution x∗ ∈ (a, b), and the sequence (xk )k≥0 defined by xk+1 = xk − f (xk )/f 0 (xk ), k ∈ N, converges to x∗ . We shall prove that the Newton method is of order 2. Indeed, using Taylor’s formula, we have f 00 (ck ) 0 = f (x∗ ) = f (xk ) + (x∗ − xk )f 0 (xk ) + (x∗ − xk )2 , 2 where |x∗ − ck | ≤ |x∗ − xk |, which we rewrite in the form xk+1 − x∗ = (x∗ − xk )2 f 00 (ck ) , 2f 0 (xk ) ML.2.5. THE SECANT METHOD 45 x∗ • • xk+1 f • xk Figure ML.2.1: Newton’s Method hence |f 00 (x∗ )| |xk+1 − x∗ | = > 0, k→∞ |xk − x∗ |2 2|f 0 (x∗ )| lim that is, the Newton method has the order 2. We can also define a sequence of approximations by the formula xk+1 = xk − f (xk )/f 0 (x0 ), k ∈ N. This formula needs to know the value of the derivative only at the starting point x0 . It is known as the simplified Newton’s formula. ML.2.5 The Secant Method Newton’s method is an extremely powerful technique, but it has a major difficulty: it requires the value of the derivative of the function at each approximation. To circumvent the problem of the derivative evaluation in Newton’s method, we derive a slight variation. Using the approximation f (xk ) − f (xk−1 ) xk − xk−1 for f 0 (xk ) in Newton’s formula, gives xk+1 = xk−1 f (xk ) − xk f (xk−1 ) , f (xk ) − f (xk−1 ) k ∈ N∗ . The technique using the above formula is called the secant method. Starting with two initial approximations x0 and x1 , the approximation xk+1 is the x-intercept of the line joining (xk−1 , f (xk−1 )) and (xk , f (xk )), k = 0, 1, . . . ML.2.6 False Position Method The method of False Position (also called Regula Falsi ) generates approximation in a similar way to Secant method, but it provides a test to ensure that the root lies between two successive iterations. The method can be described as follows: 46 ML.2. SOLUTIONS OF NONLINEAR EQUATIONS Choose x0 , x1 such that f (x0 )f (x1 ) < 0. Let x2 be the x-intercept of the line joining (x0 , f (x0 )) and (x1 , f (x1 )). If f (x0 )f (x2 ) < 0 then let x3 be the x-intercept of the line joining (x0 , f (x0 )) and (x2 , f (x2 )). If f (x1 )f (x2 ) < 0 then let x3 be x-intercept of the line joining (x1 , f (x1 )) and (x2 , f (x2 )), etc. THEOREM ML.2.6.1 Let f ∈ C 2 [a, b], x0 , x1 ∈ [a, b], x0 6= x1 . If the following conditions are satisfied: (1) f 00 (x) 6= 0, ∀x ∈ [a, b]; (2) f (x0 )f 00 (x0 ) > 0 (Fourier condition); (3) f (x0 )f (x1 ) < 0, then the equation f (x) = 0 has a unique root x∗ between x0 and x1 , and the sequence (xk )k≥0 defined by xk+1 = x0 f (xk ) − xk f (x0 ) f (xk ) − f (x0 ) k ∈ N, , converges to x∗ . @ @ @ @ • • x 0 x∗ @ xk+1 • @ @ @ xk • @ f @ @ @ Figure ML.2.2: False Position Method ML.2.7 The Chebyshev Method Let f ∈ C p+1 [a, b] and x∗ be a solution of the equation f (x) = 0. Assume that f 0 has no roots in [a, b], and let h be the inverse of the function f. By virtue of Taylor’s formula we obtain: h(t) = h(y) + p X h(i) (y) i=1 i! (t − y)i + h(p+1) (c) (t − y)p+1 , (p + 1)! where |c − y| < |t − y|. For t = 0 and y = f (x) we obtain ∗ x =x+ p X h(i) (f (x)) i=1 i! (−1)i f i (x) + h(p+1) (c) (−1)p+1 f p+1 (x), (p + 1)! ML.2.7. THE CHEBYSHEV METHOD 47 where |c − f (x)| ≤ |f (x)|. Using the notation F (x) = x + p X (−1)i i=1 i! ai (x)f i (x), where ai (x) = h(i) (f (x)). The Chebyshev method can be obtained by defining the sequence (xk )k≥0 such that k ∈ N. xk+1 = F (xk ), If h(p+1) (0) 6= 0, then the Chebyshev method has the order of convergence p + 1. In order to calculate the coefficients ai (x), i = 1, . . . , p, we differentiate p-times, the identity h(f (x)) = x, ∀x ∈ [a, b]. We obtain: h0 (f (x)) ∙ f 0 (x) = 1, i.e., a1 (x)f 0 (x) = 1, a1 (x) = 1 f 0 (x) , and ai = 1 f 0 (x) d ai−1 (x), dx i = 1, . . . p, where a0 (x) = x. For p = 1 we obtain xk+1 = xk − f (xk )/f 0 (xk ), k ∈ N, i.e., the Newton method. For p = 2 we obtain a Chebyshev method of order 3: xk+1 = xk − f (xk ) f 00 (xk )f 2 (xk ) , − f 0 (xk ) 2(f 0 (xk ))3 k ∈ N. The Chebyshev method needs more calculations, but it converges more rapidly. 48 ML.2. SOLUTIONS OF NONLINEAR EQUATIONS EXAMPLE ML.2.7.1 Mathematica (*Chebyshev*) p:=3; h Array[a, p]; For a[1] = f 001[t] ; i = 2, i ≤ p, i++, ii h D[a[i−1],t] ; a[i] = Simplify f 00[t] Pj (−1)ii F [j , x ]:=t + i=1 i! a[i]f [t]i /.t → x TableForm[Table[a[i], {i, 1, 3}]] 1 f 0 [t] 00 − ff0 [t][t]3 3f 00 [t]2 −f 0 [t]f (3) [t] f 0 [t]5 TableForm[Table[F [i, x], {i, 1, 3}]] x − ff0[x] [x] x− x− f [x] f 0 [x] f [x] f 0 [x] ML.2.8 − − f [x]2 f 00 [x] 2f 0 [x]3 f [x]2 f 00 [x] 2f 0 [x]3 − f [x]3 (3f 00 [x]2 −f 0 [x]f (3) [x]) 6f 0 [x]5 Numerical Solutions of Nonlinear Systems of Equations Consider the functions fi : Mn,1 (R) → R, i = 1, . . . , n, and F : Mn,1 (R) → Mn,1 (R), where F = [f1 , . . . , fn ]t . The functions f1 , f2 , . . . , fn are called the coordinate functions of F. Consider the system of equations f1 ([x1 , . . . , xn ]t ) = 0 f2 ([x1 , . . . , xn ]t ) = 0 .. .. .. . . . f ([x , . . . , x ]t ) = 0 n 1 n (2.8.1) in unknown x = [x1 , x2 , . . . , xn ]t ∈ Mn,1 (R). The system ( 2.8.1 ) has the form F (x) = 0. (2.8.2) By denoting g(x) = F (x) + x, one can see that the solutions of equation ( 2.8.2 ) are the fixed points of g = [g1 , . . . , gn ]t . Let kxk∞ := max |xi | 1≤i≤n ML.2.9. NEWTON’S METHOD FOR SYSTEMS OF NONLINEAR EQUATIONS 49 and x∗ be a solution of equation ( 2.8.2 ). Define the ball D := {x ∈ Mn,1 (R) | kx − x∗ k∞ ≤ r}, r > 0. THEOREM ML.2.8.1 If the functions gi : Mn,1 (R) → R, (i = 1, 2, . . . , n), satisfy the conditions: ∂gi ≤ λ (x) ∂x n j (λ < 1, i, j = 1, . . . , n,) for all x ∈ D, then equation ( 2.8.2 ) has a unique solution x∗ ∈ D. Moreover for all point x(0) ∈ D the sequence (x(k) )k≥0 , x(k+1) = g(x(k) ), k ∈ N, converges to the solution x∗ . Proof. We prove that the function g is a λ-contraction on D. Let x, y ∈ D. We have: n n X ∂gi λX |gi (x) − gi (y)| ≤ |xj − yj | ≤ λkx − yk∞ , ∂xj (ξi ) |xj − yj | ≤ n j=1 j=1 i = 1, 2, . . . , n. Hence, kg(x) − g(y)k∞ ≤ λkx − yk∞ . Now, let us prove that g(D) ⊂ D. Let x ∈ D, then kg(x) − x∗ k∞ = kg(x) − g(x∗ )k∞ ≤ λkx − x∗ k∞ < λr < r, hence g(x) ∈ D. Using the Banach fixed point theorem, the sequence (x(k) )k≥0 converges to the unique fixed point of g in D. An estimate of the error is given by kx(k) − x∗ k∞ < ML.2.9 λk kx(1) − x(0) k∞ , 1−λ k ∈ N. Newton’s Method for Systems of Nonlinear Equations Let f1 , f2 , . . . , fn ∈ C 1 (Mn,1 (R)). In addition to the notations used in Section ( ML.2.8 ) we need the following definition def. ∂fi 0 (x) . F (x) = ∂xj 1≤i,j≤n Assume that at the fixed point x∗ there exists the matrix (F 0 (x∗ ))−1 . Taking g(x) = x − (F 0 (x∗ ))−1 ∙ F (x), 50 ML.2. SOLUTIONS OF NONLINEAR EQUATIONS we can see that ∂gi ∗ (x ) = 0, ∂xj i, j = 1, 2, . . . , n. Since F is continuous, there exists a ball D centered at x∗ such that the conditions of Theorem ( ML.2.8.1 ) are satisfied for g(x) = x − (F 0 (x))−1 ∙ F (x). For a certain value x0 , sufficiently close to x∗ , the sequence (x(k) )k≥0 , defined by x(k+1) = x(k) − (F 0 (x(k) ))−1 ∙ F (x(k) ), k ∈ N, converges to x∗ . In the case n = 2, we have: F (x, y) = [f (x, y), g(x, y)]t , 0 F (x, y) = 0 (F (x, y)) −1 fx0 fy0 gx0 gy0 1 = 0 0 fx gy − fy0 gx0 , gy0 −fy0 −gx0 fx0 , x(k+1) = x(k) − f (x(k) , y (k) )gy0 (x(k) , y (k) ) − g(x(k) , y (k) )fy0 (x(k) , y (k) ) , fx0 (x(k) , y (k) )gy0 (x(k) , y (k) ) − gx0 (x(k) , y (k) )fy0 (x(k) , y (k) ) y (k+1) =y (k) f (x(k) , y (k) )gx0 (x(k) , y (k) ) − g(x(k) , y (k) )fx0 (x(k) , y (k) ) + 0 (k) (k) 0 (k) (k) , fx (x , y )gy (x , y ) − gx0 (x(k) , y (k) )fy0 (x(k) , y (k) ) k ∈ N. The weakness of Newton’s method arises from the need to compute and invert the matrix F (x(k) ) at each step. In practice, explicit calculation of (F 0 (x(k) ))−1 is avoided by performing the operation in a two-step manner: (k) (k) – Step (1) A vector y (k) = [y1 , . . . , yn ]T , is found which satisfies 0 F 0 (x(k) )y (k) = −F (x(k) ); – Step (2) The vector x(k+1) is calculated by the formula x(k+1) = x(k) + y (k) . ML.2.10. STEEPEST DESCENT METHOD ML.2.10 51 Steepest Descent Method The steepest descent method (also called the gradient method ) determines a local minimum for a function Ψ : Rn → R. Note that the system ( 2.8.1 ) has a solution precisely when the function Ψ defined by Ψ = f12 + ∙ ∙ ∙ + fn2 has the minimal value zero. The method of Steepest Descent for finding a local minimum for an arbitrary function Ψ : Rn → R can be intuitively described as follows: (1) Evaluate Ψ at an initial approximation x(0) ; (2) Determine a direction from x(0) that results in a decrease in the value of Ψ; (3) Move an appropriate distance in this direction and call the new vector x(1) ; Repeat steps (1) to (3). More precisely: Consider a starting value x(0) . Find the minimum of the function t 7→ Ψ(x(0) − t grad Ψ(x(0) )). Let t0 be the smallest positive root of the equation d (Ψ(x(0) − t grad Ψ(x(0) ))) = 0. dt Define x(1) = x(0) − t0 grad Ψ(x(0) ), and, step by step, denoting by tk the smallest nonnegative root of the equation d (Ψ(x(k) − t grad Ψ(x(k) ))) = 0, dt take x(k+1) = x(k) − tk grad Ψ(x(k) ), k ∈ N. The Steepest Descent converges only linearly. As a consequence, the method is used to find sufficiently accurate starting approximation for the Newton based techniques. 52 ML.3 Elements of Interpolation Theory Sometimes we know the values of a function f at a set of numbers x0 < ∙ ∙ ∙ < xn , but we do not have an analytic expression for f. For example, the values f (xi ) might be obtained from some physical measurement. The task now is to estimate f (x) for an arbitrary x. If the required x lies in the interval (x0 , xn ) the problem is called interpolation; if x is outside the interval [x0 , xn ], it is called extrapolation. Interpolation must model the function by some known function called interpolator. By far most common among the functions used as interpolators are polynomials. Rational functions and trigonometric polynomials also turn out to be extremely useful in interpolation. We would like to emphasize the contribution given to the Interpolation Theory by the members of the school of thought in the Theory of Allure, Convexity and Interpolation founded by Elena Popoviciu. ML.3.1 Lagrange Interpolation Let x0 < ∙ ∙ ∙ < xn be a set of distinct real numbers and f be a function whose values are given at these numbers. DEFINITION ML.3.1.1 The polynomial P of degree at most n such that P (xi ) = f (xi ), i = 0, . . . , n, is called the Lagrange Interpolating Polynomial (or simply, Lagrange Polynomial) of the function f with respect to x0 , . . . , xn . The numbers xi are called mesh points or interpolation points. Consider the polynomial P : R → R, P (x) = a0 + a1 x + ∙ ∙ ∙ + an xn . The Lagrange interpolating conditions P (xi ) = f (xi ), i = 0, . . . , n, 53 54 ML.3. ELEMENTS OF INTERPOLATION THEORY are equivalent to the system n a0 + a1 x0 + ∙ ∙ ∙ + an x0 = f (x0 ) a0 + a1 x1 + ∙ ∙ ∙ + an xn = f (x1 ) 1 .. .. .. .. . . . . a + a x + ∙ ∙ ∙ + a xn = f (x ) 0 1 n n n n in unknowns a0 , . . . , an . Since the Vandermonde determinant V (x0 , . . . , xn ) = 1 x0 ∙ ∙ ∙ xn0 1 x1 ∙ ∙ ∙ xn1 . .. .. . . . .. . . 1 xn ∙ ∙ ∙ xnn is different from zero (the points xi are distinct) the above system has a unique solution. The uniqueness of the Lagrange Polynomial allows us to denote it by the symbol L(x0 , . . . , xn ; f ). DEFINITION ML.3.1.2 The divided difference of the function f with respect to the points x0 , . . . , xn , is defined to be the coefficient at xn in the Lagrange interpolating polynomial L(x0 , . . . , xn ; f ) and is denoted by [x0 , . . . , xn ; f ]. For example, [x0 ; f ] = f (x0 ), REMARK [x0 , x1 ; f ] = f (x1 ) − f (x0 ) . x 1 − x0 ML.3.1.3 In the case when the points xi are not distinct, see ( ML.3.6.2 ). For instance, [x0 , x0 , x0 ; f ] = ML.3.2 f 00 (x0 ) 2 . Some Forms of the Lagrange Polynomial From the interpolating conditions: L(x0 , . . . , xn ; f )(xi ) = f (xi ), i = 0, . . . , n, ML.3.2. SOME FORMS OF THE LAGRANGE POLYNOMIAL we obtain the system a0 a0 .. . a 0 a 0 + a1 x0 + ∙ ∙ ∙ + an xn0 + a1 x1 + ∙ ∙ ∙ + an xn1 .. .. . . + a1 xn + ∙ ∙ ∙ + an xnn + a1 x + ∙ ∙ ∙ + a n xn in unknowns a0 , . . . , an . The 1 1 .. . 1 1 = = 55 f (x0 ) f (x1 ) .. . = f (xn ) = L(x0 , . . . , xn ; f )(x) system has a solution if x0 ∙ ∙ ∙ x1 ∙ ∙ ∙ .. . . . . xn ∙ ∙ ∙ x ∙∙∙ xn0 xn1 .. . xnn xn hence, we obtain = 0, f (xn ) L(x0 , . . . , xn ; f )(x) f (x0 ) f (x1 ) .. . L(x0 , . . . , xn ; f )(x) = − 1 x0 1 x1 .. .. . . 1 xn 1 x ∙ ∙ ∙ xn0 f (x0 ) ∙ ∙ ∙ xn1 f (x1 ) .. . . . .. . . n ∙ ∙ ∙ xn f (xn ) 0 ∙ ∙ ∙ xn V (x0 , . . . , xn ) (3.2.1) Consider the polynomials `i ∈ R[X], i = 0, . . . , n, `i (x) = (x − x0 ) . . . (x − xi−1 )(x − xi+1 ) . . . (x − xn ) . (xi − x0 ) . . . (xi − xi−1 )(xi − xi+1 ) . . . (xi − xn ) It is clear that: `i (xk ) = δik , and hence the polynomial P = i, k = 0, . . . n, n X f (xi )`i i=0 satisfies the relations: P (xk ) = f (xk ), k = 0, . . . , n. Since the Lagrange Polynomial has a unique representation we obtain P = L(x0 , . . . , xn ; f ). Hence n X f (xi )`i . (3.2.2) L(x0 , . . . , xn ; f ) = i=0 The polynomials `i are called the fundamental polynomials of the Lagrange interpolation. By letting `(x) = (x − x0 )(x − x1 ) . . . (x − xn ), 56 ML.3. ELEMENTS OF INTERPOLATION THEORY we obtain `i (x) = and consequently `(x) , (x − xi )`0 (xi ) L(x0 , . . . , xn ; f )(x) = n X i=0 f (xi ) `(x) . (x − xi )`0 (xi ) (3.2.3) Consider the polynomial Q = L(x0 , . . . , xn ; f ) − (X − x0 ) . . . (X − xn−1 )[x0 , . . . , xn ; f ]. Note that the polynomial Q is of degree at most n − 1 and it satisfies the relations: Q(xi ) = f (xi ), i = 0, . . . , n − 1. It follows that Q = L(x0 , . . . , xn−1 ; f ), and hence L(x0 , . . . , xn ; f ) = L(x0 , . . . , xn−1 ; f ) + (X − x0 ) . . . (X − xn−1 )[x0 , . . . , xn ; f ]. Adding the equalities: L(x0 , x1 ; f ) L(x0 , x1 , x2 ; f ) ............... L(x0 , . . . , xn ; f ) = f (x0 ) + (X − x0 )[x0 , x1 ; f ] = L(x0 , x1 ; f ) + (X − x0 )(X − x1 )[x0 , x1 , x2 ; f ] ... ............................................ = L(x0 , . . . , xn−1 ; f ) +(X − x0 ) . . . (X − xn−1 )[x0 , . . . , xn ; f ], we obtain the Newton Interpolatory Divided Difference Formula: L(x0 , . . . , xn ; f ) = f (x0 ) + (X − x0 )[x0 , x1 ; f ] + ∙ ∙ ∙ +(X − x0 ) . . . (X − xn−1 )[x0 , . . . , xn ; f ]. EXAMPLE ML.3.2.1 1 x0 f (x0 ) 1 x0 L(x0 , x1 ; f ) = − 1 x1 f (x1 ) / 1 x 1 1 x 0 x − x1 x − x0 = f (x0 ) + f (x1 ) x0 − x 1 x1 − x 0 = f (x0 ) + (x − x0 )[x0 , x1 ; f ]. (3.2.4) ML.3.2. SOME FORMS OF THE LAGRANGE POLYNOMIAL 57 Let xi , xj be distinct points, i, j ∈ {0, . . . , n}. Let Lk := L(x0 , . . . , xk−1 , xk+1 , . . . , xn ; f ), k = 0, . . . , n. Consider the polynomial Q= Note that: (X − xi )Li − (X − xj )Lj . xj − xi x i − xj Lj (xi ) = f (xi ), xj − xi xj − xi Li (xj ) = f (xj ), Q(xj ) = xj − xi (xk − xi )f (xk ) − (xk − xj )f (xk ) = f (xk ), Q(xk ) = xj − xi k ∈ {0, . . . , n} \ {i, j}, that is, Q(xi ) = − Q(xk ) = f (xk ), and grad (Q) ≤ n. Hence, k = 0, . . . , n, Q = L(x0 , . . . , xn ; f ). Consequently we obtain the Aitken-Neville formula: L(x0 , . . . , xn ; f ) = X − xi L(x0 , . . . , xi−1 , xi+1 , . . . , xn ; f ) xj − xi X − xj L(x0 , . . . , xj−1 , xj+1 , . . . , xn ; f ) − xj − xi (i, j ∈ {0, . . . , n}, i 6= j). (3.2.5) Consider the notations: Q(i, j) = L(xi−j , . . . , xi ; f )(x), 0 ≤ j ≤ i ≤ n. Using formula ( 3.2.5 ) we obtain the following algorithm Q(i, j) = (x − xi−j )Q(i, j − 1) − (x − xi )Q(i − 1, j − 1) xi − xi−j (i = 1, . . . , n; j = 1, . . . , i). (3.2.6) used to generate recursively the Lagrange Polynomials. For example, with n = 3, the polynomials can be obtained by proceeding as shown in the table below, where each row is completed before the succeeding rows are begun. Q(0, 0) = f (x0 ) Q(1, 0) = f (x1 ) Q(2, 0) = f (x2 ) Q(3, 0) = f (x3 ) ↓ Q(1, 1) ↓ Q(2, 1) → Q(2, 2) . Q(3, 1) → Q(3, 2) → Q(3, 3) 58 ML.3. ELEMENTS OF INTERPOLATION THEORY Using the notations L(i, j) = L(xi , . . . , xj ; f )(x) (0 ≤ i ≤ j ≤ n) from ( 3.2.5 ), we obtain the Neville’s algorithm L(i − m, i) = (x − xi−m )L(i − m + 1, i) − (x − xi )L(i − m, i − 1) xi − xi−m (m = 1, . . . , n; i = n, n − 1, . . . , m). (3.2.7) The various L’s form a “tableau” with “ancestors” on the left leading to a single “descendent” at the extreme right. The Neville’s algorithm is a recursive way of filling in the numbers in the “tableau”, a column at the time, from left to right. For example, with n = 3, we have L(0, 0) = f (x0 ) L(1, 1) = f (x1 ) L(2, 2) = f (x2 ) L(3, 3) = f (x3 ) L(0, 1) ↑ & L(1, 2) & L(0, 2) ↑ & ↑ & L(2, 3) L(1, 3) L(0, 3) ↑ ML.3.2. SOME FORMS OF THE LAGRANGE POLYNOMIAL EXAMPLE ML.3.2.2 Mathematica (* Neville Method 1*) n:=3; Clear[x, f, L]; Array[x, n, 0]; Array[L, {n, n}, 0]; f [t ]:=t3 ; For i = 0, i ≤ n, i++, x[i] = ni ; L[i, i] = f [x[i]] ; For[j = 1, j ≤ n, j++, For[i = n, i ≥ j, i–, L[i − j, i] = ((x − x[i − j])L[i − j + 1, i] − (x − x[i])L[i − j, i − 1])/ (x[i] − x[i − j])] ] Simplify[TableForm[Table[L[i, j], {i, 0, n}, {j, 0, n}]]] x 0 − 2x + x2 x3 9 9 1 2 2 L[1, 0] 27 − 9 + 7x − 11x + 2x2 9 9 9 8 10 19x L[2, 0] L[2, 1] 27 −9 + 9 L[3, 0] L[3, 1] L[3, 2] 1 Clear[L]; step = 1; For[j = 1, j ≤ n, j++, For[i = n, i ≥ j, i–, L[i − j, i] = “step”(N [step]); step = step + 1] ]; Print[ “======================================”] Print["The way of filling in the numbers in the table"] Print[“===================================”] Do[{Print[Simplify[L[k, 0]], “ * ”, Simplify[L[k, 1]], “ * ”, Simplify[L[k, 2]], “ * ”, Simplify[L[k, 3]]]}, {k, 0, n}]; Print[“===================================”]; ====================================== The way of filling in the numbers in the table =================================== L[0, 0] * 3.step * 5.step * 6.step L[1, 0] * L[1, 1] * 2.step * 4.step L[2, 0] * L[2, 1] * L[2, 2] * 1.step L[3, 0] * L[3, 1] * L[3, 2] * L[3, 3] =================================== 59 60 ML.3. ELEMENTS OF INTERPOLATION THEORY EXAMPLE ML.3.2.3 Mathematica (* Aitken Method *) << Graphics`Arrow` Array[x, n, 0]; Array[Q, {n, n}, 0]; n:=3; Clear[x, f, Q, points]; points = {}; {};Array[x, 3 f [t ]:=t ; For i = 0, i ≤ n, i++, x[i] = ni ; Q[i, 0] = f [x[i]] ; For[i = 1, i ≤ n, i++, For[j = 1, j ≤ i, j++, AppendTo[points, {j, −i}]; i (x−x[i−j])Q[i,j−1]−(x−x[i])Q[i−1,j−1] ;] Q[i, j] = x[i]−x[i−j] Simplify[TableForm[Table[Q[i, j], {i, 0, n}, {j, 0, i}]]] 0 1 27 8 27 1 x 9 − 29 + 7x − 2x + x2 9 9 2 − 10 + 19x − 11x + 2x2 x3 9 9 9 9 Print[“Lagrange(x) = ”, Simplify[Q[n, n]]] Lagrange(x) = x3 txt points[[i]][[1]]}, points[[i]], {1.2, −2}], n = Table[Text[{−points[[i]][[2]], o n(n+1) i, 1, 2 ]; oi h n ; arrows = Table Arrow[points[[i − 1]], points[[i]]], i, 2, n(n+1) 2 Show[Graphics[txt], Graphics[arrows], Graphics[{PointSize[.025], RGBColor[1, 0, 0], Point/@points}], AspectRatio → Automatic, PlotRange → All] 1, 1 2, 1 2, 2 3, 1 3, 2 3, 3 ML.3.3. SOME PROPERTIES OF THE DIVIDED DIFFERENCE ML.3.3 61 Some Properties of the Divided Difference Some forms of the divided difference can be obtained from the formulas which were derived in the previous section. From ( 3.2.1 ) we deduce [x0 , . . . , xn ; f ] = Eq. ( 3.3.1 ) implies i 1 x0 ∙ ∙ ∙ 1 x1 ∙ ∙ ∙ .. .. . . . . . 1 xn ∙ ∙ ∙ 1 x0 ∙ ∙ ∙ 1 x1 ∙ ∙ ∙ .. .. . . . . . 1 xn ∙ ∙ ∙ [x0 , . . . , xn ; x ] = x0n−1 f (x0 ) x1n−1 f (x1 ) .. .. . . xnn−1 f (xn ) x0n−1 xn0 x1n−1 xn1 .. .. . . n−1 xn xnn . 0, i = 0, . . . , n − 1; 1, i = n. (3.3.1) (3.3.2) From ( 3.2.2 ) we obtain = n X i=0 [x0 , . . . , xn ; f ] f (xi ) . (xi − x0 ) . . . (xi − xi−1 )(xi − xi+1 ) . . . (xi − xn ) (3.3.3) From here we easily get [x0 , . . . , xi , y0 , . . . , yj ; (t − x0 ) . . . (t − xi ) f (t)]t = [y0 , . . . , yj ; f ]. (3.3.4) By identifying the coefficients at xn in ( 3.2.5 ), we obtain the following recurrence formula for divided differences [x0 , . . . , xn ; f ] = [x1 , . . . , xn ; f ] − [x0 , . . . , xn−1 ; f ] . xn − x0 (3.3.5) For instance, with n = 3, the determination of the divided difference from tabulated data points is outlined in the table below: 62 ML.3. ELEMENTS OF INTERPOLATION THEORY f (x0 ) & f (x1 ) % & f (x2 ) f (x3 ) [x0 , x1 ; f ] & [x0 , x1 , x2 ; f ] % & % & % & % [x1 , x2 ; f ] [x2 , x3 ; f ] [x1 , x2 , x3 ; f ] [x0 , x1 , x2 , x3 ; f ] % By applying the functional [x0 , . . . , xn ; ∙] to the identity (t − xi ) f (t) = x i − x0 xn − xi (t − xn ) f (t) + (t − x0 ) f (t), xn − x0 xn − x0 t ∈ [t0 , tn ], i = 0, . . . , n, using ( 3.3.4 ), we obtain another recurrence formula [x0 , . . . , xi−1 , xi+1 , . . . , xn ; f ] xi − x0 xn − x i = [x0 , . . . , xn−1 ; f ] + [x1 , . . . , xn ; f ], xn − x 0 xn − x0 (3.3.6) i = 0, . . . , n. The following is the Leibniz formula for divided differences [x0 , . . . , xn ; f g] = n X k=0 [x0 , . . . , xk ; f ] ∙ [xk , . . . , xn ; g]. We give bellow a simple proof of this formula (Ivan, 1998)[Iva98a]: = = [x0 , . . . , xn ; f g] [x0 , . . . , xn ; L(x0 , . . . , xn ; f ) ∙ g] n X (t − x0 ) . . . (t − xk−1 )[x0 , . . . , xk ; f ] ∙ g(t)]t [x0 , . . . , xn ; k=0 = ( 3.3.4 ) = n X k=0 n X k=0 [x0 , . . . , xk ; f ] ∙ [x0 , . . . , xn ; (t − x0 ) . . . (t − xk−1 ) g(t)]t [x0 , . . . , xk ; f ] ∙ [xk , . . . , xn ; g]. Another useful formula is the integral representation (3.3.7) ML.3.4. MEAN VALUE PROPERTIES IN LAGRANGE INTERPOLATION 63 [x0 , . . . , xk ; f ] tZn−1 Z1 Zt1 = dt1 dt2 ∙ ∙ ∙ f (n) (x0 + (x1 − x0 )t1 + ∙ ∙ ∙ + (xn − xn−1 )tn )dtn (3.3.8) 0 0 0 which is valid if f (n−1) is absolutely continuous. This can be proved by mathematical induction on n. If A : C[a, b] → R is a continuous linear functional which is orthogonal to all polynomials P of degree ≤ n − 1, i.e., A(P ) = 0, then the following Peano-type representation is valid A(f ) = Z b f (n) (t) A a n−1 (∙ − t)+ (n − 1)! dt, (3.3.9) for all f ∈ C n [a, b], where u+ := 0 if u < 0 and u+ := u if u ≥ 0. This can be proved by applying the functional A to both sides of the Taylor formula f (x) = f (a) + (x − a)n−1 (n−1) x−a 0 (a) f (a) + ∙ ∙ ∙ + f 1! (n − 1)! Z 1 + (n − 1)! b a n−1 f (n) (t) (x − t)+ dt. The functional [x0 , . . . , xn ; ∙] is continuous on C[a, b]. Hence it has a Peano representation 1 [x0 , . . . , xn ; f ] = n! Z b a n−1 n ∙ [x0 , . . . , xn ; (∙ − t)+ ] f (n) (t) dt, (3.3.10) n−1 k = 1, . . . , n. We would like to emphasize that the Peano kernel n ∙ [x0 , . . . , xn ; (∙ − t)+ ], a B-spline function, is non-negative (see Remark ( ML.3.11.1 )). If the points x0 , . . . , xn are inside the simple closed curve C and f is analytic on and inside C, then Z f (z) dz 1 [x0 , . . . , xn ; f ] = . (3.3.11) 2πi C (z − x0 ) . . . (z − xn ) ML.3.4 Mean Value Properties in Lagrange Interpolation Let x0 , . . . , xn be a set of distinct points in the interval [a, b]. The following mean value theorem gives a bound for the error arising in the Lagrange interpolating-approximating process. 64 ML.3. ELEMENTS OF INTERPOLATION THEORY THEOREM ML.3.4.1 (Lagrange Error Formula) If the function f : [a, b] → R is continuous and possesses an (n + 1)th derivative on the interval (a, b), then for each x ∈ [a, b] there exists a point ξ(x) ∈ (a, b), such that f (x) − L(x0 , . . . , xn ; f )(x) = (x − x0 ) . . . (x − xn ) f (n+1) (ξ(x)) (n + 1)! . Proof. If x ∈ {x0 , . . . , xn } then the formula is satisfied for all ξ ∈ (a, b). Assume that x ∈ [a, b] \ {x0 , . . . , xn }. Define an auxiliary function H : [a, b] → R, such that H(t) = (f (t) − L(x0 , . . . , xn ; f )(t))(x − x0 ) . . . (x − xn ) − (f (x) − L(x0 , . . . , xn ; f )(x))(t − x0 ) . . . (t − xn ). Note that the function H has n + 2 roots, namely x, x0 , . . . , xn . By the Generalized Rolle Theorem there exists a point ξ(x) ∈ (a, b) with H (n+1) (ξ(x)) = 0, i.e., f (x) − L(x0 , . . . , xn ; f )(x) = (x − x0 )(x − x1 ) . . . (x − xn ) f (n+1) (ξ(x)) . (n + 1)! The error formula given in Theorem ( ML.3.4.1 ) is an important theoretical result because Lagrange polynomials are extremely used for deriving numerical differentiation and integration methods. The well known Lagrange’s Mean Value Theorem states that if f ∈ C[x0 , x1 ] and f 0 exists on (x0 , x1 ), then [x0 , x1 ; f ] = f 0 (ξ) for some ξ ∈ (x0 , x1 ). This result can be generalized by the following theorem. THEOREM ML.3.4.2 If the function f : [a, b] → R is continuous and has an nth derivative on (a, b), then there exists a point ξ ∈ (a, b) such that [x0 , . . . , xn ; f ] = Proof. Define an auxiliary function F 1 1 F (x) = ... 1 1 f (n) (ξ) n! . : [a, b] → R, such that x0 ∙ ∙ ∙ xn0 f (x0 ) x1 ∙ ∙ ∙ xn1 f (x1 ) .. . . . .. . . .. . . n xn ∙ ∙ ∙ xn f (xn ) x ∙ ∙ ∙ xn f (x) ML.3.5. APPROXIMATION BY INTERPOLATION 65 Note that the function F has the roots x0 , . . . , xn . By the Generalized Rolle Theorem, there exists a point ξ ∈ (a, b) with F (n) (ξ) = 0, i.e., 1 x0 1 x1 .. .. . . 1 xn 0 0 ∙ ∙ ∙ xn0 f (x0 ) ∙ ∙ ∙ xn1 f (x1 ) . .. .. . .. . n ∙ ∙ ∙ xn f (xn ) ∙ ∙ ∙ n! f (n) (ξ) = 0. By expanding the determinant in terms of the last row we obtain 1 x0 ∙ ∙ ∙ xn−1 f (x0 ) 0 1 x1 ∙ ∙ ∙ xn−1 f (x1 ) 1 f (n) (ξ)V (x0 , . . . , xn ) − n! .. .. . . .. .. . . . . . 1 xn ∙ ∙ ∙ xnn−1 f (xn ) and the theorem is proved. ML.3.5 =0 Approximation by Interpolation Let f ∈ C ∞ [a, b] be a function possessing uniform bounded derivatives. THEOREM ML.3.5.1 If (xn )n≥0 is a sequence of distinct points in the interval [a, b], then the sequence of Lagrange Polynomials (L(x0 , . . . , xn ; f ))n≥0 converges uniformly to f on [a, b]. Proof. Since f has uniform bounded derivatives, then there exists a constant M > 0 such that |f (n) (x)| ≤ M, for all x ∈ [a, b] and n ∈ N. Let x ∈ [a, b]. By Theorem ( ML.3.4.1 ) there exists a point ξ ∈ (a, b), such that |f (n+1) (ξ)| |f (x) − L(x0 , . . . , xn ; f )(x)| = |(x − x0 ) . . . (x − xn )| (n + 1)! n+1 (b − a) ≤ M. (n + 1)! By using (b − a)n = 0, n→∞ n! it follows that the sequence (L(x0 , . . . , xn ; f ))n∈N converges uniformly to f on [a, b]. lim ML.3.6 Hermite-Lagrange Interpolating Polynomial Let m, n ∈ N, and α0 , . . . , αn ∈ N∗ such that α0 + ∙ ∙ ∙ + αn = m + 1. 66 ML.3. ELEMENTS OF INTERPOLATION THEORY Consider the numbers yij , i = 0, . . . , n; j = 0, . . . , αi − 1, and the distinct points x0 , . . . , xn . Then there exists a unique polynomial Hm of degree at most m, called the Hermite-Lagrange interpolating polynomial, such that (j) Hm (xi ) = yij , for i = 0, . . . , n; j = 0, . . . , αi − 1. Define `(x) := (x − x0 )α0 . . . (x − xn )αn . The Hermite-Lagrange interpolating polynomial can be written in the form Hm (x) = −j−1 n α i −1 αiX X X i=0 j=0 k=0 yij 1 k!j! (x − xi )αi `(x) (k) x=xi `(x) . (x − xi )αi −j−k (3.6.1) Let lj denotes the j th Lagrange fundamental polynomial, n Y x − xi . lj (x) = xj − xi i=1 i6=j THEOREM ML.3.6.1 If f ∈ C 1 [a, b], then the unique polynomial of least degree agreeing with f and f 0 at x0 , . . . , xn is the polynomial of degree at most 2n + 1 given by H2n+1 (x) = n X f (xj )hj (x) + j=0 where n X f 0 (xj )gj (x), j=0 hj (x) = 1 − 2(x − xj ) lj0 (xj ) lj2 (x), gj (x) = (x − xj ) lj2 (x). Proof. Since lj (xi ) = δij , 0 ≤ i, j ≤ n, one can easily show that hj (xj ) = δij , gj (xi ) = 0, 0 ≤ i, j ≤ n and h0j (xi ) = 0, gj0 (xi ) = δij , 0 ≤ i, j ≤ n Consequently H2n+1 (xi ) = f (xi ), 0 H2n+1 (xi ) = f 0 (xi ), i = 0, . . . , n. ML.3.6. HERMITE-LAGRANGE INTERPOLATING POLYNOMIAL REMARK 67 ML.3.6.2 The divided difference for coalescing points [ x 0 , . . . , x 0 , . . . , xn , . . . , x n ; f ] | {z } | {z } α0 −times αn −times is defined by Tiberiu Popoviciu [Pop59] to be the coefficient at xm of the Hermite-Lagrange polynomial satisfying (j) (xi ) = f (j) (xi ), Hm for i = 0, . . . , n; j = 0, . . . , αi − 1. EXAMPLE ML.3.6.3 Mathematica (* Hermite - Lagrange *) (* InterpolatingPolynomial[data, var] gives a polynomial in the variable var which provides an exact fit to a list of data. *) (* The number of knots = n + 1 *) n = 0; Print[“Number of knots = ”, n + 1] Number of knots = 1 (* Multiplicity of knots *) α0 = 2; (*α1 = 1; *) Table [αi , {i, 0, n}] {2} (*ThePdegree of the Hermite − Lagrange polynomial = m *) m= n j=0 αj − 1 1 data = Table[ {xi , Table [ D[f [t], {t, j}], {j, 0, αi − 1}] /. {t → xi }} , {i, 0, n}] {{x0 , {f [x0 ] , f 0 [x0 ]}}} %//TableForm f [x0 ] x0 f 0 [x0 ] H[x ] = InterpolatingPolynomial[data, x]//FullSimplify f [x0 ] + (x − x0 ) f 0 [x0 ] (*The divided difference on multiple knots [x0 , ..., xn ; f ] *) Coefficient [H[x], xm ] //Simplify f 0 [x0 ] 68 ML.3. ELEMENTS OF INTERPOLATION THEORY ML.3.7 Finite Differences Let F = {f | f : R → R} and h > 0. Define the following operators: • the identity operator, 1 : F → F , 1 f (x) = f (x); • the translation, displacement, or shift operator, T : F → F , Tf (x) = f (x + h), introduced by George Boole [Boo60]; • the backward difference operator, ∇ : F → F , ∇f (x) = f (x) − f (x − h); • the forward difference operator, Δ : F → F , Δf (x) = f (x + h) − f (x); • the centered difference operator, δ : F → F , h h −f x− . δf (x) = f x + 2 2 Some properties of the forward difference operator Δ will be presented. Since Δ = T − 1, the following relation can be derived Δn = (T − 1)n , and hence n Δ f (x) = (−1) Tn−k f (x). k k=0 n n X k Moreover, using the relation Ti f (x) = f (x + ih), we obtain n (−1) f (x + (n − k)h). Δ f (x) = k k=0 n We also have n X n k=0 k n X k Δk f (x) = (1 + Δ)n f (x) = Tn f (x) = f (x + nh). (3.7.1) (3.7.2) ML.3.7. FINITE DIFFERENCES EXAMPLE 69 ML.3.7.1 For h = 1, since Δk 1 x = (−1)k k! x(x + 1) . . . (x + k) n X = (−1)k nk x(x + 1) . . . (x + k) k=0 (−1)k k! xk+1 = 1 x+n , by ( 3.7.1 ), we obtain . (3.7.3) By denoting yi = f (x + ih), i = 0, . . . , n, the forward finite differences can be calculated as in the following table: y0 Δy0 Δ 2 y0 Δ3 y 0 y1 Δy1 Δ 2 y1 Δ3 y 1 ∙∙∙ ∙∙∙ ∙∙∙ ∙∙∙ yn−3 Δyn−3 Δ2 yn−3 Δ3 yn−3 yn−2 Δyn−2 Δ2 yn−2 yn−1 Δyn−1 ∙∙∙ ∙∙∙ Δn−1 y0 Δn y0 Δn−1 y1 ∙∙∙ yn It should be noted that the finite difference is a particular case of the divided difference. In fact, with xk = x0 + k h, [x, x + h, . . . , x + nh; f ] n X f (xk ) = (xk − x0 ) . . . (xk − xk−1 )(xk − xk+1 ) . . . (xk − xn ) k=0 = n X (−1)n−k f (x + kh) k!(n − k)!hn n 1 X n−k n = (−1) f (x + kh) n!hn k=0 k k=0 = hence Δn f (x) , n!hn Δn f (x) [x, x + h, . . . , x + nh; f ] = . n!hn (3.7.4) The relationship between the derivative of a function and the forward finite difference is presented in the following theorem. 70 ML.3. ELEMENTS OF INTERPOLATION THEORY THEOREM ML.3.7.2 If the function f : R → R has an nth derivative, then for each x ∈ R there exists ξ ∈ (x, x + nh), such that Δn f (x) = hn f (n) (ξ). This is a direct consequence of ( ML.3.4.2 ) and ( 3.7.4 ). Let C be the set of all functions possessing a Taylor expansion at all points x ∈ R. In what follows, Δ is taken as the restriction of the forward finite difference operator to the set C. Consider the differentiating operator D : C → C, Df = f 0 . From h 0 h2 hn f (x) + f 00 (x) + ∙ ∙ ∙ + f (n) (x) + ∙ ∙ ∙ 1! 2! n! (hD)n hD (hD)2 + + ∙∙∙ + + ∙ ∙ ∙ f (x), = 1! 2! n! f (x + h) − f (x) = we deduce Δ = ehD − 1. Likewise, we get: ∇ = 1 − e−hD , hD . 2 In order to represent the operator D by the operator Δ, we use the following result δ = 2 sh ehD = 1 + Δ, to derive hD = ln(1 + Δ) = Δ − Δ 2 Δ 3 Δ4 + − + ∙∙∙ 2 3 4 Furthermore, we obtain: h 2 D 2 = Δ2 − Δ 3 + h3 D 3 = Δ3 − 11 4 5 5 Δ − Δ + ∙∙∙ 12 6 3 4 7 5 Δ + Δ − ∙∙∙ 2 4 Taylor’s Formula can be written in the form f (a + h) = ehD f (a). (3.7.5) ML.3.8. INTERPOLATION OF FUNCTIONS OF SEVERAL VARIABLES 71 Therefore f (a + xh) = exhD f (a) = (ehD )x f (a) = (1 + Δ)x f (a), which known as the Gregory-Newton Interpolating Formula, f (a + xh) = (1 + xΔ + x(x − 1) 2 x(x − 1)(x − 2) 3 Δ + Δ + ∙ ∙ ∙ )f (a), 2! 3! i.e., ∞ X x f (a + xh) = n=0 x ∈ R. n Δn f (a), (3.7.6) From ( 3.7.6 ) it follows that, for any polynomial P of degree at most n, we have P (x + a) = n X x k=0 hence, for all x ∈ R and m ∈ Z, we obtain P (x + m) = Δk P (a), k n X X x k k=0 i=0 PROBLEM k k i (−1)k−i P (m + i). (3.7.7) ML.3.7.3 Let P be a polynomial of degree at most n with real coefficients. Suppose that P takes integer values at some n + 1 consecutive integers. Prove that P takes integer values at any integer number. We simply use representation ( 3.7.7 ) for P . ML.3.8 Interpolation of Functions of Several Variables Let a ≤ x0 < ∙ ∙ ∙ < xn ≤ b and c ≤ y0 < ∙ ∙ ∙ < ym ≤ d. Define F = {f | f : [a, b] × [c, d] → R}, and ui : [a, b] → R, vj : [c, d] → R be functions satisfying the conditions: ui (xk ) = δik , i, k = 0, . . . , n; vj (yh ) = δjh , j, h = 0, . . . , m. 72 ML.3. ELEMENTS OF INTERPOLATION THEORY Define the interpolating operators U, V : F → F , by U f (x, y) = n X f (xi , y)ui (x), i=0 V f (x, y) = m X f (x, yj )vj (y). j=0 Note that: U f (xk , y) = f (xk , y) and V f (x, yh ) = f (x, yh ), k = 0, . . . , n; h = 0, . . . , m; x ∈ [a, b], y ∈ [c, d]. DEFINITION ML.3.8.1 The operator U V is called the tensor product of the operators U and V. Note that we have U V f (x, y) = m n X X f (xi , yj ) ui (x) vj (y) i=0 j=0 and U V f (xk , yh ) = f (xk , yh ), k = 0, . . . , n; h = 0, . . . , m. DEFINITION ML.3.8.2 The operator U ⊕ V := U + V − U V is called the Boolean sum of the operators U and V. Note that: U ⊕ V f (xk , y) = f (xk , y), U ⊕ V f (x, yh ) = f (x, yh ), k = 0, . . . , n; h = 0, . . . , m; x ∈ [a, b], y ∈ [c, d]. ML.3.9 Scattered Data Interpolation. Shepard’s Method In many fields such as geology, meteorology and cartography, we frequently encounter nonuniformly spaced data, also called scattered data, which have to be interpolated. The interpolation problem can be stated as follows. Suppose that a set of n distinct abscissae Xi = (xi , yi ) ∈ R2 , and the associated ordinates zi , i = 1, . . . , n, are given. Then we seek a function f : R2 → R such that f (Xi ) = zi , i = 1, . . . , n. In a global method, the interpolant depends on all the data points at once in the sense that adding, changing, or removing one data point implies that we have to solve the problem again. In a local method, adding, changing or removing a data point affects the interpolant only locally, i.e., in a certain subset of the domain. ML.3.9. SCATTERED DATA INTERPOLATION. SHEPARD’S METHOD 73 The Shepard interpolant to the data (Xi ; zi ) ∈ R3 , i = 1, . . . , n, is a weighted mean of the ordinates zi : n X ωi (X) zi , X ∈ R3 , (3.9.1) S(X) := i=1 with weight functions ωi (X) := n Y ρμi (X, Xj ) j=1 j6=i n n Y X ρμi (X, Xj ) k=1 j=1 j6=k where ρ is a metric on R2 and μi > 0. For X 6= Xi , i = 1, . . . , n, we can write 1 ωi (X) = ρμi (X, X n X k=1 i) 1 μi ρ (X, Xk ) , i = 1, . . . , n. This is a kind of inverse distance weighted method, i.e., the larger the distance of X to Xi , the less influence zi has on the value of S at the point X. Now, we assume that ρ is the Euclidean metric on R2 . The functions ωi have the following properties: • ωi ∈ C 0 , continuity; • ωi (X) ≥ 0, positivity; • ωi (Xj ) = δij , interpolation property (S(Xi ) = zi ); • n P ωi (X) = 1, normalization. i=1 It can be shown that the Shepard interpolant ( 3.9.1 ) has • cusps at Xi if 0 < μi < 1, • corners at Xi if μi = 1, • flat spots at Xi if μi > 1, i.e., the tangent planes at such points are parallel to the (x, y) plane. Besides the obvious weaknesses of Shepard’s method, clearly all of the functions ωi (X) have to be recomputed as soon as just one data point is changed, or a data point is added or removed. This shows that Shepard’s method is a global method. There are several ways to improve Shepard’s method, for example by removing the discontinuities in the derivatives and the flat spots at the interpolation points. Even the global character can be made local using the following procedures: 74 ML.3. ELEMENTS OF INTERPOLATION THEORY 1. multiply the weight functions ωi by a mollifying function λi : λi (Xi ) = 1, λi (X) ≥ 0, inside a given neighborhood Bi of Xi , λi (X) = 0, outside of Bi , 2. use weight functions which are locally supported like B-splines, 3. use a recurrence formula. Consider the weight functions ek (X) := k−1 Y ρμ (X, Xj ) j=1 k k Y X , k = 2, . . . n, μ ρ (X, Xi ) j=1 i=1 i6=j and the recurrence formula s1 (X) = z1 , sk (X) = sk−1 (X) + ek (X)(zk − sk−1 (Xk )), k = 2, . . . , n. (3.9.2) We have obtained a recursive formula for the Shepard-type interpolant sn . ML.3.10 Splines Splines are piecewise polynomials with R as their natural domain of definition. They often appear as solutions to extremal problems. Splines also appears as kernels of interpolations formulas. The systematic study of splines has begun with the work of Isaac Jacob Schoenberg in the forties. We would like to emphasize that it is Tiberiu Popoviciu who defined the so called “elementary functions of order n”, later referred as splines. Results of Popoviciu’s work were also known by I. J. Schoenberg (see M. Mardens and I. J. Schoenberg, On the variation diminishing spline approximation methods, Mathematica 8(31),1, 1966, pp. 61–82 (dedicated to Tiberiu Popoviciu on his 60th birthday)). At present, splines are widely used in numerical computation and are indispensable for many theoretical questions. A typical spline is the truncated power 0, if x < a, k (x − a)+ := k (x − a) , if x ≥ a, where k ∈ N∗ . With k = 0, we obtain the Heaviside function 0, if x < a, 0 (x − a)+ := 1, if x ≥ a, Let n, p ∈ N∗ and (a = x0 < ∙ ∙ ∙ < xn = b) be a division of the interval [a, b]. ML.3.11. B-SPLINES DEFINITION 75 ML.3.10.1 A function S ∈ C p−2 [a, b] is called a spline of order p (p = 2, 3, . . .), equivalently of degree m = p − 1, with breakpoints xi , if on each interval [xi−1 , xi ] it is a polynomial of degree ≤ m, and on one of them is of degree exactly m. Thus, the splines of order two are broken lines. One can prove that the linear space of splines of order p on the interval [a, b] has the following basis: p−1 {1, x, . . . , xp−1 , (x − x0 )p−1 + , . . . , (x − xn )+ }. ML.3.11 B-splines The representation of a spline as a sum of a truncated powers is generally not very useful because the truncated power functions have large support. A basis which is more local in character can be defined by using B-splines (basic splines) which are splines with the smallest possible support. The B-splines are defined by means of the divided differences n−1 B(x) := B(x0 , . . . , xn ; x) := n [x0 , . . . , xn ; (∙ − x)+ ]. (3.11.1) Important properties of B-splines can be derived from properties of divided differences. As an example, we derive the following recurrence formula which is valid when n ≥ 2 B(x0 , . . . , xn ; x) xn − x n x − x0 B(x0 , . . . , xn−1 ; x) + B(x1 , . . . , xn ; x) , n − 1 xn − x0 xn − x0 1 , if x ∈ [xi , xi+1 ) xi+1 − xi B(xi , xi+1 ; x) = , i = 0, 1, . . . , n − 1. 0, otherwise = In fact, if x is not a knot, by using ( 3.3.4 ) and ( 3.3.6 ), we obtain n−1 ]t B(x0 , . . . , xn ; x) = n [x0 , . . . , xn ; (t − x)+ = n(x − x0 ) n−2 [x0 , . . . , xn−1 , x; (t − x)(t − x)+ ]t xn − x0 + = n(xn − x) n−2 [x1 , . . . , xn , x; (t − x)(t − x)+ ]t x n − x0 n(x − x0 ) n(xn − x) n−2 n−2 [x0 , . . . , xn−1 ; (t − x)+ ]t + [x1 , . . . , xn ; (t − x)+ ]t xn − x0 x n − x0 (3.11.2) 76 ML.3. ELEMENTS OF INTERPOLATION THEORY xn − x n x − x0 B(x0 , . . . , xn−1 ; x) + B(x1 , . . . , xn ; x) . n − 1 xn − x0 xn − x0 If x is a knot (for instance x = x0 ), by using ( 3.3.4 ), we obtain = n−1 ]t B(x0 , . . . , xn ; x0 ) = n [x0 , . . . , xn ; (t − x0 )+ n−2 n−2 = n [x0 , . . . , xn ; (t − x0 )(t − x0 )+ ]t = n [x1 , . . . , xn ; (t − x0 )+ ]t . REMARK ML.3.11.1 From Equation ( 3.11.2 ) one can easily prove that B-splines are non-negative functions. It is also easy to differentiate a B-spline. We have, for any x which is not a knot, B 0 (x0 , . . . , xn ; x) = n [x0 , . . . , xn ; d n−1 ]t (t − x)+ dx n−2 = n(n − 1)[x0 , . . . , xn ; (t − x)+ ]t = n(n − 1) n−2 n−2 [x1 , . . . , xn ; (t − x)+ ]t − [x0 , . . . , xn−1 ; (t − x)+ ]t xn − x0 ( 3.3.5 ) n = B(x1 , . . . , xn ; x) − B(x0 , . . . , xn−1 ; x) . xn − x0 Next, we present some properties of the spline functions of degree 3. Let us consider the points xi = a + ih, h = (b − a)/n, i = −3, −2, . . . , n + 3. We define the functions: Bi (x) := 1 h3 0, (x − xi−2 )3 h3 + 3h2 (x − xi−1 ) + 3h(x − xi−1 )2 − 3(x − xi−1 )3 , h3 + 3h2 (xi+1 − x) + 3h(xi+1 − x)2 − 3(xi+1 − x)3 , (xi+2 − x)3 , 0, x ∈ (−∞, xi−2 ] x ∈ (xi−2 , xi−1 ] x ∈ (xi−1 , xi ] x ∈ (xi , xi+1 ] x ∈ (xi+1 , xi+2 ] x ∈ (xi+2 , ∞). One can prove that the functions Bi belong to C 2 [a, b], for i = −1, 0, 1, . . ., n + 1. Some of their properties can be found in the table below. H HH x HH xi−2 xi−1 xi xi+1 xi+2 Bi (x) 0 1 4 1 0 Bi0 (x) 0 3 h 0 − h3 0 Bi00 (x) 0 6 h2 − h122 6 h2 0 ML.3.11. B-SPLINES 77 (xi , 4) • + • • • xi−2 Let f ∈ C 1 [a, b]. We try to find interpolating conditions: 0 S (x0 ) S(xi ) 0 S (xn ) • • • xi xi−1 Bi xi+1 • xi+2 a spline function S ∈ C 2 [a, b] satisfying the following = f 0 (x0 ), = f (xi ), = f 0 (xn ). i = 0, . . . , n, We try to find such a function S in the form We obtain the system S = a−1 B−1 + a0 B0 + ∙ ∙ ∙ + an Bn + an+1 Bn+1 . 0 (x0 ) + a−1 B−1 a−1 B−1 (xi ) + 0 (xn ) + a−1 B−1 The matrix of the system is a0 B00 (x0 ) a0 B0 (xi ) +∙∙∙+ +∙∙∙+ a0 B00 (xn ) +∙∙∙+ − h3 0 h3 0 0 1 4 1 0 0 0 1 4 1 0 .. .. .. .. .. . . . . . 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 and its determinant is equal to the 3 −h 1 0 .. . 0 0 0 0 an+1 Bn+1 (x0 ) an+1 Bn+1 (xi ) = f 0 (x0 ) = f (xi ) (i = 0, . . . , n) 0 an+1 Bn+1 (xn ) = f 0 (xn ) ∙∙∙ ∙∙∙ ∙∙∙ ... 0 0 0 .. . ∙∙∙ ∙∙∙ ∙∙∙ 4 1 − h3 0 0 0 0 0 0 .. .. . . 1 0 4 1 0 h3 determinant of the matrix ∙∙∙ 0 0 0 ∙∙∙ 0 0 0 ∙∙∙ 0 0 0 . . . .. .. .. . . . . 0 0 0 0 ∙∙∙ 4 1 0 0 0 0 0 ∙∙∙ 2 4 1 0 0 0 0 ∙ ∙ ∙ 0 0 h3 0 4 1 .. . 0 2 4 .. . 0 0 1 .. . 0 0 0 .. . It is clear that the above matrix is strictly diagonally dominant, hence the system has a unique solution. We shall write the functions Bi in terms of the divided differences. By direct calculation, we can show that Bi (x) = 4! h [xi−2 , xi−1 , xi , xi+1 , xi+2 ; (∙ − x)3+ ], i = −1, 0, . . . , n + 1. 78 ML.3.12 ML.3. ELEMENTS OF INTERPOLATION THEORY Problems Let x0 , . . . , xn be distinct points on the real axis, f be a function defined on these points and `(x) = (x − x0 ) . . . (x − xn ). PROBLEM ML.3.12.1 Let P be a polynomial. Prove that the Lagrange Polynomial L(x0 , . . . , xn ; P ) is the remainder obtained by dividing the polynomial P by the polynomial (X − x0 ) . . . (X − xn ). Let P (x) = Q(x) (x − x0 ) . . . (x − xn ) + R(x). We have R(xi ) = P (xi ) for i = 0, . . . , n. Since the degree of the polynomial R is at most n, we get R = L(x0 , . . . , xn ; P ). PROBLEM ML.3.12.2 Calculate L(x0 , . . . , xn ; X n+1 ). We have X n+1 = 1 ∙ (X − x0 )(X − x1 ) . . . (X − xn ) +X n+1 − (X − x0 )(X − x1 ) . . . (X − xn ). By P. ( ML.3.12.1 ) we obtain L(x0 , . . . , xn ; X n+1 ) = X n+1 − (X − x0 )(X − x1 ) . . . (X − xn ). PROBLEM ML.3.12.3 Let Pn : R → R be a polynomial of degree at most n, n ∈ N. Prove the decomposition in partial fractions Pn (x) (x − x0 ) . . . (x − xn ) = n X k=0 With `(x) = (x − x0 ) . . . (x − xn ), we have 1 x − xi ∙ P (xi ) `0 (xi ) . n P (xi ) L(x0 , . . . , xn ; Pn )(x) X 1 Pn (x) ∙ 0 = = . `(x) `(x) x − x ` (x i i) k=0 PROBLEM ML.3.12.4 Calculate the sum n X i=0 k = 0, . . . , n. xki (xi − x0 ) . . . (xi − xi−1 )(xi − xi+1 ) . . . (xi − xn ) , ML.3.12. PROBLEMS PROBLEM 79 ML.3.12.5 Calculate [x0 , . . . , xn ; xk ], for k = n, n + 1. PROBLEM ML.3.12.6 Calculate [x0 , . . . , xn ; (X − x0 ) . . . (X − xk )], k ∈ N. PROBLEM ML.3.12.7 Prove that L(x0 , . . . , xn ; L(x1 , . . . , xn ; f )) = L(x1 , . . . , xn ; L(x0 , . . . , xn ; f )). PROBLEM ML.3.12.8 Prove that, for x ∈ / {x0 , . . . , xn }, f (t) L(x0 , . . . , xn ; f )(x) . = x0 , . . . , x n ; x−t t (x − x0 ) . . . (x − xn ) With `(x) = (x − x0 ) . . . (x − xn ), we have n X `(x) f (xi ) L(x0 , . . . , xn ; f )(x) = ∙ , x − xi `0 (xi ) k=0 hence f (xi ) n L(x0 , . . . , xn ; f )(x) X x−xi f (t) . = = x0 , . . . , x n ; `(x) `0 (xi ) x−t t k=0 PROBLEM If f (x) = 1 x ML.3.12.9 and 0 < x0 < x1 < ∙ ∙ ∙ < xn , then [x0 , . . . , xn ; f ] = (−1)n x0 . . . x n . 80 ML.3. ELEMENTS OF INTERPOLATION THEORY By using problem ( ML.3.12.8 ), we have 1 L(x0 , . . . , xn ; 1)(0) (−1)n =− . = x0 , . . . , x n ; t t (0 − x0 ) . . . (0 − xn ) x0 . . . x n Solution 2. We write the obvious equality (t − t0 ) . . . (t − xn ) x0 , . . . , x n ; =0 t in the form n x0 , . . . , xn ; t − (t0 + ∙ ∙ ∙ + tn )t n−1 + ∙ ∙ ∙ + (−1) i.e., 1 − (t0 + ∙ ∙ ∙ + tn )0 + ∙ ∙ ∙ + 0 + (−1) n+1 t0 . . . x n n+1 t0 . . . xn = 0, t 1 x0 , . . . , x n ; = 0. t ML.4 Elements of Numerical Integration ML.4.1 Richardson’s Extrapolation Extrapolation is applied when it is known that the approximation technique has an error term with a predictable form, one that depends on a parameter, usually the step size h. Richardson’s extrapolation is a technique frequently employed to generate results of high accuracy by using low-order formulas. Let 1 < p1 < p2 < . . . be a sequence of integers and F be a formula that approximates an unknown value M. Suppose that, in a neighborhood of the point zero, the following relation holds M = F (h) + a1 hp1 + a2 hp2 + ∙ ∙ ∙ where a1 , a2 , . . . are constants. The order of approximation for F (h) is O(hp1 ). By denoting F1 (h) := F (h), a11 := a1 , a12 := a2 , . . . we obtain For q > 1, we deduce M = F1 (h) + a11 hp1 + a12 hp2 + ∙ ∙ ∙ . p 1 p 2 h h h M = F1 + a11 + a12 + ∙∙∙ , q q q hence, by letting F2 (h) = we obtain q p1 F 1 h q − F1 (h) q p1 − 1 , M = F2 (h) + 0 ∙ hp1 + a22 hp2 + ∙ ∙ ∙ , i.e., the formula F2 (h) has order of approximation O(hp2 ). Step by step, we deduce approximation formulas of order O(hpj ) : q pj−1 Fj−1 hq − Fj−1 (h) , Fj (h) = q pj−1 − 1 81 (4.1.1) 82 ML.4. ELEMENTS OF NUMERICAL INTEGRATION j = 2, 3, . . . . In the case of the formula M = F (h) + a1 h + a2 h2 + ∙ ∙ ∙ , with q = 2, we deduce F1 (h) := F (h), 2j−1 Fj−1 h2 − Fj−1 (h) Fj (h) := , 2j−1 − 1 j = 2, 3, . . . , n, which are approximations of order O(hj ) for M. For example, with n = 3, the approximations are generated by rows, in the order indicated by the framed entries in the table below. 1 O(h) F1 (h) h 2 F1 ( 2 ) h 4 F1 ( 4 ) h 7 F1 ( 8 ) O(h2 ) 3 F2 (h) h 5 F2 ( 2 ) h 8 F2 ( 4 ) O(h3 ) 6 F3 (h) h 9 F3 ( 2 ) O(h4 ) 10 F4 (h) Each column beyond the first in the extrapolation table is obtained by a simple averaging process. If F1 (h) is an approximation formula of order O(h2 ), then we obtain the following approximations of order O(h2j ) for M : 4j−1 Fj−1 h2 − Fj−1 (h) Fj (h) := , (4.1.2) 4j−1 − 1 j = 2, 3, . . . . ML.4.2 Numerical Quadrature Let f : [a, b] → R be an integrable function. In general, the problem of evaluating the definite integral involves the following difficulties: • the values of the function f are known only at the given points, say x0 , . . . , xn ; • the function f has no antiderivatives; • the antiderivatives of f cannot be determined analytically. Therefore, the basic method involved in approximating the definite integral of f uses a sum of the form n X Ai f (xi ), i=0 where A0 , . . . , An are constants not depending on f. ML.4.3. ERROR BOUNDS IN THE QUADRATURE METHODS DEFINITION 83 ML.4.2.1 A relation of the form Z b f (x) dx = a n X Ai f (xi ) + Rn (f ), i=0 is called a quadrature formula. The number Rn (f ) is the remainder of the quadrature formula ( ML.4.2.1 ). The previous equality is sometimes written in the form Z b n X f (x) dx ≈ Ai f (xi ). a DEFINITION i=0 ML.4.2.2 The degree of accuracy, or degree of precision of the quadrature formula ( ML.4.2.1 ) is the positive integer m such that Rn (X k ) = 0, k = 0, . . . , m, Rn (X m+1 ) 6= 0. The quadrature formulas can be classified as follows: • Newton-Cotes type, when the nodes x0 , . . . , xn are given and the coefficients A0 , . . . , An are determined such that the degree of precision is at least n; • Chebyshev type, when the coefficients A0 , . . . , An are equal one to the other and the nodes x0 , . . . , xn are determined such that the degree of precision is at least n; • Gauss type, when coefficients A0 , . . . , An and the nodes x0 , . . . , xn , are determined such that the degree of precision is maximal. ML.4.3 Error Bounds in the Quadrature Methods Let f ∈ C n+1 [a, b]. Consider a quadrature formula Z b n X f (x) dx = Ai f (xi ) + Rn (f ) a i=0 with order of precision n. Using Taylor’s formula with integral form of the remainder Z n X f (j) (a) 1 x j f (x) = (x − t)n f (n+1) (t)dt, (x − a) + j! n! a j=0 x ∈ [a, b], we obtain Rn (f ) 84 ML.4. ELEMENTS OF NUMERICAL INTEGRATION ! Z X n X 1 f (j) (a) j n (n+1) (X − t) f (t)dt (X − a) + Rn j! n! a j=0 = Rn 1 = Rn n! = Z = b a Z Z 1 n! b a 1 n! Z n (n+1) Z (x − t) f = By denoting (X − t) f (t)dt n n (n+1) (x − t) f Z a 1 X (t)dt dx − Ai n! i=0 b t n (n+1) n x a X b a 1 X (t)dx dt − Ai n! i=0 n Z xi a Z (b − t)n+1 1 X Ai (xi − t)n+ − (n + 1)! n! i=0 (xi − t)n f (n+1) (t)dt b a ! (xi − t)n+ f (n+1) (t)dt f (n+1) (t)dt. Z b n n+1 X (b − t) 1 n A (x − t) − Kn = i i + dt, n! i=0 a (n + 1)! which is a constant not depending on f and using the notation kf (n+1) k = max |f (n+1) (x)|, a≤x≤b we obtain an error bound in quadrature formula: |Rn (f )| ≤ Kn kf (n+1) k. ML.4.4 Trapezoidal Rule The idea used to produce trapezoidal rule is to approximate the function f : [a, b] → R by the Lagrange interpolating polynomial L(a, b; f ). THEOREM ML.4.4.1 If f ∈ C 2 [a, b], then there exists ξ ∈ (a, b) such that the following quadrature formula holds Z b (b − a)3 00 f (a) + f (b) − f (ξ). f (x)dx = (b − a) 2 12 a Proof. We have = Z b a Z b L(a, b; f )(x)dx a x−a b−x f (a) + f (b) f (b) + f (a) dx = (b − a) . b−a b−a 2 ML.4.5. RICHARDSON’S DEFERRED APPROACH TO THE LIMIT 85 By using the weighted mean value theorem, we obtain Z b (f (x) − L(a, b; f )(x))dx R1 (f ) = a = Z b a (x − a)(x − b)[a, b, x; f ]dx = [a, b, c; f ] f 00 (ξ) = 2 Z f 00 (ξ) = 2 b a Z b a (x − a)(x − b)dx (x − a)2 − (x − a)(b − a) dx b (x − a)3 (x − a)2 − (b − a) 3 2 a =− (b − a)3 00 f (ξ). 12 We have proved that the trapezoidal quadrature formula has degree of precision 1. To generalize this procedure, choose a positive integer n. Subdivide the interval [a, b] into n subintervals [xi−1 , xi ], xi = a + i b−a , i = 0, . . . , n, and apply trapezoidal rule on each subinterval. We obtain n n X b−a i=1 b−a (f (xi−1 ) + f (xi )) = 2n n and R(f ) = − n X (b − a)3 =− where h = 12n3 i=1 n−1 f (a) + f (b) X f (xi ) + 2 i=1 ! n (b − a)3 1 X 00 f (ξi ) = − f (ξi ) 12n2 n i=1 00 (b − a)3 00 b − a 2 00 f (ξ) = − h f (ξ), 2 12n 12 b−a . Consequently, there exists ξ ∈ (a, b) such that n ! Z b n−1 b − a f (a) + f (b) X (b − a)3 00 f (x)dx = f (xi ) − f (ξ). + 2 n 2 12n a i=1 (4.4.1) This formula is known as the composite trapezoidal rule. It is one of the simplest formulas for numerical integration. ML.4.5 Richardson’s Deferred Approach to the Limit Let T (h) be the result from the trapezoidal rule using step size h and T (l) be the result with step size l, such that h < l. If the second derivative f 00 is “reasonably constant”, then: Z b f (x) dx ≈ T (h) + Ch2 , a 86 ML.4. ELEMENTS OF NUMERICAL INTEGRATION Z We deduce b a f (x) dx ≈ T (l) + Cl2 . C≈ hence Z T (h) − T (l) , l 2 − h2 b a f (x) dx ≈ T (h) + h2 T (h) − T (l) = l 2 − h2 ( hl )2 T (h) − T (l) . ( hl )2 − 1 Rb This gives a better approximation to a f (x) dx than T (h) and T (l). In fact, if the second derivative is actually a constant, then the truncation error is zero. = ML.4.6 Romberg Integration Romberg Integration uses the Composite Trapezoidal Rule to give preliminary approximations and then applies the Richardson Extrapolation processRto improve the approximations. b Denoting by T (h) the approximation of the integral I = a f (x)dx given by the Composite Trapezoidal rule, ! m−1 X h f (a) + f (b) + 2 T (h) = f (a + ih) , 2 i=1 where h = (b − a)/m, it can be shown that if f ∈ C ∞ [a, b], the Composite Trapezoidal rule can be written in the form Z b a f (x)dx = T (h) + a1 h2 + a2 h4 + ∙ ∙ ∙ , therefore, we can apply Richardson Extrapolation. Let n ∈ N∗ , b−a hk = k−1 , k = 1, . . . , n. 2 The following Composite Trapezoidal approximations with m = 1, 2, . . ., 2n−1 , can be derived: R1,1 = h1 Rk,1 = f (a) + f (b) , 2 hk f (a) + f (b) + 2 2 2k−1 X−1 i=1 f (a + ihk ) , k = 2, . . . , n. We can rewrite Rk,1 in the form: k−2 2X 1 Rk,1 = Rk−1,1 + hk−1 f (a + (2i − 1)hk ) , 2 i=1 k = 2, . . . , n. ML.4.7. NEWTON-COTES FORMULAS 87 Applying the Richardson Extrapolation to these values we obtain an O(h2j k ) approximation formula defined by 4j−1 Rk,j−1 − Rk−1,j−1 , Rk,j = 4j−1 − 1 k = 2, . . . , n; j = 2, . . . , k. In the case of n = 4, the results generated by these formulas are shown in the following table. R1,1 ? R2,1 - R2,2 - R3,2 R3,1 - R3,3 ) - R4,1 ML.4.7 R4,2 - R4,3 - R4,4 Newton-Cotes Formulas Let xi = a + ih, i = 0, . . . , m and h = (b − a)/m. The idea used to produce Newton-Cotes formulas is to approximate the function f : [a, b] → R by the Lagrange Interpolating Polynomial L(x0 , . . . , xm ; f ). We have Z b f (x)dx = a Z Z b L(x0 , . . . , xm ; f )(x)dx + R(f ), a b f (x)dx = a m X (m) f (xi )Ai + R(f ), i=0 and denoting `(x) = (x − x0 ) . . . (x − xm ), we obtain (m) Ai = Z b a `(x) dx, (x − xi )`0 (xi ) i = 0, . . . , m. (m) For some particular cases the values of the coefficients Ai are given in the following table. 88 ML.4. ELEMENTS OF NUMERICAL INTEGRATION ML.4.8 (m) m Ai h xi 1 1 1 , 2 2 b−a a, b 2 1 4 1 , , 6 6 6 b−a 2 3 1 3 3 1 , , , 8 8 8 8 b−a 3 a, a, a+b , b 2 a + 2b 2a + b , , b 3 3 Simpson’s Rule Simpson’s Rule is a Newton-Cotes formula in the case m = 3. THEOREM ML.4.8.1 If f ∈ C 4 [a, b] then there exists ξ ∈ (a, b) such that Z b (b − a)5 (4) a+b b−a f (a) + 4f + f (b) − f (ξ). f (x)dx = 6 2 2880 a The Composite Simpson’s Rule can be written with its error term as Z b a n−1 − n−1 X f (a) + f (b) X f (xi ) + 2 f + 2 i=1 i=0 b−a f (x)dx = 3n xi + xi+1 2 ! f (4) (ξ)(b − a)5 , 2880n5 where ξ ∈ (a, b). ML.4.9 Gaussian Quadrature Let ρ : [a, b] → R+ be an integrable weight function and f : [a, b] → R be an integrable function. The problem is to find the nodes x0 , . . . , xn ∈ [a, b] and the coefficients A0 , . . . , An such that the quadrature formula Z b n X f (x)ρ(x)dx ≈ Ai f (xi ) a i=0 ML.4.9. GAUSSIAN QUADRATURE 89 have a maximal degree of precision. Let (ϕn )n∈N be the sequence of orthogonal polynomials associated with the weight function ρ, where degree(ϕn ) = n. Let x0 , . . . , xn be the roots of the polynomial ϕn+1 and L(x0 , . . . , xn ; f ) be the Lagrange interpolating polynomial. THEOREM ML.4.9.1 The quadrature formula Z b a f (x)ρ(x)dx ≈ n X Ai f (xi ) = i=0 Z b L(x0 , . . . , xn ; f )(x)ρ(x)dx a has degree of precision 2n + 1. Proof. Let P a polynomial of degree at most 2n + 1. We divide the polynomial P to the polynomial ϕn+1 : P = Qϕn+1 + r, where degree(Q) ≤ n and degree(r) ≤ n. We have: L(x0 , . . . , xn ; P ) = L(x0 , . . . , xn ; Qϕn+1 ) +L(x0 , . . . , xn ; r) = r, | {z } 0 Z R(P ) = Z = Z = b (P (x) − L(x0 , . . . , xn ; P )(x))ρ(x)dx a b (P (x) − r(x))ρ(x)dx a b Q(x)ϕn+1 (x)ρ(x)dx = 0. a It follows that the degree of precision is at least 2n + 1. On the other hand, from the equalities R(ϕ2n+1 ) = = Z Z b a 0 b a (ϕ2n+1 (x) − L(x0 , . . . , xn ; ϕ2n+1 )(x)) ρ(x)dx {z } | ϕ2n+1 (x)ρ(x)dx > 0, we deduce that it cannot be greater than 2n + 1. The following result is concerning the coefficients of the Gaussian quadrature formula. THEOREM ML.4.9.2 The coefficients of the Gaussian quadrature formula are positive. 90 ML.4. ELEMENTS OF NUMERICAL INTEGRATION Proof. Let `i (x) = (x − x0 )(x − x1 ) ∙ ∙ ∙ (x − xn ) , x − xi i = 0, . . . , n. Taking into account the fact that the degree of the polynomials `2i is 2n, and that the Gaussian quadrature formula has degree of precision 2n + 1, we obtain Z b a `2i (x)ρ(x)dx Aj `2i (xj ) + 0 = Ai `2i (xi ), j=0 i.e., Ai = i = 0, . . . , n. = n X Rb a `2i (x)ρ(x)dx > 0, `2i (xi ) ML.5 Elements of Approximation Theory The study of approximation theory involves two general types of problems. One problem arises when a function is given explicitly but we wish to find a “simpler” type of function, such as a polynomial, that can be used to determine approximate values of the given function. The other problem in approximation theory is concerned with fitting functions to given data and finding the “best” function in a certain class that can be used to represent the data. ML.5.1 Discrete Least Squares Approximation Consider the problem of estimating the values of a function, given the values yi of the function at distinct points xi , i = 1, . . . , n. An approach for this problem would be to find a line that could be used as an approximating function. The problem of finding the equation y = ax + b of the best linear approximation in the absolute sense requires that values of a and b be found to minimize max |yi − (axi + b)|. 1≤i≤n This is commonly called a minimax problem. Another approach to determine the best linear approximation involves finding values of a and b to minimize n X i=1 |yi − (axi + b)|. This quantity is called the absolute deviation. The least squares method approach to this problem involves determining the best approximation line y = ax + b that minimize the least squares error: n X i=1 (yi − (axi + b))2 . 91 92 ML.5. ELEMENTS OF APPROXIMATION THEORY For a minimum to occur, it is necessary that n ∂ X (yi − (axi + b))2 0 = ∂a i=1 n X ∂ (yi − (axi + b))2 0= ∂b i=1 that is X n (yi − axi − b)xi = 0 i=1 n X (yi − axi − b) = 0 i=1 These equation simplify to the normal equations: X n n n X X 2 a x + b x = xi yi , i i i=n n i=1 n i=1 The solution to this system is i=1 X X x + b ∙ n = yi . a i n a= b= n X i=1 i=1 n X i=1 xi y i − n X xi n X i=1 i=1 n n X X n x2i − ( xi )2 i=1 i=1 x2i n X yi − i=1 n X n i=1 n X i=1 x2i − ( xi yi − n X yi , n X i=1 xi . xi ) 2 i=1 The general problem of approximating a set of data, {(xi , yi )}ni=1 , with an algebraic polynomial Pm (x) = m X a k xk , k=0 m < n − 1, using the least squares procedure is handled in similar manner. It requires choosing the constants a0 , . . . , am to minimize the least squares error E= n X i=1 For E to be minimized it is necessary that ∂E = 0, ∂aj (yi − Pm (xi ))2 . j = 0, . . . , m. ML.5.2. ORTHOGONAL POLYNOMIALS AND LEAST SQUARES APPROXIMATION 93 This gives m + 1 normal equations in m + 1 unknowns, aj , ( m n n X X X j+k ak xi = yi xji , j = 0, . . . , m; m < n − 1. k=0 i=1 i=1 The matrix of the previous system can be written as n X n X i=1 i=1 x0i ∙ ∙ ∙ xm i i=1 i=1 .. .. .. A= . . . X n n X xm ∙∙∙ x2m i i A = B ∙ BT , where x01 ∙ ∙ ∙ x0n B = ... . . . ... . xm ∙ ∙ ∙ xm 1 n Taking into account the fact that the Vandermonde type determinant x0 ∙ ∙ ∙ x 0 m+1 1 .. . . .. . . . m x 1 ∙ ∙ ∙ xm m+1 is different from zero, it follows that rank(B) = m + 1, hence A is nonsingular. So, the normal equations have a unique solution. ML.5.2 Orthogonal Polynomials and Least Squares Approximation Suppose that f ∈ C[a, b]. A polynomial of degree n Pn (x) = n X ak x k k=0 is required to minimize the error Z b a (f (x) − Pn (x))2 dx. Define E(a0 , . . . , an ) = Z b a (f (x) − n X ak xk )2 dx. k=0 A necessary condition for the parameters a0 , . . . , an to minimize E is that 94 ML.5. ELEMENTS OF APPROXIMATION THEORY or ( n X ak k=0 Z ∂E = 0, ∂aj b x j+k dx = a Z j = 0, . . . , n b xj f (x)dx, j = 0, . . . , n a The matrix of the above linear system, bj+k+1 − aj+k+1 j+k+1 j,k=0,...,n is known as the Hilbert matrix. It is an ill-conditioned matrix which can be considered as classic example for demonstrating round-off error difficulties. A different technique to obtain least squares approximation will be presented. Let us introduce the concept of a weight function and orthogonality. DEFINITION ML.5.2.1 An integrable function w is called a weight function on the interval [a, b] if w(x) > 0 for all x ∈ (a, b). DEFINITION ML.5.2.2 A set {ϕ1 , . . . , ϕn } is said to be an orthogonal set of functions over the interval [a, b] with respect to a weight function w if Z b 0, when j 6= k, ϕj (x)ϕk (x)w(x) dx = α > 0, when j=k k a If, in addition, αk = 1 for each k = 0, . . . , n, the set is said to be orthonormal. EXAMPLE ML.5.2.3 The set of functions {ϕ0 , . . . , ϕ2n−1 }, ϕ0 (x) = ϕk (x) = ϕn+k (x) = √1 2π √1 cos kx, π √1 sin kx, π k = 1, . . . , n k = 1, . . . , n − 1 is an orthonormal set on [−π, π] with respect to w(x) = 1. ML.5.3. RATIONAL FUNCTION APPROXIMATION EXAMPLE 95 ML.5.2.4 The set of Legendre polynomials, {P0 , P1 , . . . , Pn } where P0 P1 P2 ∙∙∙ = = = ∙∙∙ 1, x, x2 − ∙∙∙ 1 3 is orthogonal on [−1, 1] with respect to the weight function w(x) = 1. EXAMPLE ML.5.2.5 The set of Chebyshev polynomials, {T0 , T1 , . . . , Tn } Tn (x) = cos(n arccos x), x ∈ [−1, 1], is orthogonal on [−1, 1] with respect to the weight function w(x) = √ ML.5.3 1 1 − x2 . Rational Function Approximation A rational function r of degree N has the form r(x) = P (x) Q(x) where P and Q are polynomials whose degrees sum to N . Rational functions whose numerator and denominator have the same or nearly the same degree generally produce approximation results superior to polynomial methods for the same amount of computation effort. (This is based on the assumption that the amount of computation effort required for division is approximately the same as for multiplication). Rational functions also permit efficient approximation for functions which are not bounded near, but outside, the interval of approximation. ML.5.4 Padé Approximation Suppose r is a function of degree N = m + n and has the form P (x) p 0 + p 1 x + ∙ ∙ ∙ + p n xn r(x) = . = Q(x) q0 + q1 x + ∙ ∙ ∙ + qm xm The function r is used to approximate a function f on a closed interval containing zero. Q(0) 6= 0 implies q0 6= 0. Without loss of generality, we can assume that q0 = 1, for if is not the case we simply replace P by P/q0 and Q by Q/q0 . 96 ML.5. ELEMENTS OF APPROXIMATION THEORY The Padé approximation technique chooses N + 1 parameters so that f (k) (0) = r(k) (0), k = 0, . . . , N Padé approximation is an extension of Taylor polynomial approximation to rational functions. When m = 0, it is just the N th Maclaurin polynomial. Suppose f has the Maclaurin series expansion f (x) = ∞ X ai x i . i=0 We have f (x) − r(x) = ∞ X ai x i=0 i m X i=0 i qi x − Q(x) n X pi xi i=0 The condition f (k) (0) = r(k) (0), k = 0, . . . , N, is equivalent to the fact that f − r has a root of multiplicity N + 1 at zero. Consequently, we choose q1 , q2 , . . . , qm and p0 , p1 , . . . , pn such that (a0 + a1 x + ∙ ∙ ∙ )(1 + q1 x + ∙ ∙ ∙ + qm xm ) − (p0 + p1 x + ∙ ∙ ∙ + pn xx ) has no terms of degree less than or equal to N . The coefficient of xk in the previous expression is k X i=0 ai qk−i − pk , where we define: pk = 0, k = n + 1, . . . , N ; qk = 0, k = m + 1, . . . , N ; so that the rational function for Padé approximation results from ( k X i=0 ai qk−i = pk , k = 0, . . . , N. ML.5.5. TRIGONOMETRIC POLYNOMIAL APPROXIMATION EXAMPLE 97 ML.5.4.1 The Maclaurin series expansion for ex is 1+x+ x2 2 + ∙∙∙ + xn n! + ∙∙∙ To find the Padé approximation of degree two with n = 1 and m = 1 to ex , we choose p0 , p1 and q1 such that the coefficients of xk , k = 0, 1, 2, in the expression (1 + x + x2 2 )(1 + q1 x) − (p0 + p1 x) are zero. Expanding and collecting terms produces 1 − p0 + (1 + q1 − p1 )x + ( therefore 1 + q1 )x2 + 2 q1 2 x3 = 0 1 − p0 1 + q1 − p 1 = 0 = 0 1/2 + q1 The solution of this system is p0 = 1, p1 = The Padé approximation is r(x) = ML.5.5 1 1 , q1 = − . 2 2 2+x 2−x . Trigonometric Polynomial Approximation Let m, n be two given integers such that m > n ≥ 2. Suppose that a set of 2m paired data 2m−1 points {(xj , yj )}j=0 is given, where xj = −π + j π, m j = 0, . . . , 2m − 1 Consider the set of functions {ϕ0 , . . . , ϕm }, where = 12 ϕ0 (x) ϕk (x) = cos kx, ϕn+k (x) = sin kx, k = 1, . . . , n, k = 1, . . . , n − 1. The goal is to determine the trigonometric polynomial Sn , Sn (x) = 2n−1 X k=0 ak ϕk (x), 98 ML.5. ELEMENTS OF APPROXIMATION THEORY or n−1 a0 X (ak cos kx + an+k sin kx) + an cos nx, + Sn (x) = 2 k=1 to minimize 2m−1 X E(Sn ) = j=0 (yj − Sn (xj ))2 . In order to determine the constants ak , we use the fact that the system {ϕ0 , . . . , ϕ2m−1 } is orthogonal, i.e., 2m−1 k 6= l, 0, X ϕk (xj )ϕl (xj ) = j=0 m, k = l. THEOREM ML.5.5.1 The constants in the summation Sn (x) = a0 2 + n−1 X (ak cos kx + an+k sin kx) + an cos nx k=1 that minimize the least square sum E(a0 , . . . , a2n−1 ) = 2m−1 X j=0 are ak = m for each k = 0, . . . , n, an+k = for each k = 1, . . . , n − 1. 2m−1 1 X (yj − Sn (xj ))2 yj cos kxj j=0 2m−1 1 X m yj sin kxj j=0 Proof. By setting the partial derivatives of E with respect to ak to zero, we obtain so 2m−1 X ∂E 0= =2 (yj − Sn (xj )(−ϕk (xj )), ∂ak j=0 2m−1 X j=0 yj ϕk (xj ) = 2m−1 X j=0 Sn (xj )ϕk (xj ) ML.5.6. FAST FOURIER TRANSFORM = 2m−1 X 2n−1 X j=0 al ϕl (xj )ϕk (xj ) = l=0 99 2n−1 X al 2m−1 X j=0 l=0 ϕl (xj )ϕk (xj ) = m ∙ ak , k = 0, . . . , 2n − 1. ML.5.6 Fast Fourier Transform It was shown in Section ML.5.5 that the least squares trigonometric polynomial has the form m−1 X a0 (ak cos kx + bk sin kx) + am cos mx + Sm = 2 k=1 where 2m−1 1 X ak = yj cos kxj , m j=0 bk = 2m−1 1 X yj sin kxj , m j=0 k = 0, . . . , m, k = 0, . . . , m − 1. The polynomial Sn can be used for interpolation with only a minor modification. The approximation of many thousands of data points require multiplication and addition operations numbering in the millions. The round-off error associated with this number of calculation generally dominates the approximation. In 1965 J. W. Tukey, in a paper with J. W. Cooley, published in the Mathematics of Computation, introduced the important Fast Fourier Transform algorithm. They applied a different method to calculate the constants in the interpolating trigonometric polynomial. Their method reduces the number of calculations to thousands compared to millions for the direct technique. The method described by Cooley and Tukey has come to be known either as Cooley-Tukey Algorithm or the Fast Fourier Transform (FFT) Algorithm and led to a revolution in the use of interpolatory trigonometric polynomials. The method consists of organizing the problem so that the number of data points being used can be easily factored, particularly into powers of two. The interpolation polynomial has the form m−1 Sm (x) = a0 + am cos mx X (ak cos kx + bk sin kx). + 2 k=1 The nodes are given by xj = −π + and the coefficients are j π, j = 0, . . . , 2m − 1, m 2m−1 1 X yj cos kxj , ak = m j=0 k = 0, . . . , m, 100 ML.5. ELEMENTS OF APPROXIMATION THEORY bk = 2m−1 1 X yj sin kxj , m j=0 k = 0, . . . , m. Despite both constant b0 and bm are zero, they have been added to the collection for notational convenience. Instead of directly evaluating the constants ak and bk the FFT method compute the complex coefficients ck = 2m−1 X yj ei πjk m k = 0, . . . , 2m − 1. , j=0 We have 2m−1 2m−1 j 1 1 X 1 X i −iπk k(−π+ m π) = yi e = yj eikxj ck e m m j=0 m j=0 2m−1 1 X = yj (cos kxj + i sin kxj ) = ak + ibk . m j=0 Once the constants ck have been determinate, ak and bk can be recovered using the equalities ak + ibk = 1 ck e−iπk , m k = 0, . . . , 2m − 1 Suppose m = 2p for some positive integer p. For each k = 0, . . . , 2m − 1, we have ck + cm+k = 2m−1 X y j ( ei πjk m +e πj(m+k) m )= j=0 But 1 + eiπj = 2m−1 X yj ei πjk m (1 + eiπj ). j=0 0, 2, if j is odd, if j is even, so, replacing j by 2j in the index of the sum, we obtain ck + cm+k = 2 m−1 X y2j e i πjk m 2 . j=0 Likewise ck − cm+k = 2ei n πk m−1 X y2j+1 e i πjk m 2 . j=0 Therefore, we can recover ck , and cm+k , k = 0, . . . , m − 1, hence all ck can be determinate. ML.5.7. THE BERNSTEIN POLYNOMIAL ML.5.7 101 The Bernstein Polynomial Let f be a real valued function defined on the interval [0, 1]. DEFINITION ML.5.7.1 The polynomial Bn (f ) defined by Bn (f )(x) = n X n k=0 k (1 − x) n−k k x f k n , x ∈ [0, 1], is called the Bernstein polynomial related to the function f. The polynomials n pn,k (x) = (1 − x)n−k xk , k k = 0, . . . , n, are called the Bernstein polynomials. They were introduced in mathematics in 1912 by S. N. Bernstein [Lor53] and can be derived from the binomial formula n X n n (1 − x)n−k xk . 1 = (1 − x + x) = k k=0 It follows immediately from the definition of the Bernstein polynomials that they satisfy the recurrence relations pn,k (x) = (1 − x) pn−1,k (x) + x pn−1,k−1 (x), (5.7.1) n = 1, . . . ; k = 1, . . . , n, pn,k (x) nk − x = x(1 − x)(pn−1,k−1 (x) − pn−1,k (x)), (5.7.2) k = 0, . . . , n + 1, ([DL93, p.305]), and they form a partition of unity n X pn,k (x) = 1. (5.7.3) k=0 (For the purpose of this formulas pn,−1 (x) := pn,n+1 (x) := 0). For a function f ∈ C[0, 1], formula ( ML.5.7.1 ) produces a linear map f 7→ Bn f from C[0, 1] to Pn , the space of polynomials of degree at most n. It is positive (i.e., satisfies Bn f ≥ 0 if f ≥ 0) bounded with norm 1, because of ( 5.7.3 ). Using the translation operator T, T f (x) := f (x + 1/n), the identity operator I and the forward difference operator Δ = T − I, we can write Bn (f, x) = (x ∙ T + (1 − x) ∙ I)n f (0). (5.7.4) Consequently, we obtain the Taylor expansion Bn (f, x) = (I + x ∙ Δ)n f (0), (5.7.5) 102 ML.5. ELEMENTS OF APPROXIMATION THEORY i.e., Bn (f, x) = n X n i=0 It follows that Bn (Pk ) ⊆ Pk , i Δi f (0) xi . (5.7.6) n, k = 0, . . . It is instructive to compute Bn ek , ek (x) = xk , (k = 0, . . .). By using ( 5.7.6 ) we obtain Bn ek (x) = = = = = k X n i=0 k X i xi Δi ek (0) n (T − I)i ek (0)xi i i=0 k X i X n i (−1)j T j−i ek (0)xi i j i=0 j=0 k i X n X i (−1)i−j T j ek (0)xi i j=0 j i=0 j i k X X i i i−j n (−1) xi , i j n i=0 j=0 k = 0, . . . , n. We have obtained Bn ek = k X mij ei , k = 0, . . . , n, i−j j n i i , n i j i=0 where mij := i X (−1) j=0 i i, j = 0, . . . , n (see = 0 if j > i). j The linear operator Bn : Pn → Pn , can be Bn e0 m00 .. .. . = . Bn en mn0 written in matrix form ∙ ∙ ∙ m0n e0 .. ∙ .. . ... . . en ∙ ∙ ∙ mnn (5.7.7) Following similar analysis, one can easily compute Bn (ek ), ek (x) = xk , for k = 0, 1, 2 : Bn (e0 , x) Bn (e1 , x) Bn (e2 , x) Bn (e2 , x) − e2 (x) = = = = 1, x, n−1 2 x + n 1 n x(1−x) . n x, ML.5.7. THE BERNSTEIN POLYNOMIAL 103 By differentiating ( 5.7.5 ) r-times, we can express the derivatives of the Bernstein polynomial in terms of finite differences, Bn(r) (f, x) = n(n − 1) . . . (n − r + 1) Δr (I + x ∙ Δ)n−r f (0), (5.7.8) r = 0, . . . , n, hence, Bn(r) (f, x) = n(n − 1) . . . (n − r + 1) n−r X pn−r,k (x) Δr T k f (0), (5.7.9) k pn−r,k (x) Δ f , n (5.7.10) k=0 r = 0, . . . , n, that is, Bn(r) (f, x) = n(n − 1) . . . (n − r + 1) n−r X k=0 r r = 0, . . . , n. From ( 5.7.8 ), we also obtain Bn(r) (f, x) = n(n − 1) . . . (n − r + 1) n−r X n−r k=0 k xk Δr+k f (0), (5.7.11) r = 0, . . . , n. For a function f ∈ C[0, 1], the modulus of continuity is defined by: sup ω(f, t) := sup 0<h≤t 0≤x≤1−h |f (x + h) − f (x)|, 0 ≤ t ≤ 1. One can easily prove that ω(f, α t) ≤ (1 + α) ω(f, t), 0 ≤ t ≤ 1, α ≥ 0. One of the most used property of the Bernstein operators is the approximation property. THEOREM ML.5.7.2 If f : [0, 1] → R is bounded on [0, 1] and continuous at some point x ∈ [0, 1], then lim Bn (f, x) = f (x). n→∞ Proof. Since f is bounded, then there exists a constant M such that |f (x)| ≤ M, ∀ x ∈ [0, 1]. Let ε > 0. Since f is continuous at x there exists a δ > 0 such that |x − y| < δ Let nε be a number such that ε implies |f (x) − f (y)| < . 2 ε M < 2 2nδ 2 if n > nε . 104 ML.5. ELEMENTS OF APPROXIMATION THEORY We have n X i pn,i (x) f − f (x) |Bn f (x) − f (x)| = n i=0 n X i = ≤ − f (x) pn,i (x) f n i=0 X X + = | ni −x|≥δ | ni −x|<δ X n n X (x − ni )2 ε i < pn,i (x) f − f (x) + pn,i (x) 2 δ n 2 i=0 i=0 ≤ ε M 2M x(1 − x) ε + ≤ + < ε. 2 2 nδ 2 2nδ 2 Consequently |Bn f (x) − f (x)| < ε, ∀ n > nε . Another result related to the approximation property of the Bernstein operator is the following. THEOREM ML.5.7.3 [Pop42] For all f ∈ C[0, 1], the following relation is valid 3 1 |f (x) − Bn (f, x)| ≤ ω f, √ , x ∈ [0, 1]. 2 n Proof. We have |f (x) − Bn (f, x)| = |Bn (1, x) ∙ f (x) − Bn (f, x)| n X n k (1 − x)n−k xk f (x) − f = n k k=0 n X n k n−k k (1 − x) x ω f, x − . ≤ n k k=0 On the other hand, we have k −1 k ω(f, δ). ω f, x − ≤ 1 + x − δ n n ML.5.7. THE BERNSTEIN POLYNOMIAL 105 By using Schwartz inequality, we obtain n X k n n−k k (1 − x) x x − n k k=0 v v u n u n 2 uX n uX n k n−k k ≤ t ∙t (1 − x) x x − (1 − x)n−k xk n k k k=0 k=0 r x(1 − x) √ 1 ∙ 1≤ √ . ≤ n 2 n Consequently, |f (x) − Bn (f, x)| ≤ With δ = n −1/2 n X n k=0 k (1 − x) n−k k x 1 1 + √ δ −1 2 n ω(f, δ). , the proof is concluded. Let r , p ∈ N. One of most important shape preserving properties of the Bernstein operator is presented in the following theorem. THEOREM ML.5.7.4 If f : [0, 1] → R, is convex of order r, then Bn f is also convex of order r. It is the shape preserving property that makes from Bernstein polynomials one of the most used tool in CAGD. By using ( 5.7.10 ), the proof is obvious. THEOREM ML.5.7.5 (i) If f ∈ C p [0, 1], then the sequence (Bn(i) (f ))∞ (i = n=0 converges uniformly to f 0, . . . , p). THEOREM ML.5.7.6 (Voronovskaja 1932, [DL93, p. 307]) If f is bounded on [0, 1], differentiable in some neighborhood of a point x ∈ [0, 1], and has derivative f 00 (x), then lim n((Bn f )(x) − f (x)) = n→∞ THEOREM x(1 − x) 2 f 00 (x). ML.5.7.7 ([Ara57]) If f : [0, 1] → R is continuous, then for every x ∈ [0, 1] there exist three distinct points x1 , x2 , x3 ∈ [0, 1] such that Bn (f, x) − f (x) = x(1 − x) n [x1 , x2 , x3 ; f ]. 106 ML.5. ELEMENTS OF APPROXIMATION THEORY Taking into account its approximation properties, the Bernstein polynomial sequence are often used as an approximation tool. ML.5.8 Bézier Curves In the forties, the automobile industry wanted to develop a more modern curved shape for cars. Paul de Faget de Casteljau, at Citroën (1959), and Pierre Etienne Bézier, at Rénault (1962), had developed what is now known as the theory of Bézier curves and surfaces. Because de Casteljau never published his results, many of the entities in this theory (and of course the theory itself) are named after Bézier. Let (M0 , . . . , Mn ) be an ordered system of points in Rp . DEFINITION ML.5.8.1 The curve defined by B[M0 , . . . , Mn ](t) := n X n k=0 k (1 − t)n−k tk Mk , t ∈ [0, 1]. is called the Bézier curve related to the system (M0 , . . . , Mn ). The curve B[M0 , . . . , Mn ] mimes the form of the system (M0 , . . . , Mn ). The polygon formed by M0 , . . . , Mn is called the Bézier polygon or control polygon. Similarly, the polygon vertices Mi are called control points or Bézier points. As P. E. Bézier realized during 1960s it became an indispensable tool in computer graphics. The Bézier curve has the following important properties: • It begins at M0 , heading in the direction M0 , M1 ; • It ends at Mn , heading in the direction Mn−1 Mn ; • It stays entirely within the convex hull of {M0 , . . . , Mn }. Let us present a recursive method used to determine the Bézier B[M0 , . . . , Mn ] curve. The Popoviciu-Casteljau Algorithm presented by Tiberiu Popoviciu ([Pop37, p.39, (29)]) at classes (1933) (many years before Paul de Faget de Casteljau [dC59]), and described below, is probably the most fundamental one in the field of curve and surface design. Let M0,i (t) = Mi , Mr,i (t) = (1 − t)Mr−1,i (t) + t Mr−1,i+1 (t) (r = 1, . . . , n; i = 0, . . . , n − r). The Mn,0 (t) is the point with parameter value t on the Bézier curve B[M0 , . . . , Mn ]. The following picture presents an example with n = 3, t = 1/3. ML.5.9. THE METAFONT COMPUTER SYSTEM M1 M1,1 ( 13 ) • • M1,0 ( 13 ) • 107 M2 1 • •M2,1 ( 3 ) • M1,2 ( 13 ) • • M3,0 ( 1 ) 3 M2,0 ( 13 ) • M0 • M3 Let M1 , M2 , M3 ∈ R3 . Another Bézier type curve is defined by Mircea Ivan in [Iva84] I α [M1 , M2 , M3 ](t) = (1 − t)α M1 + (1 − tα − (1 − t)α )M2 + tα M3 , t ∈ [0, 1] and α > 0. For a suitable choice of control points, we obtain a continuous differentiable piecewise I α curve which closely mimes the control polygonal line. ML.5.9 The METAFONT Computer System Sets of characters that are used in typesetting are traditionally known as fonts or type. Albrecht Dürer and other Renaissance men attempted to established mathematical principles of type design, but the letters they come up with were not especially beautiful. Their methods failed because they restricted themselves to “ruler and compass” constructions, which cannot adequately express the nuances of good calligraphy [Knu86, p. 13]. METAFONT [Knu86], a computer system for the design of fonts suited to raster-based modern printing equipment, gets around this problem by using powerful mathematical techniques. The basic idea is to start with four control points M0 , M1 , M2 , M3 and consider the Bézier curve B3 [M0 , M1 , M2 , M3 ]. The fonts used to print this book were created by using Bézier curves. Curves traced out by Bernstein polynomials of degree 3 are often called Bézier cubics. We obtain rather different curves from the same starting points if we number the points differently: 4 3 4 2 2 4 1 2 1 3 1 3 108 ML.5. ELEMENTS OF APPROXIMATION THEORY The four-point method is good enough to obtain satisfactory approximation to any curve we want, provided we break into enough segments and give four suitable control points for each segment. 2 3 1 4 5 7 6 8 ML.6 Integration of Ordinary Differential Equations ML.6.1 Introduction Equations involving one or several derivatives of a function are called ordinary differential equations. A problem involving ordinary differential equations is not completely specified by its equations. Even more important in determining how to solve the problem numerically is the nature of the problem’s boundary conditions. Boundary conditions can be divided into two broad categories: • Initial value problems; • Two-point boundary value problems. Let f : [a, b] × [c, d] → R be a continuous function possessing continuous partial derivatives of order two. We will consider the following initial value problem: ( dy = f (t, y) t ∈ [a, b], (6.1.1) dt y(a) = y0 , which has a unique solution y ∈ C 2 [a, b]. Define b−a h= , ti = a + ih, i = 0, . . . , n, n where n is a positive integer. Some numerical methods for solving the initial value problem ( 6.1.1 ) will be presented. ML.6.2 The Euler Method Expand the solution y of the problem ( 6.1.1 ) into a Taylor series, y(ti+1 ) = y(ti ) + hy 0 (ti ) + 109 h2 00 y (ξi ), 2 110 ML.6. INTEGRATION OF ORDINARY DIFFERENTIAL EQUATIONS ti < ξi < ti+1 , i = 0, . . . , n − 1. But y 0 (t) = f (t, y(t)), and hence y(ti+1 ) = y(ti ) + hf (ti , y(ti )) + h2 00 y (ξi ). 2 The Euler Method can be obtained by neglecting the remainder term expansion yi+1 = yi + hf (ti , yi ), i = 0, . . . , n − 1. h2 00 y (ξi ), in the above 2 (6.2.1) The Euler method is one of the oldest and best known numerical method for integrating differential equations. It can be improved in a number of ways. ML.6.3 The Taylor Series Method Suppose that equation ( 6.1.1 ) has a solution y ∈ C p+1 [a, b]. Also y(ti+1 ) is approximated by a Taylor polynomial hp (p) 0 y (ti ). y(ti ) + hy (ti ) + ∙ ∙ ∙ + p! Use the relations: y 0 (t) = f (t, y(t)), y 00 (t) = ft0 (t, y(t)) + fy0 (t, y(t)) ∙ f (t, y(t)), etc., and denote by Tp (t, v) the expression obtained by replacing y(t) by v in p−1 X hj+1 d j gp (t) = (f (t, y(t))). (j + 1)! dt j=0 We obtain a Taylor method of order p : i = 0, . . . , n − 1. yi+1 = yi + Tp (ti , yi ), For p = 1, we have g1 (t) = h f (t, y(t)), T1 (t, v) = h f (t, v), hence we obtain the Euler’s method: yi+1 = yi + h f (ti , yi ), i = 0, . . . , n − 1. For p = 2, we get g2 (t) = h f (t, y(t)) + = h f (t, y(t)) + h2 d f (t, y(t)) 2 dt h2 0 ft (t, y(t)) + fy0 (t, y(t)) ∙ f (t, y(t)) , 2 (6.3.1) ML.6.4. THE RUNGE-KUTTA METHOD 111 hence h2 0 ft (t, v) + fy0 (t, v) ∙ f (t, v) , T2 (t, v) = h f (t, v) + 2 and we obtain the Taylor method of order two: yi+1 = yi + h f (ti , yi ) + i = 0, . . . , n − 1. ML.6.4 h2 0 ft (ti , yi ) + fy0 (ti , yi ) ∙ f (ti , yi ) , 2 The Runge-Kutta Method Let y be a solution of equation ( 6.1.1 ). The constants a1 , . . . , ap , b1 , . . . , bp−1 , c11 , c21 , c22 . . . , cp−1,p−1 , given in the relations: k1 = h f (ti , y(ti )), k2 = h f (ti + b1 h, y(ti ) + c11 k1 ), ... ... kp = h f (ti + bp−1 h, y(ti ) + cp−1,1 k1 + ∙ ∙ ∙ + cp−1,p−1 kp−1 ), will be determined such that the functions: h 7→ y(ti + h), and h 7→ y(ti ) + a1 k1 + ∙ ∙ ∙ + ap kp , have the same Taylor approximation of order o(hp ). For example, when p = 2, we need only to find the constants a1 , a2 , b1 , c11 , given in the relations: such that the two functions: k1 = h f (ti , y(ti )), k2 = h f (ti + b1 h, y(ti ) + c11 k1 ), h 7→ y(ti + h), and h 7→ y(ti ) + a1 k1 + a2 k2 , have the same Taylor approximation of order o(h2 ). The approximation of order o(h2 ) of the function y(ti+1 ) = y(ti + h) is 2 y(ti ) + y 0 (ti ) h + y 00 (ti ) h2 2 = y(ti ) + f (ti , y(ti )) h + (ft0 (ti , y(ti )) + fy0 (ti , y(ti )) f (ti , y(ti ))) h2 , and the approximation of order o(h2 ) of the function y(ti ) + a1 k1 + a2 k2 = y(ti ) + a1 h f (ti , y(ti )) + a2 h f (ti + b1 h, y(ti ) + c11 k1 ), 112 ML.6. INTEGRATION OF ORDINARY DIFFERENTIAL EQUATIONS is y(ti ) + a1 h f (ti , y(ti )) +a2 h(f (ti , y(ti )) + b1 h ft0 (ti , y(ti )) + c11 k1 fy0 (ti , y(ti )) f (ti , y(ti ))) = y(ti ) + (a1 + a2 )f (ti , y(ti ))) h 0 +(a2 b1 ft (ti , y(ti )) + a2 c11 fy0 (ti , y(ti )) f (ti , y(ti ))) h2 . Consequently, we obtain the system hence, by taking a1 = 12 , we obtain: a1 + a 2 = 1 a2 b1 = 12 , a2 c11 = 12 1 a2 = , 2 b1 = 1, c11 = 1. The Runge-Kutta algorithm of order two can be defined by the equations: y0 = y(t0 ) = h f (ti , yi ) k1 = h f (ti + h, yi + k1 ) k2 yi+1 = yi + 12 (k1 + k2 ) i = 0, . . . , n − 1. The classical Runge-Kutta method (that of order four), can be defined by the following equations: ML.6.5 y0 = y(t0 ) = h f (ti , yi ) k1 = h f (ti + 12 h, yi + 12 k1 ) k2 = h f (ti + 12 h, yi + 12 k2 ) k3 = h f (ti + h, yi + k3 ) k 4 yi+1 = yi + 16 (k1 + 2k2 + 2k3 + k4 ) i = 0, . . . , n − 1. The Runge-Kutta Method for Systems of Equations Consider the system of differential equations dx1 = f1 (t, x1 , . . . , xn ) dt .. .. . . . dxn = f (t, x , . . . , x ) n 1 n dt with initial conditions: x1 (t0 ) = x01 , . . . , xn (t0 ) = x0n . Using the notations F = [f1 , . . . , fn ], X = [x1 , . . . , xn ], system ( 6.5.1 ) can be written in the form dX = F (t, X), dt (6.5.1) ML.6.5. THE RUNGE-KUTTA METHOD FOR SYSTEMS OF EQUATIONS with the initial condition X(t0 ) = X0 = [x01 , . . . , x0n ]. The Runge-Kutta method of order four for system ( 6.5.1 ) is: K1 K2 K3 K 4 Xi+1 X0 = X(t0 ) = = = = = h F (ti , Xi ) h F (ti + 12 h, Xi + 12 K1 ) h F (ti + 12 h, Xi + 12 K2 ) h F (ti + h, Xi + K3 ) Xi + 16 (K1 + 2K2 + 2K3 + K4 ) i = 0, . . . , n − 1. 113 114 ML.7 Integration of Partial Differential Equations ML.7.1 Introduction Partial differential equations (PDE) are at the heart of many computer analysis and simulations of continuous physical systems. The intend of this chapter is to give the briefest possible useful introduction. We limit our treatment to partial differential equations of order two. A linear partial differential equation of order two has the following form A ∂ 2u ∂u ∂ 2u ∂ 2u ∂u + 2B +D + C +E + F u + G = H, 2 2 ∂x ∂x∂y ∂y ∂x ∂y (7.1.1) where the coefficients A, B, . . . , H are functions of x and y. Equation ( 7.1.1 ) is called homogeneous when H = 0. Otherwise it is called nonhomogeneous. For a given domain, in most mathematical books, the partial differential equations are classified, on the basis of their discriminant Δ = B 2 − AC, into three categories: • elliptic, if Δ < 0; • parabolic, if Δ = 0; • hyperbolic, if Δ > 0. A general method to approximate the solution is the grid method. The first step is to choose integers n, m, and step sizes h, k such that h= b−a , n k= d−c . m Partitioning the interval [a, b] into n equal parts of width h and the interval [c, d] into m equal parts of width k provides a grid by drawing vertical and horizontal lines through the points with coordinates (xi , yj ), where xi = a + ih, yj = c + jk, 115 116 ML.7. INTEGRATION OF PARTIAL DIFFERENTIAL EQUATIONS i = 0, . . . , n; j = 0, . . . , m. The lines x = xi , y = yj , are called grid lines, and their intersections are the mesh points of the grid. ML.7.2 Parabolic Partial-Differential Equations Consider a particular case of the heat or diffusion parabolic partial differential equation ∂u ∂ 2u (x, t), 0 < x < l, 0 < t < T (x, t) = ∂t ∂x2 subject to conditions u(0, t) = u(l, t) = 0, 0<t<T u(x, 0) = f (x), 0 ≤ x ≤ l, One of the algorithms used to approximate the solution of this equation is the CrankNicolson algorithm. It is an implicit algorithm defined by the relations: ui,j+1 − uij ∂u (ih, jk) ≈ , ∂t k ∂2u ui+1,j+1 − 2ui,j+1 + ui−1,j+1 + ui+1,j − 2uij + ui−1,j (ih, jk) ≈ . 2 ∂x 2h2 By letting r = hk2 , we obtain the system: 2(1 + r)ui,j+1 − rui−1,j+1 − rui+1,j+1 = rui−1,j + 2(1 − r)uij + rui+1,j . The values in the nodes of the line “j + 1” are calculated using the already computed values in the line “j”. ML.7.3 Hyperbolic Partial Differential Equations For this type of PDE, we consider the numerical solution of particular case of the wave equation: ∂ 2u ∂ 2u (x, t) = (x, t), 0 < x < l, 0 < t < T, ∂t2 ∂x2 subject to conditions u(0, t) = u(l, t) = 0, 0 < t < T, u(x, 0) = f (x), 0 ≤ x ≤ l, ∂u (x, 0) = g(x), 0≤x≤l ∂t Using the finite difference method consider ui+1,j − 2uij + ui−1,j ∂ 2u (ih, jk) = , 2 ∂x h2 ∂ 2u ui,j+1 − 2uij + ui,j−1 (ih, jk) = , ∂y 2 k2 k2 hence, letting r = 2 , we obtain an explicit algorithm h ui,j+1 = −ui,j−1 + 2(1 − r)uij + r(ui+1,j + ui−1,j ). ML.7.4. ELLIPTIC PARTIAL DIFFERENTIAL EQUATIONS ML.7.4 117 Elliptic Partial Differential Equations The approximate solution of the Poisson elliptic partial differential equation ∂ 2u ∂ 2u (x, y) + (x, y) = f (x, y), ∂x2 ∂y 2 subject to boundary conditions u(x, c) = f (x), u(x, d) = g(x), u(a, y) = f (y), u(b, y) = g(y), x ∈ [a, b], y ∈ [c, d] x ∈ [a, b], y ∈ [c, d]. can be found by taking h = k, ∂2u ui+1,j − 2uij + ui−1,j (ih, jh) = , ∂x2 h2 ui,j+1 − 2uij + ui,j−1 ∂2u (ih, jh) = , 2 ∂y h2 Denoting r = k2 , we obtain an implicit algorithm h2 ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − 4uij = fij . 118 ML.7 Self Evaluation Tests ML.7.1 Tests Let x0 , . . . , xn be distinct points on the real axis, f be a function defined on these points and `(x) = (x − x0 ) . . . (x − xn ). TEST ML.7.1.1 Let P be a polynomial. The remainder obtained by dividing the polynomial P by the polynomial (X − x0 ) . . . (X − xn ) is: (A) The Lagrange Polynomial L(x0 , . . . , xn ; P ); (B) The Lagrange Polynomial L(x0 , . . . , xn ; (X − x0 ) . . . (X − xn )); (C) The Lagrange Polynomial L(x0 , . . . , xn ; X n ); (D) The Lagrange Polynomial L(x0 , . . . , xn ; X n P ) TEST ML.7.1.2 The polynomial L(x0 , . . . , xn ; X n+1 ) is: (A) X n+1 ; (B) X n+1 − (X − x0 )(X − x1 ) . . . (X − xn ); (C) X n ; (D) 0; 119 120 TEST ML.7. SELF EVALUATION TESTS ML.7.1.3 Let Pn : R → R be a polynomial of degree at most n, n ∈ N. Prove the decomposition in partial fractions Pn (x) (x − x0 ) . . . (x − xn ) TEST = n X k=0 1 x − xi ∙ P (xi ) `0 (xi ) . ML.7.1.4 Calculate the sum n X i=0 xki (xi − x0 ) . . . (xi − xi−1 )(xi − xi+1 ) . . . (xi − xn ) , k = 0, . . . , n. TEST ML.7.1.5 Calculate [x0 , . . . , xn ; xn+1 ]. TEST ML.7.1.6 Calculate [x0 , . . . , xn ; (X − x0 ) . . . (X − xk )], k ∈ N. TEST ML.7.1.7 Prove that L(x0 , . . . , xn ; L(x1 , . . . , xn ; f )) = L(x1 , . . . , xn ; L(x0 , . . . , xn ; f )). TEST ML.7.1.8 Prove that, for x ∈ / {x0 , . . . , xn }, f (t) L(x0 , . . . , xn ; f )(x) . x0 , . . . , x n ; = x−t t (x − x0 ) . . . (x − xn ) ML.7.1. TESTS TEST 121 ML.7.1.9 If f (x) = 1 x and 0 < x0 < x1 < ∙ ∙ ∙ < xn , then [x0 , . . . , xn ; f ] = TEST (−1)n x0 . . . x n . ML.7.1.10 For x > 0, we define the function H(x) = 1 + ∞ X k=1 1 (x + 2) . . . (x + k + 1) . Show that a) H(x) = e(x + 1)J (x), where J (x) = 1 b) e x+2 ≤ H(x) ≤ TEST Z 1 e−t tx dt. 0 e+x+1 . x+2 ML.7.1.11 a) Show that the following quadrature formula Z 1 1 3 2 f (x)dx = f (0) + f + R(f ) 4 4 3 0 has the degree of exactness 2. b) Let f be a function from C 1 [0, 1]. Show that there exists a point θ = θ(f ) such that 1 2 R(f ) = θ, θ, 0, ; f . 36 3 122 ML.7. SELF EVALUATION TESTS TEST ML.7.1.12 Show that the following quadrature formula Z b b−a a + 2b f (x)dx = f (a) + 3f + R(f ) 4 3 a has the degree of exactness 2 and for f ∈ C 3 [a, b], there exists c = c(f ) ∈ [a, b] such that (b − a)4 000 f (c). R(f ) = 216 TEST ML.7.1.13 Let f ∈ C 3 [a, b] and let us consider the following quadrature formula Z b a n−1 (3k + 2)(b − a) b−a X k(b − a) + 3f a + . f (x)dx = f a+ 4n k=0 n 3n Show that lim n3 Rn (f ) = n→∞ TEST (b − a)3 216 [f 00 (b) − f 00 (a)] . ML.7.1.14 Let Π+ 2n be the set of all polynomials (2k + 1)π cos 2n of degree 2n which are positive on the set k = 0, 1, ..., n − 1 and having the leading coefficient 1. Using the gaussian quadrature formula: Z 1 −1 n−1 X f (x) (2k + 1)π + Rn (f ), dx = Ak f cos √ 2n 1 − x2 k=0 show that inf + P ∈Π2n Z 1 −1 π P (x) dx = 2n−1 . √ 2 1 − x2 ML.7.1. TESTS TEST 123 ML.7.1.15 Find a quadrature formula of the following form Z 1 f (x)dx = Af (0) + Bf (x1 ) + Cf (x2 ) + R(f ), x1 6= x2 , x1 , x2 ∈ (0, 1), 0 with maximum degree of exactness. TEST ML.7.1.16 Find a quadrature formula of the form Z 1 f (x)dx 0 1 1 1 1 0 0 − x1 + f + x1 − x1 − f + x1 + c2 f = c1 f 2 2 2 2 +R(f ), f ∈ C 1 [0, 1] with maximum degree of exactness. TEST ML.7.1.17 Let Pn (x) = xn + a1 xn−1 + ∙ ∙ ∙ + an a polynomial with real coefficients. Show that Z 1 (n!)5 2 . Pn (x)dx ≥ [(2n)!]2 (2n + 1) 0 For what polynomial does the inequality above become equality? 124 ML.7.2 ML.7. SELF EVALUATION TESTS Answers to Tests TEST ANSWER ML.7.1.1 Answer (A). Indeed, let P (x) = Q(x) (x − x0 ) . . . (x − xn ) + R(x). We have R(xi ) = P (xi ), i = 0, . . . , n. Since the degree of the polynomial R is at most n, we get R = L(x0 , . . . , xn ; P ). TEST ANSWER ML.7.1.2 Answer (B). Indeed, we have X n+1 = 1 ∙ (X − x0 )(X − x1 ) . . . (X − xn ) +X n+1 − (X − x0 )(X − x1 ) . . . (X − xn ). By Test ( ML.7.1.1 ), we obtain L(x0 , . . . , xn ; X n+1 ) = X n+1 − (X − x0 )(X − x1 ) . . . (X − xn ). TEST ANSWER ML.7.1.3 Since the Lagrange operator L(x0 , . . . , xn ; ∙) preserves polynomials with degree at most n, with `(x) = (x − x0 ) . . . (x − xn ), we have Pn (x) `(x) TEST ANSWER = L(x0 , . . . , xn ; Pn )(x) `(x) n X 1 x − xi k=0 ∙ P (xi ) `0 (xi ) . ML.7.1.4 n X i=0 xki (xi − x0 ) . . . (xi − xi−1 )(xi − xi+1 ) . . . (xi − xn ) = [x0 , . . . , xn ; X k ] ( 0, if 0 ≤ k ≤ n − 1; = 1, if k = n. See( 3.3.2 ). = ML.7.2. ANSWERS TO TESTS TEST ANSWER 125 ML.7.1.5 We have: 0 = = = = [x0 , . . . , xn ; (X − x0 ) . . . (X − xn )] [x0 , . . . , xn ; X n+1 − (x0 + ∙ ∙ ∙ + xn )X n + ∙ ∙ ∙] [x0 , . . . , xn ; X n+1 ] − (x0 + ∙ ∙ ∙ + xn )[x0 , . . . , xn ; X n ] + 0 [x0 , . . . , xn ; X n+1 ] − (x0 + ∙ ∙ ∙ + xn ) ∙ 1 Consequently [x0 , . . . , xn ; X n+1 ] = x0 + ∙ ∙ ∙ + xn . TEST ANSWER ML.7.1.6 For k < n, the degree of the polynomial (X − x0 ) . . . (X − xk−1 ) is at most n − 1, hence [x0 , . . . , xn ; (X − x0 ) . . . (X − xk−1 )] = 0. For k = n, [x0 , . . . , xn ; (X − x0 ) . . . (X − xn−1 )] = [x0 , . . . , xn ; X n − ∙ ∙ ∙] = [x0 , . . . , xn ; X n ]−0 = 1. For k > n, the polynomial (X − x0 ) . . . (X − xk−1 ) takes zero value for X = x0 , . . . xn , hence [x0 , . . . , xn ; (X − x0 ) . . . (X − xk−1 )] = 0. 126 ML.7. SELF EVALUATION TESTS TEST ANSWER ML.7.1.7 Since the degree of the polynomial L(x1 , . . . , xn ; f ) is at most n − 1, we have L(x0 , . . . , xn ; L(x1 , . . . , xn ; f )) = L(x1 , . . . , xn ; f ). Consider the polynomial P = L(x1 , . . . , xn ; L[x0 , . . . , xn ; f ]). We have: P (xi ) = L(x1 , . . . , xn ; L[x0 , . . . , xn ; f ])(xi ) i = 1, . . . , n = L[x0 , . . . , xn ; f ](xi ), i = 1, . . . , n, = f (xi ), hence P = L(x1 , . . . , xn ; f ). TEST ANSWER ML.7.1.8 With `(x) = (x − x0 ) . . . (x − xn ), we have L(x0 , . . . , xn ; f )(x) = hence L(x0 , . . . , xn ; f )(x) `(x) = n X `(x) x − xi k=0 f (xi ) x−xi 0 (x ) ` i k=0 n X ∙ f (xi ) `0 (xi ) = x0 , . . . , x n ; , f (t) x−t . t ML.7.2. ANSWERS TO TESTS TEST ANSWER 127 ML.7.1.9 By using problem ( ML.3.12.8 ), we have (−1)n 1 L(x0 , . . . , xn ; 1)(0) = x0 , . . . , x n ; =− . t t (0 − x0 ) . . . (0 − xn ) x0 . . . x n Solution 2. We write the obvious equality (t − t0 ) . . . (t − xn ) x0 , . . . , x n ; =0 t in the form t . . . xn n n−1 n+1 0 = 0, x0 , . . . , xn ; t − (t0 + ∙ ∙ ∙ + tn )t + ∙ ∙ ∙ + (−1) t i.e., 1 − (t0 + ∙ ∙ ∙ + tn )0 + ∙ ∙ ∙ + 0 + (−1) n+1 t0 . . . x n x 0 , . . . , x n ; 1 t = 0. 128 ML.7. SELF EVALUATION TESTS TEST ANSWER ML.7.1.10 a) We have, 1 (x + 2)...(x + k + 1) = 1 Γ(x + 2)Γ(k) Γ(x + k + 2) (k − 1)! . (7.2.1) From (7.2.1), we get H(x) = 1 + = 1+ = 1+ ∞ X β(x + 2, k) k=1 ∞ X k=1 Z 1 0 1 (k − 1)! Z 1 (k − 1)! 1 t 0 k−1 (1 − t) et (1 − t)x+1 dt = 1 + e x+1 Z 1 dt = 1 + Z 0 ∞ 1X k=0 1 k! tk (1 − t)x+1 dt e−t tx+1 . 0 Integrating by parts, we obtain Z 1 −t x+1 1 −t x + (x + 1) e t dt or, H(x) = 1 + e −e t 0 0 Z 1 H(x) = e(x + 1) e−t tx dt. 0 b) The function e−t is a convex function. We consider the weight function w(t) = (x + 1)tx , t ∈ [0, 1] and the quadrature formula Z 1 x+1 w(t)f (t)dt = f + R(f ), x+2 0 with x ≥ 0 fixed. The above quadrature formula is a Gauss-type quadrature formula. Let f be a convex function. Then, R(f ) ≥ 0. For f (t) = e−t , we obtain x+1 1 H(x) ≥ e ∙ e− x+2 or H(x) ≥ e x+2 . For the second inequality we use the trapezoidal rule with the weight function given by w. We obtain Z 1 Z 1 x+1 −1 x+1 H(x) ≤ e 1 − (x + 1)t dt + e (x + 1)t dt 0 = e+x+1 x+2 0 . ML.7.2. ANSWERS TO TESTS TEST ANSWER 129 ML.7.1.11 a) We have R(e0 ) = R(e2 ) = Z Z 1 dx − 0 1 − 4 1 2 x dx − 0 3 4 34 49 = 0, R(e1 ) = = 0, R(e3 ) = b) For the beginning, we remark that 2 2 2 f (x) − f = x− x, ; f , 3 3 3 Z Z Z 0 1 0 32 xdx − =0 43 1 8 3 = . x3 dx − 27 4 36 1 0 1 x x− 2 3 dx = 0 (7.2.2) (7.2.3) From (7.2.2), we obtain Z 1 0 x x− 2 3 f (x)dx = Z 1 0 x x− 2 3 2 2 x, ; f dx. 3 Using the mean value theorem for integrals in (7.2.3), implies that there exists a point c = c(f ) ∈ [0, 1] such that Z 1 0 x x− 2 3 f (x)dx = 2 c, ; f 3 Z 1 0 x x− 2 3 2 dx = 1 36 2 c, ; f 3 Applying the mean value theorem for divided differences in (7.2.4), it follows that there exists a point θ = θ(f ) ∈ [0, 1] such that Z 1 1 0 2 f (x)dx = f (θ). x x− (7.2.4) 3 36 0 The quadrature formulais an interpolation type formula and therefore R(f ) = Z 1 2 2 x, 0, ; f dx. From (7.2.4), we get x x− 3 3 0 0 Z 1 2 2 1 2 x, 0, ; f dx = x, 0, ; f x x− . (7.2.5) 3 3 36 3 0 x=θ 0 2 2 On the other hand, we have x, 0, ; f = x, x, 0 ; f .From here, by using 3 3 2 1 θ, θ, 0, ; f . (7.2.5), we obtain R(f ) = 36 3 130 ML.7. SELF EVALUATION TESTS TEST ANSWER ML.7.1.12 A simple computation shows that R(ei ) = Z b a xi dx − b−a 4 for i = 0, 1, 2 and R(e3 ) = " ai + 3 a + 2b 3 i # = 0, (b − a)4 . 36 Let us consider the function g : [0, 1] → R defined by g(t) = f ((1 − t)a + bt). Using previous results (Test ML.7.1.11), we have 1 2 R(g) = θ, θ, 0, ; g . 36 3 Using the mean value theorem for divided differences, it follows that there exists a point c = c(f ) ∈ [0, 1] such that g 00 (c) 2 . θ, θ, 0, ; g = 3 6 We have and g 000 (t) = (b − a)3 f 000 ((1 − t)a + tb) Z or Z 1 g(t)dt = 0 1 0 f ((1 − t)a + tb)dt = 1 4 1 4 g(0) + 3g f (a) + 3f 2 3 + a + 2b 3 g 000 (c) 216 + (b − a)3 f 000 (e c) 216 If we make the change of variable x = (1 − t)a + tb, we get Z b c) (b − a)4 f 000 (e a + 2b b−a f (a) + 3f + . f (x)dx = 4 3 216 a . ML.7.2. ANSWERS TO TESTS TEST ANSWER 131 ML.7.1.13 Let xk be the points given by xk = a + k Then But, Z Z xk f (x)dx = b−a xk−1 4n b−a b f (x)dx = a n , k = 0, n. n Z X k=1 f (xk−1 ) + 3f xk f (x)dx. xk−1 xk−1 + 2xk 3 + (b − a)4 216n4 (7.2.6) Since ck,n ∈ [xk−1 , xk ], k = 1, n, n b−aX n f 000 (ck,n ) k=1 is a Riemann sum for the function f 000 (∙), relative to the division Δn : a = x0 < x1 < ... < xn = b. From (7.2.6), we get lim n3 Rn (f ) = n→∞ = = (b − a)3 216 (b − a) 3 216 (b − a)3 216 lim n→∞ Z b n b−aX n f 000 (ck,n ) . f 000 (ck,n ) k=1 f 000 (x)dx a [f 00 (b) − f 00 (a)] . 132 TEST ANSWER ML.7. SELF EVALUATION TESTS ML.7.1.14 The coefficients Ak , k = 0, n − 1 are positive. For P ∈ Π+ 2n , we have Z 1 P (x) dx ≥ Rn (P ). √ 1 − x2 −1 Equality in the above relation takes place if and only if P cos (2k+1)π = 0 for 2n 2 k = 0, n − 1. Since P ∈ Π+ 2n , it follows that P (x) = Tn (x), where Tn is the Chebysev polynomial relative to the weight w(x) = √ 1 2 , having the leading 1−x coefficient 1, i.e., Tn (x) = 1 cos(n arccos x). On the other hand, we have Rn (P ) = Rn x2n , for any P ∈ Π+ 2n . Therefore, Rn (P ) = Rn x2n = Rn Tn2 (x) . 2n−1 From the gaussian quadrature formula , we get Z 1 Z 1 Tn2 (x) cos2 (n arccos x) 2 dx = dx Rn Tn (x) = √ √ 2n−2 1 − x2 1 − x2 −1 −1 2 By making the change of variable x = cos t in the last integral, we get Z π 1 π 2 Rn Tn (x) = 2n−2 cos2 (nt)dt = 2n−1 . 2 2 0 So far, we have proved that Rn (P ) = for any P ∈ Π+ 2n . Therefore, Z 1 −1 π 22n−1 , P (x) π dx ≥ 2n−1 . √ 2 1 − x2 The last inequality together with the identity Z 1 T 2 (x) π dx = 2n−1 √n 2 1 − x2 −1 concludes our proof. ML.7.2. ANSWERS TO TESTS TEST ANSWER 133 ML.7.1.15 If the quadrature formula is of interpolation type, having the nodes 0, x1 , x2 , then its degree of exactness is at least 2. Let P be a polynomial of degree ≥ 3. Then P (x) − L2 (P ; 0, x1 , x2 )(x) = x(x − x1 )(x − x2 )Q(x), (7.2.7) where Q is a polynomial such that degree(Q) = degree(P ) − 3. From (7.2.7), we get, Z 1 R(P ) = x(x − x1 )(x − x2 )Q(x)dx. 0 If R(P ) = 0, then the maximum degree of Q is 1. In this case, x1 and x2 are the roots of the orthogonal polynomial of degree 2 with respect to the weight 6 3 function w(x) = x. This polynomial is l(x) = x2 − x + . The nodes x1 and 5 10 x2 are the roots of l(x), √ √ 6− 6 6+ 6 , x1 = . x1 = 10 10 The coefficients B, C are given by B = C = 1 x1 (x2 − x1 ) 1 x2 (x2 − x1 ) Z Z 1 x(x2 − x)dx = 0 1 0 x(x − x1 )dx = 16 + √ 6 36 √ 16 − 6 36 , . Since the quadrature formula has the degree of exactness 3, we must have R(l) = 0. Since, Z 1 R(l) = l(x)dx − Al(0), 0 it follows that A= R1 0 l(x)dx l(0) = 1 3 . The quadrature formula now reads √ √ √ ! Z 1 16 − 6 16 + 6 6− 6 1 + f f f (x)dx = f (0) + 3 36 10 36 0 6+ √ ! 6 10 + R(f ) (7.2.8) and has degree of exactness 4. Remark. The quadrature formula (7.2.8) is known as Bouzita’s quadrature formula. 134 ML.7. SELF EVALUATION TESTS TEST ANSWER ML.7.1.16 The following calculations hold R(e0 ) = 0 ⇔ 2c1 = 1, R(e1 ) = 0 ⇔ 2c1 = 1, 1 1 2 R(e2 ) = 0 ⇔ c1 + 2x1 − 4x1 c2 = , 2 3 1 1 − x21 − 6x1 c2 = . R(e3 ) = 0 ⇔ c1 1 − 3 4 4 We get c1 = 1 2 x21 − 4x1 c2 = , Since x41 R(e3 ) = 0 ⇔ 6x21 + − 4c2 x1 1 12 3 2 . + x21 = 11 80 , it follows that x1 and c2 are the solutions of the system 1 11 3 2 4 2 2 , x1 + 6x1 − 4c2 x1 + x1 = . x1 − 4x1 c2 = 12 2 80 For x1 , we get the equation x41 + 6x21 − x21 − 1 12 x21 + 3 2 = 11 80 , which has the unique solution x1 = It follows that √ 33 110 ∈ x21 − 1/12 0, 1 2 . 143 √ . 4x1 24 33 The quadrature has degree of exactness 4 and reads Z 1 f (x)dx 0 1 143 1 1 1 1 0 0 f f − x1 + f + x1 − x1 − f + x1 + √ = 2 2 2 2 2 24 33 +R(f ), c2 = with x1 = √ 33 . 110 = ML.7.2. ANSWERS TO TESTS TEST ANSWER 135 ML.7.1.17 Let us consider the gaussian formula of the form Z 1 f (x)dx = 0 n X Ak f (xk ) + R(f ), k=1 where xk , k = 1, ..., n are the roots of the Legendre polynomial ln (x) = (n!)2 dn (2n)! dxn [xn (x − 1)n ] . The quadrature formula above is exact for any polynomial of degree ≤ 2n − 1. Z 1 Pn2 (x)dx ≥ R Pn2 . Using Since the coefficients Ak are positive, we obtain 0 the fact that R(∙) is a linear functional, we obtain R Pn2 = R x2n + R Pn2 (x) − x2n . Since Pn2 (x) − x2n is a polynomial of degree that at most 2n − 1,2 it follows 2 2n 2 2n = 0 and therefore R Pn = R x = R ln (x) . We have R Pn (x) − x 2 Z 1 n d (n!) 2 R ln (x) = xn n [xn (x − 1)n ] dx. Performing n times integration by (2n)! 0 dx parts, gives Z 1 (n!)2 2 n! R ln (x) = xn (1 − x)n dx. (2n)! 0 Since Z 1 0 xn (1 − x)n dx = β(n + 1, n + 1) = = 2 (x) = we obtain R ln holds if and only if n X (n!) [(2n)!]2 (2n + 1) Γ(2n + 2) 2 (2n + 1)! (n!)5 Γ2 (n + 1) , and the inequality is proved. Equality Ak Pn2 (xk ) = 0. Since the coefficients Ak are positive, it k=1 follows that we must have Pn (xk ) = 0, k = 1, n. It follows that equality holds if and only if Pn = ln . Index nk = n . . . (n + k − 1), n0 = 1, (rising factorial), 69 k n = n . . . (n − k + 1), n0 = 1, (falling factorial), 69 algorithm Cooley-Tukey, 99 Crank-Nicolson, 116 FFT, 99 Neville, 58 Popoviciu-Casteljau, 106 asymptotic constant, 41 nonlinear systems of, 41 ordinary differential, 109 EULER, Leonhard (1707–1783), 109 extrapolation, 53 Richardson, 81, 86 FADDEEV, Dmitrii Konstantinovich (1907-1989), 39 Fadeev-Frame method, 39 finite differences, 68 fixed point, 42 fonts, 107 formula Bézier cubics, 107 Aitken-Neville, 57 Bézier curves, 106 Gregory-Newton, 71 BÉZIER, Etienne Pierre (1910-1999), 106 Newton, 39 Bernstein polynomial, 101 Newton’s polynomial, 56 BERNSTEIN, Sergi Natanovich (1880–1968), simplified Newton’s, 45 101 GAUSS, Johann Carl Friedrich (1777–1855), BOOLE, George (1815–1864), 68 20 Boolean sum, 72 Gauss-Seidel iterative method, 34 CAYLEY, Arthur (1821–1895), 39 GREGORY, James (1638–1675), 71 CHEBYSHEV, Pafnuty Lvovich (1821–1894), grid, 115 46 grid line, 116 condition number, 15 HAMILTON, Sir William Rowan (1805–1865), correction, 20 39 COTES, Roger(1682–1716), 83 HERMITE, Charles (1822–1901), 65 CRANK, John (born 1916) , 116 HILBERT, David (1862–1943), 94 DÜRER, Albrecht (1471–1528), 107 integration de CASTELJAU, Paul de Faget, 106 Romberg, 86 degree of accuracy, 83 interpolation, 53 divided difference, 54 fundamental polynomials, 55 eigenvalues, 37 Hermite-Lagrange polynomial, 66 eigenvector, 37 interpolator, 53 elementary functions of order n, 74 Lagrange polynomial, 53 equations mesh points, 53 nonlinear, 41 points, 53 136 INDEX Chebyshev, 46 Euler, 109, 110 false position, 45 Gauss-Jordan, 20 Jacobi’s iterative method, 34 Gauss-Seidel, 34 JACOBI, Carl Gustav Jacob (1804-1851), 34 global, 72 JORDAN, Wilhelm (1842–1899), 20 gradient, 51 grid, 115 KUTTA, Martin Wilhelm (1867–1944), 111 Jacobi, 34 Le Verrier, 38 LAGRANGE, Joseph-Louis (1736–1813), 53 local, 72 Le VERRIER, Urbain Jean Joseph (1811–1877), Newton-Raphson, 44 38 Padé, 95 LEGENDRE, Adrien-Marie (1752–1833), 95 pivoting elimination, 16 LU factorization, 23 regula falsi, 45 relaxation, 35 Mathematica Runge-Kutta, 111 Choleski, 32 secant, 45 Doolittle factorization, 30 Simpson, 88 Doolittle factorization “in place”, 31 SOR, 35 Eigensystem, 38 steepest descent, 51 Eigenvalues, 38 successive approximation, 42 Eigenvalues, Eigenvectors, Eigensystem, 38 Taylor, 110 Eigenvectors, 38 Taylor series, 110 Hermite-Lagrange polynomial, 67 minimax problem, 91 LU factorization, 27 modulus of continuity, 103 norms, 13 partitioning methods, 23 NEWTON, Sir Isaac (1643–1727), 39 pivot elimination, 19 NICOLSON, Phyllis (1917–1968), 116 Successive Over-Relaxation, 36 norm matrix Frobenius, 12 augmented, 16 induced, 10 Hilbert, 94 matrix, 10 ill-conditioned, 15 natural, 10 lower triangular, 23 vector, 9 norm, 10 positive definite, 8 operator strictly diagonally dominant, 7 backward difference, 68 trace, 38 centered difference, 68 upper triangular, 23 displacement, 68 well-conditioned, 15 forward difference, 68 mesh point, 116 identity, 68 method shift, 68 least squares, 91 translation, 68 pivoting elimination, 17 order of convergence, 41 binary-search, 44 orthogonal set, 94 orthonormal, 94 bisection, 43 several variables, 71 Shepard interpolant, 73 iterative techniques, 33 137 138 PADÉ, Henri Eugène (1863–1953), 95 partitioning methods, 20 PDE, 115 elliptic, 117 hyperbolic, 116 parabolic, 116 pivot element, 18 polynomial characteristic, 37 Chebyshev, 95 Legendre, 95 quadrature Gaussian, 88 quadrature formula Chebyshev’s, 83 Gauss’s, 83 Newton-Cotes, 87 Newton-Cotes’s, 83 quadrature, 83 RAPHSON, Joseph (1648–1715), 44 remainder, 83 RICHARDSON, Lewis Fry (1881–1953), 81 rule composite trapezoidal, 85 trapezoidal, 84 RUNGE, Carle David Tolmé (1856–1927), 111 scattered data, 72 SCHOENBERG, Isaac Jacob (1903–1990), 74 von SEIDEL, Philipp Ludwig (1821–1896), 34 Shepard, Donald S.(1903–1990), 73 SIMPSON, Thomas (1710–1761), 88 splines, 74 B-splines, 75 step size, 115 stopping criterion, 42 systems nonlinear, 48, 49 TAYLOR, Brook (1685–1731), 110 tensor product, 72 TUKEY, John Wilder (1915–2000), 99 Vandermonde determinant, 54 VANDERMONDE, Alexandre-Théophile (1735– 1796), 54 INDEX vector residual, 14 wave equation, 116 weight function, 94 Bibliography [Ara57] Oleg Arama. Some properties concerning the sequence of polynomials of S. N. Bernstein (in Romanian). Studii si Cerc. (Cluj), 8(3–4): 195–210, 1957. [Bak76] N. Bakhvalov. Méthodes Numériques. Edition Mir, Moscou, 1976. [BDL83] R. E. Barnhill, R. P. Dube, and F. F. Little. Properties of Shepard’s surfaces. Rocky Mountain Journal of Mathematics, 13(2): 365–382, 1983. [BF93] Richard L. Burden and J. Douglas Faires. Numerical Analysis. PWS-KENT Publishing Company, 1993. [Blu72] E. K. Blum. Numerical analysis and computation theory and practice. Addison-Wesley Publishing Co., Reading, Mass.-London-Don Mills, Ont., 1972. Addison-Wesley Series in Mathematics. [Boo60] G. Boole. A treatise of the calculus of finite differences. Macmillan, Cambridge, London, 1860. [Bre83] Claude Brezinski. Outlines of Padé approximation. In H. Werner et al. eds., editor, Computational Aspects of Complex Analysis, pages 1–50. Reidel, Dordrecht, 1983. [But87] J. C. Butcher. The numerical analysis of ordinary differential equations. A WileyInterscience Publication. John Wiley & Sons Ltd., Chichester, 1987. Runge-Kutta and general linear methods. [BVI95] C. Brezinski and J. Van Iseghem. A taste of Padé approximation. In Acta numerica, 1995, Acta Numer., pages 53–103. Cambridge Univ. Press, Cambridge, 1995. [dB78] Carl de Boor. A practical guide to splines. Springer-Verlag, New York Heidelberg Berlin, 1978. [DB03a] Germund Dahlquist and Åke Björck. Numerical methods. Dover Publications Inc., Mineola, NY, 2003. Translated from the Swedish by Ned Anderson, Reprint of the 1974 English translation. [dB03b] Carl de Boor. A divided difference expansion of a divided difference. J. Approx. Theory, 122(1):10–12, 2003. [dB05] Carl de Boor. Divided differences. Surveys in Approximation Theory, 1:46–69, 2005. [dC59] Paul de Faget de Casteljau. Outilages méthodes calcul. Technical report, A. Citroen, Paris, 1959. [DL93] Ronald A. DeVore and George G. Lorentz. Constructive approximation. Springer Verlag, Berlin Heidelberg New York, 1993. 139 140 BIBLIOGRAPHY [DM72] W. S. Dorn and D. D. McCracken. Numerical Methods with FORTRAN IV Case Studies. John Wiley & Sons, Inc., New York - London - Sydney - Toronto, 1972. [Far90] G. Farin. Curves and Surfaces for Computer Aided Geometric Design – A Practical Guide. Academic Press, Inc., Boston - San Diego - New York - London, 1990. [Gav01] Ioan Gavrea. Aproximarea functiilor prin operatori liniari. Mediamira, Cluj-Napoca, 2001. [Gon95] H. H. Gonska. Shepard Methoden. In Hauptseminar CAGD 1995, pages 1–34, Duisburg, 1995. Gerhard Mercator Universität. [GZ93] H. H. Gonska and X. Zhou. Polynomial approximation with side conditions: Recent results and open problems. In Proc. of the First International Colloquium on Numerical Analysis, Plovdiv 1992, pages 61–71. Zeist/The Netherlands: VSP International Science Publishers, 1993. [Her78] Charles Hermite. Sur la formule d’interpolation de Lagrange. J. Reine Angew. Math., 84:70–79, 1878. [HL89] J. Hoschek and D. Lasser. Fundamentals of Computer Aided Geometric Design. A K Peters, Ltd, 1989. [IP92] Mircea Ivan and Kálmán Pusztai. Mathematics by Computer. Comprex Publishing House, Cluj-Napoca, 1992. [IP03] Mircea Ivan and Kálmán Pusztai. Numerical Methods with Mathematica. Mediamira, Cluj-Napoca, 2003. [Iva73] Mircea Ivan. Mean value theorems in mathematical analysis (Romanian). Master’s thesis, Babes-Bolyai University, Cluj, 1973. [Iva82] Mircea Ivan. Interpolation Methods and Applications. PhD thesis, Babes-Bolyai University, Cluj-Napoca, 1982. [Iva84] Mircea Ivan. On some Bernstein-Bézier type functions. In Itinerant Seminar on Functional Equations, Approximation and Convexity, pages 81–84, Cluj-Napoca, 1984. Babeş-Bolyai University Press. [Iva98a] Mircea Ivan. A proof of Leibniz’s formula for divided differences. In Proceedings of the third Romanian–German Seminar on Approximation Theory (RoGer–98), pages 15–18, Sibiu, June 1–3, 1998, 1998. [Iva98b] Mircea Ivan. A proof of Leibniz’s formula for divided differences. In Proceedings of the third Romanian-German Seminar on Approximation Theory (RoGer-98), pages 15–18, Sibiu, June 1-3 1998. [Iva02] Mircea Ivan. Calculus, volume 1. Editura Mediamira, Cluj-Napoca, 2002. [Iva04] Mircea Ivan. Elements of Interpolation Theory. Mediamira Science Publisher, Cluj-Napoca, 2004. [Iva05] Mircea Ivan. Numerical Analysis with Mathematica. Mediamira Science Publisher, ClujNapoca, 2005. ISBN 973-713-051-0, 252p. BIBLIOGRAPHY 141 [Kag03] Yasuo Kageyama. A note on zeros of the Lagrange interpolation polynomial of the function 1/(z − c). Transactions of the Japan Society for Industrial and Applied Mathematics, 13:391–402, 2003. [Kar53] S. Karlin. Total Positivity. Stanford University Press, 1953. [KM84] N. V. Kopchenova and I. A. Maron. Computational Mathematics. Mir Publishers, Moscow, 1984. [Knu86] Donald E. Knuth. The METAFONTbook. Addison Wesley Publishing Company, ReadingMassachusetts, 1986. [Kor60] P. P. Korovkin. Linear operators and approximation theory. Translated from the Russian ed. (1959). Russian Monographs and Texts on Advanced Mathematics and Physics, Vol. III. Gordon and Breach Publishers, Inc., New York, 1960. [Loc82] F. Locher. Numerische Mathematik für Informatiker. Springer-Verlag, Berlin-HeidelbergNew York, 1982. [Lor53] G. G. Lorentz. Bernstein polynomials. Mathematical Expositions, no. 8. University of Toronto Press, Toronto, 1953. [Mei02] Erik Meijering. A chronology of interpolation: From Ancient Astronomy to Modern Signal and Image Processing. Proceedings of the IEEE, 90(3):319–342, March 2002. [MŞ81] G. I. Marciuk and V. V. Şaidurov. Creşterea preciziei soluţiilor ı̂n scheme cu diferenţe. Editura Academiei RSR, Bucureşti, 1981. [New87] Isaac Newton. Philosophiae Naturalis Principia Mathematica. Printed by Joseph Streater by order of the Royal Society, London, 1687. [NPI+ 57] Miron Nicolescu, Gh. Pic, D.V. Ionescu, E. Gergely, L Németi, L Bal, and F Radó. The mathematical activity of Professor Tiberiu Popoviciu. Studii şi cerc. de matematică (Cluj), 8(1–2):7–19, 1957. [Nue89] Guenther Nuernberger. Approximation by spline functions. Springer, Berlin, 1989. [OR03] John O’Connor’s and Edmund F. Robertson. MacTutor History of Mathematics Archive. School of Mathematics and Statistics. University of St Andrews. Scotland, 2003. [Pal03] Radu Paltanea. Approximation by Linear Positive Operators: Estimates with Second Order Moduli. Editura Universităţii Transilvania, Braşov, 2003. [Pop33] Tiberiu Popoviciu. Sur quelques proprietées des fonctions d’une ou de deux variables réelles. PhD thesis, La Faculté des Sciences de Paris, June 1933. [Pop34] Tiberiu Popoviciu. Sur quelques propriété des fonctions d’une ou de deux variables réelles. Mathematica (Cluj), VIII:1–85, 1934. [Pop37] Tiberiu Popoviciu. Despre cea mai bună aproximaţie a funcţiilor continue prin polinoame. Cinci lecţii ţinute la Facultatea de Ştiinţe din Cluj ı̂n anul şcolar 1933–34 (Romanian). Inst. Arte Grafice Ardealul, Cluj, 1937. [Pop40] Tiberiu Popoviciu. Introduction à la théorie des différences divisées. Bull. Math. Soc. Roumaine Sci., 42(1):65–78, 1940. 142 BIBLIOGRAPHY [Pop42] Tiberiu Popoviciu. Sur l’approximation des fonctions continues d’une variable réelle par des polynômes. Ann. Sci. Univ. Jassy. Sect. I., Math., 28:208, 1942. [Pop59] Tiberiu Popoviciu. Sur le reste dans certaines formules lineaires d’approximation de l’analyse. Mathematica (Cluj), 1(24):95–142, 1959. [Pop72] Elena Popoviciu. Mean Value Theorems and their Connection to the Interpolation Theory (Romanian). Editura Dacia, Cluj, 1972. [Pop75] Tiberiu Popoviciu. Numerical Analysis. An Introduction to Approximate Calculus (Romanian). Editura Academiei Republicii Socialiste România, Bucharest, 1975. [Pow81] M. J. D. Powell. Approximation theory and methods. Cambridge University Press, Cambridge, 1981. [PT95] L. Piegl and W. Tiller. The NURBSbook. Springer-Verlag, Berlin - Heidelberg - New York, 1995. [PTTF92] W. H. Press, S. A. Teukolsky, Vetterling W. T., and B. P. Flannery. Numerical Recipes in C. Cambridge University Press, 1992. [RV98] I. Raşa and T. Vladislav. Analiză Numerică. Editura Tehnică, Bucureşti, 1998. [SB61] M. G. Salvadori and M. L. Baron. Numerical Methods in Engineering. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1961. [Şe83] I. Gh. Şabac and et al. Matematici speciale vol II. Editura Didactică şi Pedagogică, Bucureşti, 1983. [Ste39] J.F. Steffensen. Note on divided differences. Danske Vid. Selsk. Math.-Fys. Medd., 17(3):1– 12, 1939. [VIS00] Ioan-Adrian Viorel, Dumitru Mircea Ivan, and Loránd Szabó. Metode numerice cu aplicaţii ı̂n ingineria electrică. Editura Universităţii din Oradea, Oradea, 2000. (nr. pag. 210) 9738083-29-X. [Vol90] E. A. Volkov. Métodos numéricos. Editorial Mir, Moscow, 1990. Translated from the Russian by K. P. Medkov.