2. OPTIMISATION & LAGRANGE METHOD
Transcription
2. OPTIMISATION & LAGRANGE METHOD
2. OPTIMISATION & LAGRANGE METHOD We already know how to find maxima/minima of functions of one variable; We will now look at the analogous problem for functions of two (or more) variables. We usually call maxima/minima critical points for functions of one variable, stationary points for functions of two – We start by reviewing critical points: (2.1)CRITICAL POINTS(Functions of 1 variable) Given a function (of one variable) f (x), we can find its critical points, (i.e. local maxima, local minima, points of inflection etc.), by solving the equation df = 0. dx To classify this if a) b) c) (2 · 1) (or these) critical point(s), we simply calculate f ′′ (x) and f ′′ (x) < 0 then x is a Maximum, f ′′ (x) > 0 then x is a Minimum, f ′′ (x) = 0 then the test is ‘Inconclusive’. (2 · 2) We can also find and classify stationary points for functions of two variables: (2.2)STATIONARY POINTS(Functions of 2 variables) Given a function (of two variables) f (x, y), we can find its stationary points (i.e. local maxima, local minima, saddle points, etc.) by solving the equations ∂f ∂f = 0, = 0. ∂x ∂y (2 · 3) 2 To classify this (or these) stationary point(s), we calculate fxx fyy − fxy . If 2 a) fxx fyy − fxy < 0 then (x, y) is a Saddle point, 2 b) fxx fyy − fxy > 0 then (x, y) is a Max./Min. point, 2 c) fxx fyy − fxy = 0 then the test is ‘Inconclusive’. (2 · 4) In case a) we immediately conclude that the stationary point is a saddle point. As the name suggests, a saddle point is both maximum and minimum: It 1 looks like a maximum from the ‘back’ of the saddle, where it sits on the horse’s back, and a minimum from the ‘side’ of the saddle, at the arch of the horse’s back. In case b) we need to test whether the stationary point is a max. or min. If i) fxx , fyy < 0 then (x, y) is a Maximum, (2 · 5) ii) fxx , fyy > 0 then (x, y) is a Minimum. In case c) the test is inconclusive, and we have no choice but to admit defeat. Note: We will prove each of the above results later on, using the Chain Rule. In the mean-time, however, we will get our feet wet with some basic examples. Example 1 Find and classify every stationary point of the following function: f (x, y) = 2x3 − 2y 3 − 3ax2 + 3by 2 + 100 and evaluate the function thereat, (assuming that both a and b are positive). First of all we have to find the stationary points by letting fx = 0 and fy = 0. fx = 2 (3x2 ) − 3a (2x) , fy = −2 (3y 2 ) + 3b (2y) , = 6 (x2 − ax) , = −6 (y 2 − by) , = 6x (x − a) , = −6y (y − b) . We find the stationary points by solving the resulting simultaneous equations: x (x − a) = 0, (1) y (y − b) = 0. (2) We find that there are four stationary points: (0, 0), (0, b), (a, 0) and (a, b). Now, we have to classify the stationary points; Calculating fxx , fyy and fxy fxx = 12x − 6a, fyy = −12y + 6b, fxy = 0, 2 we can then calculate fxx fyy − fxy , and in turn, classify these stationary points: 2 fxx fyy − fxy = (12x − 6a) (−12y + 6b) − (0)2 , = 62 (2x − a) (−2y + b) − 0, = −36 (2x − a) (2y − b) . 2 Unfortunately, we have to check each of the four stationary points individually: For the point (0, 0), with f (0, 0) = 100: 2 = −36ab, (0, 0) is a ‘Saddle point’. fxx fyy − fxy For the point (a, 0), with f (a, 0) = 100 − a3 : 2 fxx fyy − fxy = +36ab, (a, 0) is a ‘Max/Minimum’, fxx = +6a, fyy = +6b, (a, 0) is a ‘Minimum’. For the point (0, b), with f (0, b) = 100 + b3 : 2 fxx fyy − fxy = +36ab, (0, b) is a ‘Max/Minimum’, fxx = −6a, fyy = −6b, (0, b) is a ‘Maximum’. For the point (a, b), with f (a, b) = 100 − a3 + b3 : 2 = −36ab, (a, b) is a ‘Saddle point’. fxx fyy − fxy Note(1): Finding some of the stationary points is almost always very easy, but being sure we have found each and every one of them can be pretty tricky. Note(2): Although we usually refer to stationary points as maxima, minima, etc. – and we will continue to do so – it is (strictly speaking) more accurate to refer to them as the points at which the function is maximised, minimised, etc. Example 2 Find and classify every stationary point of the following function: f (x, y) = x3 + 3xy 2 − 6xy + 1. First of all we have to find the stationary points by letting fx = 0 and fy = 0. fx = 3x2 + 3 (1) y 2 − 6 (1) y + 0, fy = 0 + 3x (2y 1 ) − 6x (1) + 0, = 3x2 + 3y 2 − 6y, = 6xy − 6x, = 3 (x2 + y 2 − 2y) , = 6x (y − 1) . However, we have a snag; Solving the resulting set of simultaneous equations: x2 + y 2 − 2y = 0, (1) x (y − 1) = 0, (2) 3 seems much more difficult now, since equation (1) can not be easily factorised, although (2) can. We can get around this problem by substituting the (easily found) solutions for the simplest equation into the most complicated equation. The solutions of the simplest equation – x (y − 1) = 0 – are x = 0 and y = 1: For x = 0 : (0)2 + y 2 − 2y = 0, For y = 1 : x2 + (1)2 − 2 (1) = 0, y (y − 2) = 0, (x − 1) (x + 1) = 0, y = 0, 2. x = 1, −1. And so, we end up with four stationary points: (0, 0), (0, 2), (1, 1) and (−1, 1). Note: These kinds of coupled/simultaneous equations can potentially be very tricky. We are used to straight-forward linear coupled equations, of the form: ax + by = A, cx + dy = D, which are trivially solved – by Gauss-Elimination for example – but solving these less straight-forward simultaneous equations is (generally) more involved. Classifying the corresponding stationary points, however, is always very easy. Now, we have to classify the stationary points: Calculating fxx , fyy and fxy fxx = 6x, fyy = 6x, fxy = 6 (y − 1) , 2 we can then calculate fxx fyy − fxy , and in turn, classify these stationary points: 2 fxx fyy − fxy = 36 x2 − (y − 1)2 . As before, we have to check each of these four stationary points individually: For the point (0, 0), with f (0, 0) = +1: 2 fxx fyy − fxy = −36, (0, 0) is a ‘Saddle point’. For the point (0, 2), with f (0, 2) = +1: 2 fxx fyy − fxy = −36, (0, 2) is a ‘Saddle point’. For the point (+1, 1), with f (+1, 1) = −1: 2 fxx fyy − fxy = +36, (1, 1) is a ‘Max/Minimum’, fxx = +6, fyy = +6, (1, 1) is a ‘Minimum’. 4 For the point (−1, 1), with f (−1, 1) = +3: 2 fxx fyy − fxy = +36, (−1, 1) is a ‘Max/Minimum’, fxx = −6, fyy = −6, (−1, 1) is a ‘Maximum’. Example 3 Show, in the usual way, that the function (of two variables) below: f (x, y) = x3 + y 3 − 2x2 − 2y 2 + 3xy, has stationary points at (0, 0), (1/3, 1/3), investigate the nature of these points. First of all we have to find the stationary points; Letting fx = 0 and fy = 0: fx = 3x2 + 0 − 2 (2x1 ) − 0 + 3 (1) y, fy = 0 + 3y 2 − 0 − 2 (2y 1 ) + 3x (1) , = 3x2 − 4x + 3y, = 3y 2 − 4y + 3x, = x (3x − 4) + 3y, = y (3y − 4) + 3x, which results in the (even more) difficult set of simultaneous equations below: x (3x − 4) + 3y = 0, (1) y (3y − 4) + 3x = 0. (2) We need to exploit the symmetry in the above equations in order to solve them. Important Note: An important property of symmetric functions is that the derivatives fx and fy also exhibit symmetry, as we saw in the previous section on Partial Differentiation. In this case, these equations have symmetry: y = x, (2 · 6) since swapping x for y, and y for x, in either equation gives us the other one! This being the case, we can replace y with x in both of the original equations: ) x (3x − 4) + 3 (x) = 0, ⇒ x (3x − 1) = 0, (x) (3 (x) − 4) + 3x = 0. and we end up with the pair of solutions (x, y) = (0, 0) and (x, y) = (1/3, 1/3), since if x = 0 and x = 1/3, then y = 0 and y = 1/3, respectively, because 5 any solutions to the symmetrical equations (1)–(2) must also be solutions of y = x. Now, we have to classify the stationary points; Calculating fxx , fyy and fxy fxx = 6x − 4, fyy = 6y − 4, fxy = 3, 2 we can then calculate fxx fyy − fxy , and in turn, classify these stationary points: 2 fxx fyy − fxy = (6x − 4) (6y − 4) − (3)2 , = 22 (3x − 2) (3y − 2) − 9, = 4 (3x − 2) (3y − 2) − 9. And once again, we have to check both of these stationary points individually: For the point (0, 0), with f (0, 0) = 0: 2 fxx fyy − fxy = +7, (0, 0) is a ‘Max/Minimum’, fxx = −4, fyy = −4, (0, 0) is a ‘Maximum’. For the point (1/3, 1/3), with f (1/3, 1/3) = −1/27: 2 = −5, (1/3, 1/3) is a ‘Saddle point’. fxx fyy − fxy Example 4(Area & Volume) An open rectangular five-sided box has a volume of 4m3 (four cubic metres). Prove that the dimensions of the box are 2m × 2m × 1m, if the external surface-area of the box is at a minimum. If the box is opened at the top, and its dimensions are x = length, y = breadth and z = height, then the Area A and the Volume V of the box can be written: Area: A (x, y, z) = 2 (x + y) z + xy, Volume: V (x, y, z) = xyz = 4. We can not minimise the area as it stands, since it is a function of three variables. However, we can express A (x, y, z) as a function of just two variables by eliminating, say, the variable z using (what is called a constraint) 4 = xyz: 4 z= . xy We can hence write the expression for the area of the box in terms of x and y: 1 + xy. A (x, y) = 8 (x + y) xy 6 So now, to minimise the surface area A (x, y), we need only solve the equations ∂A ∂A = 0 and = 0. ∂x ∂y In fact, as we will see later on when we look at the Lagrange Multipler Method, there is another way of minimising the area, as a function of three variables, subject to the constraint (a term we will see a lot of) that the volume is fixed. As usual, Ax and Ay are easily found using standard partial differentiation: ∂ ∂A ∂ ∂A = (8 (x + y) x−1 y −1 + xy) , = (8 (x + y) x−1 y −1 + xy) , ∂x ∂x ∂y ∂y ∂ ∂ = (8x−1 + 8y −1 + xy) , = (8x−1 + 8y −1 + xy) , ∂x ∂y = −8x−2 + y, = −8y −2 + x, and since the minimum area is found by solving for both Ax = 0 and Ay = 0: −8x−2 + y = 0, and −8y −2 + x = 0, −8 + x2 y = 0, −8 + xy 2 = 0, x2 y = 8, xy 2 = 8. As usual, we find the minimum by solving the resulting simultaneous equations: x2 y = 8, (1) xy 2 = 8. (2) These two equations (again) have the symmetry y = x, since (again) swapping x for y and y for x in either of the two equations gives us the other. Therefore: ) x2 (x) = 8, ⇒ x3 − 8 = 0, x (x)2 = 8. with the obvious solutions x = 2 and y = 2. (All other solutions for x and y are complex, and have no physical meaning in this context). To find the corresponding value of z, we substitute x = 2, y = 2 into our expression for z: 4 z= = 1. (2) (2) 7 The dimensions of the open rectangular five-sided box with a volume of 4m3 , and a minimum external surface area, is 2m×2m×1m, as expected. QED. Example 5 Find and classify every stationary point of the following function: 1 f (x, y) = x2 + y 2 + 2 2 , xy First of all we have to find the stationary points; Letting fx = 0 and fy = 0: fx = 2x1 + 0 + (−2x−3 ) y −2 , fy = 0 + 2y 1 + x−2 (−2y −3 ) , = 2x1 − 2x−3 y −2 , = 2y − 2x−2 y −3 , = 2x−3 y −2 (x4 y 2 − 1) , = 2x−2 y −3 (x2 y 4 − 1) , we end up with the following (extremely tricky) set of simultaneous equations: x4 y 2 − 1 = 0, (1) x2 y 4 − 1 = 0. (2) As with the previous question, we clearly need some kind of trick to solve these. Important Note: These simultaneous equations have even more symmetry than the two in example 3 (where we had y = x). Now we have the symmetry: y 2 = x2 , (2 · 7) since swapping x2 for y 2 , or y 2 for x2 , in either of those equations, automatically gives us the other. By factorising the symmetry equation y 2 = x2 , we see that it is equivalent to (y − x) (y + x) = 0, giving us the symmetry y = x and y = −x. We can once again exploit this symmetry, by making the substitution y = ±x: ) x4 . (±x)2 − 1 = 0, ⇒ x6 − 1 = 0, x2 . (±x)4 − 1 = 0. with the obvious solutions x = 1 and x = −1. Since the solutions must satisfy y = ±x, the stationary points are (x, y) = (1, 1), (1, −1), (−1, −1) and (−1, 1). Now, we have to classify the stationary points; Calculating fxx , fyy and fxy fxx = 2 + 6x−4 y −2 , fyy = 2 + 6x−2 y −4 , fxy = 4x−3 y −3 , 8 2 we can then calculate fxx fyy − fxy , and in turn, classify these stationary points: 2 2 fxx fyy − fxy = (2 + 6x−4 y −2 ) (2 + 6x−2 y −4 ) − (4x−3 y −3 ) , = 4 (1 + 3x−4 y −2 ) (1 + 3x−2 y −4 ) − 16x−6 y −6 . Thankfully, we can actually check all four stationary points at the same time. So, for each of these four points (x, y) = (1, 1), (1, −1), (−1, 1) and (−1, −1): 2 fxx fyy − fxy = +48, all points are ‘Max/Minima’, fxx = +8, fyy = +8, all points are ‘Minima’. We have seen a number of different types of problems to do with optimisation now, but we have not explained where the equations (2 · 1)−(2 · 5) came from. The following derivations of critical and stationary points are non-examinable: (2.3)CRITICAL POINTS(Derivation) Let us assume that critical points of f (x) are found by solving the equation f ′ (x) = 0. We now derive the conditions we would use to classify these critical points; The Taylor’s series expansion of an arbitrary function (of just one variable) f (x) a very small distance along the x axis, away from a critical point x0 , is f (x0 + h) = f (x0 ) + 1!1 h1 f ′ (x0 ) + 2!1 h2 f ′′ (x0 ) + · · · , = f (x0 ) + h {0} + 21 h2 f ′′ (x0 ) + · · · , = f (x0 ) + 21 h2 f ′′ (x0 ) + · · · , where h denotes this very small distance away from the critical point x = x0 . Apart from taking into consideration that f ′ (x0 ) = 0, since x = x0 is a critical point, we can also ignore all terms of higher-order (and thus smaller) than h2 . 9 For a function of just one variable f (x) a Minimum is defined as shown below: f (x0 + h) > f (x0 ) for all (small) h, and therefore, we can see that for f ′′ (x0 ) > 0, we have a minimum at x = x0 . For a function of just one variable f (x) a Maximum is defined as shown below: f (x0 + h) < f (x0 ) for all (small) h, and therefore, we can see that for f ′′ (x0 ) < 0, we have a maximum at x = x0 . (2.4)STATIONARY POINTS(Derivation) Assume that stationary points of f (x, y) are found by solving the equation: fx (x, y) = 0 and fy (x, y) = 0. We now derive the conditions we would use to classify the stationary points; The Taylor’s series of a function (of two variables) f (x, y), very small distances h (along x axis) and k (along yaxis) away from a stationary point (x0 , y0 ) is: f (x0 + h, y0 + k) = f (x0 , y0 ) + 1!1 [h1 fx (x0 , y0 ) + k 1 fy (x0 , y0 )] + 2!1 [h2 fxx (x0 , y0 ) + 2hkfxy (x0 , y0 ) + k 2 fyy (x0 , y0 )] , = f (x0 , y0 ) + 12 [h2 fxx (x0 , y0 ) + 2hkfxy (x0 , y0 ) + k 2 fyy (x0 , y0 )] , since fx (x0 , y0 ) = 0 and fy (x0 , y0 ) = 0, given that (x0 , y0 ) is a stationary point. By ‘completing the square’, we can write f (x0 + h, y0 + k) in two useful forms: 2 f (x0 + h, y0 + k) = f (x0 , y0 ) + 2f1xx (hfxx + kfxy )2 + k 2 fxx fyy − fxy , 2 2 f (x0 + h, y0 + k) = f (x0 , y0 ) + 2f1yy h2 fxx fyy − fxy + (kfyy + hfxy ) , where fxx , fyy and fxy mean fxx (x0 , y0 ), fyy (x0 , y0 ) and fxy (x0 , y0 ) respectively. For a function of two variables f (x, y), a Minimum is defined as shown below f (x0 + h, y0 + k) > f (x0 , y0 ) 10 for all (small) h, k, (1) (2) 2 and so for fxx fyy − fxy > 0 and fxx > 0, fyy > 0, we have a minimum at (x0 , y0 ). For a function of two variables f (x, y), a Maximum is defined as shown below f (x0 + h, y0 + k) < f (x0 , y0 ) for all (small) h, k, 2 and so for fxx fyy − fxy > 0 and fxx < 0, fyy < 0, we have a maximum at (x0 , y0 ). The above derivations of the conditions for finding maxima and minima can be derived from either one of equations (1) and (2), since for fxx fyy − 2 fxy > 0 we must have either that fxx and fyy are both positive, or are both negative. Similarly, the following derivation of the formula for finding saddle points can be derived using either of the two (modified versions of) equations (1) and (2). Finally, by converting to the polar coordinates h = ε cos θ, k = ε sin θ, where ε is a very small radius, we can write f (x0 + h, y0 + k) in the following form: 2 2 2 f (x0 + h, y0 + k) = f (x0 , y0 ) + ε 2fsinxx θ (fxx tan θ + fxy )2 + fxx fyy − fxy , 2 2 2 f (x0 + h, y0 + k) = f (x0 , y0 ) + ε 2fcosyy θ fxx fyy − fxy + (fyy tan θ + fxy )2 . For a function of two variables f (x, y), a Saddle Point is defined as follows: f (x0 + h, y0 + k) ⋚ f (x0 , y0 ) depending on θ, (i.e. in some directions, we have a maximum, in other directions, a mini2 mum), and so for fxx fyy −fxy < 0, we have a saddle point, i.e. for some angles of θ we have maxima, for others, minima, regardless of the values of fxx , fyy , fxy . It may not be totally obvious why this is the case, but in light of the fact that: −∞ ≤ tan θ ≤ +∞ for −π/2 ≤ θ ≤ +π/2, we can see that regardless of whether or not fxx or fyy are positive or negative, we can always find an angle which forces the conditions for minima or maxima: f (x0 + h, y0 + k) > f (x0 , y0 ) or f (x0 + h, y0 + k) < f (x0 , y0 ) . 11 (This can be a very difficult point to explain on paper, and so this derivation may fall short – If this is the case, please just ask me to talk ye through it). (2.5)CONSTRAINTS So far, we have found the stationary points for functions where x and y were independent variables. But what if x and y were instead dependent variables? (a)Independent Variables We find the stationary points of the function: f (x, y) = x2 + y 2 , where x and y are the usual independent variables, by solving these equations: fx = 2x1 + 0, fy = 0 + 2y 1 , = 2x = 0, = 2y = 0. This gives us a stationary point at (x, y) = (0, 0) corresponding to a minimum. We find the stationary points of the func(b)Dependent Variables tion: f (x, y) = x2 + y 2 , where x and y are now dependent variables, related by the straight line equation 3x + 4y = 25, (more generally called a constraint), by eliminating one of the two variables, and finding the stationary points of the resulting function, as in example 4. By eliminating y using y = (25 − 3x) /4, we can re-write f (x, y) as the function 1 f (x) = x2 + 16 (25 − 3x)2 . Therefore, to minimise this function (of one variable) we only need to solve for d 1 (25 − 3x)2 = 0, x2 + 16 f′ = dx 1 = 2x + 16 2 (25 − 3x)1 (−3) = 0, = 18 (25x − 75) = 0. So, the solution is x = 3, and since the constraint is y = 41 (25 − 3x), y = 4. This gives us a stationary point at (x, y) = (3, 4) corresponding to a minimum. 12 Note that the absolute minimum for the function f (x, y) = x2 + y 2 at (0, 0) is f (0, 0) = 0, when it is unconstrained, whereas if we impose the constraint 3x+4y = 25, we find that the absolute minimum for the function f (x, y) = x2 + y 2 at (3, 4) is f (3, 4) = 25. Clearly the absolute minimum is higher (and thus worse) due to the constraint. Furthermore, the more constraints and restrictions we place on the system, the lower we expect the maxima to be, and the higher we expect the minima to be! Strictly, it would be possible for a system to have the same maxima/minima if a constraint were placed on it, but these maxima/minima could never be better! Exercise 1: tions: Find the stationary point(s) of each of the following func- a) f (x, y) = x2 + y 2 , b) f (x, y) = xy, c) f (x, y) = (x − y)2 . Find these stationary points once again, subject now to each of the constraints: i) Ax + By = C, ii) y − Ax2 − Bx − C = 0, iii) x − y 2 = 0, by either substituting for, or eliminating, one of the variables x, y in each case. Note: In many cases in both Engineering and Mathematical-Physics, it can be extremely difficult to eliminate one of the variables using the equation of constraint. The Lagrange-Multiplier Method can get around this problem. Consider, for example, if we wanted to minimise the function of two variables f (x, y) = xy, subject to the constraint x2 + 8xy + 7y 2 = 180. In this case, it would be very difficult to get x in terms of y, or y in terms of x, but (as we will now see), this can be solved easily and systematically using the Lagrange-Multiplier Method: 13 (2.6)LAGRANGE MULTIPLIER METHOD To find the stationary point(s) of the function of two variables f (x, y), subject to the constraint equation g (x, y) = 0, we need to solve the following equations: ∂f ∂g ∂f ∂g +λ = 0, +λ = 0, (2 · 7) ∂x ∂x ∂y ∂y where λ is the Lagrange multiplier. The Lagrange-Multiplier method is broadly: 1) 2) 3) 4) Solve the above (Lagrange–Multiplier) Equations; Determine the value(s) of the Lagrange-Multiplier; Use the constraint to find (x, y) for each value of λ; Determine the value(s) of f (x, y) for each value of (x, y) . Example 6 of Use Lagrange’s Multiplier method to find the minimum value f (x, y) = x2 + y 2 , subject to the following equation of constraint (equation of a hyperbolic curve): x2 + 8xy + 7y 2 = 180. Before starting, it is always a good idea to be absolutely clear on the problem: Find Stationary–Points of f (x, y) = x2 + y 2 , Subject to the Constraint g (x, y) = x2 + 8xy + 7y 2 − 180. Note(1): We must always express these constraints in the form g (x, y) = 0. Obviously, this will always be possible, since an arbitrary equation of the form: LHS = RHS, can always be re-written in the form g (x, y) = 0; in this case LHS − RHS = 0. First of all we substitute f (x, y), the main function, and g (x, y), the function appearing in the constraint equation, into the Lagrange-Multiplier Equations: ∂f ∂g +λ = 0, ∂x ∂x ∂f ∂g +λ = 0, ∂y ∂y (2x) + λ (2x + 8y) = 0, (2y) + λ (8x + 14y) = 0, (1 + λ) x + 4λy = 0, 4λx + (1 + 7λ) y = 0. 14 Our next move is to determine the value(s) of λ, the Lagrange-Multiplier(s). Note(2): The solutions will correspond to the minimum distance (squared) between the origin (0, 0), and any point (x, y) on the given hyperbolic curve. For the time being, however, we will ignore these geometrical interpretation. Now we have to determine the value(s) of λ that give us the most general non-trivial solution for (x, y). In this case, the best way to do this is by re-writing the simultaneous equations using standard matrix notation (as shown below): ) (1 + λ) x + 4λy = 0, 0 x (1 + λ) 4λ , = ⇒ 0 y 4λ (1 + 7λ) 4λx + (1 + 7λ) y = 0, so that the λ giving us the most general solution for x and y can be found via: (1 + λ) 4λ = 0, 4λ (1 + 7λ) (1 + λ) (1 + 7λ) − 16λ2 = 0, 7λ2 − 16λ2 + 8λ + 1 = 0, − (9λ + 1) (λ − 1) = 0. We have to determine the stationary points corresponding to each value of λ. So, for the first value, being λ = − 19 , our two equations collapse into just one: ) 1 + − 91 x + 4 − 19 y = 0, ⇒ 2x − y = 0. 4 − 19 x + 1 + 7 − 19 y = 0, Substituting this equation (y = 2x) into the constraint x2 + 8xy + 7y 2 = 180: x2 + 8x (2x) + 7 (2x)2 − 180 = 0, 45x2 − 180 = 0, 45 (x + 4) (x − 4) = 0, and hence, we get the solutions x = 4 (with y = 8), and x = −4 (with y = −8): Solutions: (x, y) = (±4, ±8) corresponding to f = 80. And for the second value, being λ = 1, our equations again collapse into one: ) (1 + (1)) x + 4 (1) y = 0, ⇒ x + 2y = 0. 4 (1) x + (1 + 7 (1)) y = 0, 15 Substituting this equation (x = −2y) into the constraint x2 +8xy+7y 2 = 180: (−2y)2 + 8 (−2y) y + 7y 2 − 180 = 0, −5y 2 − 180 = 0, −5 (y + 6i) (y − 6i) = 0, and hence, we get no real solutions. If the above values were acceptable, we would have the solutions y = 6i (with x = −12i) and y = −6i (with x = 12i): Solutions: (x, y) = (±6i, ∓12i) corresponding to f = −180. However, we cannot accept complex solutions. The topic of complex variables, and conformal mappings. Example 7 Using the Lagrange-Multiplier Method, or otherwise, find both the maximum and minimum values of the function f (x, y) = xy on the circle: x2 + y 2 = 8. As in Example 6, it is important to lay out the problem as clearly as possible: Find Stationary–Points of f (x, y) = xy, Subject to the Constraint g (x, y) = x2 + y 2 − 8. So, we substitute f (x, y) and g (x, y) into the Lagrange-Multiplier Equations: ∂g ∂f +λ = 0, ∂x ∂x ∂f ∂g +λ = 0, ∂y ∂y (y) + λ (2x) = 0, (x) + λ (2y) = 0, 2λx + y = 0, x + 2λy = 0. And as before, we determine the value(s) of λ that give us the most general non-trivial solution for (x, y), by re-writing these equations in matrix notation: 0 x 2λ 1 2λx + y = 0 , = ⇒ 0 y 1 2λ x + 2λy = 0 16 so that the λs giving us the most general solution for (x, y) can be found via: 2λ 1 1 2λ = 0, (2λ)2 − 12 = 0, (2λ + 1) (2λ − 1) = 0. For the first value, being λ = − 21 , these equations again collapse into just one: 2 − 12 x + y = 0 ⇒ y − x = 0. x + 2 − 12 y = 0 Substituting this equation (y = x) into the constraint x2 + y 2 − 8 = 0 we get: x2 + (x)2 − 8 = 0, 2 (x2 − 4) = 0, 2 (x − 2) (x + 2) = 0, and the solutions are therefore x = 2 (with y = 2), and x = −2 (with y = −2): Solutions: (x, y) = (±2, ±2) corresponding to f = +4. (As it turns out, each of these stationary points will correspond to a maximum). For the second value, being λ = 21 , the equations collapse into one yet again: 2 + 12 x + y = 0 ⇒ y + x = 0. x + 2 + 12 y = 0 Substituting this equation (y = −x) into the constraint x2 + y 2 − 8 = 0 we get: x2 + (−x)2 − 8 = 0, 2 (x2 − 4) = 0, 2 (x − 2) (x + 2) = 0, and the solutions are therefore x = 2 (with y = −2), and x = −2 (with y = 2): Solutions: (x, y) = (±2, ∓2) corresponding to f = −4. 17 (As it turns out, each of these stationary points will correspond to a minimum). The maximum of f (x, y) = xy, subject to x2 + y 2 = 8, is therefore f = +4. Note: We could also have solved this problem if we had known that the parametric equations of the circle – centred at origin (0, 0) – x2 + y 2 = 8 are: √ x (θ) = 2 2 cos (θ) , √ y (θ) = 2 2 sin (θ) . Using the above, the constraint is automatically satisfied, and the function is: √ √ f (θ) = 2 2 cos (θ) 2 2 sin (θ) , = 8 cos (θ) sin (θ) , = 4 sin (2θ) . It is easy to show that the stationary points of this function are at θ = . ± π4 , ± 3π 4 (These correspond to each of the same four solutions as before, as we expect). Example 8(Standard Ellipse) Use Lagrange’s multiplier method to obtain the maximum, and the minimum values that the function f (x, y) = x2 + y 2 can have on the ellipse described by b 2 x 2 + a2 y 2 = a2 b 2 . As usual, we substitute f (x, y), the function we wish to minimise, and g (x, y), the function appearing in the constraint equation, into the Lagrange equations ∂g ∂f +λ = 0, ∂x ∂x ∂f ∂g +λ = 0, ∂y ∂y (2x) + λ (2b2 x) = 0, (2y) + λ (2a2 y) = 0, 2 (λb2 + 1) x = 0, 2 (λa2 + 1) y = 0. (Normally, we might be tempted to solve coupled/simultaneous equations by simply eliminating x or y in what we might call the ‘old-fashioned’ way, but in this example we can clearly see the advantage of using matrix-methods). 18 We write these two simultaneous equations in matrix notation in order to find the value(s) of λ that gives us the most general non-trivial solution for (x, y) 0 x 2 (λb2 + 1) 0 2 (λb2 + 1) x + (0) y = 0 , = ⇒ 0 y 0 2 (λa2 + 1) (0) x + 2 (λa2 + 1) y = 0 so that the λs giving us the most general solution for (x, y) can be found via: 2 (λb2 + 1) 0 = 0, 0 2 (λa2 + 1) 4 (λb2 + 1) (λa2 + 1) = 0. We have to determine the stationary points corresponding to each value of λ. For λ = −1/b2 , one of the two equations vanishes altogether, leaving us with: 2 ((−b2 /b2 ) + 1) x + (0) y = 0 ⇒ 2 (1 − a2 /b2 ) y = 0. (0) x + 2 ((−a2 /b2 ) + 1) y = 0 Because a 6= b – since the condition a = b would give us a circle equation – the solution y = 0 can be substituted into the constraint b2 x2 + a2 y 2 = a2 b2 to get b2 x2 + a2 (0)2 − a2 b2 = 0, b2 (x2 − a2 ) = 0, b2 (x − a) (x + a) = 0. This then gives us the solutions x = a (with y = 0), and x = −a (with y = 0): Solutions: (x, y) = (±a, 0) corresponding to f = +a2 . (This solution corresponds to a minimum if b > a, and a maximum if b < a). For λ = −1/a2 , however, the other equation vanishes, leaving us instead with: 2 ((−b2 /a2 ) + 1) x + (0) y = 0 ⇒ 2 (1 − b2 /a2 ) x = 0. (0) x + 2 ((−a2 /a2 ) + 1) y = 0 Again, because a 6= b, the solution x = 0 can be substituted into the constraint: b2 (0)2 + a2 y 2 − a2 b2 = 0, a2 (y 2 − b2 ) = 0, a2 (y − b) (y + b) = 0. 19 This then gives us the solutions y = b (with x = 0), and y = −b (with x = 0): Solutions: (x, y) = (0, ±b) corresponding to f = +b2 . (This solution corresponds to a maximum if b > a, and a minimum if b < a). Note: We could also have solved this problem if we had known that the parametric equations of the ellipse – centred at (0, 0) – b2 x2 + a2 y 2 = a2 b2 are: x (θ) = a cos (θ) , y (θ) = b sin (θ) . Using these, the constraint is automatically satisfied and we write the function: f (θ) = a cos (θ) b sin (θ) , = ab cos (θ) sin (θ) , = 12 ab sin (2θ) . We could hence show that the stationary points of this function are to be found at θ = ±π/4, ±3π/4, (corresponding to exactly the same solutions as before). Example 9(Minimum Distance) Use the Lagrange multiplier method to obtain the least distance between the origin (0, 0), and the hyperbolic curve described 3x2 + 4xy + 6y 2 = 140. Note(Distance): Generally speaking, the distance between (a, b) (any point) and (x, y) (a point on a curve g (x, y) = 0) is given by the following expression: q (2 · 8)1 D(a,b) = (x − a)2 + (y − b)2 , and therefore, the distance between (0, 0) (the origin) and (x, y) (some point on a curve g (x, y) = 0) is given by the virtually equivalent expression below: p (2 · 8)2 D(0,0) = x2 + y 2 . p Strictly speaking, we are supposed to minimise x2 + y 2 , the distance between (0, 0) (the origin) and (x, y) (a point on the curve); However, it is convenient, (and mathematically equivalent), to minimise the ‘distance-squared’ function f (x, y) = x2 + y 2 , 20 subject to the same constraint g (x, y) = 3x2 +4xy+6y 2 −140 = 0. Therefore; Find Stationary–Points of f (x, y) = x2 + y 2 , Subject to the Constraint g (x, y) = 3x2 + 4xy + 6y 2 − 140. Because x and y are both real variables (not imaginary or complex variables), the distance will always be positive as we intuitively expect it to be. Therefore, minimising the distance function is, at least mathematically speaking, exactly equivalent to minimising the distance-squared function, i.e. f (x, y) = x2 + y 2 . (We have to remember that this distance-squared function is only used to find the point(s) which minimise/maximise the distance – it is only a middle-man. The solutions must then be substituted into the expression for the distance D). As per usual, we start by solving the Lagrange-Multiplier Equations (as below): ∂f ∂g +λ = 0, ∂x ∂x ∂f ∂g +λ = 0, ∂y ∂y (2x) + λ (6x + 4y) = 0, (2y) + λ (4x + 12y) = 0, (1 + 3λ) x + 2λy = 0, 2λx + (1 + 6λ) y = 0. Once again, we want to determine the value(s) of λ, the Lagrange multiplier, that gives us the most general (non-trivial) solution for (x, y). And yet again, by re-writing the simultaneous equations in standard matrix notation, we get: ) (1 + 3λ) x + 2λy = 0, 0 x (1 + 3λ) 2λ . = ⇒ 0 y 2λ (1 + 6λ) 2λx + (1 + 6λ) y = 0, p (If we had tried instead to minimise the distance function x2 + y 2 , we would have had great difficulty solving the resulting simultaneous equations, despite the fact that they ultimately would have yielded precisely the same solutions). To determine the λs giving us the most general solution for x and y, we solve: (1 + 3λ) 2λ = 0, 2λ (1 + 6λ) (1 + 3λ) (1 + 6λ) − 4λ2 = 0, (7λ + 1) (2λ + 1) = 0. 21 We have to determine the stationary points corresponding to each value of λ. For the first value, being λ = − 17 , our equations once again collapse into one: ) 1 + 3 − 71 x + 2 − 71 y = 0, ⇒ 2x − y = 0. 2 − 17 x + 1 + 6 − 17 y = 0, Substituting this equation (y = 2x) into the constraint 3x2 +4xy +6y 2 = 140: 3x2 + 4x (2x) + 6 (2x)2 − 140 = 0, 35 (x2 − 4) = 0, 35 (x + 2) (x − 2) = 0, and we hence find the solutions x = 2 (with y = 4), and x = −2 (with y = −4); Solutions: (x, y) = (±2, ±4) corresponding to f = 20. At this point, we have to recall that the function we are trying to minimise, is the distance between (0, 0) (the origin) and (x, y) (a point on the curve), thus q L (±2, ±4) = (±2)2 + (±4)2 , √ = 4 + 16, √ = 20. √ √ (This corresponds to what turns out to be a minimum length of 20 = 2 5). For the second value, being λ = − 21 , the equations yet again collapse into one: ) 1 + 3 − 12 x + 2 − 12 y = 0, ⇒ −x − 2y = 0. 2 − 12 x + 1 + 6 − 21 y = 0, Substituting this equation (x = −2y) into the constraint 3x2 + 4xy + 6y 2 = 140: 3 (−2y)2 + 4 (−2y) x + 6y 2 − 140 = 0, 10 (y 2 − 14) = 0, √ √ 35 y + 14 y − 14 = 0, √ √ √ and√we get solutions y = − 14 (with x = 2 14), y = 14 (with x = −2 14); √ √ Solutions: (x, y) = ± 14, ∓2 14 corresponding to f = 70. 22 Again, we have to recall that the function we are trying to minimise, is the distance between (0, 0) (the origin) and (x, y) (a point on the curve), and so q √ √ √ 2 √ 2 ± 14 + ∓2 14 , L ± 14, ∓2 14 = √ = 14 + 54, √ = 70. (This solution corresponds to what turns out to be a maximum length of √ 70). (2.7)LAGRANGE MULTIPLIER METHOD(Derivation) Since the stationary points of the function f (x, y) are, by definition, given by ∂f ∂f df = dx + dy = 0, ∂x ∂y and because, given some constraint in the form g (x, y) = 0, we must also have ∂g ∂g dg = dx + dy = 0, ∂x ∂y the stationary points must simultaneously satisfy the following two equations: ∂f ∂f dx + dy = 0, (1) ∂x ∂y ∂g ∂g dx + dy = 0. (2) ∂x ∂y We can simplify this by re-writing the equations in matrix form (shown below): ) fx dx + fy dy = 0, 0 fx fy dx = ⇒ 0 dy gx gy gx dx + gy dy = 0. and finding values of fx , fy , gx and gy giving non-trivial solutions for dx, dy. The most general (non-trivial) solution(s) for dx and dy are found by solving: fx fy gx gy = 0 ⇒ fx gy − gx fy = 0. 23 By equating like derivatives, we can see that the condition to optimise f (x, y), subject to the (equation of) constraint g (x, y) = 0, can be written in the form: fx gy − gx fy = 0, fx gy = fy gx , fx /gx = fy /gy . To find the points (x, y) that satisfy the above equation, it is more convenient to solve instead the following equivalent pair of coupled/simultaneous equations: fx + λgx = 0, fy + λgy = 0, where λ is an unknown constant called the Lagrange Multiplier. QED Note: The standard Lagrange-Multiplier Equations (shown above) can easily be modified to handle three or more dimensions, and two or more constraints. Clearly, we could imagine a whole host of problems where we might have a function of not two but three variables, subject to an equation of constraint also involving three variables; For example, if we wanted to optimise the distance between a point and a surface in three dimensions, rather than the distance between a point and a curve in two-dimensions – We will show such problems can be solved relatively easily using a (modified) LagrangeMultiplier Method. (2.8)LAGRANGE MULTIPLIER METHOD(3-Dimensions) To find the stationary point(s) of the function of three variables f (x, y, z), subject to the constraint equation g (x, y, z) = 0, we solve the three equations: ∂g ∂f ∂g ∂f ∂g ∂f +λ = 0, +λ = 0, +λ = 0, (2 · 9) ∂x ∂x ∂y ∂y ∂z ∂z where λ is the Lagrange multiplier. Then, we proceed as usual: 1) 2) 3) 4) Solve the above (Lagrange–Multiplier) Equations; Determine the value(s) of the Lagrange-Multiplier; Use the constraint to find (x, y, z) for each value of λ; Determine the value(s) of f (x, y, z) for each value of (x, y, z) . 24 Example 10 Use the Lagrange multiplier method to determine the minimum distance from the origin (0, 0, 0) to any point (x, y, z) on the plane defined by 4x − 4y + 2z = 36. As usual, it is a good idea to have no ambiguity about the problem in question: Find Stationary–Points of f (x, y, z) = x2 + y 2 + z 2 , Subject to the Constraint g (x, y, z) = 4x − 4y + 2z − 36, bearing in mind, that the distance between the origin (0, 0, 0) and a point p 2 on the curve (x, y, z) is Distance = x + y 2 + z 2 , and that the distancesquared function f (x, y) = x2 +y 2 +z 2 is again used purely for the sake of convenience. First off, we solve the Lagrange Multiplier equations (of which there are three) ∂f ∂g +λ = 0, ∂x ∂x ∂f ∂g +λ = 0, ∂y ∂y ∂f ∂g +λ = 0, ∂z ∂z (2x) + λ (4) = 0, (2y) + λ (−4) = 0, (2z) + λ (2) = 0, 2x + 4λ = 0, 2y − 4λ = 0, 2z + 2λ = 0. Our next move is to determine the value(s) of λ, the Lagrange multiplier(s). Note: The easiest way to do this is to use the above equations (which can be simplified to x = −2λ, y = 2λ, z = −λ) and substitute them into the constraint 4x − 4y + 2z = 36, 4 (−2λ) − 4 (2λ) + 2 (−λ) = 36, −8λ − 8λ − 2λ = 36, −18λ = 36. Hence λ = −2. Having used the constraint already, we find that by backsubstituting the Lagrange multiplier λ into the original equations, the solution (x, y, z) = (4, −4, 2) drops into our laps without having to use the constraint. Amazingly, this question was quicker than its two-dimensional counterparts. 25 Example 11 Use the Lagrange multiplier method to determine the maxi√ mum and minimum temperature, on the sphere of radius 5 2 and centred at the origin (0, 0, 0), where the temperature is the function of position shown below: T (x, y, z) = 273 + 2z (3x + 4y) . As usual, it is a good idea to have no ambiguity about the problem in question: Find Stationary–Points of T (x, y, z) = 273 + 6xz + 8yz, Subject to the Constraint g (x, y, z) = x2 + y 2 + z 2 − 50. First off, we solve the Lagrange Multiplier equations (of which there are three) ∂g ∂T +λ = 0, ∂x ∂x ∂T ∂g +λ = 0, ∂y ∂y ∂T ∂g +λ = 0, ∂z ∂z (6z) + λ (2x) = 0, (8z) + λ (2y) = 0, (6x + 8y) + λ (2z) = 0, λx + 3z = 0, λy + 4z = 0, 3x + 4y + λz = 0. Our next move is to determine the value(s) of λ, the Lagrange multiplier(s). However, we will not get the ‘free lunch’ here that we got in Example 10! We must once again re-write these equations in matrix notation, in order to find the values of λ that gives us the most general non-trivial solution for (x, y, z): 0 x λ 0 3 λx + (0) y + 3z = 0 (0) x + λy + 4z = 0 ⇒ 0 λ 4 y = 0 , 0 z 3 4 λ 3x + 4y + λz = 0 so that the λs giving us the most general solution for (x, y, z) can be found via λ 0 3 0 λ 4 = 0, 3 4 λ λ3 − 16λ − 9λ = 0, λ (λ − 5) (λ + 5) = 0. We have to determine the stationary points corresponding to each value of λ. 26 For λ = 0, one of these three equations vanishes altogether, leaving us with: 3z = 0 z = 0, 4z = 0 ⇒ y = −3x/4. 3x + 4y = 0 By substituting these back√into the x2 + y 2 + z 2 = 50, we find that √ constraint the stationary point is ±4 2, ∓3 2, 0 , giving us a temperature of T = 273. For λ = +5, we find that once again, one of these equations disappears – since the third equation (below) is a superposition of the first two – leaving us with: 5x + 3z = 0 5x + 3z = 0, 5y + 4z = 0 ⇒ 5y + 4z = 0. 3x + 4y + 5z = 0 By substituting these – i.e. x = −3z/5, y = −4z/5 – back into the constraint, we get the stationary point (±3, ±4, ∓5), giving us a temperature of T = 23. For λ = −5, we find that once again, one of these equations disappears – since the third equation is again a superposition of the first two – leaving us with: −5x + 3z = 0 5x − 3z = 0, −5y + 4z = 0 ⇒ 5y − 4z = 0. 3x + 4y − 5z = 0 By substituting these – i.e. x = +3z/5, y = +4z/5 – back into the constraint, we get the stationary point (±3, ±4, ±5), giving us a temperature of T = 523. Clearly the sphere is hottest at the points (±3, ±4, ±5), at 5230 K, where (0 K denotes degrees Kelvin), and is coolest at the points (±3, ±4, ∓5), at 230 K. Example 12 Use the Lagrange multiplier method to determine the maximum and minimum distance from the origin to any point on the ellipsoid defined by x2 /4 + y 2 /9 + z 2 /16 = 1. As usual, it is a good idea to have no ambiguity about the problem in question: Find Stationary–Points of f (x, y, z) = x2 + y 2 + z 2 , Subject to the Constraint g (x, y, z) = x2 /4 + y 2 /9 + z 2 /16 − 1, 27 bearing in mind, that the distance between the origin (0, 0, 0) and a point p 2 on the curve (x, y, z) is Distance = x + y 2 + z 2 , and that the distancesquared function f (x, y) = x2 + y 2 + z 2 is yet again used just for the sake of convenience. First off, we solve the Lagrange Multiplier equations (of which there are three) ∂g ∂f +λ = 0, ∂x ∂x ∂f ∂g +λ = 0, ∂y ∂y ∂f ∂g +λ = 0, ∂z ∂z (2x) + λ (2x/4) = 0, (2y) + λ (2y/9) = 0, (2z) + λ (2z/16) = 0, (λ + 4) x = 0, (λ + 9) y = 0, (λ + 16) z = 0. Our next move is to determine all three values of λ, the Lagrange multipliers. The three values that the Lagrange multiplier will take on are actually in plain view, but it is still possible, albeit a little heavy-handed, to solve this (as before), by re-writing the three equations in matrix notation, as shown below: 0 x λ+1 0 0 (λ + 4) x + (0) y + (0) z = 0 0 , y 0 λ+9 0 (0) x + (λ + 9) y + (0) z = 0 = ⇒ z 0 0 λ + 16 (0) x + (0) y + (λ + 16) z = 0 0 so that the λs giving us the most general solution for (x, y, z) can be found via λ+4 0 0 0 = 0, λ + 9 0 0 0 λ + 16 (λ + 4) (λ + 9) (λ + 16) = 0. We have to determine the stationary points corresponding to each value of λ. For λ = −4, we obviously get y = 0, z = 0, and – after substituting y = z = 0 into the constraint equation – x = ±2. The solutions are therefore (±2, 0, 0). For λ = −9, we obviously get x = 0, z = 0, and – after substituting x = z = 0 into the constraint equation – y = ±3. The solutions are therefore (0, ±3, 0). 28 For λ = −16, we obviously get x = 0, y = 0, and – after substituting x = y = 0 into the constraint equation – z = ±4. The solutions are therefore (0, 0, ±4). Clearly the shortest distance to the sphere is 2 units, and the longest distance is twice that. From a geometrical point of view, this result is not surprising. 29