Lipschitzian Optimization, DIRECT Algorithm, and Applications
Transcription
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization DIRECT Algorithm Applications Lipschitzian Optimization, DIRECT Algorithm, and Applications Yves Brise April 1, 2008 Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Outline 1 Lipschitzian Optimization 2 DIRECT Algorithm 3 Applications Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Outline 1 Lipschitzian Optimization 2 DIRECT Algorithm 3 Applications Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Function Optimization Problem For a function f : D ⊆ Rd → R, find min f (x ). x∈D Simple Bounds Mostly we will assume li ≤ xi ≤ ui for all i ∈ [d], i.e. every variable xi has some lower bound li and some upper bound ui . This means D is a hyperrectangle. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Taxonomy of Methods Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Lipschitzian Optimization Shubert (1972) “A Sequential Method Seeking the Global Maximum of a Function” Definition A function f : D ⊆ Rd → R is called Lipschitz-continuous if there exists a positive constant K ∈ R+ such that |f (x ) − f (x ′ )| ≤ K |x − x ′ |, ∀x , x ′ ∈ D. Problem We consider the following minimization problem min f (x ), x∈D where f is Lipschitz-continuous, and D simple bounded. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Shubert’s Algorithm in 1D If we substitute a and b for x ′ into the definition of Lipschitz-continuity we get the following two conditions for f (x ), where x ∈ [a, b], f (x ) ≥ f (x ) ≥ f (a) − K (x − a), f (b) + K (x − b). Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Shubert’s Algorithm in 1D If we substitute a and b for x ′ into the definition of Lipschitz-continuity we get the following two conditions for f (x ), where x ∈ [a, b], f (x ) ≥ f (x ) ≥ f (a) − K (x − a), f (b) + K (x − b). Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Shubert’s Algorithm in 1D If we substitute a and b for x ′ into the definition of Lipschitz-continuity we get the following two conditions for f (x ), where x ∈ [a, b], f (x ) ≥ f (x ) ≥ f (a) − K (x − a), f (b) + K (x − b). Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Shubert’s Algorithm in 1D If we substitute a and b for x ′ into the definition of Lipschitz-continuity we get the following two conditions for f (x ), where x ∈ [a, b], f (x ) ≥ f (x ) ≥ f (a) − K (x − a), f (b) + K (x − b). Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Shubert’s Algorithm in 1D If we substitute a and b for x ′ into the definition of Lipschitz-continuity we get the following two conditions for f (x ), where x ∈ [a, b], f (x ) ≥ f (x ) ≥ f (a) − K (x − a), f (b) + K (x − b). Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Shubert’s Algorithm in 1D If we substitute a and b for x ′ into the definition of Lipschitz-continuity we get the following two conditions for f (x ), where x ∈ [a, b], f (x ) ≥ f (x ) ≥ f (a) − K (x − a), f (b) + K (x − b). Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Shubert’s Algorithm in 1D If we substitute a and b for x ′ into the definition of Lipschitz-continuity we get the following two conditions for f (x ), where x ∈ [a, b], f (x ) ≥ f (x ) ≥ f (a) − K (x − a), f (b) + K (x − b). Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Global vs. Local Search X (a, b, f , K ) = B(a, b, f , K ) = a + b f (a) − f (b) + , 2 2K f (a) + f (b) K (b − a) − . 2 2 Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Global vs. Local Search X (a, b, f , K ) = B(a, b, f , K ) = a + b f (a) − f (b) + , 2 2K f (a) + f (b) K (b − a) . − 2 2 Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Global vs. Local Search X (a, b, f , K ) = B(a, b, f , K ) = a + b f (a) − f (b) + , 2 2K f (a) + f (b) K (b − a) − . 2 2 Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Pros and Cons of Lipschitzian Optimization Pros + Global search possible + Deterministic, no need for multiple runs + Few paramters apart from K no need for fine-tuning + K gives bound on error, no need to rely on arbitrary stopping criteria such as the number of iterations Cons (of Shubert’s algorithm) - Lipschitz constant has to be known - Speed of convergence (global vs. local) - Computational complexity in higher dimensions Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Outline 1 Lipschitzian Optimization 2 DIRECT Algorithm 3 Applications Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Problems of Liptschitzian Optimization Problem 1: Specifying K K might not be easily accessible. DIRECT needs no prior knowledge and uses all possible constants. Sounds terrific, but how... Problem 2: Convergence Speed The parameter K is a trade-off between global and local search. By using all possible K , DIRECT balances better between global and local search. Problem 3: Combinatorial Complexity in Higher Dimensions Lipschitzian Optimization is initialized by evaluating the function at the corners of a hyperrectangle. We have to make O(2d ) evaluations. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications DIRECT in 1D Jones, Perttunen, Stuckman (1993) “Lipschitzian Optimization Without the Lipschitz Constant” The name DIRECT stands fro DIviding RECTangles, but also captures the fact that it is a direct search technique. Key idea: Sample the function at center of rectangle. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Division of Intervals When dividing the search space we have to make sure that previous function evaluations are not lost, i.e. they are still at the center of some interval. ⇒ Instead of a bisection we do a trisection. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Lipschitz Bound f (x ) ≥ f (c) + K (x − c) f (x ) ≥ f (c) − K (x − c) Yves Brise for x ≤ c, for x ≥ c. Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Potentially Optimal Intervals Let S be the the partition of [a, b] into subintervals, |S| = m. Definition An interval j ∈ S is called potentially optimal if there exists some constant K̃ ≥ 0 such that the following conditions hold, f (cj ) − K̃ ((bj − aj )/2) ≤ f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2) ∀i ∈ S, fmin − ǫ|fmin |, (1) (2) where ǫ ≥ 0. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Potentially Optimal Intervals (1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Potentially Optimal Intervals (1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Potentially Optimal Intervals (1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Potentially Optimal Intervals (1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Potentially Optimal Intervals (2): f (cj ) − K̃ ((bj − aj )/2) ≤ fmin − ǫ|fmin | Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Potentially Optimal Intervals (2): f (cj ) − K̃ ((bj − aj )/2) ≤ fmin − ǫ|fmin | Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Summary DIRECT in 1D 1 2 3 4 5 6 7 Input : a, b ∈ R, f (·), ǫ ≥ 0 Output: fmin Initialize; repeat Identify set S of potentially optimal intervals; for s ∈ S do Evaluate new center points and subdivide s; until too many iterations ; return fmin ; Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Division of Hypercubes 1. Evaluate f at c ± δei , where ei is the i th unit vector. 2. Subdivide along directions with best function values first. This way the largest rectangles contain the best function values. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Division of Hypercubes 1. Evaluate f at c ± δei , where ei is the i th unit vector. 2. Subdivide along directions with best function values first. This way the largest rectangles contain the best function values. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Division of Hypercubes 1. Evaluate f at c ± δei , where ei is the i th unit vector. 2. Subdivide along directions with best function values first. This way the largest rectangles contain the best function values. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Division of Hypercubes 1. Evaluate f at c ± δei , where ei is the i th unit vector. 2. Subdivide along directions with best function values first. This way the largest rectangles contain the best function values. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Division of Hyperrectangles Only divide along the set of longest sides. Rectangles have side lengths either 3−k or 3−(k +1) , for k ∈ N. This fact is essential for the convergence of DIRECT. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Division of Hyperrectangles Only divide along the set of longest sides. Rectangles have side lengths either 3−k or 3−(k +1) , for k ∈ N. This fact is essential for the convergence of DIRECT. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Convergence of DIRECT Theorem (Jones, Perttunen, Stuckman, 1993) DIRECT samples a dense subset of the unit cube, i.e. for any point x in the unit hypercube and δ > 0, DIRECT will eventually sample a point y such that ||x − y ||2 ≤ δ. Proof Let D be the d-dimensional unit hypercube. A rectangle R that has been involved in r divisions will have j := r mod d sides of length 3−(k +1) and d − j sides of length 3−k , where k = (r − j)/d. p The radius of R is therefore (j3−2(k +1) + (d − j)3−2k )/2, which goes to zero as r approches infinity. Let t ∈ N be the current iteration, and rt ∈ N the fewest number of divisions undergone by any rectangle. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Convergence of DIRECT Proof (cont’d) Claim: limt→∞ rt = ∞. Assume otherwise: ∃t ′ after which rt never changes, i.e. limt→∞ rt = rt ′ . After iteration t ′ there will be a finite number of rectangles (say N) of maximal size. The one with the lowest function value will be potentially optimal, and therefore subdivided. This only leaves N − 1 maximal rectangles. After N − 1 iterations rt has increased by 1. Corollary (Jones, Perttunen, Stuckman, 1993) If the function f is continuous in the neigborhood of f ∗ := minx∈D f (x ) then DIRECT converges to f ∗ . Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Convergence of DIRECT Finkel, Kelley (2004) “Convergence Analysis of the DIRECT Algorithm”, application of nonsmooth analysis. Definition The generalized directional derivative of f at x ∈ D in direction v is f 0 (x , v ) := lim sup y →x,y ∈D, t↓0,y +tv ∈D f (y + tv ) − f (y ) . t Theorem (Finkel, Kelley, 2004) If f is Lipschitz-continuous on D and x ∗ is a cluster point of the sequence of DIRECT’s “best points”, then f 0 (x ∗ , v ) ≥ 0. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Speed Up for Easy Functions Gablonsky, Kelley (2001) “A Locally-Biased Form of The DIRECT Algorithm” If there are only a few local minima, DIRECT uses a lot of unnecessary time exploring unvisited territory. Idea: Group rectangles by L∞ norm, i.e. by their longest side and not their diameter. This leads to reduction in the number of groups, especially in the large and unimportant rectangles. Little theoretical, but at least some experimental evidence that this scheme works. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Additive Scaling Problem Finkel, Kelley (2006) “Additive Scaling and the DIRECT Algorithm” Theorem (Finkel, Kelley, 2006) Let R be a hypercube sampled by DIRECT, and α(R) its size. Suppose that R is in the set of the smallest rectangles and that f (c(R)) = fmin . If ǫ> 2α(R)K p √ , |f (c(R))|( (d + 8) − d ) then R will not be subdivided until all rectangles in its neighborhood are of the same size as R. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Additive Scaling Problem Theorem (Finkel, Kelley, 2006) Let R be a hypercube sampled by DIRECT. Suppose there exists a hyperrectangle T such that α(T ) > α(R). If √ K d , fmin > p ǫ( 1 + 8/d − 1) then R will not be subdivided. The authors propose a variant of the definition of potential optimality, f (cR ) − K̃ α(R) ≤ fmin − ǫ|fmin − fmedian |. Experimental results that the modified DIRECT is stable under additive scaling. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Outline 1 Lipschitzian Optimization 2 DIRECT Algorithm 3 Applications Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Aircraft Routing Bartholomew-Biggs, Parkhurst, Wilson (2002) “Using DIRECT to Solve an Aircraft Routing Problem” Two-dimensional (euclidian) shortest path problem subject to obstacle regions, rendezvous time, speed and maneuverability constraints, visibility, etc. . . Optimization of a multivariate non-differentiable function. Therefore, DIRECT seems to be a good approach. Waypoints are variables of the function to optimize. Restart slightly improves performance. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Component Design Zhu, Bogy (2002) “DIRECT Algorithm and Its Application to Slider Air-Bearing Surface Optimization”, design of hard-drive heads. Shape of the magnetic head is crucial to movement. Height above the disk is referred to as “flying” height. DIRECT outperforms Simulated Annealing. Better convergence rate and better result. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications Lipschitzian Optimization DIRECT Algorithm Applications Last Slide Open semester thesis: “Global Function Optimization for Aircraft Routing”, Implementation of DIRECT in C++. http://www.ti.inf.ethz.ch/ew/teaching/dst.html [email protected] Thank you for your attention. Yves Brise Lipschitzian Optimization, DIRECT Algorithm, and Applications
Similar documents
CIE TC1-90 Colour fidelity index
Visual inspection,color vision test Detection threshold,categorical color, color name
More information