Lipschitzian Optimization, DIRECT Algorithm, and Applications

Transcription

Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Lipschitzian Optimization, DIRECT Algorithm,
and Applications
Yves Brise
April 1, 2008
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Outline
1
Lipschitzian Optimization
2
DIRECT Algorithm
3
Applications
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Outline
1
Lipschitzian Optimization
2
DIRECT Algorithm
3
Applications
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Function Optimization
Problem
For a function f : D ⊆ Rd → R, find
min f (x ).
x∈D
Simple Bounds
Mostly we will assume li ≤ xi ≤ ui for all i ∈ [d], i.e. every variable xi
has some lower bound li and some upper bound ui . This means D is
a hyperrectangle.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Taxonomy of Methods
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Lipschitzian Optimization
Shubert (1972)
“A Sequential Method Seeking the Global Maximum of a Function”
Definition
A function f : D ⊆ Rd → R is called Lipschitz-continuous if there
exists a positive constant K ∈ R+ such that
|f (x ) − f (x ′ )| ≤ K |x − x ′ |, ∀x , x ′ ∈ D.
Problem
We consider the following minimization problem
min f (x ),
x∈D
where f is Lipschitz-continuous, and D simple bounded.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥
f (x ) ≥
f (a) − K (x − a),
f (b) + K (x − b).
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥
f (x ) ≥
f (a) − K (x − a),
f (b) + K (x − b).
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥
f (x ) ≥
f (a) − K (x − a),
f (b) + K (x − b).
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥
f (x ) ≥
f (a) − K (x − a),
f (b) + K (x − b).
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥
f (x ) ≥
f (a) − K (x − a),
f (b) + K (x − b).
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥
f (x ) ≥
f (a) − K (x − a),
f (b) + K (x − b).
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥
f (x ) ≥
f (a) − K (x − a),
f (b) + K (x − b).
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Global vs. Local Search
X (a, b, f , K )
=
B(a, b, f , K )
=
a + b f (a) − f (b)
+
,
2
2K
f (a) + f (b) K (b − a)
−
.
2
2
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Global vs. Local Search
X (a, b, f , K )
=
B(a, b, f , K )
=
a + b f (a) − f (b)
+
,
2
2K
f (a) + f (b) K (b − a)
.
−
2
2
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Global vs. Local Search
X (a, b, f , K )
=
B(a, b, f , K )
=
a + b f (a) − f (b)
+
,
2
2K
f (a) + f (b) K (b − a)
−
.
2
2
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Pros and Cons of Lipschitzian Optimization
Pros
+ Global search possible
+ Deterministic, no need for multiple runs
+ Few paramters apart from K no need for fine-tuning
+ K gives bound on error, no need to rely on arbitrary stopping
criteria such as the number of iterations
Cons (of Shubert’s algorithm)
- Lipschitz constant has to be known
- Speed of convergence (global vs. local)
- Computational complexity in higher dimensions
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Outline
1
Lipschitzian Optimization
2
DIRECT Algorithm
3
Applications
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Problems of Liptschitzian Optimization
Problem 1: Specifying K
K might not be easily accessible. DIRECT needs no prior knowledge
and uses all possible constants. Sounds terrific, but how...
Problem 2: Convergence Speed
The parameter K is a trade-off between global and local search. By
using all possible K , DIRECT balances better between global and
local search.
Problem 3: Combinatorial Complexity in Higher Dimensions
Lipschitzian Optimization is initialized by evaluating the function at the
corners of a hyperrectangle. We have to make O(2d ) evaluations.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
DIRECT in 1D
Jones, Perttunen, Stuckman (1993)
“Lipschitzian Optimization Without the Lipschitz Constant”
The name DIRECT stands fro DIviding RECTangles, but also
captures the fact that it is a direct search technique.
Key idea: Sample the function at center of rectangle.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Division of Intervals
When dividing the search space we have to make sure that previous
function evaluations are not lost, i.e. they are still at the center of
some interval.
⇒ Instead of a bisection we do a trisection.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Lipschitz Bound
f (x ) ≥ f (c) + K (x − c)
f (x ) ≥ f (c) − K (x − c)
Yves Brise
for x ≤ c,
for x ≥ c.
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Potentially Optimal Intervals
Let S be the the partition of [a, b] into subintervals, |S| = m.
Definition
An interval j ∈ S is called potentially optimal if there exists some
constant K̃ ≥ 0 such that the following conditions hold,
f (cj ) − K̃ ((bj − aj )/2) ≤
f (cj ) − K̃ ((bj − aj )/2) ≤
f (ci ) − K̃ ((bi − ai )/2) ∀i ∈ S,
fmin − ǫ|fmin |,
(1)
(2)
where ǫ ≥ 0.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Potentially Optimal Intervals
(1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Potentially Optimal Intervals
(1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Potentially Optimal Intervals
(1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Potentially Optimal Intervals
(1): f (cj ) − K̃ ((bj − aj )/2) ≤ f (ci ) − K̃ ((bi − ai )/2), ∀i ∈ S
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Potentially Optimal Intervals
(2): f (cj ) − K̃ ((bj − aj )/2) ≤ fmin − ǫ|fmin |
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Potentially Optimal Intervals
(2): f (cj ) − K̃ ((bj − aj )/2) ≤ fmin − ǫ|fmin |
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Summary
DIRECT in 1D
1
2
3
4
5
6
7
Input : a, b ∈ R, f (·), ǫ ≥ 0
Output: fmin
Initialize;
repeat
Identify set S of potentially optimal intervals;
for s ∈ S do
Evaluate new center points and subdivide s;
until too many iterations ;
return fmin ;
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Division of Hypercubes
1. Evaluate f at c ± δei , where ei is the i th unit vector.
2. Subdivide along directions with best function values first.
This way the largest rectangles contain the best function values.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Division of Hypercubes
1. Evaluate f at c ± δei , where ei is the i th unit vector.
2. Subdivide along directions with best function values first.
This way the largest rectangles contain the best function values.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Division of Hypercubes
1. Evaluate f at c ± δei , where ei is the i th unit vector.
2. Subdivide along directions with best function values first.
This way the largest rectangles contain the best function values.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Division of Hypercubes
1. Evaluate f at c ± δei , where ei is the i th unit vector.
2. Subdivide along directions with best function values first.
This way the largest rectangles contain the best function values.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Division of Hyperrectangles
Only divide along the set of longest sides.
Rectangles have side lengths either 3−k or 3−(k +1) , for k ∈ N.
This fact is essential for the convergence of DIRECT.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Division of Hyperrectangles
Only divide along the set of longest sides.
Rectangles have side lengths either 3−k or 3−(k +1) , for k ∈ N.
This fact is essential for the convergence of DIRECT.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Convergence of DIRECT
Theorem (Jones, Perttunen, Stuckman, 1993)
DIRECT samples a dense subset of the unit cube, i.e. for any point x
in the unit hypercube and δ > 0, DIRECT will eventually sample a
point y such that ||x − y ||2 ≤ δ.
Proof
Let D be the d-dimensional unit hypercube.
A rectangle R that has been involved in r divisions will have
j := r mod d sides of length 3−(k +1) and d − j sides of length
3−k , where k = (r − j)/d.
p
The radius of R is therefore (j3−2(k +1) + (d − j)3−2k )/2, which
goes to zero as r approches infinity.
Let t ∈ N be the current iteration, and rt ∈ N the fewest number
of divisions undergone by any rectangle.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Convergence of DIRECT
Proof (cont’d)
Claim: limt→∞ rt = ∞.
Assume otherwise: ∃t ′ after which rt never changes, i.e.
limt→∞ rt = rt ′ .
After iteration t ′ there will be a finite number of rectangles (say
N) of maximal size. The one with the lowest function value will be
potentially optimal, and therefore subdivided.
This only leaves N − 1 maximal rectangles. After N − 1 iterations
rt has increased by 1.
Corollary (Jones, Perttunen, Stuckman, 1993)
If the function f is continuous in the neigborhood of f ∗ := minx∈D f (x )
then DIRECT converges to f ∗ .
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Convergence of DIRECT
Finkel, Kelley (2004)
“Convergence Analysis of the DIRECT Algorithm”, application of
nonsmooth analysis.
Definition
The generalized directional derivative of f at x ∈ D in direction v is
f 0 (x , v ) := lim sup
y →x,y ∈D,
t↓0,y +tv ∈D
f (y + tv ) − f (y )
.
t
Theorem (Finkel, Kelley, 2004)
If f is Lipschitz-continuous on D and x ∗ is a cluster point of the
sequence of DIRECT’s “best points”, then f 0 (x ∗ , v ) ≥ 0.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Speed Up for Easy Functions
Gablonsky, Kelley (2001)
“A Locally-Biased Form of The DIRECT Algorithm”
If there are only a few local minima, DIRECT uses a lot of
unnecessary time exploring unvisited territory.
Idea: Group rectangles by L∞ norm, i.e. by their longest side
and not their diameter.
This leads to reduction in the number of groups, especially in the
large and unimportant rectangles.
Little theoretical, but at least some experimental evidence that
this scheme works.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Additive Scaling Problem
Finkel, Kelley (2006)
“Additive Scaling and the DIRECT Algorithm”
Theorem (Finkel, Kelley, 2006)
Let R be a hypercube sampled by DIRECT, and α(R) its size.
Suppose that R is in the set of the smallest rectangles and that
f (c(R)) = fmin . If
ǫ>
2α(R)K
p
√ ,
|f (c(R))|( (d + 8) − d )
then R will not be subdivided until all rectangles in its neighborhood
are of the same size as R.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Additive Scaling Problem
Theorem (Finkel, Kelley, 2006)
Let R be a hypercube sampled by DIRECT. Suppose there exists a
hyperrectangle T such that α(T ) > α(R). If
√
K d
,
fmin > p
ǫ( 1 + 8/d − 1)
then R will not be subdivided.
The authors propose a variant of the definition of potential
optimality,
f (cR ) − K̃ α(R) ≤ fmin − ǫ|fmin − fmedian |.
Experimental results that the modified DIRECT is stable under
additive scaling.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Outline
1
Lipschitzian Optimization
2
DIRECT Algorithm
3
Applications
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Aircraft Routing
Bartholomew-Biggs, Parkhurst, Wilson (2002)
“Using DIRECT to Solve an Aircraft Routing Problem”
Two-dimensional (euclidian) shortest path problem subject to
obstacle regions, rendezvous time, speed and maneuverability
constraints, visibility, etc. . .
Optimization of a multivariate non-differentiable function.
Therefore, DIRECT seems to be a good approach.
Waypoints are variables of the function to optimize.
Restart slightly improves performance.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Component Design
Zhu, Bogy (2002)
“DIRECT Algorithm and Its Application to Slider Air-Bearing Surface
Optimization”, design of hard-drive heads.
Shape of the magnetic head
is crucial to movement.
Height above the disk is
referred to as “flying” height.
DIRECT outperforms
Simulated Annealing. Better
convergence rate and better
result.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization
DIRECT Algorithm
Applications
Last Slide
Open semester thesis:
“Global Function Optimization for Aircraft Routing”, Implementation of
DIRECT in C++.
http://www.ti.inf.ethz.ch/ew/teaching/dst.html
[email protected]
Thank you for your attention.
Yves Brise
Lipschitzian Optimization, DIRECT Algorithm, and Applications

Similar documents

CIE TC1-90 Colour fidelity index

CIE TC1-90 Colour fidelity index  Visual inspection,color vision test  Detection threshold,categorical color, color name

More information