Assignment 1, 2nd ed.

Transcription

Assignment 1, 2nd ed.
Umeå University
Department of Computing Science
Fall 2014
5DA001
Assignment 1
Non-linear least squares
v. 2 2014-11-18
5DA001 — Non-linear optimization
The deadline for this assignment can be found at:
http://www8.cs.umu.se/kurser/5DA001/HT14/timetable.html
(Link Planning and Readings on the course homepage.)
• The submission should consist of:
– The complete report, including a front page with the following information:
1. Your name (and of your colleague if you work in pairs).
2. The course name.
3. The assignment number.
4. Your username(s) at the Department of Computing Science.
5. The version of the submission (in case of re-submissions).
– An appendix with the source code.
• To simplify feedback, the report, except appendices, must have page
numbers and each section should be numbered.
• If you submit a report with linked references (e.g. written in LATEX), please
verify that references are ok and not “figure ??”.
• It should be possible to understand your report without knowing
the specification in detail. Thus, it is recommended you start
your report with a short summary of the specification.
• Your report should be submitted as a pdf file uploaded via the
https://www8.cs.umu.se/~labres/py/handin.cgi page, also avaiable via the
results link at the bottom left of the course home page.
• Furthermore, the source code should be available in a folder called edu/5da001/assN
in your home folder, where N is the assignment number. You will probably
have to create the folder yourself.
• The submitted code should be Matlab-compatible. If you develop your
code in Octave, test your code in Matlab before submitting it!
• Auxiliary code and data needed for this assignment will be placed at
http://www8.cs.umu.se/kurser/5DA001/HT14/assignment1/.
1
Introduction
This assignment includes implementation of two or more non-linear optimization
algorithms and applying them to two or more application problems.
1.1
Algorithms
The optimization algorithms are
GN-A Gauss-Newton (Section 3.1.1) with Armijo line search (Section 3.1.2).
GN-W Gauss-Newton (Section 3.1.1) with Wolfe line search (Section 3.1.3).
LM-P Levenberg-Marquardt with Powell dogleg (Section 3.3).
BFGS-W BFGS (Section 3.2) with Wolfe line search (Section 3.1.3).
1.2
Application problems
The applications problems are
ELLIPSE Fit an ellipse to a number of measured points. The ellipse problem
is described in Section 5.1.
HOMOGRAPHY Estimate the homography between two planar projections
of an object. Described in Section 5.2.
RELORIENT Relative orientation of two cameras. Described in Section 5.3.
Code for the RELORIENT problem is given.
2
Task
You may work alone or in pairs. There is a baseline task if you work alone and
additional work (one more algorithm or test problem) if you work in pairs. The
baseline task contains an implementation and an investigation part:
2.1
Implementation
• Implement the GN-A algorithm.
• Implement one optional algorithm
• Implement either the ELLIPSE or the HOMOGRAPHY test problems.
This includes:
– Code for your model function.
– Code to plot the model corresponding to a parameter vector x, i.e.
2
∗ for the ellipse problem, plot both the ellipse points and an illustration of the ellipse,
∗ for the homography problem, plot both the transformed points
and an illustration of the homography.
This plotting function is central as a visual feedback to you (and me).
– Code for the residual/jacobian function.
– Code to find initial values from observations only.
2.2
Basic questions
For each test problem, answer the following questions:
• How many parameters n does the problem have, as a function of the
number of points k?
• How many parameters n0 are global, i.e. does not depend on the number
of points, i.e. what is n for k = 0?
• How many observations (elements of the residual vector) does each point
generate?
• What is the minimum number of points needed in order to obtain a unique
solution? That is, for what k is the total number of observations m equal
to the number of parameters n?
• The redundancy is defined as
r = m − n.
How many points are needed to have a redundancy r ≥ n0 ?
2.3
2.3.1
Investigation
Optimization
• Show that your code returns after a maximum of one iteration if x0 is a
minimizer for your test problem.
• Show the iteration trace for a nice test problem and nice starting approximation.
• Show the iteration trace for a difficult test problem. In particular, show
how the damping (linesearch or trust-region) works.
• Construct an example where the solution gives problems and show how
the damping struggles to solve the problem (and possibly fails). Hint:
Construct a degenerate problem.
3
• Construct an example where the solution is OK but the starting approximation gives problems. Hint: Pick a starting approximation that corresponds to a degenerate problem.
• Show the iteration trace for the RELORIENT test problem with the supplied starting approximation.
2.3.2
Analysis
• For the given test data for your problem(s), answer the following questions:
– What is the estimated coordinate measurement error, also known as
“standard deviation of unit weight” σ0 ?
– What is the estimated standard deviations of the global parameters?
– Which two parameters have the highest (by absolute value) correlation value? How high is it?
– Which point i has the highest redundancy number ri ?
– Which point j has the lowest redundancy numbers rj ?
– Plot the solutions for the following data sets:
∗ All points.
∗ All points except point i.
∗ All points except point j.
Did the solution change more when point i or j was removed? Is this
consistent with the redundancy numbers?
• For the ellipse problem, pick one of the ellipses given in Appendix B as
your given data.
2.4
2.4.1
Code compliance
Test problems
Your residual/Jacobian function should have the following parameter list:
function [r,J,JJ]=function_name(x,...)
where x is a column vector with the parameters and r is the residual vector, J is
the Jacobian, and JJ is a numeric approximation of the Jacobian. The ellipses
(...) indicate that other parameters may be necessary. See the example code in
Appendix A.
2.4.2
Optimization methods
Your optimization methods should all have the following parameter list
function [x,code,n,...]=method_name(fun,x0,convTol,maxIter,params,...)
4
where fun is a function handle to or a string with the name of your residual
function, x0 is a column vector with the starting approximation, convTol is the
convergence tolerance (scalar), and maxIter (scalar) is the maximum number
of iterations allowed. The minimizer (if found) is returned in the column vector
x. The status of the optimization is returned in code, where 0 indicates convergence, and -1 failure to converge in the allowed number of iterations. Other
failure codes are allowed. The number of iterations needed is returned in the
scalar n.
The cell array params contain any extra parameters to send to the residual
function. For the example in Appendix A, param={t,y}. The calling sequence
in your optimization code is f=feval(x,param{:});1 .
It is furthermore strongly suggested that the iterates xi are returned as
columns of a matrix X. This will enable the optimization code to be kept simple
while allowing for later “playback” (plotting, printing) to analyze the iteration
sequence.
2.4.3
Result scripts
Any result that you refer to in your report should have a corresponding script
(i.e. a file with matlab code that is executed without any parameters) that is referred to in the report. For instance, the test that your GN-A algorithm detects
that x0 is given a minimizer might be called gna_verify_x0_is_minimizer or
test_4_1 (if the test is presented in Section 4.1 of your report).
2.4.4
Repeatability
It is important that any test that uses random errors should be repeatable. This
can be achieved by resetting the random number generator via the command
rng(n), where n is some integer, before calling randn to generate the random
numbers. You should probably test that your code works with different values
of n as well.
2.5
Hints
• Do not forget the 1/2 in the least squares linesearch algorithm!
• How do you expect the damping to behave, far from and near the solution,
respectively, for a nice problem? For instance, do you expect a small (1)
or large (= 1) step length near the solution?
• One way to generate a suitably difficult test case is to:
1. Pick an x∗ that is close to a degenerate solution.
1 You
may choose to “hide” the extra parameters via a function declaration like
fun=@(x)ellipse(x)-d; before calling the optimization method, but your optimization code
should still be able to handle functions that do take extra parameters.
5
2. Generate k points pi that satisfy your model exactly (points on the
ellipse or generated from the homography),
3. Generate simulated measurement points qi by adding random errors.
The code
e=sigma*randn(size(p)); q=p+e;
will add independent, normally distributed errors with standard deviation sigma to each of your measurements. Furthermore, removing
the projection of e into the range space of J(x∗ ) will maintain the
same x∗ (for sanity checks).
3
Optimization algorithms
3.1
3.1.1
Gauss-Newton
The Gauss-Newton method
Implement the Gauss-Newton minimization method for unconstrained non-linear
least squares problems. Use kJpk ≤ (1 + krk) as the convergence criteria. See
Section 4 for suggested constant values.
Suggested additional input parameters:
• Parameters (constants) needed by the line search.
Suggested additional output parameters:
• A matrix with all iterates xi as columns. Useful for analyzing the algorithm.
• A vector with all step lengths αi . Useful for analyzing the line search.
• The error code -2 could be used to indicate that the line search could not
find a suitable step length.
3.1.2
Armijo line search with backtracking
Implement a line search algorithm from a point xk along a search direction pk
that uses the Armijo condition with backtracking α = 1, 12 , 14 , . . ..
Suggested input parameters:
• The name of the function calculating the residual r.
• The current point xk .
• The current search direction pk .
• Shortest acceptable step length αmin . Necessary to guarantee termination
of the step length algorithm.
• Parameter(s) necessary to test the Armijo condition.
6
Suggested output parameters:
• The accepted step length αk . If no such step length exists, αk = 0 should
be returned.
3.1.3
The Bracket-Zoom Wolfe line search
Implement the line search algorithm described in the textbook, chapter 3.5,
algorithms 3.5-3.6.
Suggested input parameters:
• The name of the function calculating the residual r and Jacobian J.
• The current point xk .
• The current search direction pk .
• Acceptable step length interval αmin , αmax . Necessary to guarantee termination of the step length algorithm.
• Parameter(s) necessary to test the Wolfe condition.
Suggested output parameters:
• The accepted step length αk . If no such step length exists, αk = 0 should
be returned.
3.2
The BGFS method with Wolfe line search
Implement the BFGS method with Wolfe line search (textbook chapter 6.1,
algorithm 6.1).
Suggested additional input parameters:
• Parameters (constants) needed by the line search.
Suggested additional output parameters:
• A vector with all step lengths αi . For analyzing the line search.
3.3
Levenberg-Marquardt with Powell dogleg
Implement the Levenberg-Marquardt method. Use the dogleg algorithm to solve
the subproblem
min ψk (p)
p
=
1
f (xk ) + ∇f (xk )T p + pT ∇2 f (xk )p
2
s.t.
kpk
≤
∆k ,
where ∇2 f (xk ) is approximated by J(xk )T J(xk ). Use kJpGN k ≤ (1 + krk) as
the convergence criteria, where pGN is the Gauss-Newton search direction. See
Section 4 for suggested constant values.
Suggested additional input parameters:
7
• The inital trust-region size ∆0 .
Suggested additional output parameters:
• Vectors with search direction lengths (kpk k), gain ratios (ρk ), and trustregion sizes ∆k . Useful for analyzing the global strategy.
4
Common constants
Suggested constants: Convergence tolerance = 10−8 , maxIter=50, c1 = 1e − 4,
c2 = 0.9, αmin = 10−3 , αmax = 10, ∆0 = 1 or ∆0 = 10−6 , η = 0.25.
5
5.1
Application problems
Ellipse fitting
Problem: An oblique projection of a sphere becomes an ellipse. The problem of
determining the position of the sphere in 3D, may be formulated to contain the
following subproblem:
Model function: A point p = (x, y)T on an ellipse with center (cx , cy ), semimajor axis length a, semi-minor axis length b and inclination δ has coordinates
x
cx
cos δ − sin δ
a cos θ
= h(θ) =
+
y
cy
sin δ
cos δ
b sin θ
for some value of the “phase angle” θ.
Problem: Given m measured points p̃ = (x̃, ỹ)T , find the parameters of the
ellipse which is closest to all points, as measured by the Eucledian distance, i.e.
solve the problem


r1 (x)
1


min r(x)T r(x), where r(x) =  ...  and ri (x) = h(θi ) − p̃i
x 2
rm (x)
T
for the unknowns x = cx cy a b δ θ1 . . . θm .
On application the uses the ellipse fitting is in Radiostereometry (RSA),
where the spherical head of a hip joint prosthesis is projected in two X-ray
images. See figures 1 and 2.
5.1.1
Example data
See Appendix B for real data for this problem.
8
[
Figure 1: The projection (right) of the spherical head of the hip joint (left) by
two X-ray tubes generates two elliptical projections.
f
a
b
d
xoy
Figure 2: The parameters of the projected ellipse.
9
5.2
Planar projective transformation (homography)
If a plane is viewed from an oblique angle, the coordinates p = (x, y)T in the
plane is transformed according to a homography:
"
#
0
a11 x+a12 y+a13
x
a31 x+a32 y+1
= h(p) = a21 x+a22 y+a23 .
y0
a31 x+a32 y+1
Problem: Given a number of measured 2d coordinates p̃ = (x̃, ỹ)T and
corresponding known coordinates p = (x, y)T , determine the parameters aij of
the homography, i.e. solve the problem


r1 (x)
1


min r(x)T r(x), where r(x) =  ...  and ri (x) = h(pi ) − p̃i
x 2
rm (x)
T
for the unknowns x = a11 a12 a13 a21 a22 a23 a31 a32 . The matrix


a11 a12 a13
A = a21 a22 a23  ,
a31 a32 1
describes the homography between the two images.
Figure 3: The image coordinates p̃i are measured in the left image. The
corresponding “true” coordinates pi are here assumed to be the corners of the unit square.
By calculating the homography A between
pi and p̃i , we can “rectify” the image (right).
The following Matlab
code generates the rectified image:
T=maketform(’projective’,A’);
I2=imtransform(I1,T,’XData’,[-1,2],’YData’,[-1,2],’XYScale’,0.01);
5.3
Relative orientation of two cameras
Code will be available for this problem. The text is to increase your understanding of the problem.
10
5.3.1
Point projection
Assume we have a camera with known internal orientation, i.e. the focal length
f and the principal point (x0p , yp0 )T (optical center of image) is known. Ignoring
the effects of lens distortion, the relationship between a 3d object point p =
(x, y, z)T and its projected 2d image coordinates q = (x0 , y 0 )T is described by
the collinearity equations
x0 − x0p
y 0 − yp0
m11 (x − xc ) + m12 (y − yc ) + m13 (z − zc )
,
m31 (x − xc ) + m32 (y − yc ) + m33 (z − zc )
m21 (x − xc ) + m22 (y − yc ) + m23 (z − zc )
= −f
,
m31 (x − xc ) + m32 (y − yc ) + m33 (z − zc )
= −f
where the optical center of the camera is placed at world coordinates (xc , yc , zc )T
and the rotation matrix


m11 m12 m13
M = m21 m22 m23 
m31 m32 m33
describes the orientation of the camera with respect to the world coordinate
system. The rotation matrix may be parameterized in many ways. For this assignment, assume M is parameterized by the x-y-z (roll-pitch-yaw) Euler angles,
i.e.


1
0
0
sin ω  ,
M = Mκ M φ Mω ,
Mω = 0 cos ω
0 − sin ω cos ω




cos φ 0 − sin φ
cos κ sin κ 0
1
0 ,
Mφ =  0
Mκ = − sin κ cos κ 0 .
sin φ 0 cos φ
1
0
0
With substitutions

   
xc
U
x
 V  = M y  −  yc 
zc
W
z

the collinearity equations becomes
0
0
U
x −fW
,
x
q = 0 = h(p) = 0p
V
y
.
yp − f W
5.3.2
Relative orientation
Assume we have two cameras with known internal orientation and we have
measured corresponding points q̃ 1 = (x0 , y 0 )T and q̃ 2 = (x0 , y 0 )T in two images
of the same object point p = (x, y, z)T . The projection in each cameras satisfy
the collinearity condition, with different camera centers and orientation, i.e. we
11
have twelve degrees of freedom. By locking seven degrees of freedom we can
determine the relative orientation of the two cameras and the three-dimension
position of the object points (with respect to the cameras). One solution is to
put camera one at the origin, aligned with the world coordinate system, i.e.
(xc1 , yc 1 , zc1 )T = (0, 0, 0)T , (ω1 , φ1 , κ1 )T = (0, 0, 0)T , and camera two at a fixed
distance along the x-axis, i.e. xc2 = bX .
Problem: Given m pairs of points with unknown object coordinates pi =
(xi , yi , zi )T and measured image coordinates q̃i1 = (x̃0i , ỹi0 )T and q̃i2 = (x̃0i , ỹi0 )T ,
determine the relative orientation of the cameras and the object positions pi ,
i.e. solve the problem

 1
r1 (x)
 .. 
 . 

 1
rm (x)
1
T


min r(x) r(x), where r(x) =  2

x 2
 r1 (x) 
 . 
 .. 
2
(x)
rm
and ri1 (x) = h(pi ) − q̃i1 is the residual in the first image, and ri2 (x) = h(pi ) − q̃i2
is the residual in the second image. The unknowns to solve for are
x = yc 2
5.4
zc2
ω2
φ2
κ2
p1
...
pm
T
.
Application
The problem application is when you have a number of measurements in two or
more images. See figures 4 and 5.
12
4
30
24
29
25
26
5
27
28
31
13
12
15
14
Figure 4: Overview of calculated camera positions and object points for the
Zürich City Hall data set.
Figure 5: Result of relative orientation of cameras 5 and 14.
13
A
Example of a problem function
function [r,J,JJ]=antelope_r(x,t,y)
%ANTELOPE_R Residual/jacobian function for the antelope problem.
%
%
R=ANTELOPE_F(X,T,Y) returns the residual vector for the exponential
%
antelope population model with parameters X=[K1;K2] and observations Y
%
taken at time T. If Y and T are M-by-1 vectors, R will be returned as
%
an M-by-1 vector.
%
%
The antelope population model and its residual are calculated as
%
%
M(X; T) = K1 * EXP( K2 * T ),
%
%
R(X) = M(X; T) - Y.
%
%
[R,J]=... also returns the analytical Jacobian J of R(X).
%
%
[R,J,JJ]=... returns JJ as the numerical approximation of J, as
%
calculated by JACAPPROX. Useful for debugging the implementation of J.
%
%See also: JACAPPROX.
% $Id: antelope_r.m 1226 2014-11-13 13:56:10Z niclas $
% Compute the residual.
r=x(1)*exp(x(2)*t)-y;
if nargout>1
% Compute the analytical Jacobian only if asked to. Avoid unnecessary
% calculations when only the residual is wanted.
J=[exp(x(2)*t), x(1)*t.*exp(x(2)*t)];
end
if nargout>2
% Compute the numerical jacobian only if asked to. IMPORTANT to avoid
% an infinite recursive loop via JACAPPROX.
JJ=jacapprox(mfilename,x,1e-6,{t,y});
end
14
(a) Image 2
(b) Image 1
Figure 6: Example images with point sets on three different ellipses. The first
point set (blue) contain points on the surface of the femoral head. The second
point set (red) contain points on the surface of the hemispherical backshell of
the cup. The third point set (green) contain points on the opening of the cup.
B
Ellipse data
The
supplied
function
http://www8.cs.umu.se/kurser/5DA001/HT14/assignments/assignment1/code/ellipse_data.m
can be called to get real ellipse points, see help ellipse_data. The image
and object numbers are illustrated in Figure 6. The image files are available at
http://www8.cs.umu.se/kurser/5DA001/HT14/assignments/assignment1/images.
15