Bad semidefinite programs: they all look the same G´ abor Pataki

Transcription

Bad semidefinite programs: they all look the same G´ abor Pataki
Bad semidefinite programs: they all look the same
G´abor Pataki∗
Abstract
We call a conic linear system P badly behaved, if for some objective function c the
value sup { hc, xi | x ∈ P } is finite, but the dual program has no solution attaining the
same value. We give simple, and exact characterizations of badly behaved conic linear
systems. Surprisingly, it turns out that the badly behaved semidefinite system
!
!
α 1
1 0
x1
,
(0.1)
1 0
0 0
where α is some real number, appears as a minor in all such systems in a well-defined
sense. We prove analogous results for second order conic systems: here the minor to
which all bad systems reduce is
 
 
α
1
 
 
x1 α ≤K 1 ,
(0.2)
1
0
where α is again some real number, and K is a second order cone.
Our characterizations provide N P ∩ co−N P certificates in the real number model of
computing to recognize whether semidefinite, and second order conic systems are badly
behaved.
Our main tool is one of our recent results that characterizes when the linear image
of a closed convex cone is closed. While we use convex analysis, the results have a
combinatorial flavor.
Key words: duality; closedness of the linear image; badly behaved instances; excluded
minor characterizations
MSC 2010 subject classification: Primary: 90C46, 49N15; secondary: 52A40
OR/MS subject classification: Primary: convexity; secondary: programming/nonlinear/theory
1
Introduction. A sample of the main results
Conic linear programs, as introduced by Duffin ([17]) in the fifties provide a natural generalization of linear programming. Until the late eighties the main aspect of conic LPs that
∗
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill
1
optimizers and convex analysts focused on was their duality theory. It is largely inherited
from linear programming; however, when the underlying cone is not polyhedral, pathological phenomena occur, such as nonattainment of the optimal value, positive duality gaps,
and infeasibility of the dual, when the primal program is bounded.
Nesterov and Nemirovskii [26] in the late eighties developed a general theory of interior point methods for conic LPs: the two most prominent efficiently solvable classes are
semidefinite programs (SDPs), and second order conic programs (SOCPs). Today SDPs
and SOCPs are covered in several surveys and textbooks, and are a subject of intensive
research ([8, 5, 35, 1, 33, 14]). The website [21] is a good resource to keep track of new
developments.
This research was motivated by the curious similarity of pathological SDP instances
appearing in the literature. The main, surprising finding of this paper is that all badly
behaved SDPs “look the same, ” and the trivial badly behaved semidefinite system (0.1)
appears as a minor in all such systems. We prove similar results for SOCPs.
We first present our general framework. Let K be a closed, convex cone, and write
s ≤K t to denote t − s ∈ K. Let X and Y be finite dimensional euclidean spaces, c ∈ X, b ∈
Y, and A : X → Y a linear map. The conic linear programming problem parametrized by
the objective function c is
sup
hc, xi
(Pc )
s.t. Ax ≤K b.
The primal conic sytem is defined as
P = { x | Ax ≤K b }.
(1.3)
To define the dual program, we let A∗ be the adjoint operator of A, and K ∗ the dual cone
of K, i.e.,
K ∗ = { y | hy, xi ≥ 0 ∀x ∈ K }.
The dual problem is then
inf hb, yi
s.t. y ≥K ∗ 0
A∗ y = c.
(Dc )
Definition 1. We say that the system P is well-behaved, if for all c objective functions for
which the value of (Pc ) is finite, the dual has a feasible solution which attains the same
value. We say that P is badly behaved, if it is not well-behaved. A more formal terminology,
coined by Duffin, Jeroslow, and Karlovitz [16] for semi-infinite systems is saying that PSD
yields, or does not yield uniform LP-duality; we use our naming for simplicity.
The most important class of conic LPs we study in this paper is semidefinite programs.
n the set of of n by n symmetric
We denote by S n the set of n by n symmetric, and by S+
n
n
positive semidefinite matrices. For S, T ∈
PSn let us write S T to denote T − S ∈ S+ ,
and define their inner product as S • T = i,j=1 sij tij . The primal-dual pair of semidefinite
programming problems is then
2
Pm
ci xi
sup
Pi=1
m
(SDPc ) s.t.
i=1 xi Ai B
inf B • Y
s.t. Y 0
(SDDc )
Ai • Y = ci (i = 1, . . . , m),
n )∗ = S n under the • inner product, the pair
where A1 , . . . , Am , and B are in S n . Since (S+
+
(SDPc ) − (SDDc ) is indeed a pair of conic LPs.
The primal semidefinite system is
PSD = { x |
m
X
xi Ai B }.
(1.4)
i=1
The two motivating examples of pathological SDPs follow:
Example 1. In the problem
sup x1
s.t. x1
0 1
1 0
!
1 0
0 0
!
(1.5)
the only feasible solution is x1 = 0. In the dual program let us denote the components of Y
by yij . The only constraint other than semidefiniteness is 2y12 = 1, so it is equivalent to
inf y11
s.t.
y11 1/2
1/2 y22
!
0.
(1.6)
which has a 0 infimum, but does not attain it.
Example 2. The problem
sup x2



 

1 0 0
0 0 1
1 0 0



 

s.t. x1 0 0 0 + x2 0 1 0 0 1 0
0 0 0
1 0 0
0 0 0
(1.7)
again has an attained 0 supremum. The reader can easily check that the value of the dual
program is 1, (and it is attained), so there is a finite, positive duality gap.
Curiously, while their “pathology” differs, Examples 1 and 2 still look similar: if we
delete the second row and second column in all matrices in Example 2, remove the first
matrix, and rename x2 as x1 , we get back Example 1! This raises the following questions:
Do all badly behaved semidefinite systems “look the same”? Does the system of Example 1
appear in all of them as a “minor”? In the following we make these questions more precise,
and prove that the answer is yes to both. The paper contains a number of results for general
conic LPs, but we state the ones for semidefinite systems first, since this can be done with
minimal notation.
3
Assumption 1. Theorems 1 and 2 are stated “modulo rotations”, i.e., we assume that we
can replace Ai by T T Ai T for i = 1, . . . , m, and B by T T BT for some invertible matrix T.
We also assume that the maximum rank slack in PSD is of the form
!
Ir 0
Z =
.
0 0
Theorem 1. The system PSD is badly behaved, if and
is a linear combination of the Ai and B of the form

V11 e1 . . .
 T
e
0 ...
V = 
 1
..
..
.
.
(1.8)
only if there exists a matrix V which


,

(1.9)
where V11 is r by r, e1 is the first unit vector, and the dots denote arbitrary components.
The Z and Y matrices provide a certificate of the bad behavior of PSD . For example,
they are
!
!
1 0
0 1
Z =
,V =
,
0 0
1 0
in Example 1, and




1 0 0
0 0 1




Z = 0 1 0 , V = 0 1 0
0 0 0
1 0 0
in Example 2. The reader may find it interesting to spot the certificates in the badly
behaved systems appearing in the literature.
Corollary 1. Consider the following four elementary operations performed on PSD :
(1) Rotation: for some T invertible matrix, replace A1 , . . . , Am by T T A1 T, . . . , T T Am T,
and B by T T BT.
P
Pm
(2) Contraction: replace a matrix Ai by m
j=1 µj Aj ,
j=1 λj Aj , and replace B by B +
where the λj and µj are scalars, and λi 6= 0.
(3) Delete row ` and column ` from all matrices.
(4) Delete a matrix Ai .
If PSD is badly behaved, then using these operations we can arrive at the system
!
!
α 1
1 0
x1
,
1 0
0 0
where α is some real number.
4
(1.10)
Well-behaved semidefinite systems have a similarly simple characterization:
Theorem 2. The system PSD is well-behaved, if and only if conditions (1) and (2) below
hold:
(1) there exists a matrix U such that
0
0
0 In−r
U =
!
,
(1.11)
and
A1 • U = . . . = Am • U = B • U = 0.
(1.12)
(2) For all V matrices, which are a linear combination of the Ai and B and are of the
form
!
V11 V12
,
(1.13)
V =
T
V12
0
with V11 ∈ S r , we must have V12 = 0.
Example 3. The semidefinite system

 

0 0 0
1 0 0

 

x1 0 0 1 0 0 0
0 1 0
0 0 0
is well-behaved, despite its similarity to
the U matrix of Theorem 2 are

1 0

Z = 0 0
0 0
(1.14)
the system in Example 1. The maximum slack and



0
0 0 0



0 , U = 0 1 0 ,
0
0 0 1
and since the entries in positions (1, 2) and (1, 3) are zero in all constraint matrices, condition (2) trivially holds.
Corollary 2. The question
“Is PSD well-behaved?”
is in N P ∩ co−N P in the real number model of computing.
5
Our main results build on three pillars. First, we describe characterizations (in conditions (2) and (3) in Theorem 5) of when P is well-behaved. These results involve the
closedness of the linear image of K ∗ and K ∗ × R+ , where K ∗ is the dual cone of K. The
closedness of these linear images is not easy to check in general, so we invoke a recent result
from [30] on the closedness of the linear image of a closed convex cone, which we recap in
Theorem 3. This accomplishes the following: with z a maximum slack in the system P, and
dir(z, K) the set of feasible directions at z in the cone K, the well behaved nature of P will
be related to how dir(z, K) intersects the linear subspace spanned by A and b. An equivalent
characterization is given using the concept of strict complementarity, and the tangent space
of K at x. Somewhat surprisingly, Theorem 5 unifies two seemingly unrelated conditions
for P to be well-behaved: Slater’s condition, and the polyhedrality of K.
Since dir(z, K) is a well structured set when K is the semidefinite or second order
cone, a third step, which is some linear algebra, allows us to put the characterization for
semidefinite and second order conic systems into a reasonably aesthetic shape, like Theorem
1 above. The closedness result from [30] also makes it possible to prove the polynomial time
verifiability of the good and bad behavior of semidefinite and second order conic systems.
Literature review Linear programs with conic constraints were first proposed by Duffin
in [17] mainly as a tool to better understand the duality theory of convex programs. For
textbook and survey treatments on conic duality we refer to [5, 8, 34, 31, 25, 35, 14].
Our main results hinge on Theorem 3, which gives conditions for the linear image of a
closed convex cone to be closed. While this question is fundamental, there is surprisingly
little recent literature on the subject. We refer to [36] for a sufficient condition; to [4] for
a necessary and sufficient condition on when the continuous image of a closed convex cone
is closed, and to [3] for a characterization of when the linear image of a closed convex set
is closed. More recently [7] studied a more general problem, whether the intersection of an
infinite sequence of nested sets is nonempty, and gave a sufficient condition in terms of the
sequence being retractive. The question whether a small perturbation of the linear operator
still leads to the closedness of the linear image has been recently studied in [11].
The study of constraint qualifications (CQs) is also naturally related to our work. CQs
are conditions under which a conic linear system, or a semi-infinite linear system (a system
with finitely many variables, and infinitely many constraints), has desirable properties from
the viewpoint of duality. Slater’s condition (the existence of an x such that b − Ax is in the
relative interior of K) is sufficient for P to be well-behaved, but not necessary. The same
holds for the polyhedrality of K. Another approach is to connect the desirable properties
of a constraint set to the closedness of a certain related set. In this vein [16] introduced the
notion of uniform LP-duality for a semi-infinite linear system, and showed it to be equivalent
to such a CQ. Our condition (2) in Theorem 5 is analogous to Theorem 3.2 in [16]. More
recently, [24, 13] and [23] studied such CQs, called closed cone constraint qualifications for
optimization problems of the form inf { f (x) | x ∈ C, g(x) ∈ K }, where K is a closed convex
cone, and C is a closed convex set. These CQs guarantee that the Lagrange dual of this
6
problem has the same optimal value, and attains it. In particular, Condition (2) in our
Theorem 5 specializes to Corollary 1 in [23] when K is the semidefinite cone, and Theorem
3.2 in [13] gives an equivalent condition, though stated in a different form. Closed cone
CQs are weaker than Slater type ones, however, the closedness of the related set is usually
not easy to check. A similar CQ for inequality systems in infinite dimensional spaces was
given in [12], and a minimal cone based approach, which replaces K by a suitable face of
K to obtain strong duality results was introduced in [9].
Where our results differ from the constraint qualifications listed above is that besides
characterizing exactly when P is well-behaved, they are polynomial time verifiable for
semidefinite and second order conic systems, and yield easily “viewable” certificates, as
the Z and V matrices and the excluded system (0.1): the key is the use of Theorem 3.
Organization The rest of the introduction contains notation and definitions. In Section
2 we describe characterizations of when P is well behaved in Theorem 5. In Section 3 we
prove Theorems 1 and 2, and Corollaries 1 and 2. In Section 4 we derive similar results for
second order conic systems, and systems over p-cones. Finally, in Section 5 we prove that
conic systems given in a different form (dual form, like the feasible set of (Dc ) ) or in the
subspace form used by Nesterov and Nemirovskii in [26]) are well-behaved, if and only if
the the system we obtain by rewriting them in the standard form of P is.
Notation, and preliminaries The general references in convex analysis we used are:
Rockafellar [32], Hiriart-Urruty and Lemar´echal [22], and Borwein and Lewis [10].
The closure, interior, boundary, relative interior and relative boundary of a convex set
C are are denoted by cl C, int C, bd C, ri C, and rb C, respectively. For a convex set C, and
x ∈ C we define following sets:
dir(x, C) = { y | x + y ∈ C for some > 0 }
(1.15)
ldir(x, C) = dir(x, C) ∩ − dir(x, C),
(1.16)
tan(x, C) = cl dir(x, C) ∩ − cl dir(x, C).
(1.17)
Here dir(x, C) is the set of feasible directions at x in C, ldir(x, C) is lineality space of
dir(x, C), and tan(x, C) is the tangent space at x in C.
The range space, and nullspace of an operator A are denoted by R(A), and N (A),
respectively. The linear span of set S is denoted by lin S, and the orthogonal complement
of lin S by S ⊥ .
If x and y are elements of a euclidean space, we write x∗ y for hx, yi.
Suppose that C is a closed convex cone, and E is a convex subset of C. We say that E
is a face of C, if x1 , x2 ∈ C, and 1/2(x1 + x2 ) ∈ E implies that x1 and x2 are both in E.
For an E face of C we define its conjugate face as as
E4 = C ∗ ∩ E⊥.
7
(1.18)
Similarly, the conjugate face of a G face of C is
G4 = C ∩ G⊥ .
(1.19)
Notice that the ()4 operator is ambiguous, as it denotes two maps: one from the faces of
C to the faces of C ∗ , and one in the other direction.
If C is a closed convex cone, and x ∈ C, we say that u ∈ C ∗ is strictly complementary
to x, if x ∈ ri E for some E face of C, (i.e., E is the smallest face of C that contains x, cf.
Theorem 18.2 in [32]), and u ∈ ri E 4 . Notice that u may be strictly complementary to x,
but not the other way. The reason is that (E 4 )4 is the smallest exposed face of of C that
contains E, i.e., the smallest face of C that arises as the intersection of C with a supporting
hyperplane, and it only equals E, when E itself is exposed. In the semidefinite and second
order cones all faces are exposed, so in these cones strict complementarity is a symmetric
concept.
We say that a closed convex cone C is nice, if
C ∗ + E ⊥ is closed for all E faces of C.
(1.20)
While this condition appears technical, it suffices to note that most cones appearing in the
optimization literature, such as polyhedral, semidefinite, and second order cones are nice.
For details, we refer to [30].
The following question is fundamental in convex analysis: under what conditions is the
linear image of a closed convex cone closed? We state a slightly modified version of Theorem
1 from [30], which gives easily checkable conditions for the closedness of M ∗ C ∗ , which are
“almost” necessary and sufficient.
Theorem 3. Let M be a linear map, C a closed convex cone, and x ∈ ri(C ∩ R(M )).
Then conditions (1) and (2) are equivalent to each other, and necessary for the closedness
of M ∗ C ∗ .
(1) R(M ) ∩ (cl dir(x, C) \ dir(x, C)) = ∅.
(2) There is u ∈ N (M ∗ ) strictly complementary to x, and
R(M ) ∩ (tan(x, C) \ ldir(x, C)) = ∅.
Furthermore, let E be the smallest face of C that contains x. If C ∗ + E ⊥ is closed, then
conditions (1) and (2) are both sufficient for the closedness of M ∗ C ∗ .
We make three remarks to explain Theorem 3 better:
Remark 1. For a set C let us write fr(C) for the frontier of C, i.e., the difference between
the closure of C and C itself. Condition (1) in Theorem 3 relates the two questions:
Is fr(M ∗ C ∗ ) = ∅?
8
(1.21)
and
Is R(M ) ∩ fr(dir(x, C)) = ∅?
(1.22)
The usefulness of Theorem 3 comes from the fact that fr(dir(x, C)) is a well-described set,
when C is, and this well-described set helps us to understand the “messy” set M ∗ C ∗ better.
Remark 2. Theorem 3 is quite general: it directly implies the sufficiency of two classical
conditions for the closedness of M ∗ C ∗ , and gives necessary and sufficient conditions for nice
cones:
Corollary 3. Let M and C be as in Theorem 3. Then the following hold:
(1) If C is polyhedral, then M ∗ C ∗ is closed.
(2) If R(M ) ∩ ri C 6= ∅, then M ∗ C ∗ is closed.
(3) If C is nice, then conditions (1) and (2) are both necessary and sufficient for the
closedness of M ∗ C ∗ .
Proof Let x and E as in Theorem 3. If C is polyhedral, then so are the sets C ∗ + E ⊥ , and
dir(x, C), which are hence closed. So the sufficiency of condition (1) in Theorem 3 implies
the closedness of M ∗ C ∗ .
If R(M ) ∩ ri C 6= ∅, then Theorem 6.5 in [32] implies x ∈ ri C, hence E = C. Therefore
+ E ⊥ = C ∗ , and dir(x, C) = lin C, and both of these sets are closed. Again, the
sufficiency of condition (1) in Theorem 3 implies the closedness of M ∗ C ∗ .
C∗
Finally, if C is nice, then C ∗ + E ⊥ is closed for all E faces of C, and this shows that
conditions (1) and (2) are necessary and sufficient.
Remark 3. Condition (2) of Theorem 3 is stated slightly differently in Theorem 1 in [30]:
there it reads
R(M ) ∩ ((E 4 )⊥ \ lin E) = ∅.
However, (E 4 )⊥ = tan(x, C), and lin E = ldir(x, C), as shown in Lemma 3.2.1 in [18].
In conic linear programs we frequently use the notion of solutions which are only “nearly”
feasible. Here it suffices to define these only for (Dc ). We say that {yi } ⊆ K ∗ is an asymptotically feasible (AF) solution to (Dc ) if A∗ yi → c, and the asymptotic value of (Dc ) is
aval(Dc ) = inf{ lim inf i b∗ yi | {yi } is asymptotically feasible to (Dc ) }.
Notice that if (Dc ) is asymptotically feasible, then there is an AF solution {yi } ⊆ K ∗ with
lim b∗ yi = aval(Dc ).
Theorem 4. (Duffin [17]) Problem (Pc ) is feasible with val(Pc ) < +∞, iff (Dc ) is asymptotically feasible with aval(Dc ) > −∞, and if these equivalent statements hold, then
val(Pc ) = aval(Dc ).
9
(1.23)
We denote vectors with lower case, and matrices and operators with upper case letters. We distinguish vectors and matrices of similar type with lower indices, i.e., writing
x1 , x2 , . . . and Z1 , Z2 , . . . . The jthe component of vector xi is denoted by xi,j , and the
(k`)th component of matrix Zi by zi,k` . This notation is somewhat ambiguous, as xi may
denote both a vector, or the ithe component of the vector x, but the context will make it
clear which one is meant.
For matrices A and B with the same number of columns we write (A; B) for the matrix
obtained by stacking A on top of B.
2
Characterizations of when P is well behaved
To prove the characterization on the well behaved nature of P, we need the following
Definition 2. A maximum slack in P is a vector in
ri { z | z = b − Ax, z ∈ K } = ri ( (R(A) + b) ∩ K ).
We will use the fact that for z ∈ K the sets dir(z, K), ldir(z, K), and tan(z, K) (as
defined in (1.15)) depend only the smalllest face of K that contains z, cf. Lemma 3.2.1 in
[18].
Theorem 5. Let z be a maximum slack in P . Consider the statements
(1) The system P is well-behaved.
(2) The set
!
!
A∗ 0
K∗
b∗ 1
R+
(2.24)
!
A∗
K∗
∗
b
(2.25)
is closed.
(3) The set
is closed.
(4) R(A, b) ∩ (cl dir(z, K) \ dir(z, K)) = ∅.
(5) There is u ∈ N ((A, b)∗ ) strictly complementary to z, and
R(A, b) ∩ (tan(z, K) \ ldir(z, K)) = ∅.
10
Among them the following relations hold:
(1) ⇔ (2) ⇐ (3)
⇓
(4) ⇔ (5)
In addition, let F be the smallest face of K that contains z. If the set K ∗ + F ⊥ is closed,
then (1) through (5) are all equivalent.
We first remark that if P is a homogeneous system, i.e., b = 0, then it is not difficult to
see that P is well-behaved, if and only if A∗ K ∗ is closed. So in this case Theorem 5 reduces
to Theorem 3.
Second, we show an implication of Theorem 5: it unifies two classical, seemingly unrelated sufficient conditions for P to be well behaved, and gives necessary and sufficient
conditions in the case when K is nice:
Corollary 4. The following hold:
(1) If K is polyhedral, then P is well behaved.
(2) If z ∈ ri K, i.e., Slater’s condition holds, then P is well behaved.
(3) If K is nice, then conditions (2) through (5) are all equivalent to each other, and with
the well behaved nature of P.
Proof First, suppose that K is polyhedral. Then so is K ∗ + F ⊥ , which is hence closed.
The set dir(z, K) is also polyhedral, and hence closed. So the implication (4) ⇒ (1) shows
that P is well behaved. Next, assume z ∈ ri K. Then F , the minimal face that contains z
is equal to K. So K ∗ + F ⊥ equals K ∗ , and is hence closed, and dir(z, K) = lin K, which
is also closed. Again, implication (4) ⇒ (1) proves that P is well behaved. (In these two
cases we do not use the opposite implication (4) ⇐ (1).)
If K is nice, then K ∗ + F ⊥ is closed for all F faces of K, and this shows that (1) through
(5) are all equivalent.
We will need a lemma, whose proof is somewhat technical, so it is given in the Appendix.
We need a somewhat technical lemma, whose proof is given in the Appendix.
Lemma 1. The relative interiors of the sets R(A, b) ∩ K and (R(A) + b) ∩ K have a
nonempty intersection.
11
Proof of Theorem 5:
Proof of (3) ⇒ (1) and (2) ⇒ (1) Let c be an objective vector, such that c0 := val(Pc )
is finite. Then aval(Dc ) = c0 holds by Theorem 4, i.e., there is {yi } ⊆ K ∗ s.t. A∗ yi →
c, and b∗ yi → c0 , in other words,
!
!
A∗
c
∈ cl
K ∗.
(2.26)
b∗
c0
By the closedness of the set in (2.26), there exists y ∈ K ∗ such that A∗ y = c, and b∗ y = c0 .
This argument shows the implication (3) ⇒ (1).
To prove (2) ⇒ (1), note that
!
!
!
∗ 0
∗
A∗
A
K
.
cl
K ∗ ⊆ cl
b∗
b∗ 1
R+
(2.27)
By the closedness of the second set in (2.27) there exists y ∈ K ∗ , and s ∈ R+ such that
A∗ y = c, and b∗ y + s = c0 . We must have s = 0, since s > 0 would contradict Theorem 4.
Therefore y is a feasible solution to (Dc ) with b∗ y = c0 , and this completes the proof.
Proof of ¬(2) ⇒ ¬(1) Now, let c and c0 be arbitrary, and suppose they satisfy
!
!
!
c
A∗ 0
K∗
∈ cl
,
c0
b∗ 1
R+
and
c
c0
!
6∈
!
!
A∗ 0
K∗
.
R+
b∗ 1
(2.28)
(2.29)
But (2.28) is equivalent to the existence of {(yi , si )} ⊆ K ∗ × R+ s.t. A∗ yi → c, and b∗ yi +
si → c0 . Hence aval(Dc ) ≤ c0 . Also, Theorem 4 implies val(Pc ) = aval(Dc ), so
val(Pc ) ≤ c0 .
(2.30)
However, (2.29) implies that no feasible solution of (Dc ) can have value ≤ c0 . Hence either
val(Dc ) > c0 (this includes the case val(Dc ) = +∞, i.e., when (Dc ) is infeasible), or val(Dc )
is not attained. In other words, c is a ”bad” objective function.
Proof of (2) ⇒ (4): For brevity, let us write
!
A b
S = R
∩
0 1
!
K
,
R+
and in this proof we also use the shorthand
frdir(x, C) = cl dir(x, C) \ dir(x, C)
12
(2.31)
where C is a convex set, and x ∈ C. Recall that (s; s0 ) stands for the vector obtained
stacking s on top of s0 .
Let (s; s0 ) ∈ ri S. We claim s0 > 0. Indeed, let s0 ∈ (R(A) + b) ∩ K. Then (s0 ; 1) ∈ S,
hence (s; s0 ) − (s0 ; 1) ∈ S for some > 0, so s0 > 0 follows.
Next, we claim
frdir((s; s0 ), K × R+ ) = { (v; v0 ) | v ∈ frdir(s, K), v0 ∈ R}.
(2.32)
Indeed, by definition, the set on the left hand side of (2.32) is the union of the two sets
{ (v; v0 ) | v ∈ frdir(s, K), v0 ∈ cl dir(s0 , R+ }
and
{ (v; v0 ) | v ∈ cl dir(s, K), v0 ∈ frdir(s0 , R+ }.
However, s0 > 0 implies that the second of these sets is empty, and the first is equal to the
set on the right hand side of (2.32).
By Theorem 3
!
A b
frdir((s; s0 ), K × R+ ) ∩ R
= ∅
0 1
(2.33)
holds. Using (2.32), statement (2.33) is equivalent to
frdir(s, K) ∩ R(A, b) = ∅,
(2.34)
which, again by Theorem 3 is equivalent to (4).
Proof that (4) and (5) are equivalent, and imply (3) when K ∗ + F ⊥ is closed:
For brevity, let us write
S1 = R(A, b) ∩ K,
S2 = (R(A) + b) ∩ K.
Since ri S1 is relatively open, and convex, Theorem 18.2 in [32] implies that it is contained
in the relative interior of some F1 face of K, i.e. ri S1 ⊆ ri F1 . Similarly, ri S2 ⊆ ri F2 , where
F2 is also a face of K. By Lemma 1, the relative interiors of S1 and S2 intersect, so F1 = F2 .
For brevity, let us write F = F1 .
Let s ∈ ri S1 , and recall that the maximum slack z is in ri S2 . Theorem 3 with the choice
C = K, and M = (A, b) implies our claim, if we prove
dir(s, K) = dir(z, K),
ldir(s, K) = ldir(z, K),
tan(s, K) = tan(z, K).
(2.35)
Since both s and z are in ri F, and the sets in (2.35) depend only on the smallest face of K
that contains s and z (see e.g. Lemma 3.2.1 in [18]), the proof is complete.
13
3
Characterizations of bad and good semidefinite systems
We specialize the results on general conic LPs in Theorem 5. We first note that (SDPc ) −
(SDDc ) arises as a pair of conic linear programs by defining the map A : Rm → S n as
Ax =
n
X
x i Ai ,
(3.36)
i=1
for x ∈ Rm , and
A∗ Y
= (A1 • Y, . . . , Am • Y )T ,
(3.37)
for Y ∈ S n , and noting that the set of positive semidefinite matrices is self-dual under
the • inner product. Also, a maximum slack as defined in Definition 2 is a slack matrix
with maximum rank in the semidefinite system PSD , and the cone of positive semidefinite
matrices is nice (see [30]).
Proof of Theorem 1: The equivalence (1) ⇔ (4) in Theorem 5 shows that PSD is badly
behaved, iff there is a matrix V, which is a linear combination of the Ai and of B, and also
satisfies
n
n
V ∈ cl dir(Z, S+
) \ dir(Z, S+
).
(3.38)
Let us partition V as
V =
V11 V12
T V
V12
22
!
,
(3.39)
n ) \ dir(Z, S n ), if and
where V11 is r by r. It is straightforward to show that V ∈ cl dir(Z, S+
+
only if
V22 0, and
T
R(V12
)
6⊆ R(V22 ).
(3.40)
(3.41)
Suppose that V22 has s zero eigenvalues; we must have s > 0 by (3.41). Let Q be an
orthonormal matrix whose columns are suitably scaled eigenvectors of V22 , with the eigenvectors corresponding to zero eigenvectors coming first, and
!
I 0
T =
.
0 Q
Applying the rotation T T ()T to V, B, and all Ai matrices (notice that properties (3.40)
and (3.41) are invariant under this rotation), we may assume that V is of the form


V11 V121 . . .
 T

V121
0
. . . ,
V = 
(3.42)


..
..
.
.
Λ
14
where V121 has r rows, and s columns, Λ is a diagonal matrix of order n − r − s with strictly
positive components, and the dots denote arbitrary components.
By (3.41) the matrix V121 must have at least one nonzero element. With further transformations we can make sure that the nonzero element is v1,r+1 , it is equal to 1, and
v2,r+1 = . . . = vr,r+1 = 0, i.e., that V is of the required form.
Proof of Corollary 1: Suppose that PSD is badly behaved. To find the minor system
(0.1) we first add multiples of the Ai matrices to B to make B equal to the maximum slack,
i.e.
!
Ir 0
B = Z =
.
(3.43)
0 0
By Theorem 1 there is a matrix V of the form

V11 e1 . . .
m
X
 T
e
0 ...
V =
v i Ai + v 0 B = 
 1
.
..
i=1
..
.


.

(3.44)
Since B is of the form described in (3.43), we may assume v0 = 0, and we can also assume
v1 6= 0 without loss of generality. We then replace A1 by V, delete A2 , . . . , Am , and all rows
and columns except the first, and (r + 1)st to arrive at the system
!
!
v11 1
1 0
x1
.
(3.45)
1 0
0 0
Proof of Theorem 2: We will use the equivalence (1) ⇔ (5) in Theorem 5. If Z is of
the form (1.8), then a U positive semidefinite matrix is strictly complementary to Z if and
only if
!
0 0
U =
, with U22 0.
(3.46)
0 U22
So the first part of (5) in Theorem 5 translates into the existence of a U of the form (3.46)
which satisfies
A1 • U = . . . = Am • U = B • U = 0.
(3.47)
Let T be an orthonormal matrix whose columns are the eigenvectors of U, with the eigenvectors correspnding to zero eigenvectors coming first. Applying the rotation T T ()T to
U, B, and all Ai matrices, we may assume that
!
0
0
U =
.
(3.48)
0 In−r
15
Also, the form of Z implies
(
!
)
V
0
11
n) =
V11 free ,
ldir(Z, S+
0
0
)
!
(
V
V
11
12
n) =
V11 , V12 free .
tan(Z, S+
T0
V12
(3.49)
So the second part of condition (5) in Theorem 5 is equivalent to requiring that all V
matrices, which are a linear combination of the Ai and B, and are of the form
!
V11 V12
V =
,
(3.50)
T
V12
0
satisfy V12 = 0.
Proof of Corollary 2 The certificates of the bad behavior of PSD are the Z and V matrices
in Theorem 1. We can verify in polynomial time that Z is a maximum rank slack, using
the algorithm of [9] or [29] The fact that V is of the form (1.9) is trivial to check.
The well-behaved nature of PSD is straightforward to confirm via Theorem 2.
4
A characterization of badly behaved second order conic
systems
In this section we study badly behaved second order conic systems, and derive characterizations similar to Theorem 1 and Corollary 1. We first repeat the primal-dual pair of conic
LPs here, for convenience:
sup
c∗ x
(Pc ) s.t. Ax ≤K b
inf
s.t.
b∗ y
y ≥K ∗ 0 (Dc )
A∗ y = c.
Assume that A has n columns. The second order cone in m-space is defined as
p
SO(m) = { x ∈ Rm | x1 ≥ x22 + · · · + x2m }.
(4.51)
The cone SO(m) is also self-dual with respect to the usual inner product in Rm , and nice
(see [30]). We call (Pc ) − (Dc ) a primal-dual pair of second order conic programs (SOCPs),
if
K = K1 × . . . × Kt , and Ki = SO(mi ) (i = 1, . . . , t),
(4.52)
16
where mi is a natural number for i = 1, . . . , t. The primal second order conic system is
PSO = { x | Ax ≤K b }.
(4.53)
We first review two pathological SOCP instances. Despite the difference in their pathology (one’s dual is unattained, but there is no duality gap, in the other there is a finite,
positive duality gap, and the dual attains), the underlying systems look quite similar.
Example 4. Consider the SOCP
sup x1
 
 
0
1
 
 
s.t. x1 0 ≤K 1 .
1
0
(4.54)
with K = SO(3). Its optimal value is obviously zero. The dual problem is
inf y1 + y2
s.t. y3 = 1
y ≥K 0.
(4.55)
The optimal value of the dual is 0, but it is not attained: to see this, first let
yi = (i + 1/i, −i, 1)T ,
then all yi are feasible to (4.55), and their objective values converge to zero. However,
y = (y1 , −y1 , 1) is not in K, no matter how y1 is chosen, so there is no feasible solution
with value zero.
Example 5. In this more involved example, which is a modification of Example 2.2 from
[2], we show that the problem
sup x2
 
 
 
1
0
1
 
 
 
s.t. x1 1 + x2 0 ≤K1 1 ,
0
1
0
x1
!
1
+ x2
−1
!
0
≤K2
1
(4.56)
!
1
,
0
where K1 = SO(3), and K2 = SO(2), and its dual have a finite, positive duality gap. The
optimal value of (4.56) is clearly zero.
The dual is
inf y1 + y2 + w1
s.t. y1 + y2 + w1 − w2 = 0
y3
+ w2 = 1
y ∈ K1 , w ∈ K2
17
(4.57)
Since y and w are in second order cones, y1 ≥ −y2 and w1 ≥ w2 must hold. Together with
the first equation in (4.57) this implies
y2 = −y1 ,
(4.58)
w2 = w1 .
(4.59)
Using (4.59), and the second constraint in (4.57)
y3 = 1 − w2 = 1 − w1 ,
(4.60)
so we can rewrite problem (4.57) in terms of w1 only. It becomes
inf w1
s.t. (y1 , −y1 , 1 − w1 )T
(w1 , w1 )T
∈ K1
∈ K2 ,
(4.61)
and this form displays that the optimal value is 0.
Notice the similarity in the feasible sets of Examples 4 and 5: in particular, if we delete
variable x1 and the second set of constraints in the latter, we obtain the feasible set of
Example 4!
We will use tranformations of the second order cone to make our characterizations nicer.
It is well known that if C is a second order cone, then for all x1 and x2 in the interior of
C there is a T symmetric matrix such that T x1 = x2 , and T C = C : such a rotation
was used by Nesterov and Todd in developing their theory of self-scaled barriers for SOCP
([27, 28]). One can easily see that a similar T exists, if x1 and x2 are both nonzero, and on
the boundary of C. So we are justified in making the following
Assumption 2. In PSO we can replace A and b by T A and T b, respectively, where T is
a block-diagonal matrix with symmetric blocks Ti and Ti Ki = Ki for all i = 1, . . . , t. We
also assume that the maximum slack in PSO is of the form
!
!
!
1
1
z = 0, . . . , 0;
,...,
; e1 , . . . , e1 ,
(4.62)
e1
e1
(note that the dimension of the e1 vectors can differ), and we let
O = { i ∈ {1, . . . , t} | zi = 0 },
R = { i ∈ {1, . . . , t} | zi = (1; e1 ) },
I = { i ∈ {1, . . . , t} | zi = e1 }.
The main result on badly behaved second order conic systems follows.
18
(4.63)
Theorem 6. The system PSO is badly behaved, iff there is a vector v = (v1 ; . . . ; vt ) which
is a linear combination of the columns of A and of b, and satisfies
vi
vi,1
vj
∈ Ki for all i ∈ O,
≥ vi,2 for all i ∈ R,
 
α
 
α 

= 
 1  for some j ∈ R,
 
..
.
(4.64)
where α is some real number, and the dots denote arbitrary components.
The z slack, and the v vector in Theorem 6 give a certificate of the bad behavior of PSO .
They are
 
 
1
0
 
 
z = 1 , v = 0
0
1
in Example 4, and
z =
1
e1
!
,
1
0
!!
 

!
0
  0 
, v = 0 ,

1
1
in Example 5.
Let us denote the columns of A by ai (i = 1, . . . , n).
Corollary 5. Consider the following four elementary operations performed on PSO :
(1) Rotation: for some T invertible matrix satisfying T K = K, replace A by T A, and b
by T b.
Pm
P
(2) Contraction: replace a column ai of A with m
j=1 λj aj , and replace b by b+ j=1 µj aj ,
where the λj and µj are scalars, and λi 6= 0.
(3) Delete rows other than a row corresponding to the first component of a second order
cone from all matrices.
(4) Delete a column ai .
If PSO is badly behaved, then using these operations we can arrive at the system
 
 
α
1
 
 
x1 α ≤K 1 ,
1
0
where α is again some real number, and K is a second order cone.
19
(4.65)
Corollary 6. The question
“Is PSO well-behaved?”
is in N P ∩ co−N P in the real number model of computing.
Proof of Theorem 6: Since the second order cone, and a direct product of second order
cones is nice, we can specialize Theorem 5. By the equivalence (1) ⇔ (4) we have that PSO
is badly behaved, iff there is a vector v, which is a linear combination of the columns of A
and of b, and also satisfies
v ∈ cl dir(z, K) \ dir(z, K).
(4.66)
Clearly, (4.66) holds, iff
vi ∈ cl dir(zi , Ki ) for all i ∈ {1, . . . , t} and
vj 6∈ dir(zj , Kj ) for some j ∈ {1, . . . , t}.
(4.67)
dir(zi , Ki ) = Ki for all i ∈ O,
dir(zi , Ki ) = Rmi for all i ∈ I,
(4.68)
The form of z implies
so in particular dir(zi , Ki ) is closed when i ∈ O ∪ I. So (4.67) is equivalent to
vi ∈ Ki for all i ∈ O,
vi ∈ cl dir(zi , Ki ) for all i ∈ R, and
vj ∈ cl dir(zj , Kj ) \ dir(zj , Kj ) for some j ∈ R.
(4.69)
We now prove that if C is a second order cone, and w = (1; e1 ), then
cl dir(w, C) = { v | v1 ≥ v2 },
cl dir(w, C) \ dir(w, C) = { v | v1 = v2 , v3,: 6= 0 }.
(4.70)
(4.71)
holds, where v3,: denotes the subvector of v consisting of components 3, 4, . . . .
It is straightforward, but tedious to prove (4.70) and (4.71) from the definition of feasible
directions. We give a shorter proof. Let F be the minimal face of C that contains w, and
F 4 its conjugate face. Then it is easy to see that
F 4 = cone { (1; −e1 ) },
(4.72)
cl dir(w, C) = (F 4 )∗ .
(4.73)
and Lemma 3.2.1 in [18] implies
20
Since F 4 is generated by a single vector (1; −e1 ), its dual cone is the set of vectors having
a nonnegative inner product with (1; −e1 ), i.e., (4.70) follows.
Next we prove (4.71). First note that (4.70) implies
ri dir(w, C) = ri cl dir(w, C) = { v | v1 > v2 } ⊆ dir(w, C).
(4.74)
Suppose that v is in the left hand side set in equation (4.71). From (4.74) we get v1 = v2 ,
and v3,: 6= 0 follows directly from v 6∈ dir(w, C). Next suppose that v is in the set on the right
hand side of equation (4.71). Then by (4.70) we have v ∈ cl dir(w, C), and v 6∈ dir(w, C)
follows directly from v3,: 6= 0.
Combining (4.70) and (4.71), and doing a further rescaling to make sure vj,1 = vj,2 = 1
proves that (4.69) is equivalent to (4.64).
The proof of Corollary 5 is analogous to the proof of Corollary 1, and the proof of
Corollary 6 to the one of Corollary 2, so these are omitted.
The duality theory of conic programs stated over p-cones is by now well-understood
[20, 2], and these optimization problems also find applications [15]. Hence it is of interest
to characterize badly behaved p-conic systems, and this is the subject of the section’s
remainder. Letting p be a real number with 1 < p < +∞, the p-norm of x ∈ Rk is
k x kp = (|x1 |p + · · · + |xk |p )1/p ,
(4.75)
and the p-cone in dimension m is defined as
Kp,m = { x ∈ Rm | x1 ≥k x2:m kp }.
(4.76)
The dual cone of Kp,m is Kq,m , where q satisfies 1/p + 1/q = 1.
It is straightforward to see that Kp,m does not contain a line, and is full-dimensional,
and w ∈ bd Kp,m iff
w = (k x kp ; x)
(4.77)
for some x ∈ Rm−1 . For such a w we define its conjugate vector as
w4 = (k x kq ; −x).
(4.78)
Then w4 ∈ bd Kq,m , and hw, w4 i = 0 hold.
Again recalling the conic programs (Pc ) − (Dc ), we call them a pair of primal-dual
p-conic programs, if
K = K1 × . . . × Kt , and Ki = Kmi ,pi (i = 1, . . . , t),
(4.79)
where mi is a natural number for i = 1, . . . , t, and 1 < pi < +∞. The p-conic system is
Pp = { x | Ax ≤K b }.
When all pi are 2, p-conic programs reduce to SOCPs.
21
(4.80)
Theorem 7. Let z be a maximum slack in Pp , and define the index sets
O = { i ∈ {1, . . . , t} | zi = 0 },
R = { i ∈ {1, . . . , t} | zi ∈ bd Ki },
I = { i ∈ {1, . . . , t} | zi ∈ int Ki }.
(4.81)
The system Pp is badly behaved, iff there is a vector v = (v1 ; . . . ; vt ) which is a linear
combination of the columns of A and of b, and satisfies
vi
4
hvi , zi i
hvj , zj4 i
= 0 and vj
∈ Ki for all i ∈ O,
≥ 0 for all i ∈ R,
6∈ lin{zj } for some j ∈ R.
(4.82)
Theorem 7 reduces to Theorem 6, when all pi are 2, and its proof is a straightforward
generalization.
We finally remark that a characterization of well-behaved second-order and p-conic
systems, analogous to Theorem 2 can be derived as well; also, the geometric and dual
geometric cones of [19] have similar structure to p-cones, so Theorem 5 can be specialized
to characterize badly and well-behaved systems over these cones.
5
Characterizing the well-behaved nature of conic systems
in subspace, or dual form
In this section we look at conic systems stated in a form different from P, and examine when
they are well-behaved from the standpoint of duality. The short conclusion is that such a
system is well-behaved, iff the system obtained by rewriting it in the “standard form” of P
is. Most of the section’s material is straightforward, so we only outline some of it.
We first define a primal-dual pair of conic linear programs in subspace form, parametrized
by the primal objective y0 as
(Py0 0 )
inf
hy0 , zi
s.t. z ∈ (z0 + L) ∩ K
inf
hz0 , yi
s.t. y ∈ (y0 + L⊥ ) ∩ K ∗ (Dy0 0 ),
and the primal conic system in subspace form as
P 0 = (z0 + L) ∩ K.
(5.83)
Feasible solutions, boundedness, and the values of (Py0 0 ) and (Dy0 0 ) are defined in the obvious
way.
22
Definition 3. We say that the system P 0 is well-behaved, if for all y0 objective functions for
which the value of (Py0 0 ) is finite, the problem (Dy0 0 ) has a feasible solution y that satisfies
hz0 , yi + val(Py0 0 ) = hz0 , y0 i.
(5.84)
We say that P 0 is badly behaved, if it is not well-behaved.
Let us recall the primal-dual pair (Pc ) − (Dc ) and the primal conic system P.
Definition 4. If
L = R(A) and z0 = b,
(5.85)
then we say that P and P 0 are equivalent. If in addition
A∗ y0 = c,
(5.86)
then we say that (Pc ) and (Py0 0 ), and (Dc ) and (Dy0 0 ) are equivalent.
Clearly, for every conic system in the form of P there is an equivalent system in the
form of P 0 , and vice versa. Also, if A is injective, which can be assumed without loss of
generality, then for every conic LP in the form (Pc ) there is an equivalent conic LP in the
form of (Py0 0 ), and vice versa.
The following lemma is standard; we state, and prove it for the sake of completeness.
Lemma 2. Suppose that (Pc ) is equivalent with (Py0 0 ), and (Dc ) with (Dy0 0 ). If x is feasible
to (Pc ) with slack z, then z is feasible to (Py0 0 ), and
hc, xi + hy0 , zi = hy0 , z0 i.
(5.87)
val(Pc ) + val(Py0 0 ) = hy0 , z0 i.
(5.88)
Also,
Proof
The first statement is obvious, and (5.87) follows from the chain of equalities
hy0 , z0 i =
=
=
=
hy0 , Ax + zi
hy0 , Axi + hy0 , zi
hA∗ y0 , xi + hy0 , zi
hc, xi + hy0 , zi.
(5.89)
Suppose that x is feasible for (Pc ) with slack z. Then z is feasible for to (Py0 0 ), and
(5.87) holds, therefore
hc, xi + val(Py0 0 ) ≤ hy0 , z0 i,
23
(5.90)
and since (5.90) is true for any x feasible solution to (Pc ), the inequality ≤ follows in (5.88).
In reverse, let z be feasible to (Py0 0 ). Then there is x that satisfies z = z0 − Ax, so this
x with the slack z is feasible to (Pc ), so (5.87) holds, therefore
val(Pc ) + hy0 , zi ≥ hy0 , z0 i.
(5.91)
Since (5.91) holds for any z feasible to (Py0 0 ), the inequality ≥ follows in (5.88).
Combining Lemma 2 with Theorem 5 we can now characterize the well-behaved nature
of P 0 .
Theorem 8. Let z ∈ ri P 0 . Consider the statements
(1’) The system P 0 is well-behaved.
(2’) The set
I
−z0∗
!
⊥
L +
K∗
R+
!
(5.92)
is closed.
(3’) The set
(L + z0 )⊥ + K ∗
(5.93)
is closed.
(4’) lin(L + z0 ) ∩ (cl dir(z, K) \ dir(z, K)) = ∅.
(5’) There is u ∈ (L + z0 )⊥ strictly complementary to z, and
lin(L + z0 ) ∩ (tan(z, K) \ ldir(z, K)) = ∅.
Among them the following relations hold:
(10 ) ⇔ (20 ) ⇐ (30 )
⇓
(40 ) ⇔ (50 )
In addition, let F be the smallest face of K that contains z. If the set K ∗ + F ⊥ is closed,
then (1’) through (5’) are all equivalent.
Proof Define the conic system P to be equivalent with P 0 . Then z is a maximum slack in
P, and it suffices to show that statements (1), . . . (5) in Theorem (5) are equivalent to (1’),
. . . (5’) respectively.
Proof of (1) ⇔ (10 ) Consider (Py0 0 ) with some objective function y0 , and let c = A∗ y0 .
Then (Pc ) and (Py0 0 ) are equivalent, so (5.88) in Lemma 2 implies that val(Pc ) is finite, if
and only if val(Py0 0 ) is.
24
Also, (Dc ) and (Dy0 0 ) are equivalent with the same feasible set, and the same objective
value of a feasible solution. Combining this fact with (5.88), we see that there is a y feasible
to (Dc ) with b∗ y = val(Pc ), iff there is y is feasible to (Dy0 0 ) with hz0 , yi+val(Py0 0 ) = hy0 , z0 i,
and this completes the proof.
Proof of (2) ⇔ (20 ) Since R(A) = L, we have N (A∗ ) = L⊥ . By a Theorem of Abrams
(see e.g. [6] ) if C is an arbitrary set, and M an arbitrary linear map, then M C is closed,
if and only if N (M ) + C is.
Applying this result we obtain
!
!
A∗ 0
K∗
is closed ⇔ N
b∗ 1
R+
!
A∗ 0
+
b∗ 1
K∗
R+
!
is closed.
(5.94)
It is elementary to show
N
!
A∗ 0
=
b∗ 1
I
−z0∗
!
L⊥ ,
(5.95)
and this completes the proof.
The remaining equivalences follow from
R(A, b) = lin(L + z0 );
(5.96)
to prove (3) ⇔ (30 ) we also need to use Abram’s theorem again.
In the rest of the section we discuss how to define the well-behaved nature of a conic
system in dual form, and relate it to the well-behaved nature of the system obtained by
rewriting it in subspace, or in primal form.
Definition 5. We say that the system
D = { y ∈ K ∗ | A∗ y = c }
(5.97)
is well-behaved, if for all b for which v := inf { b∗ y | y ∈ D } is finite, there is an x feasible
solution to (Pc ) with c∗ x = v.
We can rewrite D in subspace form as
D0 = (y0 + L⊥ ) ∩ K ∗ ,
(5.98)
where L⊥ = N (A∗ ), and A∗ y0 = c, and also in primal form as
D00 = { u | Bu ≤K ∗ y0 },
(5.99)
where B is an operator with R(B) = N (A∗ ). Actually, we have D = D0 , but the wellbehaved nature of D and D0 are defined differently. Also note that y ∈ D iff y = y0 − Bu
for some u ∈ D00 .
25
Theorem 9. The systems D, D0 and D00 are all well- or badly behaved at the same time.
Proof
Obviously, if b = z0 , then val(Dy0 0 ) = inf { b∗ y | y ∈ D }. By definition D0 is
well-behaved, if for all z0 for which val(Dy0 0 ) is finite, there is a z feasible solution to (Py0 0 )
with
hz, y0 i + val(Dy0 0 ) = hy0 , z0 i.
(5.100)
Using the first statement in Lemma 2 with (5.87) it follows that this holds exactly when
there is an x feasible solution to (Pc ) with c∗ x = inf { b∗ y | y ∈ D }. This argument proves
that D is well-behaved if and only if D0 is.
D0
The argument used in the proof of the equivalence (1) ⇔ (10 ) in Theorem 8 shows that
is well-behaved, if and only if D00 is, and this completes the proof.
We can state characterizations of badly behaved semidefinite or second order conic
systems given in a subspace, or dual form in terms of the original data. We give an example
which is stated “modulo rotations”, i.e. making Assumption 1. We omit the proof, since it
is straightforward from Theorems 1 and 9.
Theorem 10. Suppose that in the system
Y 0, Ai • Y = bi (i = 1, . . . , m)
(5.101)
the maximum rank feasible matrix is of the form
Y¯ =
!
Ir 0
.
0 0
(5.102)
Then (5.101) is badly behaved if and only if there is a matrix V and a real number λ with
Ai • V = λbi (i = 1, . . . , m),
and

V11 e1 . . .
 T
e
0 ...
V = 
 1
..
..
.
.
(5.103)


,

(5.104)
where V11 is r by r, e1 is the first unit vector, and the dots denote arbitrary components.
A
Appendix
Proof of Lemma 1: For brevity, let us write
S1 = R(A, b) ∩ K,
S2 = (R(A) + b) ∩ K.
26
Let s1 ∈ ri S1 , s2 ∈ ri S2 , and let us write
s1 = Ax1 + by1
s2 = Ax2 + b.
We have s2 ∈ S1 , so Theorem 6.1 in [32] implies that the half-open line-segment [s1 , s2 ) is
contained in ri S1 , i.e.
λs1 + (1 − λ)s2 = A[λx1 + (1 − λ)x2 ] + b[λy1 + (1 − λ)]
∈ ri S1 ,
(A.105)
whenever 0 < λ ≤ 1. When λ is small, λy1 + (1 − λ) is positive, so
s(λ) :=
1
(λs1 + (1 − λ)s2 ) ∈ S2 .
λy1 + (1 − λ)
(A.106)
Since s(λ) → s2 as λ → 0, and s2 ∈ ri S2 , we can choose a sufficiently small λ > 0 such
that s(λ) ∈ ri S2 . By (A.105) we also have s(λ) ∈ ri S1 , and this completes the proof.
Acknowledgement My sincere thanks are due to Shu Lu for her careful reading of the
manuscript, and for pointing out several mistakes in it.
References
[1] Farid Alizadeh and Donald Goldfarb. Second-order cone programming. Math. Program., 95(1, Ser. B):3–51, 2003.
[2] Erling D. Andersen, Cees Roos, and Tamas Terlaky. Notes on duality in second order
and p-order cone optimization. Optimization, 51(4):627–643, 2002.
[3] Alfred Auslender. Closedness criteria for the image of a closed set by a linear operator.
Numer. Funct. Anal. Optim., 17:503–515, 1996.
[4] Heinz Bauschke and Jonathan Borwein. Conical open mapping theorems and regularity.
In Proceedings of the Centre for Mathematics and its Applications 36, pages 1–10.
Australian National University, 1999.
[5] Aharon Ben-Tal and Arkadii. Nemirovski. Lectures on modern convex optimization.
MPS/SIAM Series on Optimization.
[6] ABRAHAM BERMAN. Cones, Matrices and Mathematical Programming. SpringerVerlag, Berlin, New York, 1973.
[7] Dimitri Bertsekas and Paul Tseng. Set intersection theorems and existence of optimal
solutions. Math. Progr., 110:287–314, 2007.
[8] Fr´ed´eric J. Bonnans and Alexander Shapiro. Perturbation analysis of optimization
problems. Springer Series in Operations Research. Springer-Verlag, New York, 2000.
27
[9] Jonathan Borwein and Henry Wolkowicz. Regularizing the abstract convex program.
Journal of Mathematical Analysis and Applications, 83:495–530, 1981.
[10] Jonathan M. Borwein and Adrian Lewis. Convex Analysis and Nonlinear Optimization:
Theory and Examples. Springer, 2000.
[11] Jonathan M. Borwein and Warren B. Moors. Stability of closedness of convex cones
under linear mappings. Journal of Convex Analysis, 2009.
[12] Jonathan M. Borwein and Henry Wolkowicz. A simple constraint qualification in infinite dimensional programming. Mathematical Programming, 35:83–96, 1986.
[13] Radu-Ioan Bot and Gert Wanka. An alternative formulation for a new closed cone
constraint qualification. Nonlinear Analysis, 64:1367–1381, 2006.
[14] Stephen Boyd and Lieven Vandenberghe. Convex Optimization, volume 15. Cambridge
University Press, 2004.
[15] Samuel Burer and Jieqiu Chen. A p-cone sequential relaxation procedure for 0–1 integer
programs. Optimization Methods and Software, 24:523–548, 2010.
[16] Richard Duffin, Robert Jeroslow, and L. A. Karlovitz. Duality in semi-infinite linear
programming. In Semi-infinite programming and applications (Austin, Tex., 1981),
volume 215 of Lecture Notes in Econom. and Math. Systems, pages 50–62. Springer,
Berlin, 1983.
[17] Richard J. Duffin. Infinite programs. In A.W. Tucker, editor, Linear Equalities and
Related Systems, pages 157–170. Princeton University Press, Princeton, NJ, 1956.
[18] G´abor Pataki. The geometry of semidefinite programming. In Romesh Saigal, Lieven
Vandenberghe, and Henry Wolkowicz, editors, Handbook of Semidefinite Programming.
Kluwer Academic Publishers, Waterloo, Canada, 2000.
[19] Francois Glineur. Proving strong duality for geometric optimization using a conic
formulation. Annals of Operations Research, 105(2):155–184, 2001.
[20] Francois Glineur and Tamas Terlaky. Conic formulation for lp-norm optimization. J.
Optim. Theory Appl., 122(2):285–307, 2004.
[21] Christoph Helmberg. http://www-user.tu-chemnitz.de/ helmberg/semidef.html.
[22] Jean-Baptiste Hiriart-Urruty and Claude Lemar´echal. Convex Analysis and Minimization Algorithms. Springer-Verlag, 1993.
[23] Vaitilingham Jeyakumar. A note on strong duality in convex semidefinite optimization:
necessary and sufficient conditions. Optim. Lett., 2(1):15–25, 2008.
[24] Vaitilingham Jeyakumar, N. Dinh, and G. M. Lee. A new closed cone constraint
qualification for convex optimization. Technical Report AMR04/8, University of New
South Wales, Sydney, Australia, 2004, 2004.
28
[25] Zhi-Quan Luo, Jos Sturm, and Shuzhong Zhang. Duality results for conic convex
programming. Technical Report Report 9719/A, April, Erasmus University Rotterdam,
Econometric Institute EUR, P.O. Box 1738, 3000 DR, The Netherlands, 1997.
[26] Yurii Nesterov and Arkadii Nemirovskii. Interior Point Polynomial Algorithms in
Convex Programming. SIAM Publications. SIAM, Philadelphia, USA, 1994.
[27] Yurii Nesterov and Michael J. Todd. Self-scaled barriers and interior-point methods
for convex programming. Math. Oper. Res., 22(1):1–42, 1997.
[28] Yurii Nesterov and Michael J. Todd. Primal-dual interior-point methods for self-scaled
cones. SIAM Journal on Optimization, 8:324–364, 1998.
[29] Gabor Pataki. A simple derivation of a facial reduction algorithm and extended dual
systems. Technical report, Columbia University, 2000.
[30] G´abor Pataki. On the closedness of the linear image of a closed convex cone. Math.
Oper. Res., 32(2):395–412, 2007.
[31] James Renegar. A Mathematical View of Interior-Point Methods in Convex Optimization. MPS-SIAM Series on Optimization. SIAM, Philadelphia, USA, 2001.
[32] Tyrrel R. Rockafellar. Convex Analysis. Princeton University Press, Princeton, NJ,
USA, 1970.
[33] Romesh Saigal, Lieven Vandenberghe, and Henry Wolkowicz, editors. Kluwer Academic
Publishers, 2000.
[34] Alexander Shapiro. Duality, optimality conditions, and perturbation analysis. In
Romesh Saigal, Lieven Vandenberghe, and Henry Wolkowicz, editors, Handbook of
Semidefinite Programming. Kluwer Academic Publishers, 2000.
[35] Michael J. Todd. Semidefinite programming. Acta Numerica, 10:515–560, 2001.
[36] Z. Waksman and M. Epelman. On point classification in convex sets. Math. Scand.,
38:83–96, 1976.
29