@let@token OPTIMIZARE COMBINATORIE Modul 1

Transcription

@let@token OPTIMIZARE COMBINATORIE Modul 1
OPTIMIZARE COMBINATORIE
Modul 1
C. Croitoru
[email protected]
FII
March 19, 2015
1 / 143
OUTLINE
POLYHEDRAL COMBINATORICS
1
2
3
4
5
6
7
8
9
10
Introduction
Background information on polyhedra
Background information on linear programming
Total Unimodularity
Total Dual Integrality
Blocking Polyhedra
Anti-blocking Polyhedra
Cutting Planes
Hard Problems – Complexity of the Integer Hull
Miscellanea
2 / 143
Introduction
Polyhedral Combinatorics
m
Linear Programming techniques =⇒ Combinatorial Problems
1950 - 1960: Dantzig, Ford, Fulkerson, Hoffman, Johnson, Kruskal
1960-1970: Edmonds, Giles, Fulkerson
1970−→ : Lov´asz, Gr¨otschel, Chv´atal, Schrijver, Padberg, etc.
3 / 143
Introduction
Basic Reference: Alexander Schrijver’s book Combinatorial
Optimization: Polyhedra and Efficiency
Quote from Schrijver’s preface: Pioneered by the work of Jack
Edmonds, polyhedral combinatorics has proved to be a most powerful,
coherent, and unifying tool throughout combinatorial optimization.a
Schrijver’s photo:
a
Most of the present notes follow Schrijver’s chapter 30 in Handbook of
Combinatorial Optimization, Elsevier 1995
4 / 143
Background information on polyhedra
Polyhedron
A set P ⊆ Rn is a polyhedron if there exist a matrix A and a
vector b such that
P = {x|Ax ≤ b}.
Polytope
A set P ⊆ Rn is a polytope if there exist x1 , . . . , xt ∈ Rn such
thata
P = convhull{x1 , . . . , xt }.
a
Reminder: convex sets, convhull.
5 / 143
Background information on polyhedra
Finite Basis Theorem for Polytopes
Minkowski (1896), Steinitz (1916), Weyl (1935)
A set P is a polytope if and only if P is a bounded polyhedron.
Decomposition Theorem for Polyhedra
Motzkin (1936)
A set P ⊆ Rn is a polyhedron if and only if there exist
x1 , . . . , xt , y1 , . . . , ys ∈ Rn such that
P = {λ1 x1 + . . . + λt xt + µ1 y1 + . . . + µs ys |λ1 , . . . , λt , µ1 , . . . , µs ≥ 0,
λ1 + . . . + λt = 1}.
6 / 143
Background information on polyhedra
Supporting hyperplane
Polyhedron P = {x|Ax ≤ b} ⊆ Rn , where A ∈ Rm×n and b ∈ Rn .
If c ∈ Rn , c 6= 0, and δ = max{c T x|x ∈ P} then
the set {x ∈ Rn |c T x = δ} is a supporting hyperplane of P.
Face
P = {x|Ax ≤ b} polyhedron.
A subset F ⊆ P is a face of P if F = P or F = P ∩ H for some
supporting hyperplane H of P.
A face is polyhedron again.
7 / 143
Background information on polyhedra
Minimal Face
For any face F of P = {x|Ax ≤ b}, there exists a subsystem
A0 x ≤ b 0 of Ax ≤ b such that F = {x ∈ P|A0 x=b 0 }.
It follows that there are only a finite number of faces.
Minimal faces are faces minimal w.r.t. inclusion.
Theorem (Hoffman, Kruskal 1956)
F is a minimal face of P ⊆ Rn if and only if ∅ =
6 F ⊆ P and
F = {x|A0 x = b 0 }
for some subsystem A0 x ≤ b 0 of Ax ≤ b.
8 / 143
Background information on polyhedra
Vertex
If F is a minimal face then its dimension is n − rank(A). If
rank(A) = n then minimal faces correspond to vertices:
An element of P is a vertex if it is not the convex combination of
other elements of P
P = {x|Ax ≤ b} has vertices only if rank(A) = n and then, those
vertices are exactly minimal faces of P.
Theorem
Vector z ∈ P = {x|Ax ≤ b} is a vertex of P if and only if A0 z = b 0
for some some subsystem A0 x ≤ b 0 of Ax ≤ b, with A0 nonsingular
of order n.
A0 (in general not unique) is a basis for z.
9 / 143
Background information on polyhedra
Vertices
P is called pointed if it has vertices.
A polytope is always pointed and it is the convex hull of its
vertices.
Adjacent vertices
Vertices z 0 and z 00 are adjacent if convhull{z 0 , z 00 } is a face of P.
If P is a polytope, then z 0 and z 00 are adjacent vertices ⇔
1 0
00
2 (z + z ) is not a convex combination of other vertices of P.
Theorem. Vertices z 0 and z 00 of the polyhedron P are adjacent if
and only if they have basis A0 and A00 (respectively) with n − 1
rows in common.
10 / 143
Background information on polyhedra
Hirsch’s Conjecture
Polyhedron P −→ graph G (P) whose nodes are vertices of P and
whose edges are pairs of adjacent vertices.
Diameter of P is the diameter of G (P).
Hirsch’s Conjecture (cf. Dantzig 1963) A polytope in Rn
determined by m inequalities has diameter at most m − n.
Naddef 1989: True for polytopes all of whose vertices are
{0, 1}-vectors.
Francisco Santos 2010: False in general.
11 / 143
Background information on polyhedra
Facets
A facet of P is a maximal (w.r.t. inclusion) face F of P with
F 6= P.
A face F is a facet if and only ifa dim(F ) = dim(P) − 1.
An inequality c T x ≤ δ is called facet-induced inequality if
P ⊆ {x|c T x ≤ δ} and P ∩ {x|c T x ≤ δ} is a facet of P.
a
Reminder: dimension
12 / 143
Background information on polyhedra
Facets
Ax ≤ b is an irredundant (or minimal) system determining P if
no inequality in Ax ≤ b is implied by the other.
Let A+ x ≤ b + be those inequalities αT x ≤ β from Ax ≤ b for
which αT z<β for at least one z in P.
Each inequality in A+ x ≤ b + is a facet-inducing inequality.
Moreover, this define an one-to-one relation between facets and
inequalities in A+ x ≤ b + .
13 / 143
Background information on polyhedra
Facets
If P is full-dimensional, then the irredundant system Ax ≤ b is
unique up to multiplication of inequalities by positive scalars.
Theorem If P = {x|Ax ≤ b} is full-dimensional, then Ax ≤ b is
irredundant if and only if for each pair aiT x ≤ bi and ajT x ≤ bj of
constraints from Ax ≤ b there is a vector x 0 in P such that
aiT x 0 = bi and ajT x 0 < bj .
14 / 143
Background information on polyhedra
Integral polyhedron
The polyhedron P = {x|Ax ≤ b} is rational if A and b are
rational-valued (hence we can take them integer valued).
P = {λ1 x1 + . . . + λt xt + µ1 y1 + . . . + µs ys |λ1 , . . . , λt , µ1 , . . . , µs ≥ 0,
λ1 + . . . + λt = 1}.
P is rational if and only if x1 , . . . , xt , y1 , . . . , ys ∈ Qn
P is called integral if x1 , . . . , xt , y1 , . . . , ys ∈ Zn
Hence P is integral if and only if P is the convex hull of integers
vectors in P, or, equivalently if and only if every minimal face of P
contains integer vectors.
15 / 143
Background information on linear programming
Linear Programming (LP)
Problem of maximizing or minimizing a linear function c T x over a
polyhedron P.
Examples:
(i) max{c T x|Ax ≤ b}
(ii) max{c T x|x ≥ 0, Ax ≤ b}
(iii) max{c T x|x ≥ 0, Ax = b}
(iv) min {c T x|x ≥ 0, Ax ≥ b}
It can be shown, for each the above problems, that if the set
involved is a polyhedron with vertices, and if the optimum value is
finite, then it is attained by a vertex of polyhedron.
16 / 143
Background information on linear programming
Dual Problem
Duality Theorem of LP. A ∈ Rm×n , b ∈ Rm , and c ∈ Rn .
(i)
max{c T x|Ax ≤ b} = min{y T b|y ≥ 0, y T A = c T }
( ii )
max{c T x|x ≥ 0, Ax ≤ b} = min{y T b|y ≥ 0, y T A ≥ c T }
( i i i ) max{c T x|x ≥ 0, Ax = b} = min{y T b|y T A ≥ c T }
( iv )
min {c T x|x ≥ 0, Ax ≥ b} = max{y T b|y ≥ 0, y T A ≤ c T }
provided that these sets are nonempty.
17 / 143
Background information on linear programming
Farkas’s Lemma
A ∈ Rm×n , b ∈ Rm .
Ax = b has a solution x ≥ 0 if and only if y T b ≥ 0 holds for each
y ∈ Rm with y T A ≥ 0.
Principle of complementary slackness
Let x and y satisfy Ax ≤ b, y ≥ 0, and y T A = c T . Then x and y
are optimal solutions in
(i) max{c T x|Ax ≤ b} = min{y T b|y ≥ 0, y T A = c T }
if and only if yi = 0 or aiT x = bi for each i = 1, . . . , m.
(aiT x = bi denotes the ith row in Ax = b; similar statements hold
for the pairs (ii)-(iv) of dual problems)
18 / 143
Background information on linear programming
The simplex method (Dantzig 1951)
A ∈ Rm×n , b ∈ Rm , and c ∈ Rn .
Solving max{c T x|Ax ≤ b}, where polyhedron P = {x|Ax ≤ b}
has vertices, i.e. rank(A) = n.
Idea: Make a trip, going from a vertex to a better adjacent vertex,
until an optimal vertex is reached.
By the theorem on slide 9, vertices can be described by bases,
while by the theorem on slide 10, adjacency can be described as
bases differing in exactly one constraint.
19 / 143
Background information on linear programming
The simplex method (1)
The process can be described as a sequence of bases
A0 x ≤ b0 , A1 x ≤ b1 , A2 x ≤ b2 , . . .
where xk := A−1
k bk is a vertex of P, Ak+1 x ≤ bk+1 differs by one
T x ≥ cT x .
constraint from Ak x ≤ bk , and ck+1
k k
A0 x ≤ b0 is an (arbitrary) basis corresponding to a vertex, and
Ak+1 x ≤ bk+1 is obtained from Ak x ≤ bk as follows:
20 / 143
Background information on linear programming
The simplex method (2)
T
• If c T A−1
k ≥ 0 ⇒ xk is an optimal solution of max{c x|Ax ≤ b}.
c
T
∀x such that Ax ≤ b, we have Ak x ≤ bk and hence c T x =
−1
T −1
T
T
= (c T A−1
k )(Ak x) ≤ (c Ak )bk = c (Ak bk ) = c xk .
(A−1
k Ak )x
−1 a
• Else choose i s.t. (c T A−1
k )i < 0 and let z := −Ak ei .
Note that for λ ≥ 0, xk + λz traverses an edge or a ray of P (i.e. face of
dim 1) or is outside of P for all λ > 0. Also, c T z = −(c T A−1
k )i > 0.
a
ei is the ith unit vector in Rn .
21 / 143
Background information on linear programming
The simplex method (3)
• If Az ≤ 0 ⇒ xk + λz ∈ P for all λ ≥ 0, whence
max{cT x|Ax ≤ b} = ∞.
• Else ⇒ let λ0 the largest λ s.t. xk + λz ∈ P:
λ0 = min
n bj − aT xk
j
ajT z
o
|j = 1, . . . , m, ajT z > 0 .
Choose j attaining this minimum, replace the i inequality
in Ak x ≤ bk by inequality ajT x ≤ bj ⇒ Ak+1 x ≤ bk+1 .
Note that xk+1 = xk + λ0 z. Hence if xk+1 6= xk then c T xk+1 > c T xk .
The process stops if c T xk+1 > c T xk (since P has finitely many vertices).
This is the case when each vertex has only one basis (the nondegenerate
case).
22 / 143
Background information on linear programming
The simplex method. Discussion.
- Pivot selection rules, prescribing the choice of i and j above, have been
found which could be proven to assure termination of the simplex
method.
- No one of these rules could be proven to give a polynomial-time
method. Most of them could be shown to require an exponential number
of iterations in the worst case.
- This happens even in the case when diameter of P is small (TSP
polytopes have diameter 2).
- A vertex of P can be adjacent with an exponential number of vertices
(in the size of A and b), whereas for any basis A0 the are at most
n(m − n) bases differing from A0 in exactly one row.
- Desire: finding pivoting rules preventing us going through many bases
corresponding to the same vertex.
23 / 143
Background information on linear programming
Primal-dual method.(1)
- Dantzig, Ford, Fulkerson (1956)- generalization of similar
methods for network flow and transportation problems.
- Ideea: Starting with a dual feasible solution y , a primal feasible
solution x satisfying the complementary slackness condition with
respect to y is searched. If such a primal solution is found, then x
and y form a pair of optimal (primal, dual) solutions. If no such
primal solution is found, the method prescribes a modification of
y , after which we start anew.
- The method requires strategies to find primal solution satisfying
complementary slackness and to modify the dual solution if no
such primal solution is found.
24 / 143
Background information on linear programming
Primal-dual method. (2)
A ∈ Rm×n , columns of A: a1 , . . . , an ∈ Rm , b ∈ Rm , and c ∈ Rn .
Solving
min{c T x|x ≥ 0, Ax = b}.
The dual problem:
max{y T b|y T A ≤ c T }.
Suppose we have y0 a dual feasible solution: y0T A ≤ c T .
25 / 143
Background information on linear programming
Primal-dual method. (3)
A primal-dual iteration:
- Let A0 the submatrix of A consisting of those columns aj for
which y0T aj = cj .
- Solve the restricted linear program
min{λ|λ, x 0 ≥ 0, A0 x 0 + λb = b} = max{y T b|y T A0 ≤ 0, y T b ≤ 1}.
- If the optimum value is 0 ⇒ solution λ = 0, x00 ≥ 0 and A0 x00 = b.
Adding 0 components to x00 −→ x0 ≥ 0 s.t. Ax0 = b and (x0 )j = 0 if
y0T aj < 0. Complementary slackness fulfilled, hence
x0 , y0 optimal primal/dual solutions.
- If the optimum value is not 0 ⇒ it is 1. Let u be an optimal solution
for the maximum. Let θ be the largest real number satisfying
(y0 + θu)T A ≤ c T . (Note that θ > 0)
y0 := y0 + θu and start the iteration anew.
26 / 143
Background information on linear programming
Primal-dual method as a gradient method
Let y0 be a feasible solution of max{y T b|y T A ≤ c T }.
y0 is not optimal ⇔ we can find u such that u T b > 0 and u is a
feasible direction in y0 : (y0 + θu)T A ≤ c T for some θ > 0.
If A0 consists in the columns of A in which y0T A ≤ c T has equality,
then u is a feasible direction ⇔ u T A0 ≤ 0.
So u can be found by solving max{u T b|u T A0 ≤ 0, u T b ≤ 1}.
27 / 143
Background information on linear programming
Primal-dual method as a gradient method – Maximum flow (1)
Given a digraph D = (V , A), arc capacities ca ∈ [0, ∞] ∀a ∈ A, a source
node s and a sink node t. The maximum s-t flow problem: find the
maximum amount of flow that can be sent from s to t through the
network, without exceeding the capacities. Formulated as a LP:
maximise
X
xa −
a∈δ + (v )
xa
a∈δ − (s)
a∈δ + (s)
subject to
X
X
xa −
X
xa = 0
∀v ∈ V − {s, t}
a∈δ − (v )
0 ≤ xa ≤ c(a)
∀a ∈ A
If we have a feasible solution x0 , finding a feasible direction in x0 ,
means finding u : A → R satisfying:
28 / 143
Background information on linear programming
Primal-dual method as a gradient method – Maximum flow (2)
X
X
u(a) −
u(a) > 0
a∈δ − (s)
a∈δ + (s)
X
a∈δ + (v )
u(a) −
X
u(a) = 0
∀v ∈ V − {s, t}
a∈δ − (v )
u(a) ≥ 0
∀a ∈ As.t.x0 (a) = 0
u(a) ≤ 0
∀a ∈ As.t.x0 (a) = c(a)
This means finding an undirected s-t path


a is traversed forward
a is traversed backward


a is traversed forward or backward
s.t. ∀a arc in the path:
if x0 (a) = 0
if x0 (a) = c(a)
if 0 < x0 (a) < c(a)
29 / 143
Background information on linear programming
Primal-dual method as a gradient method – Maximum flow (3)
If we have such a path, we take u(a) 1 or -1, corresponding to the
forward/backward arcs in the path, and 0 for the other arcs.
Taking the highest value of θ such that x0 + θu is a feasible flow,
gives the new solution.
The path is called flow-augmenting path since the new solution
has a higher objective value.
This is exactly Ford - Fulkerson algorithm, which is hence an
example of primal-dual method. Dinits (1970) and Edmonds Karp (1972) showed polynomial-time versions of this algorithm.
30 / 143
Background information on linear programming
The ellipsoid method – Khachiyan (1979)
Shor (1970), Yudin - Nemirovskiˇı (1976) for NLP; shown by
Khachiyan to solve LP in polynomial time.
Rough description: A ∈ Qm×n , b ∈ Qm , and c ∈ Qn .
max{c T x|Ax ≤ b}; polyhedron P = {x|Ax ≤ b} is bounded.
- Find R ∈ R s.t. P ⊆ E0 = {x ∈ Rn | kxk ≤ R}
- Construct sequence of ellipsoids E0 , E1 , . . ., starting with E0 , and
for each t, Et+1 is(obtained from Et as follows: z center of Et ;
{x|akT x ≤ akT z} if ∃k s.t. akT z > bk
hyperplane H :=
{x|c T x ≥ c T z} if Az ≤ b
Et+1 is the ellipsoid of smallest volume contained in Et ∩ H.
31 / 143
Background information on linear programming
The ellipsoid method (2)
1
One can prove that Et+1 is unique and vol(Et+1 ) ≤ e − n vol(Et ).
Since the optimum solution of the LP belongs to Et , centers of
ellipsoids may converge to an optimum solution.
Difficulty: Ellipsoids with very small volumes may have a large
diameter (hence centers can remain far for optimum solution).
Difficulty: The unique smallest volume ellipsoid is determined by
irrational parameters, hence working in rational arithmetic must
allow successive approximations.
These problems can be overcome and a polynomial running time
can be proved.
32 / 143
Background information on linear programming
The ellipsoid method - discussion (1)
Gr¨
otschel, Lov´
asz, Schrijver (1981): in applying the ellipsoid
method it is not necessary that the system Ax ≤ b be explicitly
given. It is sufficient to have a subroutine to decide if a given
vector z belongs to the feasible region of the LP and to find a
separating hyperplane in case z is not feasible.
Useful for LPs coming from combinatorial optimization, where the
number of linear constraints is exponential in the size of underlying
data structure.
Optimization Problem (OP): Given G = (V , E ) a graph;
c : E → Q, FG a collection of subsets of E .
Find
P F0 ∈ FG s.t. c(F0 ) = maxF ∈FG c(F ),
where c(F ) = e∈F c(e).
33 / 143
Background information on linear programming
The ellipsoid method - discussion (2)
If FG is the collection of matchings (spanning trees, respectively
Hamiltonian circuits)in G , we obtain the problem of finding the
maximum weighted matching (spanning tree, respectively,
Hamiltonian circuit – TSP).
The optimization problem is polynomially solvable if it is solvable by an
algorithm whose running time is bounded by a polynomial in the size of
optimization problem which is |V | + |E | + size(c), where
P
size(c) := e∈E size(c(e)), and size(p/q) = log2 [(1 + |p|)(1 + |q|)].
- Separation Problem (SP) Given G = (V , E ), FG , and x ∈ QE ,
decide if x ∈ convhull({χF |F ∈ FG }) and, if not, find a separating
hyperplane. (χF denotes the incidence vector in QE of F ⊆ E ).
34 / 143
Background information on linear programming
The ellipsoid method - discussion (3)
Theorem The optimization problem OP is polynomially solvable if
and only if the separation problem SP is polynomially solvable.
The ellipsoid method does not give a practical method. The above
theorem has been used to prove that some combinatorial problems admit
polynomial time algorithms giving a motivation to find a practical one.
One drawback of the ellipsoid method is that the number of ellipsoids
constructed depends on the size of the objective vector c. This is not
very attractive in practice : it would be preferable for the size of c
influences only the sizes of the numbers occurring during the algorithm
but not the number of arithmetic operations to be performed.
35 / 143
Total Unimodularity
Totally unimodular matrices
Definition A matrix A is totally unimodular if each minor M of
A (squared submatrix of A obtained by deleting some rows and
columns) satisfies det(M) ∈ {−1, 0, 1}.
⇒ each entry of a totally unimodular matrix belongs to {−1, 0, 1}.
Theorem – Hoffman and Kruskal 1956
Let A ∈ {−1, 0, 1}m×n a totally unimodular matrix and let b ∈ Zm .
Then the polyhedron P = {x|Ax ≤ b} is integral.
Proof. Let F = {x|A0 x = b 0 } a minimal face of P, where A0 x ≤ b 0
is a subsystem of Ax ≤ b. W.l.o.g. we can suppose A0 = [A1 A2 ],
with A1 nonsingular. Then A−1
1 is an integral matrix (by Crammer’s rule and since det(A1 ) ∈ {−1, 1}). Hence x =
0
A−1
1 b
0
is an integral vector in F .
36 / 143
Total Unimodularity
TU characterization – Hoffman and Kruskal
An integral m × n matrix A is totally unimodular if and only if, for
each b ∈ Zm , each vertex of the polyhedron {x|x ≥ 0, Ax ≤ b} is
integral.
An extension of the HK theorem
A polyhedron P in Rn has the integer decomposition property if
for each k ∈ N, k > 0, and for each integral vector z in
kP (= {kx|x ∈ P}), there exist integral vectors x1 , . . . , xk in P so that
z = x1 + . . . + xk .
Each polyhedron with integer decomposition property is integral.
37 / 143
Total Unimodularity
Theorem – Baum and Trotter 1977
Let A be a totally unimodular m × n matrix and b ∈ Zm . Then the
polyhedron P := {x|Ax ≤ b} has the integer decomposition
property.
Proof. Let z ∈ kP ∩ Zn . Induction on k ∈ N, k > 0. Trivial for
k = 1. In the inductive step, we show that z = x 1 + . . . + x k for
integral vectors x 1 , . . . , x k in P. By HK theorem (on slide 36), ∃x k
integral vector in the polyhedron{x|Ax
≤ b, −Ax ≤ (k − 1)b − Az}
A
(since (i) the constraint matrix
is totally unimodular, (ii)
−A
the right side vector
b
(k − 1)b − Az
is integral, and (iii) the
polyhedron is not empty as it contains k −1 z). Then
z − xk ∈ (k − 1)P, hence by induction, z = x1 + . . . + xk .
38 / 143
Total Unimodularity
Totally Unimodularity Characterizations
Theorem Let A be a matrix with entries 0, 1 and −1. The
following are equivalent:
(i) A is totally unimodular, i.e. each square submatrix has
determinant in {−1, 0, 1}.
(ii) (Ghouila-Houry) each set of columns of A can be split into two
parts s.t. the sum of columns in one part minus the sum of
columns in the other part is a vector with entries in {−1, 0, 1}.
(iii) (Camion) Each nonsingular submatrix has a row with an odd
number of nonzero components.
(iv) (Camion) The sum of entries in any square submatrix of A
with even row and column sums, is divisible by four.
(v)(Gomory) No square submatrix of A has determinant +2 or −2.
The deepest characterization is due to Seymour (1980) implying
a polynomial time recognition algorithm.
39 / 143
Total Unimodularity
Application: Bipartite graphs
If G = (V , E ) is a bipartite graph, then its incidence matrix
A ∈ {0, 1}|V |×|E | is totally unimodular: any square submatrix B of
A either has a column with at most one 1 (in which case
det(B) ∈ {0, 1} by induction), or has two 1’s in each column (in
which case det(B) = 0, by the bipartiness of G ).
In fact the incidence matrix of a graph G is totally unimodular if
and only if G is bipartite.
We will consider next some of the consequences of the total
unimodularity of the incidence matrix of a bipartite graph.
40 / 143
Total Unimodularity – Bipartite graphs
Matching polytope of a bipartite graph
G = (V , E ) bipartite graph.
convhull{χM |M matching in G } ⊆ RE+
This polytope is equal with the set of all vectors in RE satisfying:
(i)
(ii)
X
xe
≥ 0
e∈E
xe
≤ 1
v ∈V
e3v
Clearly, for M matching, χM satisfies (i) and (ii) above, hence the
matching polytope is contained in the polyhedron described by (i) and
(ii). Conversely, by HK theorem, the polyhedron determined by (i) and
(ii) is integral and, clearly, an integral vector satisfying (i) and (ii) must
be equal to χM for some matching M of G .
41 / 143
Total Unimodularity – Bipartite graphs
Matching polytope of a bipartite graph
The matching polytope of G = (V , E ) has dimension |E |.
Each inequality in (ii) is facet-determining, except if G has a vertex
of degree at most 1.
0
χM and χM are adjacent vertices in the matching polytope if and
only if M∆M 0 is a path or a circuit. ⇒ the matching polytope has
diameter at most ν(G ) (maximum cardinality of a matching in G ).
G = (V , E ) bipartite graph, c : E → R+ , A incidence matrix of G :
maximum weight of a matching = max{c T x|x ≥ 0, Ax ≤ 1}
ν(G ) = max{1T x|x ≥ 0, Ax ≤ 1}
42 / 143
Total Unimodularity – Bipartite graphs
Node-cover polytope of a bipartite graph
Node-cover in G = (V , E ): N ⊆ V s. t. ∀e ∈ E N ∩ e 6= ∅.
τ (G ) minimum cardinality of a node-cover in G .
Node-cover polytope of the bipartite graph G = (V , E ):
convhull{χN |N node-cover in G } ⊆ RV
+
Applying HK theorem, we obtain that the node-cover polytope of
the bipartite graph G = (V , E ) is equal with the polyhedron:
(i)
(ii)
0 ≤ yv
≤ 1
v ∈V
yv + yw
≥ 1
{v , w } ∈ E
G = (V , E ) bipartite graph, w : V → R+ , A incidence matrix of G :
minimum weight of a node-cover = min{w T y |y ≥ 0, y T A ≥ 1}
τ (G ) = min{1T y |y ≥ 0, y T A ≥ 1}
43 / 143
Total Unimodularity – Bipartite graphs
Node-cover polytope of a bipartite graph
By linear programming duality, the last problems on the previously
two slides have equal optimum values, hence we obtain
K¨
onig’s Matching Theorem: ν(G ) = τ (G ) for bipartite G .
By the theorem on slide 38, the matching polytope P of G has the integer
decomposition property. If k := ∆(G ) is the maximum degree of a vertex
in G , then 1 ∈ RE belongs to kP, and hence is the sum of k integer
vectors in P. Each of these vectors is the incidence vector of a matching
in G . It follows that E can be partitioned into matchings! So, we have:
K¨
onig’s Edge-Coloring Theorem: The edge-coloring number
0
χ (G ) of a bipartite graph G is equal to its maximum degree ∆(G ).
44 / 143
Total Unimodularity – Bipartite graphs
Perfect matching polytope of a bipartite graph
Perfect matching in G = (V , E ): matching saturating each vertex.
Perfect matching polytope of the bipartite graph G = (V , E ):
convhull{χM |M perfect matching in G } ⊆ RE+
It is a face of the matching polytope. ⇒ determined by:
(i)
xe ≥ 0
e∈E
X
xe = 1
v ∈V
(ii)
e3v
This is equivalent to a theorem of Birkchoff (1946): each doubly
stochastic matrix is a convex combination of permutation matrices.
0
χM and χM are adjacent vertices in the perfect matching polytope
if and only if M∆M 0 is a circuit. ⇒ the perfect matching polytope
has diameter at most 12 |V |.
45 / 143
Total Unimodularity – Bipartite graphs
The assignment polytope
The perfect matching polytope of Kn,n .
Equivalently: the polytope in Rn×n of all matrices (xij )ni,j=1 s.t.
(i)
(ii)
(iii)
n
X
i=1
n
X
xij
≥ 0
i, j = 1, . . . , n
xij
=
1
j = 1, . . . , n
xij
=
1
i = 1, . . . , n
j=1
(such matrices are called doubly stochastic)
Balinski and Russakoff (1974): For n ≥ 4 assignment polytopes
have diameter 2.
46 / 143
Total Unimodularity – Bipartite graphs
The stable-set polytope of a bipartite graph
Stable-set in graph G : set of pairwise non-adjacent vertices.
The stable-set polytope of a graph G = (V , E ):
convhull{χC |C stable-set in G } ⊆ RV
+
By HK theorem, for bipartite graph G = (V , E ) this polytope is
equal with the set of all vectors in RV satisfying:
0 ≤ yv
≤
1
v ∈V
(ii) yv + yw
≤
1
{v , w } ∈ E
(i)
G = (V , E ) bipartite graph, w : V → R+ , A incidence matrix of G :
maximum weight of a stable-set = max{w T y |y ≥ 0, y T A ≤ 1}
α(G ) = max{1T y |y ≥ 0, y T A ≤ 1}
47 / 143
Total Unimodularity – Bipartite graphs
The edge-cover polytope of a bipartite graph
Edge-cover in graph G = (V , E ): set F ⊆ E s.t. ∪e∈F e = V .
The edge-cover polytope of a graph G = (V , E ):
convhull{χF |F edge-cover in G } ⊆ RE+
By HK theorem, for bipartite graph G = (V , E ) with no isolated
vertex, this polytope is equal with the set of all vectors in RE
satisfying:
0 ≤ xe
X
(ii)
xe
(i)
≤
1
e∈E
≥
1
v ∈V
e3v
G = (V , E ) bipartite graph, w : E → R+ , A incidence matrix of G :
minimum weight of an edge-cover = min{w T x|x ≥ 0, x T A ≥ 1}
ρ(G ) = min{1T x|x ≥ 0, x T A ≥ 1}
48 / 143
Total Unimodularity – Bipartite graphs
The edge-cover polytope of a bipartite graph
By linear programming duality, the last problems on the previously
two slides have equal optimum values, hence we obtain
K¨
onig’s Covering Theorem: α(G ) = ρ(G ) for bipartite G .
By the theorem on slide 38, the edge-cover polytope of a bipartite graph
G has the integer decomposition property. ⇒ Gupta’s Theorem (1967):
the maximum number of disjoint edge covers in a bipartite graph is equal
to its minimum degree.
G = (V , E ) bipartite graph, w ∈ ZE ,b ∈ ZV , A incidence matrix of G :
By LP duality we have:
(1) max{w T x|x ≥ 0, Ax ≤ b} = min {y T b|y ≥ 0, y T A ≥ w T }
(2) min {w T x|x ≥ 0, Ax ≥ b} = max{y T b|y ≥ 0, y T A ≤ w T }
49 / 143
Total Unimodularity – Bipartite graphs
The edge-cover polytope of a bipartite graph
By HK theorem these programs have integral optimal solutions.
For b = 1 we obtain the Egervary’s min-max relations (1931):
the maximum
is equal to the minimum
P weight of a matching
V
value of v ∈V yv , where y ∈ Z+ and
yv + yu ≥ we ∀e = {u, v } ∈ E .
the minimum weight
to the
Pof an edge-cover is equal
V
maximum value of v ∈V yv , where y ∈ Z+ and
yv + yu ≤ we ∀e = {u, v } ∈ E .
50 / 143
Total Unimodularity
Application: digraphs
Let M 
be the

1
mva = −1


0
|V | × |A| incidence matrix of a digraph D = (V , A):
if v is the tail of a
if v is the head of a
otherwise.
M is totally unimodular: any square submatrix B of M either has
a column with at most one non-zero (in which case
det(B) ∈ {0, 1} by induction), or has exactly one 1 and one -1 in
each column (in which case det(B) = 0, by adding its rows).
We will consider next some of the consequences of the total
unimodularity of the incidence matrix of a digraph.
51 / 143
Total Unimodularity – Digraphs
The s-t-flow polytope
Given a digraph D = (V , A), arc capacities ca ∈ [0, ∞] ∀a ∈ A, a
source node s and a sink node t.
The s-t-flow polytope: the set of all vectors x in RA satisfying
0 ≤ xa ≤ c(a) ∀a ∈ A
X
X
xa =
xa ∀v ∈ V − {s, t}
a∈δ − (v )
a∈δ + (v )
A vector x in this polytope is called an s-t-flow (under c). By the
total unimodularity of theP
incidence matrix
P of D, if c is integral,
the maximum value ( := a∈δ+ (s) xa − a∈δ− (r ) xa ) of an s-t-flow
under c, is attained by an integral vector (Dantzig, 1951).
52 / 143
Total Unimodularity – Digraphs
The s-t-flow polytope: Max-Flow Min-Cut Theorem
By LP duality, the maximum
P value of an s-t-flow under c is equal
to the minimum value of a∈A ya ca , where y ∈ RA
+ is such that
V
there is z ∈ R satisfying:
ya − zv + zu ≥ 0 ∀a = (v , u) ∈ A
zs = 1,
zt
= 0
By the total unimodularity of the incidence matrix of D, we may
take the minimizing y , z to be integral. Let W := {v ∈ V |zv ≥ 1}.
Then for any aP
= (v , u) ∈ δ +P
(W ) we have ya P
≥ zv − zu ≥ 1, and hence
a∈A ya ca ≥
a∈δ + (W ) ya ca ≥
a∈δ + (W ) ca .
So the maximum flow value is not less then the capacity of cut δ + (W ).
Since it cannot be larger, we obtain Ford and Fulkerson Max-Flow
Min-Cut Theorem.
53 / 143
Total Unimodularity – Digraphs
The shortest-path polytope
Given a digraph D = (V , A), a source node s and a sink node t.
The shortest-path polytope: the convex hull of all incidence
vectors χP of subsets P of A, being a disjoint union of a s-t path
and some circuits.
By the total unimodularity of the incidence matrix of D, this
polytope is equal to the set of all vectors x in RA satisfying
0 ≤ x ≤ 1 ∀a ∈ A
X a
X
xa =
xa ∀v ∈ V − {s, t}
a∈δ − (v )
a∈δ + (v )
X
a∈δ + (s)
xa −
X
xa = 1
a∈δ − (s)
So,
of an s-t-flow polytope with the hyperplane
P is the intersection
P
a∈δ + (s) xa −
a∈δ − (s) xa = 1.
54 / 143
Total Unimodularity – Digraphs
The circulation polytope (1)
Given a digraph D = (V , A) and l, u ∈ RA .
The circulation polytope: the set of all circulations between l
and u, that is vectors x ∈ RA satisfying
la ≤ xa ≤ ua
∀a ∈ A
Mx = 0
where, M is the incidence matrix of D.
By the total unimodularity of M, if l and u are integral, then the
circulation polytope is integral. So if l and u are integral, and
there exists a circulation, there exists an integral circulation.
By Farkas’s lemma, the circulation polytope is non-empty if and
only if there are no vectors z, w ∈ RA , y ∈ RV satisfying:
55 / 143
Total Unimodularity – Digraphs
The circulation polytope (2)
z, w
≥ 0
T
= 0
z −w +M y
T
T
u z −l w
< 0
Suppose that l ≤ u and the above system has a solution. Then there is
also a solution with 0 ≤ y ≤ 1, and by the total unimodularity of M,
there is a solution z, w , y with y a {0, 1}-vector. Also we may assume
that za wa = 0 for each a ∈ A. Then, for W := {v ∈ V |yv = 1},
P
P
T
T
a∈δ − (W ) ua −
a∈δ + (W ) la = u z − l w < 0.
Thus we have Hoffman’s Circulation Theorem (1960): there
exists a circulation x P
satisfying l ≤ x P
≤ u if and only if there is no
subset W of V with a∈δ− (W ) ua < a∈δ+ (W ) la .
56 / 143
Total Unimodularity – Digraphs
The circulation polytope (3)
Let D = (V , A) be a digraph and T ⊆ A be a spanning tree of D.
0
Consider the (A − T
) × T matrix N defined for a = (v , w ) ∈ A − T and

if a does not occur in the v -w path in T ,
0
a ∈ T by: Na0 ,a = −1 if a occurs forward in the v -w path in T ,


+1 if a occurs backward in the v -w path in T .
Then N is totally unimodular (using,
0 e.g., Gouila Houry characterization
x
(ii), on slide 39). A vector x =
∈ RA−T × RT satisfies Mx = 0 (
x 00
where M is the incidence matrix of D) if and only if x” = Nx 0 . Hence
the circulation polytope can be equivalently written
la ≤
xa0 ≤ ua
∀a ∈ A − T
0
∀a ∈ T
la ≤ (Nx )a ≤ ua
57 / 143
Total Unimodularity – Digraphs
The circulation polytope (4)
By the unimodularity of N, the above polytope has integer vertices
if all la and ua are integer.
A nice special case is given by the {0, 1}-matrices with consecutive
ones property : in each column the 1’s form an interval (fixing
some ordering of the rows).
This special case arises when T is a directed path and each arc in
A − T forms a directed circuit with some subpath of T (Hoffman,
1979).
58 / 143
Total Dual Integrality (TDI)
TDI is a powerful technique in deriving min-max relations and the
integrality of polyhedra. It is based on the following result.
Theorem – Edmonds and Giles 1977
A rational polyhedron P is integral if and only if each rational
supporting hyperplane of P contains integral vectors.
Proof. Since the intersection of a supporting hyperplane with P is a face
of P, the necessity condition is trivial. To prove sufficiency, suppose that
each rational supporting hyperplane of P contains integral vectors. Let
P = {x|Ax ≤ b} with A and b integral. Let F = {x|A0 x = b 0 } be a
minimal face of P with A0 x ≤ b 0 a subsystem of Ax ≤ b. If F does not
contain an integral vector, it follows that there is y such that c T = y T A0
is an integral vector, while δ := y T b 0 is not integral (this follows, e.g.,
from Hermite’s Normal form Theorem). We may assume that all entries
in y are nonnegative (we may replace each entry yi by yi − byi c). Now,
H := {x|c T x = δ} is a supporting hyperplane of P without any integral
vector.
59 / 143
Total Dual Integrality (TDI)
LP problem: (1) max{c T x|Ax ≤ b}, with rational entries in A, b, c.
Corollary
The following are equivalent:
(i) The maximum value in (1) is an integer for each integral vector
c for which the maximum is finite.
(ii) The maximum in (1) is attained by an integral optimum solution
for each rational vector c for which the maximum is finite.
(iii) The polyhedron {x|Ax ≤ b} is integer.
Now, consider the LP-duality equation
max{c T x|Ax ≤ b} = min{y T b|y ≥ 0, y T A = c T }.
We may derive that the maximum value is an integer if we know that the
minimum has an integral optimum solution and b is integral.
60 / 143
Total Dual Integrality (TDI)
TDI’s Definition - Edmonds and Giles 1977
A system Ax ≤ b of linear inequalities is totally dual integer (TDI) if
the minimum in min{y T b|y ≥ 0, y T A = c T } is attained by an integral
optimum solution y , for each integral vector c s.t. the minimum is finite.
Corollary
Let Ax ≤ b be a system of linear inequalities with A rational and b
integral. If Ax ≤ b is TDI then {x|Ax ≤ b} is integral.
Edmonds’s photo, 2009
61 / 143
TDI Applications
Arborescences
Let D = (V , A) be a digraph and r ∈ V be a fixed vertex of D.
An r -arborescence is a set A0 of |V | − 1 arcs forming a spanning tree
such that each vertex v 6= r is entered by exactly one arc in A0 (for any
vertex v there is an unique directed path in A0 from r to v ).
An r -cut is an arc set of the form δ − (U) for some U ⊆ V − {r }, U 6= ∅.
r -arborescences are minimal (w.r.t. ⊆) sets of arcs intersecting all r -cuts.
r -cuts are minimal (w.r.t. ⊆) sets of arcs intersecting all r -arborescences.
Fulkerson’s Optimum Arborescence Theorem
For any ”length” function l : A → Z+ , the minimum ”length” of
an r -arborescence is equal to the maximum number t of r -cuts
C1 , . . . , Ct (repetition allowed) such that no arc a ∈ A is in more
than l(a) of these cuts.
62 / 143
TDI Applications
Fulkerson’s Optimum Arborescence Theorem
This result can be formulated in polyhedral terms as follows.
Let C be the matrix whose rows are the incidence vectors of all
r -cuts. Hence the columns of C are indexed by A and the rows by
the collection H := {U|U 6= ∅, U ⊆ V − {r }}.
Then, the theorem is equivalent to both optima in the LP-duality
equation
min{l T x|x ≥ 0, Cx ≥ 1} = max{y T 1|y ≥ 0, y T C = l T }.
having integral solutions, for each l ∈ ZA
+.
So, in order to prove the theorem, it suffices to show that the
above maximum has an integral optimum solution, for each
l ∈ ZA
+ , i.e., that the system x ≥ 0, Cx ≥ 1 is TDI.
63 / 143
TDI Applications
Fulkerson’s Optimum Arborescence Theorem
Proof of the Optimum Arborescence Theorem. Since C is
generally not totally unimodular, we find a totally unimodular submatrix
C 0 of C (consisting of rows of C ) such that
max{y T 1|y ≥ 0, y T C = l T } = max{z T 1|z ≥ 0, z T C 0 = l T }.
By the total unimodularity of C 0 , the second maximum is attained by an
integral optimal solution z; z can be extended to an integral optimal
solution y for the first maximum (adding 0’s in the appropriate positions).
Construction of C 0 : A subcollection F of H is called laminar if
∀T , U ∈ F we have T ⊆ U or U ⊆ T or U ∩ T = ∅.
If C 0 is the matrix consisting of the rows of C indexed by some
laminar family F then C 0 is totally modular.
64 / 143
TDI Applications
Proof of Fulkerson’s Optimum Arborescence Theorem
Let F ⊆ H be a laminar family and C 0 its corresponding submatrix of C .
Let G be a subcollection of F, that is a set of rows of C 0 . For each
U ∈ G let its ”height” be the number of T in G s.t. T ⊆ U. Split G into
Geven and Godd according as h(U) is even or odd. Since G is laminar, for
any arc a ∈ A, the number of sets in Geven containing a and the number
of sets in Godd containing a differs by at most 1. Hence we can split the
rows corresponding to G into two classes fulfilling Ghouila-Houri’s
condition on slide 39. So C 0 is totally unimodular.
T
T
T
Let l ∈ ZA
+ . Let
Py optimal solution of max{y 1|y ≥ 0, y C = l }
and for which U∈H yU · |U| · |V − U| is minimum (there is such y
by compactness argument).
Let F := {U|U ∈ H, yU > 0}. Then F is laminar.
65 / 143
TDI Applications
Proof of Fulkerson’s Optimum Arborescence Theorem
Indeed, suppose that ∃ T , U ∈ F with T 6⊆ U 6⊆ T and T ∩ U 6= ∅.
Let := min{yT , yU } > 0 and reset
yT := yT − yU := yU − yT ∩U := yT ∩U + yT ∪U := yT ∪U + while y does not change in the other coordinates.
By this resetting, y T C 0 does not increase in any coordinate (since
−
−
δ − (T ∩U) + · χδ − (T ∪U) ) , while y T 1
· χδ (T ) + · χδ (U) ≥ · χ
P
does not change. However, U∈H yU · |U| · |V − U| decreases,
contradicting the choice of y . Hence F is laminar,
and the Optimum Arborescence Theorem follows, since we have
proved that the system x ≥ 0, Cx ≥ 1 is TDI.
66 / 143
TDI Applications
r -arborescences polytope
From the above proof, we obtain that the r -arborescences
polytope of a digraph D = (V , A), r ∈ V ,
convhull({χP |P r -arborescence},
is described by
0 ≤x ≤ 1
X a
xa ≥ 1
∀a ∈ A
∀U ⊆ V − {r }, U 6= ∅
a∈δ − (U)
This description gives, via the ellipsoid method, a polynomial
time algorithm to find a minimum length r -arborescence.
Indeed, given x ∈ QA we first test if 0 ≤ xa ≤ 1 for each a; if 0 > xa or
xa > 1 we have a separating hyperplane. Otherwise, consider x as a
capacity function and find C (with a maximum flow algorithm), a
minimum capacity r -cut. If C has capacity at least 1, then x belongs to
the r -arborescences polytope, otherwise C gives a separating hyperplane.
67 / 143
TDI Applications
Directed-cut polytope
One similarly show that, for a digraph D = (V , A), the following
polytope is TDI:
0 ≤ x ≤ 1 ∀a ∈ A
X a
xa ≥ 1 ∀U ⊆ V , U 6= ∅, and δ + (U) = ∅
a∈δ − (U)
This is equivalent with the Luchessi and Younger Theorem.
A directed-cut is a set of arcs of the form δ − (U), ∅ =
6 U 6= V and
δ + (U) = ∅.
A directed-cut covering is a set of arcs intersecting each directed cut.
Equivalently, is a set of arcs whose contraction makes the digraph
strongly connected.
68 / 143
TDI Applications
Luchessi and Younger Theorem
The minimum size of a directed-cut covering in a digraph D = (V , A)
is equal to the maximum number of pairwise disjoint directed cuts.
This theorem has a nice self-refining nature: for any ”length”
function l : A → Z+ , the minimum length of a directed cut
covering is equal to the maximum number t of directed cuts
C1 , . . . , Ct (repetion allowed), so that no arc a is in more than l(a)
of these cuts (to derive this from the above theorem, replace each
arc a by a path of length l(a)).
In the weighted form, the theorem is easily seen to be equivalently
to the TDI of the directed-cut polytope.
69 / 143
TDI Applications
Polymatroid Intersections
Let S be a finite set. f : 2S → R is called a submodular function
if
f (T ) + f (U) ≥ f (T ∪ U) + f (T ∩ U) for all T , U ⊆ S.
The rank function of any matroid is submodular.
A matroid can be defined as a pair M = (S, I) where I ⊆ 2S , the
family of independent sets of M, satisfies (i) ∅ ∈ I, (ii) if I ∈ I
then I 0 ∈ I for each I 0 ⊆ I and (iii) the rank function of M,
ρ : 2S → Z+ with ρ(A) = max{|I ||I ∈ I, I ⊆ A} is submodular.
70 / 143
TDI Applications
Polymatroid Intersections
Let f1 , f2 be two submodular functions on S and consider the
following system in the variable x ∈ RS :
(∗)
X
xs
≥ 0
∀s ∈ S,
xs
≤ f1 (U)
∀U ⊆ S,
xs
≤ f2 (U)
∀U ⊆ S,
s∈U
X
s∈U
Theorem (Edmonds). The above system (∗) is TDI.
71 / 143
TDI Applications
Polymatroid Intersections
Proof. Let c ∈ ZS ; the dual of the LP max{c T x|x satisfies (∗)} is:
min
nX
yU f1 (U) +
U⊆S
X
U⊆S
S
zU f2 (U)|y , z ∈ R2 ,
X
(yU + zU )χU ≥ c
o
U⊆S
Uncrossing technique to show that this minimum has an integral
solution: let y , z attaining this minimum and
P
U⊆S (yU + zU ) · |U| · |S − U| is as small as possible. F := {U|yU > 0}
forms a chain with respect to inclusion (the proof, by resetting y , is
similar to that on slide 66). Similarly, G := {U|zU > 0} forms a chain.
Since y and z attain minimum above and F and G form chains with
respect to inclusion, it follows that this minimum is
o
nX
X
X
X
zU f2 (U)|y ∈ RF , z ∈ RG ;
yU χU +
zU χ U ≥ c
yU f1 (U)+
min
U∈F
U∈G
U∈F
U∈G
The constraint matrix in this new problem is totally unimodular, by
Ghouila-Houry’s criterion. Hence it has an integral optimal solution y , z
which can be extended with 0’s to obtain an integral optimal solution for
72 / 143
the first minimum.
TDI Applications
Polymatroid Intersections
If f1 and f2 are integer-valued submodular functions, the TDI of system
(∗) implies that it define an integral polyhedron. In particular, if f1 and f2
are the rank functions of the matroids M1 = (S, I1 ) and M2 = (S, I2 ),
we obtain
Corollary (Edmonds 1970). The polytope convhull{χI |I ∈ I1 ∩ I2 } is
determined by (∗).
Proof. Observe that an integral vector satisfies (∗) if and only if it is
equal to χI for some I ∈ I1 ∩ I2 .
A very special case M1 = M2 : The independence polytope of a matroid
M = (S, I) with rank function f , convhull{χI |I ∈ I}, is determined by:
xs ≥ 0 ∀s ∈ S,
X
xs ≤ f (U) ∀U ⊆ S,
s∈U
73 / 143
TDI Applications
Edmonds’ Matroid Intersection Theorem
The maximum size of a common independent set of two matroids
(S, I1 ) and (S, I1 ) is equal to minU⊆S [f1 (U) + f2 (S − u)], where f1
and f2 are the rank functions of these matroids.
Proof. By the above corollary, the maximum size of a common
independent set is max{1T x|x satisfies (∗)}, and hence, by TDI of
(∗), to
nX
o
X
X
S
min
yU f1 (U)+
zU f2 (U)|y , z ∈ R2 ,
(yU +zU )χU ≥ 1
U⊆S
U⊆S
U⊆S
It is not difficult to show (using the nonnegativity, monotony and
submodularity of f1 and f2 ) that this last minimum is equal to that
given in the theorem.
74 / 143
TDI Applications
Matching polytope
The matching polytope of a graph G = (V , E ) is
convhull{χM |M matching in G }.
Edmonds showed that this ca be described as the set of all vectors
x ∈ RE satisfying
xe ≥ 0 ∀e ∈ E ,
X
(∗∗)
xe ≤ 1 ∀v ∈ V ,
e3v
X
e⊆U
xe
≤
j1
2
k
|U|
∀U ⊆ V ,
Since the integral vectors satisfying (∗∗) are exactly the incidence
vectors χM of matchings M of G , it suffices to show that (∗∗)
determines an integral polyhedron.
Theorem (Cunningham and Marsh, 1978). System (∗∗) is TDI.
75 / 143
TDI Applications
What means that (∗∗) is TDI?
For each w ∈ ZE both optima in the LP-duality equation
nX
X j 1 k
V
max{w T x|x satisf.(∗∗)} = min
yv +
zU |U| y ∈ ZV+ , z ∈ Z2+ ,
2
v ∈V
U⊆V
∀e ∈ E :
X
yv +
v ∈e
X
zU ≥ w e
o
U⊇e
are attained by integral optimum solutions. It means, that for every
undirected graph G = (V , E ) and every ”weight” function w : E → Z we
have: (1)
nX
X j 1 k
V
max{w (M)|M matching} = min
yv +
zU |U| y ∈ ZV+ , z ∈ Z2+ ,
2
v ∈V
U⊆V
∀e ∈ E :
X
v ∈e
yv +
X
zU ≥ w e
o
U⊇e
76 / 143
TDI Applications
Proof that (∗∗) is TDI (1)
From (1) it follows that we can suppose that w is nonnegative. Let νw
the maximum value in (1). Since the inequality ≤ in (1) follows by
duality, we show only inequality ≥. Suppose that it does not hold and
choose G and w violating (1) with |V | + |E | + w (E ) as small as possible.
Then G is connected and we ≥ 1 for each edge e.
Case 1 ∃v ∈ V covered by every maximum-weighted matching. Let w 0
be the weight obtained from w by decreasing the weights of edges
incident to v by 1. Then νw 0 = νw − 1. Since w 0 (E ) < w (E ), (1) holds
for w 0 . Increasing by 1 the component yv of optimal y for w 0 , we obtain
that (1) holds for w .
Case 2 6 ∃v ∈ V covered by every maximum-weighted matching. Let w 0
be the weight obtained from w by decreasing the weights of all edges by
1. We will show that νw ≥ νw 0 + b 12 |V |c. Since w 0 (E ) < w (E ), (1) holds
for w 0 . Increasing by 1 the component zV of optimal z for w 0 , we obtain
that (1) holds for w .
77 / 143
TDI Applications
Proof that (∗∗) is TDI (2)
Suppose that νw < νw 0 + b 12 |V |c and let M be a matching with
w 0 (M) = νw 0 and such that w (M) is as large as possible. M leaves two
vertices uncovered, since otherwise w (M) = w 0 (M) + b 12 |V |c, implying
νw ≥ w (M) = w 0 (M) + b 12 |V |c = νw 0 + b 12 |V |c.
Let M and vertices u, v uncovered by M s. t. distance d(u, v ) in G is as
small as possible. Then d(u, v ) > 1, else we can add the edge {u, v } to
M increasing w (M). Let t an internal vertex on the shortest path
between u and v . Let M 0 a matching with w (M 0 ) = νw not covering t.
Let P the edge set of the component of [M∆M 0 ]G containing t. P is a
path not covering both u and v . Suppose u is not covered by P. M∆P
is a matching with |M∆P| < |M| and w 0 (M∆P) − w 0 (M) = w (M∆P)−
|M∆P| − w (M) + |M| ≥ w (M∆P) − w (M) = w (M 0 ) − w (M 0 ∆P) ≥ 0.
Hence νw 0 = w 0 (M∆P) and w (M∆P) ≥ w (M). But, M∆P does not
cover t and u and d(u, t) < d(u, v ), contradicting the choice of M, u,
and v . 78 / 143
TDI Applications
Since (∗∗) is TDI it follows:
Edmonds’ Matching Polyhedron Theorem
The mathcing polyhedron of a graph is equal to the polyhedron
determined by (∗∗).
Perfect matching polytope of a graph G
convhull({χM |M perfect matching in G })
xe ≥ 0 ∀e ∈ E ,
is determined by
X
xe = 1 ∀v ∈ V ,
e3v
X
xe
≥
1
∀U ⊆ V , |U| odd
e∈δ(U)
Note that the last two groups of restrictions imply the last group of
restrictions in (∗∗).
79 / 143
Blocking Polyhedra
Fulkerson (1970)
One polyhedral characterization (or min-max relation) can be
derived from another one, and conversely.
Basic Idea: Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ satisfy
(1)
convhull{c1 , . . . , cm } + Rn+ = {x ∈ Rn+ |djT x ≥ 1, j = 1, t}.
Then the same holds after interchanging the ci and dj :
(2) convhull{d1 , . . . , dt } + Rn+ = {x ∈ Rn+ |ciT x ≥ 1, i = 1, m}.
In a sense, in (2) the idea of ”vertex” and ”facet” are interchanged
as compared to (1).
80 / 143
Blocking Polyhedra
Theorem. For any c1 , . . . , cm , d1 , . . . , dt ∈ Rn+ , (1) ⇔ (2).
Proof. We show (1) ⇒ (2) (⇐ follows by symmetry).
Suppose (1) holds. Then, in (2) ⊆ holds. Indeed, from (1) we have that
ciT dj (= djT ci ) ≥ 1 for all i, j. If x ∈ convhull{d1 , . . . , dt } + Rn+ ,
Pt
x = λ1 d1 + · · · + λt dt + u with λk ≥ 0, k=1 λk = 1 and u ≥ 0. Then,
P
P
t
t
ciT x = k=1 λk ciT dk + ciT u ≥ k=1 λk · 1 + ciT u ≥ 1 + 0 = 1, that is
x ∈ {x ∈ Rn+ |ciT x ≥ 1, i = 1, m}.
To show that in (2) ⊇ holds, suppose x 6∈ convhull{d1 , . . . , dt } + Rn+ .
Then, there is a separating hyperplane: ∃y ∈ Rn s.t.
y T x < min{y T z|z ∈ convhull{d1 , . . . , dt } + Rn+ } (3). We may assume
t ≥ 1 [for t = 0 it follows from (1) that 0 ∈ {c1 , . . . , cm } and x is not
from the right side of (2)]. By scaling y , we can suppose that minimum
in (3) is 1.PHence y belongs to the right side
(1) and also to the left
Pof
m
m
side: y ≥ i=1 λi ci for certain λi ≥ 0 and i=1 λi = 1. Since y T x < 1
we obtain that ciT x < 1 for at least one i. Hence x is not from the right
side of (2).
81 / 143
Blocking Polyhedra
Blocking pair of polyhedra
For X ⊆ Rn the blocker B(X ) of X is
B(X ) := {x ∈ Rn+ |y T x ≥ 1 for each y ∈ X }.
For c1 , . . . , cm ∈ Rn+ , the blocker B(P) of the polyhedron
P = convhull{c1 , . . . , cm } + Rn+ (4) can be expressed as
B(P) = {x ∈ Rn+ |ciT x ≥ 1, i = 1, m}.
So B(P) is also a polyhedron, called the blocking polyhedron of P. If
R = B(P) then the pair P, R is called a blocking pair of polyhedra.
Corollary 1. For any polyhedron P of type (4), B(B(P)) = P.
Both relations (1) and (2) are equivalent to: the pair
convhull{c1 , . . . , cm } + Rn+ and convhull{d1 , . . . , dt } + Rn+ forms a
blocking pair of polyhedra.
82 / 143
Blocking Polyhedra
Blocking pair of polyhedra
Corollary 2. Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ . The following are
equivalent
(i)
∀l ∈ Rn+ : min{l T c1 , . . . , l T cm } =
= max{λ1 + · · · + λt |λ1 , . . . , λt ∈ R+ ,
t
X
λj dj ≤ l}
j=1
(ii)
∀w ∈ Rn+ : min{w T d1 , . . . , w T dt } =
= max{µ1 + · · · + µm |µ1 , . . . , µm ∈ R+ ,
m
X
µi ci ≤ w }
i=1
Proof. By LP duality, the maximum in (i) is equal to
T
min{l T x|x ∈ RN
+ , dj x ≥ 1, j = 1, t}. Hence (i) is equivalent to (1).
Similarly, (ii) is equivalent to (2). Hence theorem on slide 81 implies
Corrolary 2.
83 / 143
Blocking Polyhedra
Lehman’s Length-Width inequality
Corollary 2.Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ . Then (1) [equivalently (2),
(i), and (ii)] holds if and only if
(∗)
(∗∗)
djT ci ≥ 1 ∀i = 1, m; ∀j = 1, t
min{l T c1 , . . . , l T cm } · min{w T d1 , . . . , w T dt } ≤ l T w
∀l, w ∈ Zn+
Proof. Suppose that (∗) and (∗∗) holds. We derive (i).
Let l ∈ Rn+ . By LP duality, the maximum in (i) is equal to
min{l T x|x ∈ Rn+ , djT x ≥ 1, j = 1, t}. Let this minimum be attained by a
vector w . Then by (∗) and (∗∗) we have
l T w ≥ (min l T ci )(min w T dj ) ≥ min l T ci ≥ l T w .
i
j
Hence the minimum in (i) is equal to l T w .
i
Conversely, suppose that (1) holds. Then (i) and (ii) holds. Now (∗)
follows by taking l = dj in (i).
84 / 143
Blocking Polyhedra
Lehman’s Length-Width inequality
To show (∗∗) let λ1 , . . . , λt , µ1 , . . . , µm attaining the maxima in (i) and
(ii), respectively.
Then
X X XX
XX
λj
µi =
λj µi ≤
λj µi djT ci =
j
i
=
j
X
j
λj dj
T i X
j
i
T
µi ci ≤ l w .
i
This implies (∗∗).
Remark.
It follows from the ellipsoid method that if d1 , . . . , dt , c1 , . . . , cm ∈ Rn+
satisfy (1) [equivalently (2), (i), and (ii)], then
∀l ∈ Rn+ : min{l T c1 , . . . , l T cm } can be found in polynomial time
if and only if
∀w ∈ Rn+ : min{w T d1 , . . . , w T dt } can be found in polynomial time
This is particularly interesting when one of t or m is exponentially large.
85 / 143
Blocking Polyhedra – Applications
Shortest Paths and Network Flows
Given a digraph D = (V , A), a source node s and a sink node t.
Let c1 , . . . , cm ∈ RA+ be the incidence vectors of the s-t-paths in D.
Let d1 , . . . , dt ∈ RA+ be the incidence vectors of the s-t-cuts.
Given l : A → (R)+ a ”length” function, the minimum-length of an
s-t-path is equal to the maximum number of s-t-cuts (repetition allowed)
so that no arc a ∈ A is in more than l(a) of these cuts.
[indeed, inequality min ≥ max is easy;
converse inequality: if p is the minimum-length of an s-t-path, then let
Vi = {v ∈ V | the length of the shortest s-t path is ≥ i} for i = 1, p.
Then, δ − (V1 ), . . . , δ − (Vp ) are the required s-t-cuts.]
This implies (i). Hence (ii) holds (slide 83).
But this is equivalent to the max-flow-min-cut theorem:
86 / 143
Blocking Polyhedra – Applications
Shortest Paths and Network Flows
the maximum amount of an s-t-flow subject to a capacity function
w is
P
equal to the minimum capacity of an s-t-cut (observe that i µi ci is an
s-t-flow).
This means that the polyhedra convhull{c1 , . . . , cm } + R+ and
convhull{d1 , . . . , dt } + R+ form a blocking pair of polyhedra.
By the remark on slide 85, the polynomial-time solvability of the
minimum capacitated cut problem is equivalent to that of the
shortest-path problem. The latter problem is much easier.
87 / 143
Blocking Polyhedra – Applications
r -arborescencs
We know (see slide 67) that the r -arborescences polytope of a digraph
D = (V , A), r ∈ V , convhull({χP |P r -arborescence}, is described by
0 ≤ x ≤ 1 ∀a ∈ A
X a
xa ≥ 1 ∀U ⊆ V − {r }, U 6= ∅
a∈δ − (U)
It follows that (1) (slide 80) holds. Since (2) is equivalent to (1), ⇒ for
any ”capacity” function w ∈ RA+ , the minimum capacity of an r -cut is
equal to the maximum value of µ1 , . . . , µk , where µ1 , . . . , µk ≥ 0 are
such that there exist r -arborescences T1 , . . . , Tk with the property that
for any arc a, the sum of µj for which a ∈ Tj is at most ca .
Hence the convex hull of the incidence vectors of sets containing an r -cut
as a subset is determined by the system (in x ∈ RA ):
0 ≤ xa ≤ 1 ∀a ∈ A
X
xa ≥ 1 ∀T r -arborescence
a∈T
88 / 143
Anti-blocking Polyhedra
Fulkerson (1971)
Results analogous to those for blocking polyhedra, proofs omitted.
Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ be such that
dim(hc1 , . . . , cm i) = dim(hd1 , . . . , dt i) = n.
Then the following are equivalent
(1) (convhull{c1 , . . . , cm } + Rn− ) ∩ Rn+ = {x ∈ Rn+ |djT x ≤ 1, j = 1, t}
(2) (convhull{d1 , . . . , dt } + Rn− ) ∩ Rn+ = {x ∈ Rn+ |ciT x ≤ 1, i = 1, m}
If X ⊆ Rn , then the anti-blocker A(X ) of X is
A(X ) := {x ∈ Rn+ |y T x ≤ 1 ∀y ∈ X }
89 / 143
Anti-blocking Polyhedra
Anti-blocking pair of polyhedra
If P = (convhull{c1 , . . . , cm } + Rn− ) ∩ Rn+ then
A(P) = {x ∈ Rn+ |ciT x ≤ 1, i = 1, m}
A(P) is called anti-blocking polyhedron of P. If R = A(P) then
the pair P, R is called an anti-blocking pair of polyhedra.
Clearly A(A(P)) = P for any polyhedron P of the above type.
90 / 143
Anti-blocking Polyhedra
Theorem
Each of the following are equivalent to (1) and (2):
(a)(convhull{c1 , . . . , cm } + Rn− ) ∩ Rn+ , (convhull{d1 , . . . , dt } + Rn− ) ∩ Rn+
is an anti-blocking pair of polyhedra.
(b) ∀l ∈ Rn+ : max{l T c1 , . . . , l T cm } =
P
min{λ1 + . . . + λt |λ1 , . . . , λt ∈ R+ ; j λj dj ≥ l}
(c) ∀w ∈ Rn+ : max{w T d1 , . . . , w T dt } =
P
min{µ1 + . . . + µm |µ1 , . . . , µm ∈ R+ ; i µi ci ≥ w }
(d) (i) djT ci ≤ 1, ∀i = 1, m; ∀j = 1, t
(ii) max{l T c1 , . . . , l T cm } · max{w T d1 , . . . , w T dt } ≥ l T w ∀l, w ∈ Zn+
91 / 143
Anti-blocking Polyhedra – Applications
Perfect Graphs
G = (V , E ) is a perfect graph (Berge 1960) if the chromatic number
of any induced subgraph G 0 of G is equal to the maximum cardinality of
a clique in G 0 . If C is the matrix having as rows the incidence vectors of
cliques in G , then G is perfect if and only if for each {0, 1}-vector w the
(dual) linear programs
(∗) max{w T x|x ≥ 0, Cx ≤ 1} = min{y T 1|y ≥ 0, y T C ≥ w T }
have integer optimal solutions.
The stable-set polytope of the graph G = (V , E ) is
STAB(G ) = convhull({χS |S stable set in G }.
Let Ch(G ) be the polytope
xv ≥ 0
X
xv ≤ 1
∀v ∈ V
∀K clique in G
v ∈K
Note that Ch(G ) = A(STAB(G )), where G is the complement of G .
Also, STAB(C5 ) ( Ch(C5 ), where C5 is the circuit on 5 vertices.
92 / 143
Anti-blocking Polyhedra – Applications
Chv´
atal’s Theorem
G = (V , E ) is a perfect graph if and only if STAB(G ) = Ch(G ).
This equivalently to
G = (V , E ) is a perfect ⇔ STAB(G ) = A(STAB(G )).
Lov´
asz’s Perfect Graph Theorem
The complement of a perfect graph is a perfect graph.
Proof. If G is perfect then STAB(G ) = A(STAB(G )). Hence
STAB(G ) = A(A(STAB(G ))) = A(STAB(G )). Therefore G is perfect.
It follows (ellipsoid method) that a maximum-weight stable set in a
perfect graph can be found in polynomial time iff a maximum-weight
clique can be found in a polynomial time.
Since the complement of a perfect graph is again perfect, this would not
give any reduction of a problem to another!
93 / 143
Anti-blocking Polyhedra
Proof of Chv´
atal’s Theorem:G perfect ⇒ STAB(G ) = Ch(G )
Proof. Suppose G perfect. Let αw be the maximum weight of a stable set
of G for w : V → Z+ . We prove ( P) αw = max{w T x|x ≥ 0, Cx ≤ 1}
for each w ∈ ZV+ , by induction on v ∈V wv .
If w is a {0, 1}-vector, this follows from (∗) definition of a perfect graph.
Let wu ≥ 2 for some u ∈ V . Let e be the vector with eu = 1 and ev = 0
for v ∈ V − {u}. Replacing w by w − e in (∗), we obtain, by induction, a
vector y ≥ 0 so that y T C ≥ (w − e)T and y T 1 = αw −e . Since
(w − e)u ≥ 1, there is a clique K s.t. yK > 0 and u ∈ K . We may
assume that χK ≤ w − e. Then αw −χK < αw (◦). Hence,
αw = 1 + αw −e = 1 + max{(w − e)T x|x ≥ 0, Cx ≤ 1} ≥ max{w T x|x ≥
0, Cx ≤ 1}, implying ( ).
Proof
Pof αw −χK < αw (◦): suppose αw −χK = αw . If S is a stable set
s.t.
, we have K ∩ S = ∅. But,
v ∈S (w − e)v = α. Since αw −χK = αw
P
K
since w − χ ≤ w − e ≤ w , we know that v ∈K (w − e)v = αw −e , and
by complementarity slackness, |K ∩ S| = 1, contradiction.
94 / 143
Anti-blocking Polyhedra
Proof of Chv´
atal’s Theorem: STAB(G ) = Ch(G ) ⇒ G perfect
Proof. Suppose STAB(G ) = Ch(G ). Then max{w T x|x ≥ 0, Cx ≤ 1} is
V
attained by the incidence
P vector of a stable set, for each w ∈ Z+ . We
show, by induction on v ∈V wv , that min{y T 1|y ≥ 0, y T C ≥ w T } has
an integer optimal solution for each {0, 1}-vector w (see (∗) definition of
perfect graphs).
Let w be a {0, 1}-vector, and let y a not-necessary integer optimal
solution of min{y T 1|y ≥ 0, y T C ≥ w T }. Let K be a clique with yK > 0.
Then the common optimal value of
max{(w − χK )T x|x ≥ 0, Cx ≤ 1} = min{y T 1|y ≥ 0, y T C ≥ (w − χK )T }
is less then max{w T x|x ≥ 0, Cx ≤ 1}, since by complementary slackness,
each optimal solution of this last problem must satisfy χK )T x = 1. As
these optmal values are integers, they differ by exactly 1. By induction,
the above minimum has an optimal integer solution y . Increasing yK by
1, gives an integral solution of min{y T 1|y ≥ 0, y T C ≥ w T }.
95 / 143
Anti-blocking Polyhedra
A polynomial time algorithm for finding a maximum-weight
stable set in a perfect graph.
Let G = (V , E ) be a graph with V = {1, . . . , n}. Let
M(G ) ⊂ R(n+1)×(n+1) the set of all matrices Y = (yi,j )ni,j=0 satisfying:
(i)
Y is symmetric and positive semi-definite;
(ii) y00 = 1, y0i = yii ,
i = 1, . . . , n;
(iii) yij = 0 if i 6= j and {i, j} ∈ E .
It follows that M(G ) is convex (not necessary a polyhedra).
Let TH(G ) be the set of vectors x ∈ Rn for which there is Y ∈ M(G ) so
that xi = yii for i = 1, . . . , n.
TH(G ) – approximation of STAB(G ) at least as good as A(STAB(G )):
96 / 143
Anti-blocking Polyhedra
A polynomial time algorithm for finding a maximum-weight
stable set in a perfect graph.
Theorem. STAB(G ) ⊆ TH(G ) ⊆ A(STAB(G )).
Proof.(Let S be a stable set in G . Let Y ∈ M(G ) with
1 if i, j ∈ S ∪ {0},
yij =
. Clearly, χSi = yii , so χS ∈ TH(G ), hence
0 otherwise.
STAB(G ) ⊆ TH(G ).
The second inclusion. If x ∈ TH(G ) then (since the elements on the
diagonal of a positive semi-definite matrix are non-negative) x ≥ 0. It
suffices to show that if x ∈ TH(G ) and u = χS for S stable set in G ,
then u T x ≤ 1. Let x obtained by taking the last n elements of the
diag (Y ) forsome
Y ∈ M(G ). Since Y is positive semi-definite
1
(1 − u T )Y
≥ 0. As yij = 0 if {i, j} ∈ E and S is a clique in G , it
−u
P
P
follows from the above inequality that 1 − 2 i∈S yi0 + i∈S yii ≥ 0.
Since xi = yi0 = yii , this implies u T x ≤ 1.
97 / 143
Anti-blocking Polyhedra
A polynomial time algorithm for finding a maximum-weight
stable set in a perfect graph.
Theorem (Gr¨
otschel, Lov´
asz, Schrijver, 1981) There is a polynomial
time algorithm to find a maximum-weight stable set in a perfect graph.
Proof. If G is perfect then STAB(G ) = A(STAB(G )), hence by the
above theorem STAB(G ) = TH(G ). The theorem follows since any
linear objective function w T x can be maximized over TH(G ) in
polynomial time.
[Linear objective function maximization over M(G ) can be done in
polynomial time, using ellipsoid method since we can solve linear
separation on M(G ) in polynomial time: for any given Y ∈ R(n+1)×(n+1)
we test the conditions (i)- (iii) in polynomial time, in such a way that we
find a separating hyperplane in R(n+1)×(n+1) if Y 6∈ M(G ).]
98 / 143
Anti-blocking Polyhedra
Matchings and Edge-colorings
The matching polytope, Pmat (G ) of a graph G = (V , E ) is is described
(see slides 75 - 78) as the set of all vectors x ∈ RE satisfying
(∗∗)
X
xe
≥
0
∀e ∈ E ,
xe
≤
1
∀v ∈ V ,
xe
≤
e3v
X
e⊆U
j1
2
|U|
k
∀U ⊆ V ,
By scalar multiplication, system (∗∗) can be normalized, determining
Pmat (G ) to: x ≥ 0, Cx ≤ 1 for a certain matrix C (deleting in (∗∗) rows
corresponding to U ⊆ V with |U| ≤ 1).
The anti-blocking polyhedron, A(Pmat (G )) is equal to {z ∈ RE+ |Dx ≤ 1},
where the rows of D are the incidence vectors of all matchings in G .
99 / 143
Anti-blocking Polyhedra
Matchings and Edge-colorings
Taking l = 1 in relation (b) on slide (91), we obtain
max{∆(G ),
max
U⊆V ,|U|≥2
|hUi|
j
k } = min{y T 1|y ≥ 0, y T D ≥ 1T }.
1
2 |U|
Here hUi denotes the set of edges contained in U.
The minimum above is called fractional edge coloring number χ∗ (G )
of G . If the minimum is attained by an integral optimum solution, it is
equal to edge coloring number χ0 (G ) of G , since
χ0 (G ) = min{y T 1|y ≥ 0, y T D ≥ 1T , y integral }.
By Vizing’s theorem χ0 (G ) ∈ {∆(G ), ∆(G ) + 1}. For G the Peterson’s
graph, we have χ∗ (G ) = ∆(G ) = 3, while χ0 (G ) = 4.
100 / 143
Cutting Planes
Integer Hull
If P ⊆ Rn , then PI , the integer hull of P is
PI = convhull{x|x ∈ P, x integral }.
Clearly, If P is bounded then PI is a polytope.
For most combinatorial optimization problems it is not difficult to find a
set of linear inequalities, determining a polyhedron P, in which the
integral vectors are the incidence vectors of sets corresponding to
combinatorial optimization.
Challenge: describing PI by linear inequalities!
Gomory, 1960: Cutting-plane method for linear integer programming.
Chv´
atal, 1973, derived from this method, an interesting iterative process
to characterize PI .
101 / 143
Cutting Planes
Integer Hull
A rational affine halfspace (rah), is a set H = {x|c T x ≤ δ}, where
c ∈ Qn , c 6= 0, and δ ∈ Q. We may assume that the components of H
are relatively prime integers, ⇒ HI = {x|c T x ≤ bδc}.
If P is a polyhedron then
P0 =
\
HI .
H rah, H⊇P
A ∈ Qm×n , b ∈ Qm :
T
{x|Ax ≤ b}0 = {x|(u T A)x ≤ bu T bc ∀u ∈ Qm
+ , with u A integral}.
{x|x ≥ 0, Ax ≤ b}0 = {x|x ≥ 0; bu T Acx ≤ bu T bc ∀u ∈ Qm
+ }.
Note that if P ⊂ H then PI ⊂ HI . Hence PI ⊆ P 0 .
102 / 143
Cutting Planes
Integer Hull
For t ∈ N, define P
(t)
(
P
if t = 0
=
. Then
(t−1) 0
(P
)
if t ≥ 1.
P = P 0 ⊇ P 1 ⊇ . . . ⊇ PI .
Theorem (Blair, Jeroslow (1982), Cook, Lov´
asz, Schrijver (1986)
For each rational matrix A there exist a number t such that for each
column vector b one has: {x|Ax ≤ b}(t) = {x|Ax ≤ b}I .
Corolary. For each polytope P there is a number t such that P (t) = PI .
The Chv´
atal rank of the rational matrix A ∈ Qm×n is the smallest t for
which {x|Ax ≤ b}(t) = {x|Ax ≤ b}I for each b ∈ Rm .
m×n
The strong Chv´
atal rank of the
rational
is the
 matrix A ∈ Q
Chv´atal rank of the matrix
I
 −I 
 .
 A
−A
103 / 143
Cutting Planes
Integer Hull
From HK theorem (slide 37), an integral matrix has strong Chv´atal rank
0 if and only if is totally unimodular.
No similar characterization are known for higher Chv´atal rank.
Example. For any graph G = (V , E ) let P the polytope determined by
X
xe
≥
0
∀e ∈ E ,
xe
≤
1
∀v ∈ V .
e3v
Clearly, PI is the matching polytope of G . It can be proved that P 0 is
xe ≥ 0 ∀e ∈ E ,
X
xe ≤ 1 ∀v ∈ V ,
e3v
X
e⊆U
xe
≤
j1
2
|U|
k
∀U ⊆ V .
Edmond’s Matching Polyhedron Theorem: P 0 = PI ( PI arises from
P by one ”round” of cutting planes.)
104 / 143
Hard Problems – Complexity of the Integer Hull
Karp and Papadimitriou (1982)
For any class (FG |G graph), if the problem
OP: Given G = (V , E ) a graph; c : E → Q, FG a collection of
Psubsets of
E , find F0 ∈ FG s.t. c(F0 ) = maxF ∈FG c(F ) (where, c(F ) = e∈F c(e))
is NP-hard and NP6=co-NP, then the class of polytopes
convhull{χF |F ∈ FG } has difficult facets, i.e.,
there is no polynomial Φ s.t. for any graph G , c ∈ ZE and δ ∈ Q with
the property that c T x ≤ δ defines a facet of convhull{χF |F ∈ FG }, the
fact that c T x ≤ δ is valid for each χF with F ∈ FG has a proof of length
at most Φ(|V | + |E | + size(c) + size(δ)).
Note that for the matching polyhedron P (see previous slide) PI = P 0
has exponentially many inequalities, but each facet-defining inequality is
of form described in P 0 and for them is easy to prove that they are valid
for the matching polytope.
105 / 143
Hard Problems – Complexity of the Integer Hull
Boyd and Pulleyblank (1984)
Suppose that for a given class (FG |G graph), for each G = (V , E ) the
polytope PG in RE satisfies (PG )I = convhull{χF |F ∈ FG } and has the
property that the problem
given G = (V , E ) a graph and c ∈ RE , find max{c T x|x ∈ PG }
is polynomially solvable.
Then, If
OP: Given G = (V , E ) a graph; c : E → Q, FG a collection of subsets of
E , find F0 ∈ FG s.t. c(F0 ) = maxF ∈FG c(F )
is NP-hard and NP6=co-NP, then there is no fixed t such that for each
graph G
(PG )(t) = convhull{χF |F ∈ FG }.
106 / 143
Hard Problems – Complexity of the Integer Hull
Example: The stable-set polytope
Let STAB(G ) = convhull{χS |S stable set in G } be the stable-set
polytope of the graph G = (V , E ). Let Ch(G ) be the polytope
xv ≥ 0 ∀v ∈ V
X
xv ≤ 1 ∀K clique in G
v ∈K
So, Ch(G ) = A(STAB(G )) (see slide 92).
Clearly, STAB(G ) ⊆ Ch(G ). Since the integral vectors in Ch(G ) are
exactly the incidence vectors of stable sets we have
STAB(G ) = Ch(G )I .
Chv´
atal (1984): there is no fixed t such that (Ch(G ))(t) = Ch(G )I ,
even we restrict G to graphs with stability number α(G ) = 2.
By Chv´atal theorem (see slide 93), the class of graphs G with
Ch(G )I = ch(G )
is exactly the class of perfect graphs.
107 / 143
Hard Problems – Complexity of the Integer Hull
Example: The stable-set polytope
Chv´
atal raised the question of whether exists, for each fixed t, a
polynomial time algorithm determining α(G ) for graphs G with
(Ch(G ))(t) = Ch(G )I .
This is true for t = 0, that is for perfect graphs (via ellipsoid method, see
slide 98).
Minty (1980), Sbihi (1980) extended Edmond’s maximum weighted
matching algorithm, to obtain a polynomial algorithm to find a maximum
weighted stable set in a K1,3 -free graph. By theorem on slide 35, the
separation problem for the stable polytope of K1,3 -free graphs can be
solved in polynomial time. However, no explicit description of a linear
inequality system for the stable-set polytope of K1,3 -free graphs is known.
This would extend Edmond’s description of the matching polytope!
Note that, by Chv´atal result on slide 107, there is no fixed t such that
(Ch(G ))(t) = Ch(G )I for all K1,3 -free graphs.
108 / 143
Hard Problems – Complexity of the Integer Hull
Example: The stable-set polytope
The most natural ”relaxation” of STAB(G ), for the graph G = (V , E ), is
the polytope Q(G ):
xv ≥ 0 ∀v ∈ V ,
xv + xw
≤ 1
∀{v , w } ∈ E .
Clearly, Q(G )I = STAB(G ).
Since Ch(G ) ⊆ Q(G ), 6 ∃ t with Q(G )(t) = Q(G )I for all graphs G .
It is not difficult to see hat Q(G )0 is:
xv ≥ 0 ∀v ∈ V ,
xv + xw
X
xv
v ∈C
≤
1 ∀{v , w } ∈ E ,
|C | − 1
≤
∀C vertex set of an odd circuit in G
2
109 / 143
Hard Problems – Complexity of the Integer Hull
Example: The stable-set polytope
Gerards and Schrijver (1986): If G has no subgraph H which arises
from K4 by replacing edge by paths such that each triangle in K4
becomes an odd circuit, then Q(G )0 = STAB(G ).
Graphs G with Q(G )0 = STAB(G ) are called by Chv´atal t-perfect.
n
X
Let A ∈ Zm×n satisfying:
|aij | ≤ 2 i = 1, . . . , m.
j=1
Gerards and Schrijver (1986): A has strong Chv´a
tal rank at
most 1 if
and only if it can not be transformed to the matrix 1 1 0 0
 1 0 1 0


 1 0 0 1


 0 1 1 0,


 0 1 0 1
0011
by a sequence of the following operations: deleting
T or
permuting rows or
1c
columns or multiplying them by -1; replacing
by D − bc T , where
bD
D is a matrix and b, c are column vectors.
110 / 143
Hard Problems – Complexity of the Integer Hull
Example: The traveling-salesman polytope
The traveling-salesman polytope of a graph G = (V , E ) is equal to
convhull{χH |H ⊆ E , H Hamiltonian circuit}.
As the traveling salesman problem is NP-hard, if NP6=co-NP, then the
traveling-salesman polytope will have ”difficult” facets (Karp and
Papadimitriou, slide 105). Let P ⊆ RE defined by:
0 ≤ xe ≤ 1 ∀e ∈ E ,
X
xe = 2 ∀v ∈ V ,
v ∈e
X
xe
≥ 2
∀U ⊆ V ,
3 ≤ |U| ≤ |V | − 1.
e∈δ(U)
The integral vectors in P are exactly the incidence vectors of Hamiltonian
circuits. Hence, PI is the traveling-salesman polytope.
111 / 143
Hard Problems – Complexity of the Integer Hull
Example: The traveling-salesman polytope
The problem of minimizing a linear function c T x over P is polynomially
solvable with the ellipsoid method, since the above system of inequalities
can be checked in polynomial time (the last group can be checked by
reduction to a minimum-cut problem).
So, if NP6=co-NP, by Boyd and Pulleyblank’s result (slide 106), there is
no fixed t such that P (t) = PI for each graph G .
The above system has been useful in solving large instances of the
traveling salesman problem: ∀c ∈ QE , the minimum of c T x over P is a
lower bound for TSP, which can be computed with the simplex method
using a row-generating technique.
This lower bound can be used in a branch-and-bound procedure for
solving TSP.
112 / 143
Miscellanea
The matching polytope of a bipartite graph – revised
If G = (V , E ) is a bipartite graph, c : E → R+ , and A is the incidence
matrix of G , we proved (slide 42) using HK theorem the (simple) fact:
maximum weight of a matching = max{c T x|x ≥ 0, Ax ≤ 1}
Solving the right hand problem via linear programming yields a
polynomial-time algorithm to solve the left hand one.
Indeed, let x be an optimal solution to the above LP.
If x is integral, then we are done.
If not, we describe a procedure that yields another optimal solution with
strictly more integer coordinates than x. We then reach an integral
optimal solution by at most |E | repetitions of this procedure.
113 / 143
Miscellanea
The matching polytope of a bipartite graph – revised
Let H be the subgraph of G spanned by {e ∈ E |xe 6∈ {0, 1}}.
Case 1. H contains a cycle C = (v1 , v2 , . . . , vk , v1 ). Since H is bipartite,
C is even. Let = mine∈E (C ) min{xe , 1 − xe }. Let x 0 and x 00 be:

xe − if e = vi vi+1 , 1 ≤ i ≤ k − 1, i odd,

xe0 = xe + if e = vi vi+1 , 1 ≤ i ≤ k, i even,
and


x
if e ∈ E − E (C )
e
xe + if e = vi vi+1 , 1 ≤ i ≤ k − 1, i odd,

xe00 = xe − if e = vi vi+1 , 1 ≤ i ≤ k, i even,
where the indices are


xe
if e ∈ E − E (C )
modulo k. These are two feasible solutions to the LP. Moreover,
X
X
1 X 0
xe c e =
xe ce +
xe00 ce .
2
e∈E
0
e∈E
e∈E
00
Thus x and x are also optimal solutions and, by the choice of , one of
these two solutions has more integer coordinates than x.
114 / 143
Miscellanea
The matching polytope of a bipartite graph – revised
Case 2. H has no cycle. Consider a longest path P = (v1 , v2 , . . . , vk ) in
H. Observe if e is an edge incident to v1 (resp. vk ) and different from
v1 v2 , (resp. vk−1 vk , then xe = 0, for otherwise H would contain either a
cycle or a longer path.
Let = mine∈E (P) min{xe , 1 − xe }. Defining x 0 and x 00 similarly as above,
we obtain two admisible solutions to the LP.
If P is odd, then the value of c T xe0 is greater than c T x, which
contradicts the optimality of x.
If P is even, both x 0 and x 00 are also optimal solutions and, by the choice
of , one of these two solutions has more integer coordinates than x. 115 / 143
Miscellanea
The fuzzy polytope of a digraph
Let D = (V , E ) be a digraph with V = {1, . . . , n}. The fuzzy polytope
of D, F (D), is the polytope described by
0 ≤ xv ≤ 1 ∀v ∈ V ,
xv − xw
≥ 0
∀(v , w ) ∈ E .
A ⊆ V , A 6= ∅ is called comprehensive from above (cfa) in D if
δ − (A) = ∅ (if w ∈ A and (v , w ) ∈ E then v ∈ A).
C(D) := {A|A ⊆ V , A cfa in D}.
Theorem (Brink, Laan, Vasil’ev, 2004) convhull{χA |A ∈ C(D)} = F (D).
Proof. Clearly, the integer vectors in F (D) are exactly the incidence
vectors of cfa sets. Hence, if we prove that the vertices of F (D) are
integer vectors, the theorem holds.
Let a be a vertex of F (D). Suppose that
Frac(a) = {v ∈ V |0 < av < 1} =
6 ∅.
116 / 143
Miscellanea
The fuzzy polytope of a digraph
Take(µ = max{av |v ∈ Frac(a)}, Fracµ (a) = {v ∈ Frac(a)|av = µ}, and
max{av |v ∈ Frac(a) − Fracµ (a)} if Frac(a) 6= Fracµ (a),
ν=
0
if Frac(a) = Fracµ (a).
0 00
n
Fix some
( δ ∈ (0, 1) s.t. µ + δ, µ − δ ∈ (ν, 1),
( and let a , a ∈ R be:
µ + δ if v ∈ Fracµ (a),
µ − δ if v ∈ Fracµ (a),
av0 =
av00 =
av
if v ∈ V − Fracµ (a).
av
if v ∈ V − Fracµ (a).
By construction, a0 and a00 satisfy the first group of inequalities in
describing F (D). Since av0 ≤ aw0 ⇔ av00 ≤ aw00 ⇔ av ≤ aw , and a ∈ F (D),
it follows that the second group of inequalities are satisfied by a0 and a00 .
Hence, a0 , a00 ∈ F (D), a0 6= a00 , and a = 21 a0 + 12 a00 , contradicting the
hypothesis that a is a vertex in F (D).
117 / 143
Miscellanea
The fuzzy polytope of a digraph
Example Let D = ({0, 1, . . . , n}, {(i, 0)|i = 1, . . . , n}). Then
C(D) = {A|∅ =
6 A ⊆ {1, . . . , n}} ∪ {{0, 1, . . . , n}}. Hence |C| = 2n .
0
1
2
3
4
Figure: n = 4: Digraph with n + 1 vertices and 2n cfa sets.
118 / 143
Miscellanea
The sharing polytope of a digraph
Let D = (V , E ) be a digraph with V = {1, . . . , n}. The sharing
polytope of D, Sh(D), is the polytope described by
xv ≥ 0 ∀v ∈ V ,
X
xv = 1
v ∈V
xv − xw
≥ 0
∀(v , w ) ∈ E .
A ⊆ V is called connected comprehensive from above (ccfa) in D if
A 6= ∅, δ − (A) = ∅, and the subdigraph iduced by A in D is connected.
CC(D) := {A|A ⊆ V , A ccfa in D}.
1 A
χ |A ∈ CC(D)} = Sh(D).
Thm. (Brink,Laan,Vasil’ev,’04) convhull{ |A|
1 A
Proof. (1) If A ∈ C(D) then x A = |A|
χ ∈ Sh(D). Indeed, xvA ≥ 0 for
P
P
1
A
each v ∈ V , and v ∈V xv = |A| v ∈A 1 = 1. Let (v , w ) ∈ E . If
|{v , w } ∩ A| =
6 1, then vvA = xwA and the third restriction is satisfied for
1
> 0 = xwA . Since A is cfa, these are
(v , w ). If v ∈ A and w 6∈ A, xvA = |A|
all possible cases.
119 / 143
Miscellanea
The sharing polytope of a digraph
1 A
χ ∈ Sh(D)
(2) If A induces in D a connected subdigraph and x A = |A|
A
then x is a vertex of Sh(D). Indeed, suppose that there are
y , z ∈ Sh(D), such that x A = 12 y + 12 z. We show that y = z = x A .
For v ∈ V − A, we have xvA = 0 and since yv , zv ≥ 0, from
xvA = 12 yv + 12 zv = 0, we deduce xvA = yv = zv = 0 for all v ∈ V − A.
For v , w ∈ A and (v , w ) ∈ E , since y and z are from Sh(D) we have
1
1
yv ≥ yw and zv ≥ zw . As |A|
= xvA = 12 yv + 12 zv ≥ 12 yw + 21 zw =xwA = |A|
it follows that yv = yw and zv = zw . By the connectivity of the
subdigraph induced by A in D, we have that yv = yw and zv = zw for all
1
v , w ∈ A and hence xvA = yv = zv = |A|
for all v ∈ A.
1 A
(3) If x is a vertex in Sh(D) ⇒ ∃A ∈ CC(D) such that x = |A|
χ .
P
x
Let A = {v ∈ V |xv > 0}. Since v ∈V xv = 1, it follows that Ax 6= ∅.
120 / 143
Miscellanea
The sharing polytope of a digraph
(3.1) Ax ∈ CC(D). Indeed, Ax ∈ C(D) is simply: if w ∈ Ax and
(v , w ) ∈ E then xv ≥ xw > 0 hence v ∈ Ax . To prove that Ax induces a
connected subdigraph in D, suppose that there are at least 2 connected
x
components in [Ax ]D : Ax1 , . . . , Axm (m ≥ 2). Clearly, each
( Ak is cfa,
xv if v ∈ Axk
because Ax is cfa. For k = 1, m, let x k given by xvk =
,
0
otherwise.
P
and define λk = v ∈Ax xvk . We have λk > 0 for all k ∈ {1, . . . , m} and
k
P
k=1,m λk = 1.
Let x k = λ1k x k , for k = 1, m. Clearly x k 6= x l for all k 6= l in {1, . . . , m}
P
and x = k=1,m λk x k . If we prove that ∀k, x k ∈ Sh(D), we contradict
that x is a vertex in Sh(D). We verify the last inequalities group in the
definition of Sh(D). Let (v , w ) ∈ E and k ∈ {1, . . . , m}. If w ∈ Axk then,
since Axk is cfa, v ∈ AxK , hence x kv = λ1k xvk ≥ λ1k xwk = x kw . If w 6∈ Axk then
x kv ≥ 0 = x kw .
121 / 143
Miscellanea
The sharing polytope of a digraph
(3.2) xv = xw , ∀v , w ∈ Ax .
Suppose α < β, where α = min{xv |v ∈ Ax } and β = max{xv |v ∈ Ax }.
Let x 0(and x 00 defined by
(
α if v ∈ Ax ,
xv − α if v ∈ Ax ,
0
00
xv =
xv =
x
0 if v ∈ V − A
0
if v ∈ V − Ax .
P
Clearly,
x 0 , x 00 ≥ 0 and x = x 0 + x 00 . Taking λ0 = v ∈Ax xv0 and
P
λ00 = v ∈Ax xv00 , we obtain from x ∈ Sh(D) that λ0 , λ00 ∈ (0, 1) and
λ0 + λ00 = 1.
We have x = λ0 x 0 + λ00 x 00 , where x 0 = λ10 x 0 and x 00 = λ10 x 00 . It is easy to
verify that x 0 6= x 00 and that x 0 , x 00 ∈ Sh(D), contradicting the hypothesis
that x is a vertex in Sh(D).
P
From (3.2) and v ∈V xv = 1 it follows that xv = |A1x | for all v ∈ Ax . So,
x=
1
Ax
|Ax | χ
and Ax ∈ CC(D).
122 / 143
Miscellanea
The sharing polytope of a digraph
Example Let D = ({0, 1, . . . , n}, {(i, 0)|i = 1, . . . , n}). Then
CC(D) = {{i}|i ∈ {1, . . . , n}} ∪ {{0, 1, . . . , n}}. Hence |CC| = n + 1.
0
1
2
3
4
Figure: n = 4: Digraph with n + 1 vertices and n + 1 ccfa sets.
123 / 143
Miscellanea
Covering a strongly connected digraph with directed cycles
Gallai (1964) conjectured an analogue of Gallai-Milgram Theorem (every
digraph D can be covered by at most α(D) directed paths) for covering
strongly connected (sc) digraphs with (directed) circuits. This was
proved in 2004 by Bessy and Thomass´
e:
Theorem 1. The vertex set of any sc digraph D can be covered by α(D)
circuits.
We present a min-max theorem established by Bondy, Charbit and
Seb¨
o, from which the above theorem follows easily.
If D = (V , E ) is a digraph, a cyclic order of D is a cyclic order
O = (v1 , v2 , . . . , vn , v1 ). The length of an arc (vi , vj ) of D (w.r.t. a cycle
order O) is j − i if i < j and n + j − i if i > j. Informally, the length of
an arc is just the length of the segment of O ”jumped” by the arc. If C
is a circuit
P of D, the index of C (w.r.t O), is
i(C ) = a arc of C length(a) (it is a multiple
of n). The index of a family
P
C of circuits, denoted i(C), is i(C) = c∈C i(C ).
124 / 143
Miscellanea
Covering a strongly connected digraph with directed cycles
A weighting of the vertices of D
Pis a function w : V → N. The weight
of a subgraph H of D w (H) = v ∈V (H) w (v ). The weighting w is
index-bounded (w.r.t. O) if w (C ) ≤ i(C ) for every circuit C of D.
For any cycle covering C of D and any index-bounded weighting w :
X
i(C) ≥
w (C ) ≥ w (D).
C ∈C
Theorem 2. Let D be a digraph in which each vertex lies in a circuit,
and let O be a cyclic order of D. Then:
min i(C) = max w (D)
where the minimum is taken over all circuit coverings C of D and the
maximum over all index-bounded weightings w of D.
125 / 143
Miscellanea
Covering a strongly connected digraph with directed cycles
In order to deduce Theorem 1 from Theorem 2, it suffices to apply it to a
coherent cyclic order O of D. A cyclic order is coherent if every arc lies
in a circuit of index one.
Every s.c. digraph admits a coherent cyclic order. A fast algorithm to
find a coherent cyclic order was given by Iwata and Matsuda(2008).
We then observe that:
• for every family C of directed cycles of D, we have |C | = i(C ),
• because each vertex lies in a circuit and O is coherent, an
index-bounded weighting of D is necessarily {0, 1}-valued,
• since each arc lies in a circuit, in an index-bounded weighting w no arc
can join two vertices of weight 1, ⇒ the support of w is a stable set, ⇒
w (D) ≤ α(D).
126 / 143
Miscellanea
Proof of Theorem 2
Let D = (V , E ) be a digraph, V = {v1 , . . . , vn } and E = {a1 , . . . , am }. It
suffices to show that equality in Theorem 2 holds for some cycle covering
C and some index-bounded weighting w .
An arc (vi , vj ) is called a forwardarc of
D if i < j, and a reverse arc if
M
j < i. Consider the matrix Q :=
, where M = (mij ) is the
N
incidence
( matrix of D and N = (nij ) is the n × m matrix defined by
1 if vi is the tail of aj
nij =
0 otherwise.
Q is totally unimodular. Indeed, if Q is the matrix obtained from Q by
subtracting each row of N from the corresponding row of M. Each
column of Q contains one 1 and one -1, the remaining entries being 0.
Thus, Q is totally unimodular. Because Q was derived from Q by
elementary row operations, the matrix Q is totally unimodular too.
127 / 143
Miscellanea
Proof of Theorem 2
(
0 if 1 ≤ i ≤ n,
Let b ∈ R2n with bi :=
, and c ∈ Rm with
1 otherwise,
(
1 if aj is a reverse arc
cj :=
Note that:
0 otherwise.
• If x := fC is the circulation associated with a circuit C , then
c T x = i(C ), the index ofP
C.
• If Nx ≥ 1, where x := {λC fC : C ∈ C} is a linear combination of
circulations, then the family C of circuits of D is a covering of D.
Consider the linear programme (LP) min{c T x|x ≥ 0, Qx ≥ b}.
The system Qx ≥ b is equivalent to Mx ≥ 0 and Nx ≥ 1. Because the
rows of M sum to 0, the rows of Mx sum to 0, ⇒ Mx = 0.
128 / 143
Miscellanea
Proof of Theorem 2
It follows that every feasible solution to of the above (LP) is a
non-negative
circulation in D. Hence, a non-negative linear combination
P
λC fC of circulations associated with circuits of D. Since Nx ≥ 1, the
circuits of positive weight in this sum form a covering of D.
Conversely, every circuit covering of D yields a feasible solution to (LP).
The linear programme (LP) is feasible because, by assumption, D has at
least one circuit covering, and it is bounded because c is non-negative.
Thus (LP) has an optimal solution.
Hence (LP) has an integral optimal solution, because Q is totally
unimodular and the constraints are integral. This solution corresponds to
a circuit covering C of minimum index, the optimal value being i(C).
129 / 143
Miscellanea
Proof of Theorem 2
Consider the dual of (LP): max{b T y |y ≥ 0, y T Q ≤ c T }. If
y T := (z1 , . . . , zn , w1 , . . . , wn ),(then this is (DLP)
Pn
1 if aj = (vi , vk ) is a reverse arc, max
i=1 wi |zi − zk + wi =
0 if aj = (vi , vk ) is a forward arc.
Consider an integral optimal solution to (DLP). If we sum the
constraints over the arc set of a circuit C of D, we obtain the inequality
X
wi = i(C ).
vi ∈V (C )
In other words, the function w defined by w (vi ) := wi , 1 ≤ i ≤ n, is an
index-bounded weighting, and the optimal value is the weight w (D) of
D. By the Duality Theorem, we have i(C ) = w (D).
130 / 143
Miscellanea – Two-player Games
Preliminaries
A two-player normal-form game is specified via a pair (R, C ) of
m × n payoff matrices.
The row player has m pure strategies, which are in one-to-one
correspondence with the rows of the payoff matrices.
The column player has n pure strategies, which are in one-to-one
correspondence with the columns of the payoff matrices.
If the row player plays strategy i and the column player strategy j,
then their respective payoffs are given by R[i, j] and C [i, j].
131 / 143
Miscellanea – Two-player Games
Mixed strategies
Players may also randomize over their strategies, leading to
mixed – as opposed to pure – strategies.
P
Notation: ∆p = {x ∈ Rp |xi ≥ 0, ∀i = 1, p, and
i=1,p xi = 1} –
the p-dimensional simplex.
The set of mixed strategies available to the row player is ∆m and
those available to the column player are from ∆n .
For x ∈ ∆m and y ∈ ∆n , the expected payoff of the row and
column player are respectively x T Ry and x T Cy .
132 / 143
Miscellanea – Two-player Games
Nash Equilibrium
(x, y ) ∈ ∆m × ∆n is said to be a Nash equilibrium if neither player
can increase her expected payoff by unilaterally deviating from her
strategy:
x T Ry ≥ x 0T Ry , ∀x 0 ∈ ∆ ,
m
T
0
T
0
x Cy ≥ x Cy , ∀y ∈ ∆n .
Equivalently, the support of a player’s strategy contains those pure
strategies that maximize her payoff given the other player’s mixed
strategy:
x > 0 ⇒ e T Ry ≥ e T Ry , ∀j ∈ {1, . . . , m},
i
i
T
j
yj > 0 ⇒ x Cej ≥ x T Cei , ∀i ∈ {1, . . . , n}.
Equivalently, a player cannot improve her payoff by unilaterally
switching to a pure strategy:
x T Ry ≥ eiT Ry , ∀i ∈ {1, . . . , m},
x T Cy ≥ x T Cej , ∀j ∈ {1, . . . , n}.
133 / 143
Miscellanea – Two-player Zero-sum Games
LP Formulation
Definition. A two-player game is said to be zero-sum if R + C = 0, i.e.
R[i, j] + C [i, j] = 0 for all i ∈ {1, . . . , m} and j ∈ {1, . . . , n}.
If the row player is forced to announce his strategy in advance, he
solves the following linear program whose value will be equal to the
row player’s payoff after the column player’s optimal response to her
strategy:
LP1
max{z|x T R ≥ z1T , x ∈ ∆ }.
m
max min x T Ry .
The maximum of z is actually
x
Dual of LP1:
LP2
y
min{z 0 | − y T R T ≥ z 0 1T , y ∈ ∆n }.
Taking z 00 = −z 0 and using the fact that C = −R, we can change
LP2 into:
LP3
max{z 00 |Cy ≥ z 00 1, y ∈ ∆n }.
The maximum of z 00 is actually
max min x T Cy = − min max x T Ry .
y
x
y
x
134 / 143
Miscellanea – Two-player Zero-sum Games
LP Duality
Theorem 1. If (x, z) is optimal for LP1, and (y , z 00 ) is optimal for LP3,
then (x, y ) is a Nash equilibrium of (R, C ). Moreover, the payoffs of the
row/column player in this Nash equilibrium are z and z 00 = −z
respectively.
Corollary 1. There exists a Nash equilibrium in every two-player
zero-sum game.
Corollary 2. (The Minmax Theorem)
max min x T Ry = min max x T Ry .
x
y
y
x
135 / 143
Miscellanea – Two-player Zero-sum Games
LP Duality
Theorem 2. If (x, y ) is a Nash equilibrium of (R, C ), then (x, x T Ry ) is
an optimal solution of LP1, and (y , −x T Cy ) is an optimal solution of
LP2.
Corollary 3. The equilibrium set
{(x, y )|(x, y ) is a Nash equilibrium of (R, C )}
of a zero-sum game is convex.
Corollary 4. The payoff of the row player is equal in all Nash
equilibria of a zero-sum game. Ditto for the column player.
Definition (Value of a Zero-Sum Game). If (R, C ) is zero-sum game,
then the value of the game is the unique payoff of the row player in all
Nash equilibria of the game.
136 / 143
Miscellanea – Two-player Zero-sum Games
Fictitious Play (FP)
Basic Idea (Brown 1951):
The two players plays the game in their heads, going through several
rounds of speculation and counterspeculation as to how their opponents
might react and how they would react in turn.
FP proceeds in rounds.
In the first round, each player arbitrarily chooses one of his actions
(row/column).
In subsequent rounds, each player looks at the empirical frequency of
play of their respective opponents in previous rounds, interprets it as a
probability distribution, and myopically plays a pure best response
against this distribution.
137 / 143
Miscellanea – Two-player Zero-sum Games
Fictitious Play (FP)
Definition. Let (R, C ) be a two-player game, x ∈ ∆m a mixed strategy
of the row player and y ∈ ∆m a mixed strategies of the column player.
The players pure strategy best response to y and x are:
BR(y ) = {i ∈ {1, . . . , m}|eiT Ry ≥ ejT Ry , ∀j ∈ {1, . . . , m},
BR(x) = {j ∈ {1, . . . , n}|x T Cej ≥ x T Cei , ∀i ∈ {1, . . . , n}.
138 / 143
Miscellanea – Two-player Zero-sum Games
Fictitious Play (FP)
Definition. The sequence (it , jt )t∈N is a simultaneous fictitious play
process (SFP process) for the game (R, C ), if
(i1 , j1 ) ∈ {1, . . . , m} × {1, . . . , n} and for all t ∈ N,
it+1 ∈ BR(yt ) and jt+1 ∈ BR(xt ),
where, the mixed strategies xt and yt (called believes) are given by
t
xt+1 =
t
1X
1X
eis and yt+1 =
ej .
t s=1
t s=1 s
139 / 143
Miscellanea – Two-player Zero-sum Games
Fictitious Play (FP)
Definition. The sequence (it , jt )t∈N is a alternating fictitious play
process (AFP process) for the game (R, C ), if i1 ∈ {1, . . . , m} and for
all t ∈ N,
it+1 ∈ BR(yt ) and jt ∈ BR(xt ),
where, the mixed strategies xt and yt (called believes) are given by
t
xt+1 =
t
1X
1X
eis and yt+1 =
ej .
t s=1
t s=1 s
140 / 143
Miscellanea – Two-player Zero-sum Games
Fictitious Play (FP)
Remarks.
Beliefs can be updated recursively. The belief of a player in round
t + 1 is a convex combination of his belief in round t and his pure
strategy best response to opponents move in round t:
xt+1 =
1
t
1
t
xt +
eit+1 and yt+1 =
yt +
ej
t +1
t +1
t +1
t + 1 t+1
If a fictitious play process (AFP or SFP) converges, it must be
constant from some stage on, implying that the limit is a pure Nash
equilibrium.
Even if the process does not converge, if the beliefs converge, then
the limit must be a Nash equilibrium (which need not be pure,
however).
141 / 143
Miscellanea – Two-player Zero-sum Games
Fictitious Play (FP)
Definition. The sequence (xt , yt )t∈N∗ is called a learning sequence. A
learning sequence of a game is said to converge if for some Nash
equilibrium (x, y ) of the game,
lim (xt , yt ) = (x, y ).
k→∞
We then say that FP converges if every learning sequence converges to
a Nash equilibrium.
Theorem (J.Robinson, 1951) FP converges for every two-player
zero-sum game.
142 / 143
Miscellanea – Two-player Zero-sum Games
Fictitious Play (FP)
Definition. A two-player zero sum is called symmetric if R and C are
squared skew-symmetric matrices R T = −R = C .
In symmetric games, both players have the same set of actions. Example:


0 −1 −
R =  1 0 − 
0
Theorem (F.Brandt, F. Fischer, P. Harrenstein, 2011).
In symmetric two-player constant-sum games, FP may require
exponentially many rounds (in the size of the representation of the game)
before an equilibrium action is eventually played.
Proof. Taking = 2−k (for k ∈ N∗ ) in the matrix above, it is proved
that FP may take 2k rounds before either player plays the pure strategy
e3 ( (e3 , e3 ) is the unique Nash equilibrium of the game).
143 / 143