@let@token OPTIMIZARE COMBINATORIE Modul 1
Transcription
@let@token OPTIMIZARE COMBINATORIE Modul 1
OPTIMIZARE COMBINATORIE Modul 1 C. Croitoru [email protected] FII March 19, 2015 1 / 143 OUTLINE POLYHEDRAL COMBINATORICS 1 2 3 4 5 6 7 8 9 10 Introduction Background information on polyhedra Background information on linear programming Total Unimodularity Total Dual Integrality Blocking Polyhedra Anti-blocking Polyhedra Cutting Planes Hard Problems – Complexity of the Integer Hull Miscellanea 2 / 143 Introduction Polyhedral Combinatorics m Linear Programming techniques =⇒ Combinatorial Problems 1950 - 1960: Dantzig, Ford, Fulkerson, Hoffman, Johnson, Kruskal 1960-1970: Edmonds, Giles, Fulkerson 1970−→ : Lov´asz, Gr¨otschel, Chv´atal, Schrijver, Padberg, etc. 3 / 143 Introduction Basic Reference: Alexander Schrijver’s book Combinatorial Optimization: Polyhedra and Efficiency Quote from Schrijver’s preface: Pioneered by the work of Jack Edmonds, polyhedral combinatorics has proved to be a most powerful, coherent, and unifying tool throughout combinatorial optimization.a Schrijver’s photo: a Most of the present notes follow Schrijver’s chapter 30 in Handbook of Combinatorial Optimization, Elsevier 1995 4 / 143 Background information on polyhedra Polyhedron A set P ⊆ Rn is a polyhedron if there exist a matrix A and a vector b such that P = {x|Ax ≤ b}. Polytope A set P ⊆ Rn is a polytope if there exist x1 , . . . , xt ∈ Rn such thata P = convhull{x1 , . . . , xt }. a Reminder: convex sets, convhull. 5 / 143 Background information on polyhedra Finite Basis Theorem for Polytopes Minkowski (1896), Steinitz (1916), Weyl (1935) A set P is a polytope if and only if P is a bounded polyhedron. Decomposition Theorem for Polyhedra Motzkin (1936) A set P ⊆ Rn is a polyhedron if and only if there exist x1 , . . . , xt , y1 , . . . , ys ∈ Rn such that P = {λ1 x1 + . . . + λt xt + µ1 y1 + . . . + µs ys |λ1 , . . . , λt , µ1 , . . . , µs ≥ 0, λ1 + . . . + λt = 1}. 6 / 143 Background information on polyhedra Supporting hyperplane Polyhedron P = {x|Ax ≤ b} ⊆ Rn , where A ∈ Rm×n and b ∈ Rn . If c ∈ Rn , c 6= 0, and δ = max{c T x|x ∈ P} then the set {x ∈ Rn |c T x = δ} is a supporting hyperplane of P. Face P = {x|Ax ≤ b} polyhedron. A subset F ⊆ P is a face of P if F = P or F = P ∩ H for some supporting hyperplane H of P. A face is polyhedron again. 7 / 143 Background information on polyhedra Minimal Face For any face F of P = {x|Ax ≤ b}, there exists a subsystem A0 x ≤ b 0 of Ax ≤ b such that F = {x ∈ P|A0 x=b 0 }. It follows that there are only a finite number of faces. Minimal faces are faces minimal w.r.t. inclusion. Theorem (Hoffman, Kruskal 1956) F is a minimal face of P ⊆ Rn if and only if ∅ = 6 F ⊆ P and F = {x|A0 x = b 0 } for some subsystem A0 x ≤ b 0 of Ax ≤ b. 8 / 143 Background information on polyhedra Vertex If F is a minimal face then its dimension is n − rank(A). If rank(A) = n then minimal faces correspond to vertices: An element of P is a vertex if it is not the convex combination of other elements of P P = {x|Ax ≤ b} has vertices only if rank(A) = n and then, those vertices are exactly minimal faces of P. Theorem Vector z ∈ P = {x|Ax ≤ b} is a vertex of P if and only if A0 z = b 0 for some some subsystem A0 x ≤ b 0 of Ax ≤ b, with A0 nonsingular of order n. A0 (in general not unique) is a basis for z. 9 / 143 Background information on polyhedra Vertices P is called pointed if it has vertices. A polytope is always pointed and it is the convex hull of its vertices. Adjacent vertices Vertices z 0 and z 00 are adjacent if convhull{z 0 , z 00 } is a face of P. If P is a polytope, then z 0 and z 00 are adjacent vertices ⇔ 1 0 00 2 (z + z ) is not a convex combination of other vertices of P. Theorem. Vertices z 0 and z 00 of the polyhedron P are adjacent if and only if they have basis A0 and A00 (respectively) with n − 1 rows in common. 10 / 143 Background information on polyhedra Hirsch’s Conjecture Polyhedron P −→ graph G (P) whose nodes are vertices of P and whose edges are pairs of adjacent vertices. Diameter of P is the diameter of G (P). Hirsch’s Conjecture (cf. Dantzig 1963) A polytope in Rn determined by m inequalities has diameter at most m − n. Naddef 1989: True for polytopes all of whose vertices are {0, 1}-vectors. Francisco Santos 2010: False in general. 11 / 143 Background information on polyhedra Facets A facet of P is a maximal (w.r.t. inclusion) face F of P with F 6= P. A face F is a facet if and only ifa dim(F ) = dim(P) − 1. An inequality c T x ≤ δ is called facet-induced inequality if P ⊆ {x|c T x ≤ δ} and P ∩ {x|c T x ≤ δ} is a facet of P. a Reminder: dimension 12 / 143 Background information on polyhedra Facets Ax ≤ b is an irredundant (or minimal) system determining P if no inequality in Ax ≤ b is implied by the other. Let A+ x ≤ b + be those inequalities αT x ≤ β from Ax ≤ b for which αT z<β for at least one z in P. Each inequality in A+ x ≤ b + is a facet-inducing inequality. Moreover, this define an one-to-one relation between facets and inequalities in A+ x ≤ b + . 13 / 143 Background information on polyhedra Facets If P is full-dimensional, then the irredundant system Ax ≤ b is unique up to multiplication of inequalities by positive scalars. Theorem If P = {x|Ax ≤ b} is full-dimensional, then Ax ≤ b is irredundant if and only if for each pair aiT x ≤ bi and ajT x ≤ bj of constraints from Ax ≤ b there is a vector x 0 in P such that aiT x 0 = bi and ajT x 0 < bj . 14 / 143 Background information on polyhedra Integral polyhedron The polyhedron P = {x|Ax ≤ b} is rational if A and b are rational-valued (hence we can take them integer valued). P = {λ1 x1 + . . . + λt xt + µ1 y1 + . . . + µs ys |λ1 , . . . , λt , µ1 , . . . , µs ≥ 0, λ1 + . . . + λt = 1}. P is rational if and only if x1 , . . . , xt , y1 , . . . , ys ∈ Qn P is called integral if x1 , . . . , xt , y1 , . . . , ys ∈ Zn Hence P is integral if and only if P is the convex hull of integers vectors in P, or, equivalently if and only if every minimal face of P contains integer vectors. 15 / 143 Background information on linear programming Linear Programming (LP) Problem of maximizing or minimizing a linear function c T x over a polyhedron P. Examples: (i) max{c T x|Ax ≤ b} (ii) max{c T x|x ≥ 0, Ax ≤ b} (iii) max{c T x|x ≥ 0, Ax = b} (iv) min {c T x|x ≥ 0, Ax ≥ b} It can be shown, for each the above problems, that if the set involved is a polyhedron with vertices, and if the optimum value is finite, then it is attained by a vertex of polyhedron. 16 / 143 Background information on linear programming Dual Problem Duality Theorem of LP. A ∈ Rm×n , b ∈ Rm , and c ∈ Rn . (i) max{c T x|Ax ≤ b} = min{y T b|y ≥ 0, y T A = c T } ( ii ) max{c T x|x ≥ 0, Ax ≤ b} = min{y T b|y ≥ 0, y T A ≥ c T } ( i i i ) max{c T x|x ≥ 0, Ax = b} = min{y T b|y T A ≥ c T } ( iv ) min {c T x|x ≥ 0, Ax ≥ b} = max{y T b|y ≥ 0, y T A ≤ c T } provided that these sets are nonempty. 17 / 143 Background information on linear programming Farkas’s Lemma A ∈ Rm×n , b ∈ Rm . Ax = b has a solution x ≥ 0 if and only if y T b ≥ 0 holds for each y ∈ Rm with y T A ≥ 0. Principle of complementary slackness Let x and y satisfy Ax ≤ b, y ≥ 0, and y T A = c T . Then x and y are optimal solutions in (i) max{c T x|Ax ≤ b} = min{y T b|y ≥ 0, y T A = c T } if and only if yi = 0 or aiT x = bi for each i = 1, . . . , m. (aiT x = bi denotes the ith row in Ax = b; similar statements hold for the pairs (ii)-(iv) of dual problems) 18 / 143 Background information on linear programming The simplex method (Dantzig 1951) A ∈ Rm×n , b ∈ Rm , and c ∈ Rn . Solving max{c T x|Ax ≤ b}, where polyhedron P = {x|Ax ≤ b} has vertices, i.e. rank(A) = n. Idea: Make a trip, going from a vertex to a better adjacent vertex, until an optimal vertex is reached. By the theorem on slide 9, vertices can be described by bases, while by the theorem on slide 10, adjacency can be described as bases differing in exactly one constraint. 19 / 143 Background information on linear programming The simplex method (1) The process can be described as a sequence of bases A0 x ≤ b0 , A1 x ≤ b1 , A2 x ≤ b2 , . . . where xk := A−1 k bk is a vertex of P, Ak+1 x ≤ bk+1 differs by one T x ≥ cT x . constraint from Ak x ≤ bk , and ck+1 k k A0 x ≤ b0 is an (arbitrary) basis corresponding to a vertex, and Ak+1 x ≤ bk+1 is obtained from Ak x ≤ bk as follows: 20 / 143 Background information on linear programming The simplex method (2) T • If c T A−1 k ≥ 0 ⇒ xk is an optimal solution of max{c x|Ax ≤ b}. c T ∀x such that Ax ≤ b, we have Ak x ≤ bk and hence c T x = −1 T −1 T T = (c T A−1 k )(Ak x) ≤ (c Ak )bk = c (Ak bk ) = c xk . (A−1 k Ak )x −1 a • Else choose i s.t. (c T A−1 k )i < 0 and let z := −Ak ei . Note that for λ ≥ 0, xk + λz traverses an edge or a ray of P (i.e. face of dim 1) or is outside of P for all λ > 0. Also, c T z = −(c T A−1 k )i > 0. a ei is the ith unit vector in Rn . 21 / 143 Background information on linear programming The simplex method (3) • If Az ≤ 0 ⇒ xk + λz ∈ P for all λ ≥ 0, whence max{cT x|Ax ≤ b} = ∞. • Else ⇒ let λ0 the largest λ s.t. xk + λz ∈ P: λ0 = min n bj − aT xk j ajT z o |j = 1, . . . , m, ajT z > 0 . Choose j attaining this minimum, replace the i inequality in Ak x ≤ bk by inequality ajT x ≤ bj ⇒ Ak+1 x ≤ bk+1 . Note that xk+1 = xk + λ0 z. Hence if xk+1 6= xk then c T xk+1 > c T xk . The process stops if c T xk+1 > c T xk (since P has finitely many vertices). This is the case when each vertex has only one basis (the nondegenerate case). 22 / 143 Background information on linear programming The simplex method. Discussion. - Pivot selection rules, prescribing the choice of i and j above, have been found which could be proven to assure termination of the simplex method. - No one of these rules could be proven to give a polynomial-time method. Most of them could be shown to require an exponential number of iterations in the worst case. - This happens even in the case when diameter of P is small (TSP polytopes have diameter 2). - A vertex of P can be adjacent with an exponential number of vertices (in the size of A and b), whereas for any basis A0 the are at most n(m − n) bases differing from A0 in exactly one row. - Desire: finding pivoting rules preventing us going through many bases corresponding to the same vertex. 23 / 143 Background information on linear programming Primal-dual method.(1) - Dantzig, Ford, Fulkerson (1956)- generalization of similar methods for network flow and transportation problems. - Ideea: Starting with a dual feasible solution y , a primal feasible solution x satisfying the complementary slackness condition with respect to y is searched. If such a primal solution is found, then x and y form a pair of optimal (primal, dual) solutions. If no such primal solution is found, the method prescribes a modification of y , after which we start anew. - The method requires strategies to find primal solution satisfying complementary slackness and to modify the dual solution if no such primal solution is found. 24 / 143 Background information on linear programming Primal-dual method. (2) A ∈ Rm×n , columns of A: a1 , . . . , an ∈ Rm , b ∈ Rm , and c ∈ Rn . Solving min{c T x|x ≥ 0, Ax = b}. The dual problem: max{y T b|y T A ≤ c T }. Suppose we have y0 a dual feasible solution: y0T A ≤ c T . 25 / 143 Background information on linear programming Primal-dual method. (3) A primal-dual iteration: - Let A0 the submatrix of A consisting of those columns aj for which y0T aj = cj . - Solve the restricted linear program min{λ|λ, x 0 ≥ 0, A0 x 0 + λb = b} = max{y T b|y T A0 ≤ 0, y T b ≤ 1}. - If the optimum value is 0 ⇒ solution λ = 0, x00 ≥ 0 and A0 x00 = b. Adding 0 components to x00 −→ x0 ≥ 0 s.t. Ax0 = b and (x0 )j = 0 if y0T aj < 0. Complementary slackness fulfilled, hence x0 , y0 optimal primal/dual solutions. - If the optimum value is not 0 ⇒ it is 1. Let u be an optimal solution for the maximum. Let θ be the largest real number satisfying (y0 + θu)T A ≤ c T . (Note that θ > 0) y0 := y0 + θu and start the iteration anew. 26 / 143 Background information on linear programming Primal-dual method as a gradient method Let y0 be a feasible solution of max{y T b|y T A ≤ c T }. y0 is not optimal ⇔ we can find u such that u T b > 0 and u is a feasible direction in y0 : (y0 + θu)T A ≤ c T for some θ > 0. If A0 consists in the columns of A in which y0T A ≤ c T has equality, then u is a feasible direction ⇔ u T A0 ≤ 0. So u can be found by solving max{u T b|u T A0 ≤ 0, u T b ≤ 1}. 27 / 143 Background information on linear programming Primal-dual method as a gradient method – Maximum flow (1) Given a digraph D = (V , A), arc capacities ca ∈ [0, ∞] ∀a ∈ A, a source node s and a sink node t. The maximum s-t flow problem: find the maximum amount of flow that can be sent from s to t through the network, without exceeding the capacities. Formulated as a LP: maximise X xa − a∈δ + (v ) xa a∈δ − (s) a∈δ + (s) subject to X X xa − X xa = 0 ∀v ∈ V − {s, t} a∈δ − (v ) 0 ≤ xa ≤ c(a) ∀a ∈ A If we have a feasible solution x0 , finding a feasible direction in x0 , means finding u : A → R satisfying: 28 / 143 Background information on linear programming Primal-dual method as a gradient method – Maximum flow (2) X X u(a) − u(a) > 0 a∈δ − (s) a∈δ + (s) X a∈δ + (v ) u(a) − X u(a) = 0 ∀v ∈ V − {s, t} a∈δ − (v ) u(a) ≥ 0 ∀a ∈ As.t.x0 (a) = 0 u(a) ≤ 0 ∀a ∈ As.t.x0 (a) = c(a) This means finding an undirected s-t path a is traversed forward a is traversed backward a is traversed forward or backward s.t. ∀a arc in the path: if x0 (a) = 0 if x0 (a) = c(a) if 0 < x0 (a) < c(a) 29 / 143 Background information on linear programming Primal-dual method as a gradient method – Maximum flow (3) If we have such a path, we take u(a) 1 or -1, corresponding to the forward/backward arcs in the path, and 0 for the other arcs. Taking the highest value of θ such that x0 + θu is a feasible flow, gives the new solution. The path is called flow-augmenting path since the new solution has a higher objective value. This is exactly Ford - Fulkerson algorithm, which is hence an example of primal-dual method. Dinits (1970) and Edmonds Karp (1972) showed polynomial-time versions of this algorithm. 30 / 143 Background information on linear programming The ellipsoid method – Khachiyan (1979) Shor (1970), Yudin - Nemirovskiˇı (1976) for NLP; shown by Khachiyan to solve LP in polynomial time. Rough description: A ∈ Qm×n , b ∈ Qm , and c ∈ Qn . max{c T x|Ax ≤ b}; polyhedron P = {x|Ax ≤ b} is bounded. - Find R ∈ R s.t. P ⊆ E0 = {x ∈ Rn | kxk ≤ R} - Construct sequence of ellipsoids E0 , E1 , . . ., starting with E0 , and for each t, Et+1 is(obtained from Et as follows: z center of Et ; {x|akT x ≤ akT z} if ∃k s.t. akT z > bk hyperplane H := {x|c T x ≥ c T z} if Az ≤ b Et+1 is the ellipsoid of smallest volume contained in Et ∩ H. 31 / 143 Background information on linear programming The ellipsoid method (2) 1 One can prove that Et+1 is unique and vol(Et+1 ) ≤ e − n vol(Et ). Since the optimum solution of the LP belongs to Et , centers of ellipsoids may converge to an optimum solution. Difficulty: Ellipsoids with very small volumes may have a large diameter (hence centers can remain far for optimum solution). Difficulty: The unique smallest volume ellipsoid is determined by irrational parameters, hence working in rational arithmetic must allow successive approximations. These problems can be overcome and a polynomial running time can be proved. 32 / 143 Background information on linear programming The ellipsoid method - discussion (1) Gr¨ otschel, Lov´ asz, Schrijver (1981): in applying the ellipsoid method it is not necessary that the system Ax ≤ b be explicitly given. It is sufficient to have a subroutine to decide if a given vector z belongs to the feasible region of the LP and to find a separating hyperplane in case z is not feasible. Useful for LPs coming from combinatorial optimization, where the number of linear constraints is exponential in the size of underlying data structure. Optimization Problem (OP): Given G = (V , E ) a graph; c : E → Q, FG a collection of subsets of E . Find P F0 ∈ FG s.t. c(F0 ) = maxF ∈FG c(F ), where c(F ) = e∈F c(e). 33 / 143 Background information on linear programming The ellipsoid method - discussion (2) If FG is the collection of matchings (spanning trees, respectively Hamiltonian circuits)in G , we obtain the problem of finding the maximum weighted matching (spanning tree, respectively, Hamiltonian circuit – TSP). The optimization problem is polynomially solvable if it is solvable by an algorithm whose running time is bounded by a polynomial in the size of optimization problem which is |V | + |E | + size(c), where P size(c) := e∈E size(c(e)), and size(p/q) = log2 [(1 + |p|)(1 + |q|)]. - Separation Problem (SP) Given G = (V , E ), FG , and x ∈ QE , decide if x ∈ convhull({χF |F ∈ FG }) and, if not, find a separating hyperplane. (χF denotes the incidence vector in QE of F ⊆ E ). 34 / 143 Background information on linear programming The ellipsoid method - discussion (3) Theorem The optimization problem OP is polynomially solvable if and only if the separation problem SP is polynomially solvable. The ellipsoid method does not give a practical method. The above theorem has been used to prove that some combinatorial problems admit polynomial time algorithms giving a motivation to find a practical one. One drawback of the ellipsoid method is that the number of ellipsoids constructed depends on the size of the objective vector c. This is not very attractive in practice : it would be preferable for the size of c influences only the sizes of the numbers occurring during the algorithm but not the number of arithmetic operations to be performed. 35 / 143 Total Unimodularity Totally unimodular matrices Definition A matrix A is totally unimodular if each minor M of A (squared submatrix of A obtained by deleting some rows and columns) satisfies det(M) ∈ {−1, 0, 1}. ⇒ each entry of a totally unimodular matrix belongs to {−1, 0, 1}. Theorem – Hoffman and Kruskal 1956 Let A ∈ {−1, 0, 1}m×n a totally unimodular matrix and let b ∈ Zm . Then the polyhedron P = {x|Ax ≤ b} is integral. Proof. Let F = {x|A0 x = b 0 } a minimal face of P, where A0 x ≤ b 0 is a subsystem of Ax ≤ b. W.l.o.g. we can suppose A0 = [A1 A2 ], with A1 nonsingular. Then A−1 1 is an integral matrix (by Crammer’s rule and since det(A1 ) ∈ {−1, 1}). Hence x = 0 A−1 1 b 0 is an integral vector in F . 36 / 143 Total Unimodularity TU characterization – Hoffman and Kruskal An integral m × n matrix A is totally unimodular if and only if, for each b ∈ Zm , each vertex of the polyhedron {x|x ≥ 0, Ax ≤ b} is integral. An extension of the HK theorem A polyhedron P in Rn has the integer decomposition property if for each k ∈ N, k > 0, and for each integral vector z in kP (= {kx|x ∈ P}), there exist integral vectors x1 , . . . , xk in P so that z = x1 + . . . + xk . Each polyhedron with integer decomposition property is integral. 37 / 143 Total Unimodularity Theorem – Baum and Trotter 1977 Let A be a totally unimodular m × n matrix and b ∈ Zm . Then the polyhedron P := {x|Ax ≤ b} has the integer decomposition property. Proof. Let z ∈ kP ∩ Zn . Induction on k ∈ N, k > 0. Trivial for k = 1. In the inductive step, we show that z = x 1 + . . . + x k for integral vectors x 1 , . . . , x k in P. By HK theorem (on slide 36), ∃x k integral vector in the polyhedron{x|Ax ≤ b, −Ax ≤ (k − 1)b − Az} A (since (i) the constraint matrix is totally unimodular, (ii) −A the right side vector b (k − 1)b − Az is integral, and (iii) the polyhedron is not empty as it contains k −1 z). Then z − xk ∈ (k − 1)P, hence by induction, z = x1 + . . . + xk . 38 / 143 Total Unimodularity Totally Unimodularity Characterizations Theorem Let A be a matrix with entries 0, 1 and −1. The following are equivalent: (i) A is totally unimodular, i.e. each square submatrix has determinant in {−1, 0, 1}. (ii) (Ghouila-Houry) each set of columns of A can be split into two parts s.t. the sum of columns in one part minus the sum of columns in the other part is a vector with entries in {−1, 0, 1}. (iii) (Camion) Each nonsingular submatrix has a row with an odd number of nonzero components. (iv) (Camion) The sum of entries in any square submatrix of A with even row and column sums, is divisible by four. (v)(Gomory) No square submatrix of A has determinant +2 or −2. The deepest characterization is due to Seymour (1980) implying a polynomial time recognition algorithm. 39 / 143 Total Unimodularity Application: Bipartite graphs If G = (V , E ) is a bipartite graph, then its incidence matrix A ∈ {0, 1}|V |×|E | is totally unimodular: any square submatrix B of A either has a column with at most one 1 (in which case det(B) ∈ {0, 1} by induction), or has two 1’s in each column (in which case det(B) = 0, by the bipartiness of G ). In fact the incidence matrix of a graph G is totally unimodular if and only if G is bipartite. We will consider next some of the consequences of the total unimodularity of the incidence matrix of a bipartite graph. 40 / 143 Total Unimodularity – Bipartite graphs Matching polytope of a bipartite graph G = (V , E ) bipartite graph. convhull{χM |M matching in G } ⊆ RE+ This polytope is equal with the set of all vectors in RE satisfying: (i) (ii) X xe ≥ 0 e∈E xe ≤ 1 v ∈V e3v Clearly, for M matching, χM satisfies (i) and (ii) above, hence the matching polytope is contained in the polyhedron described by (i) and (ii). Conversely, by HK theorem, the polyhedron determined by (i) and (ii) is integral and, clearly, an integral vector satisfying (i) and (ii) must be equal to χM for some matching M of G . 41 / 143 Total Unimodularity – Bipartite graphs Matching polytope of a bipartite graph The matching polytope of G = (V , E ) has dimension |E |. Each inequality in (ii) is facet-determining, except if G has a vertex of degree at most 1. 0 χM and χM are adjacent vertices in the matching polytope if and only if M∆M 0 is a path or a circuit. ⇒ the matching polytope has diameter at most ν(G ) (maximum cardinality of a matching in G ). G = (V , E ) bipartite graph, c : E → R+ , A incidence matrix of G : maximum weight of a matching = max{c T x|x ≥ 0, Ax ≤ 1} ν(G ) = max{1T x|x ≥ 0, Ax ≤ 1} 42 / 143 Total Unimodularity – Bipartite graphs Node-cover polytope of a bipartite graph Node-cover in G = (V , E ): N ⊆ V s. t. ∀e ∈ E N ∩ e 6= ∅. τ (G ) minimum cardinality of a node-cover in G . Node-cover polytope of the bipartite graph G = (V , E ): convhull{χN |N node-cover in G } ⊆ RV + Applying HK theorem, we obtain that the node-cover polytope of the bipartite graph G = (V , E ) is equal with the polyhedron: (i) (ii) 0 ≤ yv ≤ 1 v ∈V yv + yw ≥ 1 {v , w } ∈ E G = (V , E ) bipartite graph, w : V → R+ , A incidence matrix of G : minimum weight of a node-cover = min{w T y |y ≥ 0, y T A ≥ 1} τ (G ) = min{1T y |y ≥ 0, y T A ≥ 1} 43 / 143 Total Unimodularity – Bipartite graphs Node-cover polytope of a bipartite graph By linear programming duality, the last problems on the previously two slides have equal optimum values, hence we obtain K¨ onig’s Matching Theorem: ν(G ) = τ (G ) for bipartite G . By the theorem on slide 38, the matching polytope P of G has the integer decomposition property. If k := ∆(G ) is the maximum degree of a vertex in G , then 1 ∈ RE belongs to kP, and hence is the sum of k integer vectors in P. Each of these vectors is the incidence vector of a matching in G . It follows that E can be partitioned into matchings! So, we have: K¨ onig’s Edge-Coloring Theorem: The edge-coloring number 0 χ (G ) of a bipartite graph G is equal to its maximum degree ∆(G ). 44 / 143 Total Unimodularity – Bipartite graphs Perfect matching polytope of a bipartite graph Perfect matching in G = (V , E ): matching saturating each vertex. Perfect matching polytope of the bipartite graph G = (V , E ): convhull{χM |M perfect matching in G } ⊆ RE+ It is a face of the matching polytope. ⇒ determined by: (i) xe ≥ 0 e∈E X xe = 1 v ∈V (ii) e3v This is equivalent to a theorem of Birkchoff (1946): each doubly stochastic matrix is a convex combination of permutation matrices. 0 χM and χM are adjacent vertices in the perfect matching polytope if and only if M∆M 0 is a circuit. ⇒ the perfect matching polytope has diameter at most 12 |V |. 45 / 143 Total Unimodularity – Bipartite graphs The assignment polytope The perfect matching polytope of Kn,n . Equivalently: the polytope in Rn×n of all matrices (xij )ni,j=1 s.t. (i) (ii) (iii) n X i=1 n X xij ≥ 0 i, j = 1, . . . , n xij = 1 j = 1, . . . , n xij = 1 i = 1, . . . , n j=1 (such matrices are called doubly stochastic) Balinski and Russakoff (1974): For n ≥ 4 assignment polytopes have diameter 2. 46 / 143 Total Unimodularity – Bipartite graphs The stable-set polytope of a bipartite graph Stable-set in graph G : set of pairwise non-adjacent vertices. The stable-set polytope of a graph G = (V , E ): convhull{χC |C stable-set in G } ⊆ RV + By HK theorem, for bipartite graph G = (V , E ) this polytope is equal with the set of all vectors in RV satisfying: 0 ≤ yv ≤ 1 v ∈V (ii) yv + yw ≤ 1 {v , w } ∈ E (i) G = (V , E ) bipartite graph, w : V → R+ , A incidence matrix of G : maximum weight of a stable-set = max{w T y |y ≥ 0, y T A ≤ 1} α(G ) = max{1T y |y ≥ 0, y T A ≤ 1} 47 / 143 Total Unimodularity – Bipartite graphs The edge-cover polytope of a bipartite graph Edge-cover in graph G = (V , E ): set F ⊆ E s.t. ∪e∈F e = V . The edge-cover polytope of a graph G = (V , E ): convhull{χF |F edge-cover in G } ⊆ RE+ By HK theorem, for bipartite graph G = (V , E ) with no isolated vertex, this polytope is equal with the set of all vectors in RE satisfying: 0 ≤ xe X (ii) xe (i) ≤ 1 e∈E ≥ 1 v ∈V e3v G = (V , E ) bipartite graph, w : E → R+ , A incidence matrix of G : minimum weight of an edge-cover = min{w T x|x ≥ 0, x T A ≥ 1} ρ(G ) = min{1T x|x ≥ 0, x T A ≥ 1} 48 / 143 Total Unimodularity – Bipartite graphs The edge-cover polytope of a bipartite graph By linear programming duality, the last problems on the previously two slides have equal optimum values, hence we obtain K¨ onig’s Covering Theorem: α(G ) = ρ(G ) for bipartite G . By the theorem on slide 38, the edge-cover polytope of a bipartite graph G has the integer decomposition property. ⇒ Gupta’s Theorem (1967): the maximum number of disjoint edge covers in a bipartite graph is equal to its minimum degree. G = (V , E ) bipartite graph, w ∈ ZE ,b ∈ ZV , A incidence matrix of G : By LP duality we have: (1) max{w T x|x ≥ 0, Ax ≤ b} = min {y T b|y ≥ 0, y T A ≥ w T } (2) min {w T x|x ≥ 0, Ax ≥ b} = max{y T b|y ≥ 0, y T A ≤ w T } 49 / 143 Total Unimodularity – Bipartite graphs The edge-cover polytope of a bipartite graph By HK theorem these programs have integral optimal solutions. For b = 1 we obtain the Egervary’s min-max relations (1931): the maximum is equal to the minimum P weight of a matching V value of v ∈V yv , where y ∈ Z+ and yv + yu ≥ we ∀e = {u, v } ∈ E . the minimum weight to the Pof an edge-cover is equal V maximum value of v ∈V yv , where y ∈ Z+ and yv + yu ≤ we ∀e = {u, v } ∈ E . 50 / 143 Total Unimodularity Application: digraphs Let M be the 1 mva = −1 0 |V | × |A| incidence matrix of a digraph D = (V , A): if v is the tail of a if v is the head of a otherwise. M is totally unimodular: any square submatrix B of M either has a column with at most one non-zero (in which case det(B) ∈ {0, 1} by induction), or has exactly one 1 and one -1 in each column (in which case det(B) = 0, by adding its rows). We will consider next some of the consequences of the total unimodularity of the incidence matrix of a digraph. 51 / 143 Total Unimodularity – Digraphs The s-t-flow polytope Given a digraph D = (V , A), arc capacities ca ∈ [0, ∞] ∀a ∈ A, a source node s and a sink node t. The s-t-flow polytope: the set of all vectors x in RA satisfying 0 ≤ xa ≤ c(a) ∀a ∈ A X X xa = xa ∀v ∈ V − {s, t} a∈δ − (v ) a∈δ + (v ) A vector x in this polytope is called an s-t-flow (under c). By the total unimodularity of theP incidence matrix P of D, if c is integral, the maximum value ( := a∈δ+ (s) xa − a∈δ− (r ) xa ) of an s-t-flow under c, is attained by an integral vector (Dantzig, 1951). 52 / 143 Total Unimodularity – Digraphs The s-t-flow polytope: Max-Flow Min-Cut Theorem By LP duality, the maximum P value of an s-t-flow under c is equal to the minimum value of a∈A ya ca , where y ∈ RA + is such that V there is z ∈ R satisfying: ya − zv + zu ≥ 0 ∀a = (v , u) ∈ A zs = 1, zt = 0 By the total unimodularity of the incidence matrix of D, we may take the minimizing y , z to be integral. Let W := {v ∈ V |zv ≥ 1}. Then for any aP = (v , u) ∈ δ +P (W ) we have ya P ≥ zv − zu ≥ 1, and hence a∈A ya ca ≥ a∈δ + (W ) ya ca ≥ a∈δ + (W ) ca . So the maximum flow value is not less then the capacity of cut δ + (W ). Since it cannot be larger, we obtain Ford and Fulkerson Max-Flow Min-Cut Theorem. 53 / 143 Total Unimodularity – Digraphs The shortest-path polytope Given a digraph D = (V , A), a source node s and a sink node t. The shortest-path polytope: the convex hull of all incidence vectors χP of subsets P of A, being a disjoint union of a s-t path and some circuits. By the total unimodularity of the incidence matrix of D, this polytope is equal to the set of all vectors x in RA satisfying 0 ≤ x ≤ 1 ∀a ∈ A X a X xa = xa ∀v ∈ V − {s, t} a∈δ − (v ) a∈δ + (v ) X a∈δ + (s) xa − X xa = 1 a∈δ − (s) So, of an s-t-flow polytope with the hyperplane P is the intersection P a∈δ + (s) xa − a∈δ − (s) xa = 1. 54 / 143 Total Unimodularity – Digraphs The circulation polytope (1) Given a digraph D = (V , A) and l, u ∈ RA . The circulation polytope: the set of all circulations between l and u, that is vectors x ∈ RA satisfying la ≤ xa ≤ ua ∀a ∈ A Mx = 0 where, M is the incidence matrix of D. By the total unimodularity of M, if l and u are integral, then the circulation polytope is integral. So if l and u are integral, and there exists a circulation, there exists an integral circulation. By Farkas’s lemma, the circulation polytope is non-empty if and only if there are no vectors z, w ∈ RA , y ∈ RV satisfying: 55 / 143 Total Unimodularity – Digraphs The circulation polytope (2) z, w ≥ 0 T = 0 z −w +M y T T u z −l w < 0 Suppose that l ≤ u and the above system has a solution. Then there is also a solution with 0 ≤ y ≤ 1, and by the total unimodularity of M, there is a solution z, w , y with y a {0, 1}-vector. Also we may assume that za wa = 0 for each a ∈ A. Then, for W := {v ∈ V |yv = 1}, P P T T a∈δ − (W ) ua − a∈δ + (W ) la = u z − l w < 0. Thus we have Hoffman’s Circulation Theorem (1960): there exists a circulation x P satisfying l ≤ x P ≤ u if and only if there is no subset W of V with a∈δ− (W ) ua < a∈δ+ (W ) la . 56 / 143 Total Unimodularity – Digraphs The circulation polytope (3) Let D = (V , A) be a digraph and T ⊆ A be a spanning tree of D. 0 Consider the (A − T ) × T matrix N defined for a = (v , w ) ∈ A − T and if a does not occur in the v -w path in T , 0 a ∈ T by: Na0 ,a = −1 if a occurs forward in the v -w path in T , +1 if a occurs backward in the v -w path in T . Then N is totally unimodular (using, 0 e.g., Gouila Houry characterization x (ii), on slide 39). A vector x = ∈ RA−T × RT satisfies Mx = 0 ( x 00 where M is the incidence matrix of D) if and only if x” = Nx 0 . Hence the circulation polytope can be equivalently written la ≤ xa0 ≤ ua ∀a ∈ A − T 0 ∀a ∈ T la ≤ (Nx )a ≤ ua 57 / 143 Total Unimodularity – Digraphs The circulation polytope (4) By the unimodularity of N, the above polytope has integer vertices if all la and ua are integer. A nice special case is given by the {0, 1}-matrices with consecutive ones property : in each column the 1’s form an interval (fixing some ordering of the rows). This special case arises when T is a directed path and each arc in A − T forms a directed circuit with some subpath of T (Hoffman, 1979). 58 / 143 Total Dual Integrality (TDI) TDI is a powerful technique in deriving min-max relations and the integrality of polyhedra. It is based on the following result. Theorem – Edmonds and Giles 1977 A rational polyhedron P is integral if and only if each rational supporting hyperplane of P contains integral vectors. Proof. Since the intersection of a supporting hyperplane with P is a face of P, the necessity condition is trivial. To prove sufficiency, suppose that each rational supporting hyperplane of P contains integral vectors. Let P = {x|Ax ≤ b} with A and b integral. Let F = {x|A0 x = b 0 } be a minimal face of P with A0 x ≤ b 0 a subsystem of Ax ≤ b. If F does not contain an integral vector, it follows that there is y such that c T = y T A0 is an integral vector, while δ := y T b 0 is not integral (this follows, e.g., from Hermite’s Normal form Theorem). We may assume that all entries in y are nonnegative (we may replace each entry yi by yi − byi c). Now, H := {x|c T x = δ} is a supporting hyperplane of P without any integral vector. 59 / 143 Total Dual Integrality (TDI) LP problem: (1) max{c T x|Ax ≤ b}, with rational entries in A, b, c. Corollary The following are equivalent: (i) The maximum value in (1) is an integer for each integral vector c for which the maximum is finite. (ii) The maximum in (1) is attained by an integral optimum solution for each rational vector c for which the maximum is finite. (iii) The polyhedron {x|Ax ≤ b} is integer. Now, consider the LP-duality equation max{c T x|Ax ≤ b} = min{y T b|y ≥ 0, y T A = c T }. We may derive that the maximum value is an integer if we know that the minimum has an integral optimum solution and b is integral. 60 / 143 Total Dual Integrality (TDI) TDI’s Definition - Edmonds and Giles 1977 A system Ax ≤ b of linear inequalities is totally dual integer (TDI) if the minimum in min{y T b|y ≥ 0, y T A = c T } is attained by an integral optimum solution y , for each integral vector c s.t. the minimum is finite. Corollary Let Ax ≤ b be a system of linear inequalities with A rational and b integral. If Ax ≤ b is TDI then {x|Ax ≤ b} is integral. Edmonds’s photo, 2009 61 / 143 TDI Applications Arborescences Let D = (V , A) be a digraph and r ∈ V be a fixed vertex of D. An r -arborescence is a set A0 of |V | − 1 arcs forming a spanning tree such that each vertex v 6= r is entered by exactly one arc in A0 (for any vertex v there is an unique directed path in A0 from r to v ). An r -cut is an arc set of the form δ − (U) for some U ⊆ V − {r }, U 6= ∅. r -arborescences are minimal (w.r.t. ⊆) sets of arcs intersecting all r -cuts. r -cuts are minimal (w.r.t. ⊆) sets of arcs intersecting all r -arborescences. Fulkerson’s Optimum Arborescence Theorem For any ”length” function l : A → Z+ , the minimum ”length” of an r -arborescence is equal to the maximum number t of r -cuts C1 , . . . , Ct (repetition allowed) such that no arc a ∈ A is in more than l(a) of these cuts. 62 / 143 TDI Applications Fulkerson’s Optimum Arborescence Theorem This result can be formulated in polyhedral terms as follows. Let C be the matrix whose rows are the incidence vectors of all r -cuts. Hence the columns of C are indexed by A and the rows by the collection H := {U|U 6= ∅, U ⊆ V − {r }}. Then, the theorem is equivalent to both optima in the LP-duality equation min{l T x|x ≥ 0, Cx ≥ 1} = max{y T 1|y ≥ 0, y T C = l T }. having integral solutions, for each l ∈ ZA +. So, in order to prove the theorem, it suffices to show that the above maximum has an integral optimum solution, for each l ∈ ZA + , i.e., that the system x ≥ 0, Cx ≥ 1 is TDI. 63 / 143 TDI Applications Fulkerson’s Optimum Arborescence Theorem Proof of the Optimum Arborescence Theorem. Since C is generally not totally unimodular, we find a totally unimodular submatrix C 0 of C (consisting of rows of C ) such that max{y T 1|y ≥ 0, y T C = l T } = max{z T 1|z ≥ 0, z T C 0 = l T }. By the total unimodularity of C 0 , the second maximum is attained by an integral optimal solution z; z can be extended to an integral optimal solution y for the first maximum (adding 0’s in the appropriate positions). Construction of C 0 : A subcollection F of H is called laminar if ∀T , U ∈ F we have T ⊆ U or U ⊆ T or U ∩ T = ∅. If C 0 is the matrix consisting of the rows of C indexed by some laminar family F then C 0 is totally modular. 64 / 143 TDI Applications Proof of Fulkerson’s Optimum Arborescence Theorem Let F ⊆ H be a laminar family and C 0 its corresponding submatrix of C . Let G be a subcollection of F, that is a set of rows of C 0 . For each U ∈ G let its ”height” be the number of T in G s.t. T ⊆ U. Split G into Geven and Godd according as h(U) is even or odd. Since G is laminar, for any arc a ∈ A, the number of sets in Geven containing a and the number of sets in Godd containing a differs by at most 1. Hence we can split the rows corresponding to G into two classes fulfilling Ghouila-Houri’s condition on slide 39. So C 0 is totally unimodular. T T T Let l ∈ ZA + . Let Py optimal solution of max{y 1|y ≥ 0, y C = l } and for which U∈H yU · |U| · |V − U| is minimum (there is such y by compactness argument). Let F := {U|U ∈ H, yU > 0}. Then F is laminar. 65 / 143 TDI Applications Proof of Fulkerson’s Optimum Arborescence Theorem Indeed, suppose that ∃ T , U ∈ F with T 6⊆ U 6⊆ T and T ∩ U 6= ∅. Let := min{yT , yU } > 0 and reset yT := yT − yU := yU − yT ∩U := yT ∩U + yT ∪U := yT ∪U + while y does not change in the other coordinates. By this resetting, y T C 0 does not increase in any coordinate (since − − δ − (T ∩U) + · χδ − (T ∪U) ) , while y T 1 · χδ (T ) + · χδ (U) ≥ · χ P does not change. However, U∈H yU · |U| · |V − U| decreases, contradicting the choice of y . Hence F is laminar, and the Optimum Arborescence Theorem follows, since we have proved that the system x ≥ 0, Cx ≥ 1 is TDI. 66 / 143 TDI Applications r -arborescences polytope From the above proof, we obtain that the r -arborescences polytope of a digraph D = (V , A), r ∈ V , convhull({χP |P r -arborescence}, is described by 0 ≤x ≤ 1 X a xa ≥ 1 ∀a ∈ A ∀U ⊆ V − {r }, U 6= ∅ a∈δ − (U) This description gives, via the ellipsoid method, a polynomial time algorithm to find a minimum length r -arborescence. Indeed, given x ∈ QA we first test if 0 ≤ xa ≤ 1 for each a; if 0 > xa or xa > 1 we have a separating hyperplane. Otherwise, consider x as a capacity function and find C (with a maximum flow algorithm), a minimum capacity r -cut. If C has capacity at least 1, then x belongs to the r -arborescences polytope, otherwise C gives a separating hyperplane. 67 / 143 TDI Applications Directed-cut polytope One similarly show that, for a digraph D = (V , A), the following polytope is TDI: 0 ≤ x ≤ 1 ∀a ∈ A X a xa ≥ 1 ∀U ⊆ V , U 6= ∅, and δ + (U) = ∅ a∈δ − (U) This is equivalent with the Luchessi and Younger Theorem. A directed-cut is a set of arcs of the form δ − (U), ∅ = 6 U 6= V and δ + (U) = ∅. A directed-cut covering is a set of arcs intersecting each directed cut. Equivalently, is a set of arcs whose contraction makes the digraph strongly connected. 68 / 143 TDI Applications Luchessi and Younger Theorem The minimum size of a directed-cut covering in a digraph D = (V , A) is equal to the maximum number of pairwise disjoint directed cuts. This theorem has a nice self-refining nature: for any ”length” function l : A → Z+ , the minimum length of a directed cut covering is equal to the maximum number t of directed cuts C1 , . . . , Ct (repetion allowed), so that no arc a is in more than l(a) of these cuts (to derive this from the above theorem, replace each arc a by a path of length l(a)). In the weighted form, the theorem is easily seen to be equivalently to the TDI of the directed-cut polytope. 69 / 143 TDI Applications Polymatroid Intersections Let S be a finite set. f : 2S → R is called a submodular function if f (T ) + f (U) ≥ f (T ∪ U) + f (T ∩ U) for all T , U ⊆ S. The rank function of any matroid is submodular. A matroid can be defined as a pair M = (S, I) where I ⊆ 2S , the family of independent sets of M, satisfies (i) ∅ ∈ I, (ii) if I ∈ I then I 0 ∈ I for each I 0 ⊆ I and (iii) the rank function of M, ρ : 2S → Z+ with ρ(A) = max{|I ||I ∈ I, I ⊆ A} is submodular. 70 / 143 TDI Applications Polymatroid Intersections Let f1 , f2 be two submodular functions on S and consider the following system in the variable x ∈ RS : (∗) X xs ≥ 0 ∀s ∈ S, xs ≤ f1 (U) ∀U ⊆ S, xs ≤ f2 (U) ∀U ⊆ S, s∈U X s∈U Theorem (Edmonds). The above system (∗) is TDI. 71 / 143 TDI Applications Polymatroid Intersections Proof. Let c ∈ ZS ; the dual of the LP max{c T x|x satisfies (∗)} is: min nX yU f1 (U) + U⊆S X U⊆S S zU f2 (U)|y , z ∈ R2 , X (yU + zU )χU ≥ c o U⊆S Uncrossing technique to show that this minimum has an integral solution: let y , z attaining this minimum and P U⊆S (yU + zU ) · |U| · |S − U| is as small as possible. F := {U|yU > 0} forms a chain with respect to inclusion (the proof, by resetting y , is similar to that on slide 66). Similarly, G := {U|zU > 0} forms a chain. Since y and z attain minimum above and F and G form chains with respect to inclusion, it follows that this minimum is o nX X X X zU f2 (U)|y ∈ RF , z ∈ RG ; yU χU + zU χ U ≥ c yU f1 (U)+ min U∈F U∈G U∈F U∈G The constraint matrix in this new problem is totally unimodular, by Ghouila-Houry’s criterion. Hence it has an integral optimal solution y , z which can be extended with 0’s to obtain an integral optimal solution for 72 / 143 the first minimum. TDI Applications Polymatroid Intersections If f1 and f2 are integer-valued submodular functions, the TDI of system (∗) implies that it define an integral polyhedron. In particular, if f1 and f2 are the rank functions of the matroids M1 = (S, I1 ) and M2 = (S, I2 ), we obtain Corollary (Edmonds 1970). The polytope convhull{χI |I ∈ I1 ∩ I2 } is determined by (∗). Proof. Observe that an integral vector satisfies (∗) if and only if it is equal to χI for some I ∈ I1 ∩ I2 . A very special case M1 = M2 : The independence polytope of a matroid M = (S, I) with rank function f , convhull{χI |I ∈ I}, is determined by: xs ≥ 0 ∀s ∈ S, X xs ≤ f (U) ∀U ⊆ S, s∈U 73 / 143 TDI Applications Edmonds’ Matroid Intersection Theorem The maximum size of a common independent set of two matroids (S, I1 ) and (S, I1 ) is equal to minU⊆S [f1 (U) + f2 (S − u)], where f1 and f2 are the rank functions of these matroids. Proof. By the above corollary, the maximum size of a common independent set is max{1T x|x satisfies (∗)}, and hence, by TDI of (∗), to nX o X X S min yU f1 (U)+ zU f2 (U)|y , z ∈ R2 , (yU +zU )χU ≥ 1 U⊆S U⊆S U⊆S It is not difficult to show (using the nonnegativity, monotony and submodularity of f1 and f2 ) that this last minimum is equal to that given in the theorem. 74 / 143 TDI Applications Matching polytope The matching polytope of a graph G = (V , E ) is convhull{χM |M matching in G }. Edmonds showed that this ca be described as the set of all vectors x ∈ RE satisfying xe ≥ 0 ∀e ∈ E , X (∗∗) xe ≤ 1 ∀v ∈ V , e3v X e⊆U xe ≤ j1 2 k |U| ∀U ⊆ V , Since the integral vectors satisfying (∗∗) are exactly the incidence vectors χM of matchings M of G , it suffices to show that (∗∗) determines an integral polyhedron. Theorem (Cunningham and Marsh, 1978). System (∗∗) is TDI. 75 / 143 TDI Applications What means that (∗∗) is TDI? For each w ∈ ZE both optima in the LP-duality equation nX X j 1 k V max{w T x|x satisf.(∗∗)} = min yv + zU |U| y ∈ ZV+ , z ∈ Z2+ , 2 v ∈V U⊆V ∀e ∈ E : X yv + v ∈e X zU ≥ w e o U⊇e are attained by integral optimum solutions. It means, that for every undirected graph G = (V , E ) and every ”weight” function w : E → Z we have: (1) nX X j 1 k V max{w (M)|M matching} = min yv + zU |U| y ∈ ZV+ , z ∈ Z2+ , 2 v ∈V U⊆V ∀e ∈ E : X v ∈e yv + X zU ≥ w e o U⊇e 76 / 143 TDI Applications Proof that (∗∗) is TDI (1) From (1) it follows that we can suppose that w is nonnegative. Let νw the maximum value in (1). Since the inequality ≤ in (1) follows by duality, we show only inequality ≥. Suppose that it does not hold and choose G and w violating (1) with |V | + |E | + w (E ) as small as possible. Then G is connected and we ≥ 1 for each edge e. Case 1 ∃v ∈ V covered by every maximum-weighted matching. Let w 0 be the weight obtained from w by decreasing the weights of edges incident to v by 1. Then νw 0 = νw − 1. Since w 0 (E ) < w (E ), (1) holds for w 0 . Increasing by 1 the component yv of optimal y for w 0 , we obtain that (1) holds for w . Case 2 6 ∃v ∈ V covered by every maximum-weighted matching. Let w 0 be the weight obtained from w by decreasing the weights of all edges by 1. We will show that νw ≥ νw 0 + b 12 |V |c. Since w 0 (E ) < w (E ), (1) holds for w 0 . Increasing by 1 the component zV of optimal z for w 0 , we obtain that (1) holds for w . 77 / 143 TDI Applications Proof that (∗∗) is TDI (2) Suppose that νw < νw 0 + b 12 |V |c and let M be a matching with w 0 (M) = νw 0 and such that w (M) is as large as possible. M leaves two vertices uncovered, since otherwise w (M) = w 0 (M) + b 12 |V |c, implying νw ≥ w (M) = w 0 (M) + b 12 |V |c = νw 0 + b 12 |V |c. Let M and vertices u, v uncovered by M s. t. distance d(u, v ) in G is as small as possible. Then d(u, v ) > 1, else we can add the edge {u, v } to M increasing w (M). Let t an internal vertex on the shortest path between u and v . Let M 0 a matching with w (M 0 ) = νw not covering t. Let P the edge set of the component of [M∆M 0 ]G containing t. P is a path not covering both u and v . Suppose u is not covered by P. M∆P is a matching with |M∆P| < |M| and w 0 (M∆P) − w 0 (M) = w (M∆P)− |M∆P| − w (M) + |M| ≥ w (M∆P) − w (M) = w (M 0 ) − w (M 0 ∆P) ≥ 0. Hence νw 0 = w 0 (M∆P) and w (M∆P) ≥ w (M). But, M∆P does not cover t and u and d(u, t) < d(u, v ), contradicting the choice of M, u, and v . 78 / 143 TDI Applications Since (∗∗) is TDI it follows: Edmonds’ Matching Polyhedron Theorem The mathcing polyhedron of a graph is equal to the polyhedron determined by (∗∗). Perfect matching polytope of a graph G convhull({χM |M perfect matching in G }) xe ≥ 0 ∀e ∈ E , is determined by X xe = 1 ∀v ∈ V , e3v X xe ≥ 1 ∀U ⊆ V , |U| odd e∈δ(U) Note that the last two groups of restrictions imply the last group of restrictions in (∗∗). 79 / 143 Blocking Polyhedra Fulkerson (1970) One polyhedral characterization (or min-max relation) can be derived from another one, and conversely. Basic Idea: Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ satisfy (1) convhull{c1 , . . . , cm } + Rn+ = {x ∈ Rn+ |djT x ≥ 1, j = 1, t}. Then the same holds after interchanging the ci and dj : (2) convhull{d1 , . . . , dt } + Rn+ = {x ∈ Rn+ |ciT x ≥ 1, i = 1, m}. In a sense, in (2) the idea of ”vertex” and ”facet” are interchanged as compared to (1). 80 / 143 Blocking Polyhedra Theorem. For any c1 , . . . , cm , d1 , . . . , dt ∈ Rn+ , (1) ⇔ (2). Proof. We show (1) ⇒ (2) (⇐ follows by symmetry). Suppose (1) holds. Then, in (2) ⊆ holds. Indeed, from (1) we have that ciT dj (= djT ci ) ≥ 1 for all i, j. If x ∈ convhull{d1 , . . . , dt } + Rn+ , Pt x = λ1 d1 + · · · + λt dt + u with λk ≥ 0, k=1 λk = 1 and u ≥ 0. Then, P P t t ciT x = k=1 λk ciT dk + ciT u ≥ k=1 λk · 1 + ciT u ≥ 1 + 0 = 1, that is x ∈ {x ∈ Rn+ |ciT x ≥ 1, i = 1, m}. To show that in (2) ⊇ holds, suppose x 6∈ convhull{d1 , . . . , dt } + Rn+ . Then, there is a separating hyperplane: ∃y ∈ Rn s.t. y T x < min{y T z|z ∈ convhull{d1 , . . . , dt } + Rn+ } (3). We may assume t ≥ 1 [for t = 0 it follows from (1) that 0 ∈ {c1 , . . . , cm } and x is not from the right side of (2)]. By scaling y , we can suppose that minimum in (3) is 1.PHence y belongs to the right side (1) and also to the left Pof m m side: y ≥ i=1 λi ci for certain λi ≥ 0 and i=1 λi = 1. Since y T x < 1 we obtain that ciT x < 1 for at least one i. Hence x is not from the right side of (2). 81 / 143 Blocking Polyhedra Blocking pair of polyhedra For X ⊆ Rn the blocker B(X ) of X is B(X ) := {x ∈ Rn+ |y T x ≥ 1 for each y ∈ X }. For c1 , . . . , cm ∈ Rn+ , the blocker B(P) of the polyhedron P = convhull{c1 , . . . , cm } + Rn+ (4) can be expressed as B(P) = {x ∈ Rn+ |ciT x ≥ 1, i = 1, m}. So B(P) is also a polyhedron, called the blocking polyhedron of P. If R = B(P) then the pair P, R is called a blocking pair of polyhedra. Corollary 1. For any polyhedron P of type (4), B(B(P)) = P. Both relations (1) and (2) are equivalent to: the pair convhull{c1 , . . . , cm } + Rn+ and convhull{d1 , . . . , dt } + Rn+ forms a blocking pair of polyhedra. 82 / 143 Blocking Polyhedra Blocking pair of polyhedra Corollary 2. Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ . The following are equivalent (i) ∀l ∈ Rn+ : min{l T c1 , . . . , l T cm } = = max{λ1 + · · · + λt |λ1 , . . . , λt ∈ R+ , t X λj dj ≤ l} j=1 (ii) ∀w ∈ Rn+ : min{w T d1 , . . . , w T dt } = = max{µ1 + · · · + µm |µ1 , . . . , µm ∈ R+ , m X µi ci ≤ w } i=1 Proof. By LP duality, the maximum in (i) is equal to T min{l T x|x ∈ RN + , dj x ≥ 1, j = 1, t}. Hence (i) is equivalent to (1). Similarly, (ii) is equivalent to (2). Hence theorem on slide 81 implies Corrolary 2. 83 / 143 Blocking Polyhedra Lehman’s Length-Width inequality Corollary 2.Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ . Then (1) [equivalently (2), (i), and (ii)] holds if and only if (∗) (∗∗) djT ci ≥ 1 ∀i = 1, m; ∀j = 1, t min{l T c1 , . . . , l T cm } · min{w T d1 , . . . , w T dt } ≤ l T w ∀l, w ∈ Zn+ Proof. Suppose that (∗) and (∗∗) holds. We derive (i). Let l ∈ Rn+ . By LP duality, the maximum in (i) is equal to min{l T x|x ∈ Rn+ , djT x ≥ 1, j = 1, t}. Let this minimum be attained by a vector w . Then by (∗) and (∗∗) we have l T w ≥ (min l T ci )(min w T dj ) ≥ min l T ci ≥ l T w . i j Hence the minimum in (i) is equal to l T w . i Conversely, suppose that (1) holds. Then (i) and (ii) holds. Now (∗) follows by taking l = dj in (i). 84 / 143 Blocking Polyhedra Lehman’s Length-Width inequality To show (∗∗) let λ1 , . . . , λt , µ1 , . . . , µm attaining the maxima in (i) and (ii), respectively. Then X X XX XX λj µi = λj µi ≤ λj µi djT ci = j i = j X j λj dj T i X j i T µi ci ≤ l w . i This implies (∗∗). Remark. It follows from the ellipsoid method that if d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ satisfy (1) [equivalently (2), (i), and (ii)], then ∀l ∈ Rn+ : min{l T c1 , . . . , l T cm } can be found in polynomial time if and only if ∀w ∈ Rn+ : min{w T d1 , . . . , w T dt } can be found in polynomial time This is particularly interesting when one of t or m is exponentially large. 85 / 143 Blocking Polyhedra – Applications Shortest Paths and Network Flows Given a digraph D = (V , A), a source node s and a sink node t. Let c1 , . . . , cm ∈ RA+ be the incidence vectors of the s-t-paths in D. Let d1 , . . . , dt ∈ RA+ be the incidence vectors of the s-t-cuts. Given l : A → (R)+ a ”length” function, the minimum-length of an s-t-path is equal to the maximum number of s-t-cuts (repetition allowed) so that no arc a ∈ A is in more than l(a) of these cuts. [indeed, inequality min ≥ max is easy; converse inequality: if p is the minimum-length of an s-t-path, then let Vi = {v ∈ V | the length of the shortest s-t path is ≥ i} for i = 1, p. Then, δ − (V1 ), . . . , δ − (Vp ) are the required s-t-cuts.] This implies (i). Hence (ii) holds (slide 83). But this is equivalent to the max-flow-min-cut theorem: 86 / 143 Blocking Polyhedra – Applications Shortest Paths and Network Flows the maximum amount of an s-t-flow subject to a capacity function w is P equal to the minimum capacity of an s-t-cut (observe that i µi ci is an s-t-flow). This means that the polyhedra convhull{c1 , . . . , cm } + R+ and convhull{d1 , . . . , dt } + R+ form a blocking pair of polyhedra. By the remark on slide 85, the polynomial-time solvability of the minimum capacitated cut problem is equivalent to that of the shortest-path problem. The latter problem is much easier. 87 / 143 Blocking Polyhedra – Applications r -arborescencs We know (see slide 67) that the r -arborescences polytope of a digraph D = (V , A), r ∈ V , convhull({χP |P r -arborescence}, is described by 0 ≤ x ≤ 1 ∀a ∈ A X a xa ≥ 1 ∀U ⊆ V − {r }, U 6= ∅ a∈δ − (U) It follows that (1) (slide 80) holds. Since (2) is equivalent to (1), ⇒ for any ”capacity” function w ∈ RA+ , the minimum capacity of an r -cut is equal to the maximum value of µ1 , . . . , µk , where µ1 , . . . , µk ≥ 0 are such that there exist r -arborescences T1 , . . . , Tk with the property that for any arc a, the sum of µj for which a ∈ Tj is at most ca . Hence the convex hull of the incidence vectors of sets containing an r -cut as a subset is determined by the system (in x ∈ RA ): 0 ≤ xa ≤ 1 ∀a ∈ A X xa ≥ 1 ∀T r -arborescence a∈T 88 / 143 Anti-blocking Polyhedra Fulkerson (1971) Results analogous to those for blocking polyhedra, proofs omitted. Let d1 , . . . , dt , c1 , . . . , cm ∈ Rn+ be such that dim(hc1 , . . . , cm i) = dim(hd1 , . . . , dt i) = n. Then the following are equivalent (1) (convhull{c1 , . . . , cm } + Rn− ) ∩ Rn+ = {x ∈ Rn+ |djT x ≤ 1, j = 1, t} (2) (convhull{d1 , . . . , dt } + Rn− ) ∩ Rn+ = {x ∈ Rn+ |ciT x ≤ 1, i = 1, m} If X ⊆ Rn , then the anti-blocker A(X ) of X is A(X ) := {x ∈ Rn+ |y T x ≤ 1 ∀y ∈ X } 89 / 143 Anti-blocking Polyhedra Anti-blocking pair of polyhedra If P = (convhull{c1 , . . . , cm } + Rn− ) ∩ Rn+ then A(P) = {x ∈ Rn+ |ciT x ≤ 1, i = 1, m} A(P) is called anti-blocking polyhedron of P. If R = A(P) then the pair P, R is called an anti-blocking pair of polyhedra. Clearly A(A(P)) = P for any polyhedron P of the above type. 90 / 143 Anti-blocking Polyhedra Theorem Each of the following are equivalent to (1) and (2): (a)(convhull{c1 , . . . , cm } + Rn− ) ∩ Rn+ , (convhull{d1 , . . . , dt } + Rn− ) ∩ Rn+ is an anti-blocking pair of polyhedra. (b) ∀l ∈ Rn+ : max{l T c1 , . . . , l T cm } = P min{λ1 + . . . + λt |λ1 , . . . , λt ∈ R+ ; j λj dj ≥ l} (c) ∀w ∈ Rn+ : max{w T d1 , . . . , w T dt } = P min{µ1 + . . . + µm |µ1 , . . . , µm ∈ R+ ; i µi ci ≥ w } (d) (i) djT ci ≤ 1, ∀i = 1, m; ∀j = 1, t (ii) max{l T c1 , . . . , l T cm } · max{w T d1 , . . . , w T dt } ≥ l T w ∀l, w ∈ Zn+ 91 / 143 Anti-blocking Polyhedra – Applications Perfect Graphs G = (V , E ) is a perfect graph (Berge 1960) if the chromatic number of any induced subgraph G 0 of G is equal to the maximum cardinality of a clique in G 0 . If C is the matrix having as rows the incidence vectors of cliques in G , then G is perfect if and only if for each {0, 1}-vector w the (dual) linear programs (∗) max{w T x|x ≥ 0, Cx ≤ 1} = min{y T 1|y ≥ 0, y T C ≥ w T } have integer optimal solutions. The stable-set polytope of the graph G = (V , E ) is STAB(G ) = convhull({χS |S stable set in G }. Let Ch(G ) be the polytope xv ≥ 0 X xv ≤ 1 ∀v ∈ V ∀K clique in G v ∈K Note that Ch(G ) = A(STAB(G )), where G is the complement of G . Also, STAB(C5 ) ( Ch(C5 ), where C5 is the circuit on 5 vertices. 92 / 143 Anti-blocking Polyhedra – Applications Chv´ atal’s Theorem G = (V , E ) is a perfect graph if and only if STAB(G ) = Ch(G ). This equivalently to G = (V , E ) is a perfect ⇔ STAB(G ) = A(STAB(G )). Lov´ asz’s Perfect Graph Theorem The complement of a perfect graph is a perfect graph. Proof. If G is perfect then STAB(G ) = A(STAB(G )). Hence STAB(G ) = A(A(STAB(G ))) = A(STAB(G )). Therefore G is perfect. It follows (ellipsoid method) that a maximum-weight stable set in a perfect graph can be found in polynomial time iff a maximum-weight clique can be found in a polynomial time. Since the complement of a perfect graph is again perfect, this would not give any reduction of a problem to another! 93 / 143 Anti-blocking Polyhedra Proof of Chv´ atal’s Theorem:G perfect ⇒ STAB(G ) = Ch(G ) Proof. Suppose G perfect. Let αw be the maximum weight of a stable set of G for w : V → Z+ . We prove ( P) αw = max{w T x|x ≥ 0, Cx ≤ 1} for each w ∈ ZV+ , by induction on v ∈V wv . If w is a {0, 1}-vector, this follows from (∗) definition of a perfect graph. Let wu ≥ 2 for some u ∈ V . Let e be the vector with eu = 1 and ev = 0 for v ∈ V − {u}. Replacing w by w − e in (∗), we obtain, by induction, a vector y ≥ 0 so that y T C ≥ (w − e)T and y T 1 = αw −e . Since (w − e)u ≥ 1, there is a clique K s.t. yK > 0 and u ∈ K . We may assume that χK ≤ w − e. Then αw −χK < αw (◦). Hence, αw = 1 + αw −e = 1 + max{(w − e)T x|x ≥ 0, Cx ≤ 1} ≥ max{w T x|x ≥ 0, Cx ≤ 1}, implying ( ). Proof Pof αw −χK < αw (◦): suppose αw −χK = αw . If S is a stable set s.t. , we have K ∩ S = ∅. But, v ∈S (w − e)v = α. Since αw −χK = αw P K since w − χ ≤ w − e ≤ w , we know that v ∈K (w − e)v = αw −e , and by complementarity slackness, |K ∩ S| = 1, contradiction. 94 / 143 Anti-blocking Polyhedra Proof of Chv´ atal’s Theorem: STAB(G ) = Ch(G ) ⇒ G perfect Proof. Suppose STAB(G ) = Ch(G ). Then max{w T x|x ≥ 0, Cx ≤ 1} is V attained by the incidence P vector of a stable set, for each w ∈ Z+ . We show, by induction on v ∈V wv , that min{y T 1|y ≥ 0, y T C ≥ w T } has an integer optimal solution for each {0, 1}-vector w (see (∗) definition of perfect graphs). Let w be a {0, 1}-vector, and let y a not-necessary integer optimal solution of min{y T 1|y ≥ 0, y T C ≥ w T }. Let K be a clique with yK > 0. Then the common optimal value of max{(w − χK )T x|x ≥ 0, Cx ≤ 1} = min{y T 1|y ≥ 0, y T C ≥ (w − χK )T } is less then max{w T x|x ≥ 0, Cx ≤ 1}, since by complementary slackness, each optimal solution of this last problem must satisfy χK )T x = 1. As these optmal values are integers, they differ by exactly 1. By induction, the above minimum has an optimal integer solution y . Increasing yK by 1, gives an integral solution of min{y T 1|y ≥ 0, y T C ≥ w T }. 95 / 143 Anti-blocking Polyhedra A polynomial time algorithm for finding a maximum-weight stable set in a perfect graph. Let G = (V , E ) be a graph with V = {1, . . . , n}. Let M(G ) ⊂ R(n+1)×(n+1) the set of all matrices Y = (yi,j )ni,j=0 satisfying: (i) Y is symmetric and positive semi-definite; (ii) y00 = 1, y0i = yii , i = 1, . . . , n; (iii) yij = 0 if i 6= j and {i, j} ∈ E . It follows that M(G ) is convex (not necessary a polyhedra). Let TH(G ) be the set of vectors x ∈ Rn for which there is Y ∈ M(G ) so that xi = yii for i = 1, . . . , n. TH(G ) – approximation of STAB(G ) at least as good as A(STAB(G )): 96 / 143 Anti-blocking Polyhedra A polynomial time algorithm for finding a maximum-weight stable set in a perfect graph. Theorem. STAB(G ) ⊆ TH(G ) ⊆ A(STAB(G )). Proof.(Let S be a stable set in G . Let Y ∈ M(G ) with 1 if i, j ∈ S ∪ {0}, yij = . Clearly, χSi = yii , so χS ∈ TH(G ), hence 0 otherwise. STAB(G ) ⊆ TH(G ). The second inclusion. If x ∈ TH(G ) then (since the elements on the diagonal of a positive semi-definite matrix are non-negative) x ≥ 0. It suffices to show that if x ∈ TH(G ) and u = χS for S stable set in G , then u T x ≤ 1. Let x obtained by taking the last n elements of the diag (Y ) forsome Y ∈ M(G ). Since Y is positive semi-definite 1 (1 − u T )Y ≥ 0. As yij = 0 if {i, j} ∈ E and S is a clique in G , it −u P P follows from the above inequality that 1 − 2 i∈S yi0 + i∈S yii ≥ 0. Since xi = yi0 = yii , this implies u T x ≤ 1. 97 / 143 Anti-blocking Polyhedra A polynomial time algorithm for finding a maximum-weight stable set in a perfect graph. Theorem (Gr¨ otschel, Lov´ asz, Schrijver, 1981) There is a polynomial time algorithm to find a maximum-weight stable set in a perfect graph. Proof. If G is perfect then STAB(G ) = A(STAB(G )), hence by the above theorem STAB(G ) = TH(G ). The theorem follows since any linear objective function w T x can be maximized over TH(G ) in polynomial time. [Linear objective function maximization over M(G ) can be done in polynomial time, using ellipsoid method since we can solve linear separation on M(G ) in polynomial time: for any given Y ∈ R(n+1)×(n+1) we test the conditions (i)- (iii) in polynomial time, in such a way that we find a separating hyperplane in R(n+1)×(n+1) if Y 6∈ M(G ).] 98 / 143 Anti-blocking Polyhedra Matchings and Edge-colorings The matching polytope, Pmat (G ) of a graph G = (V , E ) is is described (see slides 75 - 78) as the set of all vectors x ∈ RE satisfying (∗∗) X xe ≥ 0 ∀e ∈ E , xe ≤ 1 ∀v ∈ V , xe ≤ e3v X e⊆U j1 2 |U| k ∀U ⊆ V , By scalar multiplication, system (∗∗) can be normalized, determining Pmat (G ) to: x ≥ 0, Cx ≤ 1 for a certain matrix C (deleting in (∗∗) rows corresponding to U ⊆ V with |U| ≤ 1). The anti-blocking polyhedron, A(Pmat (G )) is equal to {z ∈ RE+ |Dx ≤ 1}, where the rows of D are the incidence vectors of all matchings in G . 99 / 143 Anti-blocking Polyhedra Matchings and Edge-colorings Taking l = 1 in relation (b) on slide (91), we obtain max{∆(G ), max U⊆V ,|U|≥2 |hUi| j k } = min{y T 1|y ≥ 0, y T D ≥ 1T }. 1 2 |U| Here hUi denotes the set of edges contained in U. The minimum above is called fractional edge coloring number χ∗ (G ) of G . If the minimum is attained by an integral optimum solution, it is equal to edge coloring number χ0 (G ) of G , since χ0 (G ) = min{y T 1|y ≥ 0, y T D ≥ 1T , y integral }. By Vizing’s theorem χ0 (G ) ∈ {∆(G ), ∆(G ) + 1}. For G the Peterson’s graph, we have χ∗ (G ) = ∆(G ) = 3, while χ0 (G ) = 4. 100 / 143 Cutting Planes Integer Hull If P ⊆ Rn , then PI , the integer hull of P is PI = convhull{x|x ∈ P, x integral }. Clearly, If P is bounded then PI is a polytope. For most combinatorial optimization problems it is not difficult to find a set of linear inequalities, determining a polyhedron P, in which the integral vectors are the incidence vectors of sets corresponding to combinatorial optimization. Challenge: describing PI by linear inequalities! Gomory, 1960: Cutting-plane method for linear integer programming. Chv´ atal, 1973, derived from this method, an interesting iterative process to characterize PI . 101 / 143 Cutting Planes Integer Hull A rational affine halfspace (rah), is a set H = {x|c T x ≤ δ}, where c ∈ Qn , c 6= 0, and δ ∈ Q. We may assume that the components of H are relatively prime integers, ⇒ HI = {x|c T x ≤ bδc}. If P is a polyhedron then P0 = \ HI . H rah, H⊇P A ∈ Qm×n , b ∈ Qm : T {x|Ax ≤ b}0 = {x|(u T A)x ≤ bu T bc ∀u ∈ Qm + , with u A integral}. {x|x ≥ 0, Ax ≤ b}0 = {x|x ≥ 0; bu T Acx ≤ bu T bc ∀u ∈ Qm + }. Note that if P ⊂ H then PI ⊂ HI . Hence PI ⊆ P 0 . 102 / 143 Cutting Planes Integer Hull For t ∈ N, define P (t) ( P if t = 0 = . Then (t−1) 0 (P ) if t ≥ 1. P = P 0 ⊇ P 1 ⊇ . . . ⊇ PI . Theorem (Blair, Jeroslow (1982), Cook, Lov´ asz, Schrijver (1986) For each rational matrix A there exist a number t such that for each column vector b one has: {x|Ax ≤ b}(t) = {x|Ax ≤ b}I . Corolary. For each polytope P there is a number t such that P (t) = PI . The Chv´ atal rank of the rational matrix A ∈ Qm×n is the smallest t for which {x|Ax ≤ b}(t) = {x|Ax ≤ b}I for each b ∈ Rm . m×n The strong Chv´ atal rank of the rational is the matrix A ∈ Q Chv´atal rank of the matrix I −I . A −A 103 / 143 Cutting Planes Integer Hull From HK theorem (slide 37), an integral matrix has strong Chv´atal rank 0 if and only if is totally unimodular. No similar characterization are known for higher Chv´atal rank. Example. For any graph G = (V , E ) let P the polytope determined by X xe ≥ 0 ∀e ∈ E , xe ≤ 1 ∀v ∈ V . e3v Clearly, PI is the matching polytope of G . It can be proved that P 0 is xe ≥ 0 ∀e ∈ E , X xe ≤ 1 ∀v ∈ V , e3v X e⊆U xe ≤ j1 2 |U| k ∀U ⊆ V . Edmond’s Matching Polyhedron Theorem: P 0 = PI ( PI arises from P by one ”round” of cutting planes.) 104 / 143 Hard Problems – Complexity of the Integer Hull Karp and Papadimitriou (1982) For any class (FG |G graph), if the problem OP: Given G = (V , E ) a graph; c : E → Q, FG a collection of Psubsets of E , find F0 ∈ FG s.t. c(F0 ) = maxF ∈FG c(F ) (where, c(F ) = e∈F c(e)) is NP-hard and NP6=co-NP, then the class of polytopes convhull{χF |F ∈ FG } has difficult facets, i.e., there is no polynomial Φ s.t. for any graph G , c ∈ ZE and δ ∈ Q with the property that c T x ≤ δ defines a facet of convhull{χF |F ∈ FG }, the fact that c T x ≤ δ is valid for each χF with F ∈ FG has a proof of length at most Φ(|V | + |E | + size(c) + size(δ)). Note that for the matching polyhedron P (see previous slide) PI = P 0 has exponentially many inequalities, but each facet-defining inequality is of form described in P 0 and for them is easy to prove that they are valid for the matching polytope. 105 / 143 Hard Problems – Complexity of the Integer Hull Boyd and Pulleyblank (1984) Suppose that for a given class (FG |G graph), for each G = (V , E ) the polytope PG in RE satisfies (PG )I = convhull{χF |F ∈ FG } and has the property that the problem given G = (V , E ) a graph and c ∈ RE , find max{c T x|x ∈ PG } is polynomially solvable. Then, If OP: Given G = (V , E ) a graph; c : E → Q, FG a collection of subsets of E , find F0 ∈ FG s.t. c(F0 ) = maxF ∈FG c(F ) is NP-hard and NP6=co-NP, then there is no fixed t such that for each graph G (PG )(t) = convhull{χF |F ∈ FG }. 106 / 143 Hard Problems – Complexity of the Integer Hull Example: The stable-set polytope Let STAB(G ) = convhull{χS |S stable set in G } be the stable-set polytope of the graph G = (V , E ). Let Ch(G ) be the polytope xv ≥ 0 ∀v ∈ V X xv ≤ 1 ∀K clique in G v ∈K So, Ch(G ) = A(STAB(G )) (see slide 92). Clearly, STAB(G ) ⊆ Ch(G ). Since the integral vectors in Ch(G ) are exactly the incidence vectors of stable sets we have STAB(G ) = Ch(G )I . Chv´ atal (1984): there is no fixed t such that (Ch(G ))(t) = Ch(G )I , even we restrict G to graphs with stability number α(G ) = 2. By Chv´atal theorem (see slide 93), the class of graphs G with Ch(G )I = ch(G ) is exactly the class of perfect graphs. 107 / 143 Hard Problems – Complexity of the Integer Hull Example: The stable-set polytope Chv´ atal raised the question of whether exists, for each fixed t, a polynomial time algorithm determining α(G ) for graphs G with (Ch(G ))(t) = Ch(G )I . This is true for t = 0, that is for perfect graphs (via ellipsoid method, see slide 98). Minty (1980), Sbihi (1980) extended Edmond’s maximum weighted matching algorithm, to obtain a polynomial algorithm to find a maximum weighted stable set in a K1,3 -free graph. By theorem on slide 35, the separation problem for the stable polytope of K1,3 -free graphs can be solved in polynomial time. However, no explicit description of a linear inequality system for the stable-set polytope of K1,3 -free graphs is known. This would extend Edmond’s description of the matching polytope! Note that, by Chv´atal result on slide 107, there is no fixed t such that (Ch(G ))(t) = Ch(G )I for all K1,3 -free graphs. 108 / 143 Hard Problems – Complexity of the Integer Hull Example: The stable-set polytope The most natural ”relaxation” of STAB(G ), for the graph G = (V , E ), is the polytope Q(G ): xv ≥ 0 ∀v ∈ V , xv + xw ≤ 1 ∀{v , w } ∈ E . Clearly, Q(G )I = STAB(G ). Since Ch(G ) ⊆ Q(G ), 6 ∃ t with Q(G )(t) = Q(G )I for all graphs G . It is not difficult to see hat Q(G )0 is: xv ≥ 0 ∀v ∈ V , xv + xw X xv v ∈C ≤ 1 ∀{v , w } ∈ E , |C | − 1 ≤ ∀C vertex set of an odd circuit in G 2 109 / 143 Hard Problems – Complexity of the Integer Hull Example: The stable-set polytope Gerards and Schrijver (1986): If G has no subgraph H which arises from K4 by replacing edge by paths such that each triangle in K4 becomes an odd circuit, then Q(G )0 = STAB(G ). Graphs G with Q(G )0 = STAB(G ) are called by Chv´atal t-perfect. n X Let A ∈ Zm×n satisfying: |aij | ≤ 2 i = 1, . . . , m. j=1 Gerards and Schrijver (1986): A has strong Chv´a tal rank at most 1 if and only if it can not be transformed to the matrix 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 0, 0 1 0 1 0011 by a sequence of the following operations: deleting T or permuting rows or 1c columns or multiplying them by -1; replacing by D − bc T , where bD D is a matrix and b, c are column vectors. 110 / 143 Hard Problems – Complexity of the Integer Hull Example: The traveling-salesman polytope The traveling-salesman polytope of a graph G = (V , E ) is equal to convhull{χH |H ⊆ E , H Hamiltonian circuit}. As the traveling salesman problem is NP-hard, if NP6=co-NP, then the traveling-salesman polytope will have ”difficult” facets (Karp and Papadimitriou, slide 105). Let P ⊆ RE defined by: 0 ≤ xe ≤ 1 ∀e ∈ E , X xe = 2 ∀v ∈ V , v ∈e X xe ≥ 2 ∀U ⊆ V , 3 ≤ |U| ≤ |V | − 1. e∈δ(U) The integral vectors in P are exactly the incidence vectors of Hamiltonian circuits. Hence, PI is the traveling-salesman polytope. 111 / 143 Hard Problems – Complexity of the Integer Hull Example: The traveling-salesman polytope The problem of minimizing a linear function c T x over P is polynomially solvable with the ellipsoid method, since the above system of inequalities can be checked in polynomial time (the last group can be checked by reduction to a minimum-cut problem). So, if NP6=co-NP, by Boyd and Pulleyblank’s result (slide 106), there is no fixed t such that P (t) = PI for each graph G . The above system has been useful in solving large instances of the traveling salesman problem: ∀c ∈ QE , the minimum of c T x over P is a lower bound for TSP, which can be computed with the simplex method using a row-generating technique. This lower bound can be used in a branch-and-bound procedure for solving TSP. 112 / 143 Miscellanea The matching polytope of a bipartite graph – revised If G = (V , E ) is a bipartite graph, c : E → R+ , and A is the incidence matrix of G , we proved (slide 42) using HK theorem the (simple) fact: maximum weight of a matching = max{c T x|x ≥ 0, Ax ≤ 1} Solving the right hand problem via linear programming yields a polynomial-time algorithm to solve the left hand one. Indeed, let x be an optimal solution to the above LP. If x is integral, then we are done. If not, we describe a procedure that yields another optimal solution with strictly more integer coordinates than x. We then reach an integral optimal solution by at most |E | repetitions of this procedure. 113 / 143 Miscellanea The matching polytope of a bipartite graph – revised Let H be the subgraph of G spanned by {e ∈ E |xe 6∈ {0, 1}}. Case 1. H contains a cycle C = (v1 , v2 , . . . , vk , v1 ). Since H is bipartite, C is even. Let = mine∈E (C ) min{xe , 1 − xe }. Let x 0 and x 00 be: xe − if e = vi vi+1 , 1 ≤ i ≤ k − 1, i odd, xe0 = xe + if e = vi vi+1 , 1 ≤ i ≤ k, i even, and x if e ∈ E − E (C ) e xe + if e = vi vi+1 , 1 ≤ i ≤ k − 1, i odd, xe00 = xe − if e = vi vi+1 , 1 ≤ i ≤ k, i even, where the indices are xe if e ∈ E − E (C ) modulo k. These are two feasible solutions to the LP. Moreover, X X 1 X 0 xe c e = xe ce + xe00 ce . 2 e∈E 0 e∈E e∈E 00 Thus x and x are also optimal solutions and, by the choice of , one of these two solutions has more integer coordinates than x. 114 / 143 Miscellanea The matching polytope of a bipartite graph – revised Case 2. H has no cycle. Consider a longest path P = (v1 , v2 , . . . , vk ) in H. Observe if e is an edge incident to v1 (resp. vk ) and different from v1 v2 , (resp. vk−1 vk , then xe = 0, for otherwise H would contain either a cycle or a longer path. Let = mine∈E (P) min{xe , 1 − xe }. Defining x 0 and x 00 similarly as above, we obtain two admisible solutions to the LP. If P is odd, then the value of c T xe0 is greater than c T x, which contradicts the optimality of x. If P is even, both x 0 and x 00 are also optimal solutions and, by the choice of , one of these two solutions has more integer coordinates than x. 115 / 143 Miscellanea The fuzzy polytope of a digraph Let D = (V , E ) be a digraph with V = {1, . . . , n}. The fuzzy polytope of D, F (D), is the polytope described by 0 ≤ xv ≤ 1 ∀v ∈ V , xv − xw ≥ 0 ∀(v , w ) ∈ E . A ⊆ V , A 6= ∅ is called comprehensive from above (cfa) in D if δ − (A) = ∅ (if w ∈ A and (v , w ) ∈ E then v ∈ A). C(D) := {A|A ⊆ V , A cfa in D}. Theorem (Brink, Laan, Vasil’ev, 2004) convhull{χA |A ∈ C(D)} = F (D). Proof. Clearly, the integer vectors in F (D) are exactly the incidence vectors of cfa sets. Hence, if we prove that the vertices of F (D) are integer vectors, the theorem holds. Let a be a vertex of F (D). Suppose that Frac(a) = {v ∈ V |0 < av < 1} = 6 ∅. 116 / 143 Miscellanea The fuzzy polytope of a digraph Take(µ = max{av |v ∈ Frac(a)}, Fracµ (a) = {v ∈ Frac(a)|av = µ}, and max{av |v ∈ Frac(a) − Fracµ (a)} if Frac(a) 6= Fracµ (a), ν= 0 if Frac(a) = Fracµ (a). 0 00 n Fix some ( δ ∈ (0, 1) s.t. µ + δ, µ − δ ∈ (ν, 1), ( and let a , a ∈ R be: µ + δ if v ∈ Fracµ (a), µ − δ if v ∈ Fracµ (a), av0 = av00 = av if v ∈ V − Fracµ (a). av if v ∈ V − Fracµ (a). By construction, a0 and a00 satisfy the first group of inequalities in describing F (D). Since av0 ≤ aw0 ⇔ av00 ≤ aw00 ⇔ av ≤ aw , and a ∈ F (D), it follows that the second group of inequalities are satisfied by a0 and a00 . Hence, a0 , a00 ∈ F (D), a0 6= a00 , and a = 21 a0 + 12 a00 , contradicting the hypothesis that a is a vertex in F (D). 117 / 143 Miscellanea The fuzzy polytope of a digraph Example Let D = ({0, 1, . . . , n}, {(i, 0)|i = 1, . . . , n}). Then C(D) = {A|∅ = 6 A ⊆ {1, . . . , n}} ∪ {{0, 1, . . . , n}}. Hence |C| = 2n . 0 1 2 3 4 Figure: n = 4: Digraph with n + 1 vertices and 2n cfa sets. 118 / 143 Miscellanea The sharing polytope of a digraph Let D = (V , E ) be a digraph with V = {1, . . . , n}. The sharing polytope of D, Sh(D), is the polytope described by xv ≥ 0 ∀v ∈ V , X xv = 1 v ∈V xv − xw ≥ 0 ∀(v , w ) ∈ E . A ⊆ V is called connected comprehensive from above (ccfa) in D if A 6= ∅, δ − (A) = ∅, and the subdigraph iduced by A in D is connected. CC(D) := {A|A ⊆ V , A ccfa in D}. 1 A χ |A ∈ CC(D)} = Sh(D). Thm. (Brink,Laan,Vasil’ev,’04) convhull{ |A| 1 A Proof. (1) If A ∈ C(D) then x A = |A| χ ∈ Sh(D). Indeed, xvA ≥ 0 for P P 1 A each v ∈ V , and v ∈V xv = |A| v ∈A 1 = 1. Let (v , w ) ∈ E . If |{v , w } ∩ A| = 6 1, then vvA = xwA and the third restriction is satisfied for 1 > 0 = xwA . Since A is cfa, these are (v , w ). If v ∈ A and w 6∈ A, xvA = |A| all possible cases. 119 / 143 Miscellanea The sharing polytope of a digraph 1 A χ ∈ Sh(D) (2) If A induces in D a connected subdigraph and x A = |A| A then x is a vertex of Sh(D). Indeed, suppose that there are y , z ∈ Sh(D), such that x A = 12 y + 12 z. We show that y = z = x A . For v ∈ V − A, we have xvA = 0 and since yv , zv ≥ 0, from xvA = 12 yv + 12 zv = 0, we deduce xvA = yv = zv = 0 for all v ∈ V − A. For v , w ∈ A and (v , w ) ∈ E , since y and z are from Sh(D) we have 1 1 yv ≥ yw and zv ≥ zw . As |A| = xvA = 12 yv + 12 zv ≥ 12 yw + 21 zw =xwA = |A| it follows that yv = yw and zv = zw . By the connectivity of the subdigraph induced by A in D, we have that yv = yw and zv = zw for all 1 v , w ∈ A and hence xvA = yv = zv = |A| for all v ∈ A. 1 A (3) If x is a vertex in Sh(D) ⇒ ∃A ∈ CC(D) such that x = |A| χ . P x Let A = {v ∈ V |xv > 0}. Since v ∈V xv = 1, it follows that Ax 6= ∅. 120 / 143 Miscellanea The sharing polytope of a digraph (3.1) Ax ∈ CC(D). Indeed, Ax ∈ C(D) is simply: if w ∈ Ax and (v , w ) ∈ E then xv ≥ xw > 0 hence v ∈ Ax . To prove that Ax induces a connected subdigraph in D, suppose that there are at least 2 connected x components in [Ax ]D : Ax1 , . . . , Axm (m ≥ 2). Clearly, each ( Ak is cfa, xv if v ∈ Axk because Ax is cfa. For k = 1, m, let x k given by xvk = , 0 otherwise. P and define λk = v ∈Ax xvk . We have λk > 0 for all k ∈ {1, . . . , m} and k P k=1,m λk = 1. Let x k = λ1k x k , for k = 1, m. Clearly x k 6= x l for all k 6= l in {1, . . . , m} P and x = k=1,m λk x k . If we prove that ∀k, x k ∈ Sh(D), we contradict that x is a vertex in Sh(D). We verify the last inequalities group in the definition of Sh(D). Let (v , w ) ∈ E and k ∈ {1, . . . , m}. If w ∈ Axk then, since Axk is cfa, v ∈ AxK , hence x kv = λ1k xvk ≥ λ1k xwk = x kw . If w 6∈ Axk then x kv ≥ 0 = x kw . 121 / 143 Miscellanea The sharing polytope of a digraph (3.2) xv = xw , ∀v , w ∈ Ax . Suppose α < β, where α = min{xv |v ∈ Ax } and β = max{xv |v ∈ Ax }. Let x 0(and x 00 defined by ( α if v ∈ Ax , xv − α if v ∈ Ax , 0 00 xv = xv = x 0 if v ∈ V − A 0 if v ∈ V − Ax . P Clearly, x 0 , x 00 ≥ 0 and x = x 0 + x 00 . Taking λ0 = v ∈Ax xv0 and P λ00 = v ∈Ax xv00 , we obtain from x ∈ Sh(D) that λ0 , λ00 ∈ (0, 1) and λ0 + λ00 = 1. We have x = λ0 x 0 + λ00 x 00 , where x 0 = λ10 x 0 and x 00 = λ10 x 00 . It is easy to verify that x 0 6= x 00 and that x 0 , x 00 ∈ Sh(D), contradicting the hypothesis that x is a vertex in Sh(D). P From (3.2) and v ∈V xv = 1 it follows that xv = |A1x | for all v ∈ Ax . So, x= 1 Ax |Ax | χ and Ax ∈ CC(D). 122 / 143 Miscellanea The sharing polytope of a digraph Example Let D = ({0, 1, . . . , n}, {(i, 0)|i = 1, . . . , n}). Then CC(D) = {{i}|i ∈ {1, . . . , n}} ∪ {{0, 1, . . . , n}}. Hence |CC| = n + 1. 0 1 2 3 4 Figure: n = 4: Digraph with n + 1 vertices and n + 1 ccfa sets. 123 / 143 Miscellanea Covering a strongly connected digraph with directed cycles Gallai (1964) conjectured an analogue of Gallai-Milgram Theorem (every digraph D can be covered by at most α(D) directed paths) for covering strongly connected (sc) digraphs with (directed) circuits. This was proved in 2004 by Bessy and Thomass´ e: Theorem 1. The vertex set of any sc digraph D can be covered by α(D) circuits. We present a min-max theorem established by Bondy, Charbit and Seb¨ o, from which the above theorem follows easily. If D = (V , E ) is a digraph, a cyclic order of D is a cyclic order O = (v1 , v2 , . . . , vn , v1 ). The length of an arc (vi , vj ) of D (w.r.t. a cycle order O) is j − i if i < j and n + j − i if i > j. Informally, the length of an arc is just the length of the segment of O ”jumped” by the arc. If C is a circuit P of D, the index of C (w.r.t O), is i(C ) = a arc of C length(a) (it is a multiple of n). The index of a family P C of circuits, denoted i(C), is i(C) = c∈C i(C ). 124 / 143 Miscellanea Covering a strongly connected digraph with directed cycles A weighting of the vertices of D Pis a function w : V → N. The weight of a subgraph H of D w (H) = v ∈V (H) w (v ). The weighting w is index-bounded (w.r.t. O) if w (C ) ≤ i(C ) for every circuit C of D. For any cycle covering C of D and any index-bounded weighting w : X i(C) ≥ w (C ) ≥ w (D). C ∈C Theorem 2. Let D be a digraph in which each vertex lies in a circuit, and let O be a cyclic order of D. Then: min i(C) = max w (D) where the minimum is taken over all circuit coverings C of D and the maximum over all index-bounded weightings w of D. 125 / 143 Miscellanea Covering a strongly connected digraph with directed cycles In order to deduce Theorem 1 from Theorem 2, it suffices to apply it to a coherent cyclic order O of D. A cyclic order is coherent if every arc lies in a circuit of index one. Every s.c. digraph admits a coherent cyclic order. A fast algorithm to find a coherent cyclic order was given by Iwata and Matsuda(2008). We then observe that: • for every family C of directed cycles of D, we have |C | = i(C ), • because each vertex lies in a circuit and O is coherent, an index-bounded weighting of D is necessarily {0, 1}-valued, • since each arc lies in a circuit, in an index-bounded weighting w no arc can join two vertices of weight 1, ⇒ the support of w is a stable set, ⇒ w (D) ≤ α(D). 126 / 143 Miscellanea Proof of Theorem 2 Let D = (V , E ) be a digraph, V = {v1 , . . . , vn } and E = {a1 , . . . , am }. It suffices to show that equality in Theorem 2 holds for some cycle covering C and some index-bounded weighting w . An arc (vi , vj ) is called a forwardarc of D if i < j, and a reverse arc if M j < i. Consider the matrix Q := , where M = (mij ) is the N incidence ( matrix of D and N = (nij ) is the n × m matrix defined by 1 if vi is the tail of aj nij = 0 otherwise. Q is totally unimodular. Indeed, if Q is the matrix obtained from Q by subtracting each row of N from the corresponding row of M. Each column of Q contains one 1 and one -1, the remaining entries being 0. Thus, Q is totally unimodular. Because Q was derived from Q by elementary row operations, the matrix Q is totally unimodular too. 127 / 143 Miscellanea Proof of Theorem 2 ( 0 if 1 ≤ i ≤ n, Let b ∈ R2n with bi := , and c ∈ Rm with 1 otherwise, ( 1 if aj is a reverse arc cj := Note that: 0 otherwise. • If x := fC is the circulation associated with a circuit C , then c T x = i(C ), the index ofP C. • If Nx ≥ 1, where x := {λC fC : C ∈ C} is a linear combination of circulations, then the family C of circuits of D is a covering of D. Consider the linear programme (LP) min{c T x|x ≥ 0, Qx ≥ b}. The system Qx ≥ b is equivalent to Mx ≥ 0 and Nx ≥ 1. Because the rows of M sum to 0, the rows of Mx sum to 0, ⇒ Mx = 0. 128 / 143 Miscellanea Proof of Theorem 2 It follows that every feasible solution to of the above (LP) is a non-negative circulation in D. Hence, a non-negative linear combination P λC fC of circulations associated with circuits of D. Since Nx ≥ 1, the circuits of positive weight in this sum form a covering of D. Conversely, every circuit covering of D yields a feasible solution to (LP). The linear programme (LP) is feasible because, by assumption, D has at least one circuit covering, and it is bounded because c is non-negative. Thus (LP) has an optimal solution. Hence (LP) has an integral optimal solution, because Q is totally unimodular and the constraints are integral. This solution corresponds to a circuit covering C of minimum index, the optimal value being i(C). 129 / 143 Miscellanea Proof of Theorem 2 Consider the dual of (LP): max{b T y |y ≥ 0, y T Q ≤ c T }. If y T := (z1 , . . . , zn , w1 , . . . , wn ),(then this is (DLP) Pn 1 if aj = (vi , vk ) is a reverse arc, max i=1 wi |zi − zk + wi = 0 if aj = (vi , vk ) is a forward arc. Consider an integral optimal solution to (DLP). If we sum the constraints over the arc set of a circuit C of D, we obtain the inequality X wi = i(C ). vi ∈V (C ) In other words, the function w defined by w (vi ) := wi , 1 ≤ i ≤ n, is an index-bounded weighting, and the optimal value is the weight w (D) of D. By the Duality Theorem, we have i(C ) = w (D). 130 / 143 Miscellanea – Two-player Games Preliminaries A two-player normal-form game is specified via a pair (R, C ) of m × n payoff matrices. The row player has m pure strategies, which are in one-to-one correspondence with the rows of the payoff matrices. The column player has n pure strategies, which are in one-to-one correspondence with the columns of the payoff matrices. If the row player plays strategy i and the column player strategy j, then their respective payoffs are given by R[i, j] and C [i, j]. 131 / 143 Miscellanea – Two-player Games Mixed strategies Players may also randomize over their strategies, leading to mixed – as opposed to pure – strategies. P Notation: ∆p = {x ∈ Rp |xi ≥ 0, ∀i = 1, p, and i=1,p xi = 1} – the p-dimensional simplex. The set of mixed strategies available to the row player is ∆m and those available to the column player are from ∆n . For x ∈ ∆m and y ∈ ∆n , the expected payoff of the row and column player are respectively x T Ry and x T Cy . 132 / 143 Miscellanea – Two-player Games Nash Equilibrium (x, y ) ∈ ∆m × ∆n is said to be a Nash equilibrium if neither player can increase her expected payoff by unilaterally deviating from her strategy: x T Ry ≥ x 0T Ry , ∀x 0 ∈ ∆ , m T 0 T 0 x Cy ≥ x Cy , ∀y ∈ ∆n . Equivalently, the support of a player’s strategy contains those pure strategies that maximize her payoff given the other player’s mixed strategy: x > 0 ⇒ e T Ry ≥ e T Ry , ∀j ∈ {1, . . . , m}, i i T j yj > 0 ⇒ x Cej ≥ x T Cei , ∀i ∈ {1, . . . , n}. Equivalently, a player cannot improve her payoff by unilaterally switching to a pure strategy: x T Ry ≥ eiT Ry , ∀i ∈ {1, . . . , m}, x T Cy ≥ x T Cej , ∀j ∈ {1, . . . , n}. 133 / 143 Miscellanea – Two-player Zero-sum Games LP Formulation Definition. A two-player game is said to be zero-sum if R + C = 0, i.e. R[i, j] + C [i, j] = 0 for all i ∈ {1, . . . , m} and j ∈ {1, . . . , n}. If the row player is forced to announce his strategy in advance, he solves the following linear program whose value will be equal to the row player’s payoff after the column player’s optimal response to her strategy: LP1 max{z|x T R ≥ z1T , x ∈ ∆ }. m max min x T Ry . The maximum of z is actually x Dual of LP1: LP2 y min{z 0 | − y T R T ≥ z 0 1T , y ∈ ∆n }. Taking z 00 = −z 0 and using the fact that C = −R, we can change LP2 into: LP3 max{z 00 |Cy ≥ z 00 1, y ∈ ∆n }. The maximum of z 00 is actually max min x T Cy = − min max x T Ry . y x y x 134 / 143 Miscellanea – Two-player Zero-sum Games LP Duality Theorem 1. If (x, z) is optimal for LP1, and (y , z 00 ) is optimal for LP3, then (x, y ) is a Nash equilibrium of (R, C ). Moreover, the payoffs of the row/column player in this Nash equilibrium are z and z 00 = −z respectively. Corollary 1. There exists a Nash equilibrium in every two-player zero-sum game. Corollary 2. (The Minmax Theorem) max min x T Ry = min max x T Ry . x y y x 135 / 143 Miscellanea – Two-player Zero-sum Games LP Duality Theorem 2. If (x, y ) is a Nash equilibrium of (R, C ), then (x, x T Ry ) is an optimal solution of LP1, and (y , −x T Cy ) is an optimal solution of LP2. Corollary 3. The equilibrium set {(x, y )|(x, y ) is a Nash equilibrium of (R, C )} of a zero-sum game is convex. Corollary 4. The payoff of the row player is equal in all Nash equilibria of a zero-sum game. Ditto for the column player. Definition (Value of a Zero-Sum Game). If (R, C ) is zero-sum game, then the value of the game is the unique payoff of the row player in all Nash equilibria of the game. 136 / 143 Miscellanea – Two-player Zero-sum Games Fictitious Play (FP) Basic Idea (Brown 1951): The two players plays the game in their heads, going through several rounds of speculation and counterspeculation as to how their opponents might react and how they would react in turn. FP proceeds in rounds. In the first round, each player arbitrarily chooses one of his actions (row/column). In subsequent rounds, each player looks at the empirical frequency of play of their respective opponents in previous rounds, interprets it as a probability distribution, and myopically plays a pure best response against this distribution. 137 / 143 Miscellanea – Two-player Zero-sum Games Fictitious Play (FP) Definition. Let (R, C ) be a two-player game, x ∈ ∆m a mixed strategy of the row player and y ∈ ∆m a mixed strategies of the column player. The players pure strategy best response to y and x are: BR(y ) = {i ∈ {1, . . . , m}|eiT Ry ≥ ejT Ry , ∀j ∈ {1, . . . , m}, BR(x) = {j ∈ {1, . . . , n}|x T Cej ≥ x T Cei , ∀i ∈ {1, . . . , n}. 138 / 143 Miscellanea – Two-player Zero-sum Games Fictitious Play (FP) Definition. The sequence (it , jt )t∈N is a simultaneous fictitious play process (SFP process) for the game (R, C ), if (i1 , j1 ) ∈ {1, . . . , m} × {1, . . . , n} and for all t ∈ N, it+1 ∈ BR(yt ) and jt+1 ∈ BR(xt ), where, the mixed strategies xt and yt (called believes) are given by t xt+1 = t 1X 1X eis and yt+1 = ej . t s=1 t s=1 s 139 / 143 Miscellanea – Two-player Zero-sum Games Fictitious Play (FP) Definition. The sequence (it , jt )t∈N is a alternating fictitious play process (AFP process) for the game (R, C ), if i1 ∈ {1, . . . , m} and for all t ∈ N, it+1 ∈ BR(yt ) and jt ∈ BR(xt ), where, the mixed strategies xt and yt (called believes) are given by t xt+1 = t 1X 1X eis and yt+1 = ej . t s=1 t s=1 s 140 / 143 Miscellanea – Two-player Zero-sum Games Fictitious Play (FP) Remarks. Beliefs can be updated recursively. The belief of a player in round t + 1 is a convex combination of his belief in round t and his pure strategy best response to opponents move in round t: xt+1 = 1 t 1 t xt + eit+1 and yt+1 = yt + ej t +1 t +1 t +1 t + 1 t+1 If a fictitious play process (AFP or SFP) converges, it must be constant from some stage on, implying that the limit is a pure Nash equilibrium. Even if the process does not converge, if the beliefs converge, then the limit must be a Nash equilibrium (which need not be pure, however). 141 / 143 Miscellanea – Two-player Zero-sum Games Fictitious Play (FP) Definition. The sequence (xt , yt )t∈N∗ is called a learning sequence. A learning sequence of a game is said to converge if for some Nash equilibrium (x, y ) of the game, lim (xt , yt ) = (x, y ). k→∞ We then say that FP converges if every learning sequence converges to a Nash equilibrium. Theorem (J.Robinson, 1951) FP converges for every two-player zero-sum game. 142 / 143 Miscellanea – Two-player Zero-sum Games Fictitious Play (FP) Definition. A two-player zero sum is called symmetric if R and C are squared skew-symmetric matrices R T = −R = C . In symmetric games, both players have the same set of actions. Example: 0 −1 − R = 1 0 − 0 Theorem (F.Brandt, F. Fischer, P. Harrenstein, 2011). In symmetric two-player constant-sum games, FP may require exponentially many rounds (in the size of the representation of the game) before an equilibrium action is eventually played. Proof. Taking = 2−k (for k ∈ N∗ ) in the matrix above, it is proved that FP may take 2k rounds before either player plays the pure strategy e3 ( (e3 , e3 ) is the unique Nash equilibrium of the game). 143 / 143