Exact and Approximate Algorithms for New Variants of Some Classic
Transcription
Exact and Approximate Algorithms for New Variants of Some Classic
Tel Aviv University The Raymond and Beverly Sackler Faculty of Exact Sciences The Blavatnik School of Computer Science Exact and Approximate Algorithms for New Variants of Some Classic Graph Problems Thesis submitted for the degree of Doctor of Philosophy by Amitai Armon Under the supervision of Prof. Uri Zwick Submitted to the Senate of Tel-Aviv University September 2008 To my mother and to my grandmother Belina Acknowledgment To my advisor Uri Zwick, for his successful guidance of this research, and for sharing with me his optimistic approach to settling algorithmic problems. To Amos Fiat, Micha Sharir, Vera Asodi, Eyal Even-Dar and Ben Sandbank, for long collaborations in fundamental course teaching, which were very pleasant. To Adi Avidor and Oded Schwartz, for a fruitful research collaboration and a joyful joint work. To Amir Epstein and Ido Tzameret, for being great partners to share an office with. And to my mother and my grandmother Belina, for all the love and support, and for making me who I am. Abstract Graphs are probably the most studied object in theoretical computer-science, and many graph problems, such as Shortest-Paths, Max-Flow and the Traveling Salesperson Problem, have been studied for many decades. Efficient algorithms have been developed for some of these problems, while others were proven to be hard to solve. For many of the latter problems, efficient algorithms which find an approximate solution have been developed. In this work we consider new variants of three well-studied fundamental graph problems: Global Min-Cut, The Traveling Salesperson Problem and Facility Location. For each of these problems, the variants we consider generalize or extend the respective problem, by considering additional goals, options or constraints, arising either from real-life scenarios or from previous theoretic studies. In our study of these new variants, we also answer some open questions posed regarding previously introduced special cases, and improve some previous results. In Chapter 2 we consider multicriteria versions of the Global Min-Cut problem. In the kcriteria setting of Global Min-Cut, each edge of the input graph has k non-negative costs associated with it. These costs are measured in separate, non interchangeable, units. In the AND-version of the problem, purchasing an edge requires the payment of all the k costs associated with it. In the OR-version, an edge can be purchased by paying any one of the k-costs associated with it. Given k bounds b1 , b2 , . . . , bk , the basic multicriteria decision problem is whether there exists a cut C of the graph that can be purchased using a budget of bi units of the i-th criterion, for 1 ≤ i ≤ k. We show that the AND-version can be solved in polynomial-time for any fixed number k of criteria, and it is NP-hard for non-fixed k. Our results may be somewhat surprising, since Papadimitriou and Yannakakis [PY00] proved that bicriteria s-t-Min-Cut is strongly NP-hard. Our work resolves an open question of Bruglieri et al. [BEH00, BME04], who asked whether a cardinality-constrained variant of Global Min-Cut, in which the number of edges connecting the two subsets is limited by some input number, can be solved in polynomial-time. We answer their question in the affirmative. Regarding the OR-version of the problem, on the other hand, we show NP-hardness even for k = 2, and prove that the problem can be solved in pseudo-polynomial time for any fixed number k of criteria. It also admits an FPTAS (a fully-polynomial-time approximation scheme). Further extensions and applications, as well as multicriteria OR-versions of two other optimization problems, are also considered. As far as we know, we are the first to consider OR-versions of multicriteria problems. This chapter is based on the paper [AZ06]. In Chapter 3 we consider cooperative variants of The Traveling Salesperson Problem (TSP). In these problems a salesperson has to make deliveries to customers who are willing to help in the process. The basic motivation for these variants is that in many realistic scenarios the “customers” are actually other members of the same organization/company the salesperson belongs to, and thus can be asked to help. The customers may be able to cooperate in several modes: They may assist by approaching the salesperson to receive the goods, by delivering goods that they received to other customers, or by doing both. Several objectives may be of interest: Minimizing the total distance traveled by all the participants, minimizing the maximal distance traveled by a participant, or minimizing the total time until all the deliveries are made. All the combinations of cooperation-modes and objective functions are considered in our study, both in weighted undirected graphs and in Euclidean space. We show that most of the problems we consider have a constant approximation algorithm, many of the others admit a PTAS, and a few are solvable in polynomial time. On the intractability side, we provide NP-hardness proofs and inapproximability factors, some of which are tight. All our algorithms are purely combinatorial, and our hardness proofs use reductions from well-known NP-hard problems, without requiring the use of the PCP theorem. This chapter is based on the paper [AAS06]. Chapter 4 considers a min-max version of the previously studied r-gathering problem with unit-demands. The problem we consider is a metric facility-location problem, in which each open facility must serve at least r customers, and the maximum of all the facility and connection costs should be minimized (rather than their sum). This problem is motivated by scenarios in which r customers are required for a facility to be worth opening, and the costs represent the time until the facility/connection will be available (i.e., we want to have the complete solution ready as soon as possible). We present a 3-approximation algorithm for this problem, and prove that it cannot be approximated better (assuming P 6= N P ). Next we consider this problem with the additional natural requirement that each customer will be assigned to a nearest open facility, and present a 9-approximation algorithm. We further consider previously introduced special cases and variants, and obtain improved algorithmic and hardness results. The results of this chapter are based on the paper [Arm08]. Contents 1 Introduction 1.1 1.2 1.3 1.4 1 Multicriteria Global Min-Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Cooperative TSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Min-Max r-Gatherings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.1 Our contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 General Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.1 Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.2 Gap-Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Multicriteria Global Min-Cut 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Multicriteria global minimum cut: the AND-version . . . . . . . . . . . . . . . . . 12 2.2.1 The Min-Max problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.2 The decision problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.3 The optimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.4 Two applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.5 The Pareto Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.6 Hardness results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Multicriteria global minimum cut: the OR-version . . . . . . . . . . . . . . . . . . 18 2.3.1 Relation to scheduling on unrelated machines . . . . . . . . . . . . . . . . . 18 2.3.2 The min-max version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.3 A case that can be solved in polynomial time . . . . . . . . . . . . . . . . . 20 OR-versions of other multicriteria problems . . . . . . . . . . . . . . . . . . . . . . 20 2.4.1 21 2.3 2.4 Shortest paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 2.4.2 2.5 Minimum spanning trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3 Cooperative TSP 3.1 3.2 3.3 3.4 23 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.1.1 Related Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.1.2 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Euclidean cTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.1 Min-Sum Euclidean-cTSP . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.2 Min-Max Euclidean-cTSP . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.3 Min-Makespan Euclidean-cTSP . . . . . . . . . . . . . . . . . . . . . . 38 cTSP in Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.3.1 Min-Sum cTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3.2 Min-Max cTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.3.3 Min-Makespan cTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Discussion and Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4 On unweighted r-Gatherings 57 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1.1 Our results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2 Problem Definitions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.3 Approximating Min-Max r-Gathering . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.4 Assigning to a Nearest Open Facility . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4.1 Improved Results for r = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5 Hardness Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.6 Concluding Remarks and Open Problems . . . . . . . . . . . . . . . . . . . . . . . 72 4.1 5 Concluding Remarks 75 Bibliography 77 ii Chapter 1 Introduction Graphs are apparently the most studied object in theoretical Computer-Science, and many of the graph problems have been studied for many decades. A partial list of such problems includes the well-known Shortest-Paths problem, Minimum Spanning Tree, Max-Flow, Global Min-Cut, The Traveling Salesperson Problem, Minimum Steiner Tree, k-Center, k-Median, Facility-Location, and many more. A basic survey can be found in the fundamental textbook of Cormen et al. [CLRS01]. Efficient algorithms have been developed for some of these classical problems, while other problems were proven to be hard to solve. For many of the latter problems, efficient algorithms which find an approximate solution have been developed (see, e.g, [Vaz03] for a survey of many of these results). In parallel to the on-going study of classical graph problems, there are many studies on their special cases, generalizations and other variants. Special cases often relate to special graphs or to limiting some input values (e.g limiting them to integers or to a certain range). Generalizations sometimes involve, for example, considering directed rather than undirected graphs, or considering matroids rather than graphs. Other types of variants involve changing the goal function (sum/max/min/etc.), adding/removing a constraint, and/or considering a different space (e.g. the plane or another Euclidean space). Such variants were sometimes motivated by theoretic questions and sometimes evolved from real-life applications. In this work we consider new variants of three well-studied fundamental graph problems: MinCut, The Traveling Salesperson Problem and Facility Location. For each of these problems, the variants we consider seem to provide a new insight on the respective problem, and include as special cases previously introduced variants. Our work answers some open questions posed regarding these special cases, improves some previous results, and expands the scope of the previous research. 2 Introduction 1.1 Multicriteria Global Min-Cut In Chapter 2 we consider multicriteria versions of the well-known Global Min-Cut problem. In Global Min-Cut, the vertices of an edge-weighted undirected graph should be split into two nonempty subsets. The goal is that the total weight (cost) of the edges connecting the two subsets will be minimal. Karger provided a near-linear (O(m log3 n)) Monte-Carlo algorithm for this problem [Kar00]. The fastest deterministic algorithms require O(mn log n) time [HO94, NI92, SW97], similarly to the fastest max-flow algorithms which solve the s-t-Min-Cut problem - the variant in which two pre-specified vertices must be in different subsets (see, e.g., [GT88]). Bruglieri et al. [BEH00, BME04] asked whether a cardinality-constrained variant of this problem, in which the number of edges connecting the two subsets is limited by some input number, can be solved in polynomial-time. This problem can be viewed as a special case of a bicriteria variant of Global Min-Cut, in which each edge has an additional type of weight (equal to 1 for all the edges in this special case), and the minimization of the first weight (criterion) should be done without exceeding the limit for the second weight (which is not interchangeable with the first one). There is a large body of works, mostly by the Operations Research community, on cardinalityconstrained and multicriteria problems, which are usually NP-hard and require approximations (see, e.g., [EG00, Ehr00, Cli97]). Specifically related to this problem is the result of Papadimitriou and Yannakakis [PY00], who proved that bicriteria s-t-Min-Cut is strongly NP-hard. 1.1.1 Our Contribution Quite surprisingly, we show that bicriteria Global Min-Cut can be solved in polynomial time. Thus we also resolve the question asked by Bruglieri et al. We extend our study by considering two more general multicriteria versions of Global MinCut. In the k-criteria setting, each edge of the input graph has k non-negative costs associated with it. These costs are measured in separate, non-interchangeable, units. In the AND-version of the problem, purchasing an edge requires the payment of all the k costs associated with it. In the OR-version, an edge can be purchased by paying any one of the k-costs associated with it. Given k bounds b1 , b2 , . . . , bk , the basic multicriteria decision problem is whether there exists a cut C of the graph that can be purchased using a budget of bi units of the i-th criterion, for 1 ≤ i ≤ k. We show that the AND-version is in P for any fixed number k of criteria, and is NP-hard for non-fixed k. The OR-version of the problem, on the other hand, is NP-hard even for k = 2, but can be solved in pseudo-polynomial time for any fixed number k of criteria. It also admits 1.2 Cooperative TSP 3 an FPTAS (a fully-polynomial-time approximation scheme). We provide similar results for the optimization versions of these two problems (minimizing the cost in one criterion given bounds on the others). Further extensions and applications, as well as multicriteria OR-versions of two other optimization problems, are also considered. As far as we know, we are the first to consider OR-versions of multicriteria problems. This chapter is based on the paper [AZ06]. 1.2 Cooperative TSP In Chapter 3 we consider cooperative variants of The Traveling Salesperson Problem (TSP). The input to TSP consists of a complete edge-weighted undirected graph, and the goal is to find a cycle in which each vertex appears exactly once, such that its total weight is minimal. The classical motivation is planning a tour for a salesperson, wishing to visit a set of cities/customers and return home, while minimizing the travel. This problem is known to be NP-hard [GGJ76], and cannot be approximated within any polynomial factor [Vaz03]. For the metric version, in which the weights satisfy the triangle inequality, Christofides presented a 3/2 approximation algorithm [Chr76], and this variant cannot be approximated within less than a factor of 131 130 , assuming P 6= N P [EK01]. In any fixed-dimension Euclidean space, the problem has a PTAS (a polynomial-time approximation scheme) [Aro98, Mit99], and is still NP-hard [Pap77]. There is also a 3/2 approximation algorithm for path-TSP [Hoo91], the variant in which the salesperson does not need to return home (i.e., one needs to find a path instead of a cycle, and one endpoint of the path is specified in the input). TSP has been studied in many forms over the decades. One of the interesting recent variants of TSP is the Freeze-Tag problem [ABF+ 02, SABM02, ABG+ 03, KLS04], presented by Arkin et al. [ABF+ 02], which is motivated by a swarm of robots scenario. In that scenario, an awake robot has to wake up a set of sleeping robots, and when a robot is awaken it can be instructed how to move in order to wake up other robots. The goal is to finish waking up all the robots as fast as possible (i.e., minimizing the “makespan”). Arkin et al. presented constant factor approximation algorithms for this problem on some special graphs, and proved that it cannot be approximated within less than a factor of 5/3 (assuming P 6= N P ). Konemann et al. later √ presented an O( log n) approximation algorithm for the problem on general graphs [KLS04]. The Freeze-Tag problem can be viewed as a variant of TSP, in which the “customers” cooperate with the salesperson, by helping in the “sales” after they are reached. Such cooperativeness can occur in various other realistic scenarios, in which the customers are in fact “agents” of the same organization as the salesperson, and can be instructed to move in order to assist in “delivering the goods”. Clearly, there can be other forms of such a cooperation. For example, the customers 4 Introduction might be able to move before they receive the goods, in order to approach the salesperson. In some scenarios, this type of cooperation might be relevant instead of (or in addition to) moving after receiving the goods. Also, various goals other than minimizing the makespan might be of interest. For example, in the sleeping robots scenario, if each robot has limited battery, then it might be interesting to minimize the maximal travel of any of the robots. Another interesting goal in some scenarios might be minimizing the total travel of all the participants (e.g. if their company pays all their travel costs). Clearly, just like in classical TSP, it might be appropriate to give up the requirement that the salesperson and customers will “return home” (their original location might be arbitrary). Also, such “cooperative” variants can be considered in a Euclidean space rather than a graph. 1.2.1 Our Contribution As described in Chapter 3, we studied all the above mentioned combinations of cooperationmode, goal-function, space and “roundtrip requirement”. We show that most of these problems have a constant-factor approximation algorithm, many of the others admit a PTAS, and a few are solvable in polynomial time. On the intractability side we provide NP-hardness proofs and inapproximability factors, some of which are tight. All our algorithms are purely combinatorial, and our hardness proofs use reductions from well-known NP-hard problems, without requiring the use of the PCP theorem. This chapter is based on the paper [AAS06]. 1.3 Min-Max r-Gatherings Chapter 4 considers the r-gathering problem, a recently introduced variant of the Facility Location problem. In the classical Facility Location, the input consists of a set of customers C and a set of potential facility locations F . Each potential location is associated with a cost for opening a facility there. Each pair of customer location and facility location is associated with a servicecost (for serving that customer by that facility). The problem is to choose locations for opening facilities, such that the sum of opening costs and service costs is minimal (each customer is served by the open facility that minimizes his service cost). This problem is often described by a bipartite graph, whose sides correspond to C and F (associated with costs) and whose edge weights are the respective service costs. Hochbaum [Hoc82] presented an O(log n) approximation for this problem. The problem cannot be approximated within an o(log n) factor [Arc00], unless N P ⊆ DT IM E(nO(log log n) ). If the service costs are assumed to satisfy the triangle-inequality, then the problem can be approximated 1.3 Min-Max r-Gatherings 5 within a factor of 1.5 [Byr07], and cannot be approximated within less than a factor of 1.463, unless N P ⊆ DT IM E(nO(log log n) ) [GK99]. Most of the research deals with the scenario in which the service-costs satisfy the triangle-inequality, called metric facility location. The costs are sometimes called distances in this scenario, and thus each customer is assigned to the nearest open facility. This problem has been extensively studied for many decades and has many well-studied variants (see, e.g., [Dre95, Vyg05] for surveys). Some of the more recent variants of Facility Location consider an additional constraint on the number of customers in each facility. In the capacitated variant of the problem (see, e.g., [Vyg05]), there is an upper bound on the number of customers that each facility may serve. In the soft-capacities version, more than one facility may be opened at the same location (the location’s opening cost is thus multiplied by the number of facilities opened there). In the hard capacities version, only one facility may be opened in each location. For soft capacities, there is a 2-approximation algorithm, which matches the integrality-gap of the linear relaxation of the problem [MYZ03]. For hard capacities, there is a 5.83-approximation algorithm [ZCY04]. The best lower bound is the lower bound of 1.463 for the uncapacitated version (unless N P ⊆ DT IM E(nO(log log n) )) [ZCY04, GK99]. Note that unlike the classical facility location problem, a customer might not be served by the facility with minimal service cost, due to the constraint posed by the capacities. One of the more recent variants of Facility Location that consider the number of customers served by each facility is the r-gathering problem [GMM00, KM00, Svi08]. This problem is an analogue of capacitated Facility Location, in which there is a lower bound of r on the number of customers served by each facility (rather than an upper bound). The basic motivation is that a certain number of customers is usually required to make a facility worth opening. This problem can also be regarded as a clustering problem (without opening costs), in which small clusters are not desired (see [AFK+ 06]). Previous works also considered the case in which the lower bounds differ between different locations [GMM00, KM00]. Again, a customer might not be served by the facility with minimal service cost, due to the constraint posed by the lower-bounds. Karger and Minkoff [KM00] and Guha et al. [GMM00], who introduced the r-gathering problem, also provided a bicriteria approximation algorithm for it. Their algorithm guarantees that each facility will serve at least αr customers, at a cost of at most 1+α 1−α β times the optimal cost for r-gathering, where β is the approximation ratio for Facility Location. This implies a 1.5(2r − 1 + ) single criterion approximation (choosing α = r−1 r + ). Svitkina [Svi08] has recently provided a constant (558) single-criterion approximation for the problem. The recent work of [AK07] proves that the problem is polynomial for r = 2 if there are no facility costs. Aggrawal et al. [AFK+ 06] considered a special case of a min-max version of this problem, in 6 Introduction which there are no opening costs, C = F (i.e., the customer locations are the potential facility locations), and the maximal service cost should be minimized. They called this special case, motivated by a clustering application, r-gather clustering. They provided a 2-approximation algorithm for this special case, and proved that it cannot be approximated better for r ≥ 7 (assuming P 6= N P ). For the generalization in which a certain fraction of the customers may be ignored (“outlayer points”), they state that there is a 3-approximation algorithm. 1.3.1 Our contribution We provide a 3-approximation algorithm for the general min-max version of r-gathering, and prove that it cannot be approximated better for any r > 2 (the recent work of [AK07] proves that the problem is polynomial for r = 2). We also prove that r-gather clustering cannot be approximated within less than a factor of 2 for any r > 2 improving the result of [AFK+ 06]. Our algorithm has several extensions, among them a 3-approximation for the generalization in which a certain fraction of the customers may be ignored (thus solving a more general problem than [AFK+ 06], with the same approximation ratio). We also consider a variant in which each customer in the solution has to be assigned to the nearest open facility, which is a quite natural requirement. We provide a 9-approximation algorithm for this variant. For the original min-sum version of r-gathering, we provide a 2r approximation for the special case in which there are no opening costs. This approximation ratio was also obtained by Lim et al. [LWX06] in parallel to our work, using a different algorithm. The results of this chapter are based on the paper [Arm08]. 1.4 General Preliminaries 1.4 7 General Preliminaries 1.4.1 Approximations Let O be a minimization problem, let ALG be an approximation algorithm for this problem, and let I be an instance of O. Denote by OP T (I) the optimal solution for I, and by ALG(I) the solution found by ALG applied to I. Denote by r the ratio of their values: r = |ALG(I)| |OP T (I)| . Then we say that ALG is a c-approximation for O if for every input I, r ≤ c. A Polynomial Time Approximation Scheme (PTAS) for a minimization problem O, is an approximation algorithm ALG that, given an input instance I and a parameter ε > 0, computes a c- approximation for I in time t, where c ≤ 1 + ε and t = Polyε (|I|). That is, for every fixed ε > 0 the running time of ALG is polynomial, but the dependency of the running time in ε is not necessarily polynomial. If t = Poly(|I|, 1ε ) then we say that ALG is a Fully Polynomial Time Approximation Scheme (FPTAS). This work focuses on minimization problems - similar definitions can be made for maximization problems. 1.4.2 Gap-Problems In order to prove inapproximability of an optimization problem, one usually defines a corresponding gap problem. Recall the following definition: Definition 1.1 (Gap Problem - Minimization). Let O be a minimization problem. Gap-O- [α, β] is the following decision problem: Given an input instance, decide whether • There exists a solution of cost at most α, or • Every solution of the given instance is of cost larger than β. If the cost of the solution resides between these values, then any output suffices. Clearly, if Gap-O- [α, β] is N P -hard, then it is N P -hard to approximate O to within any factor smaller than β α. 8 Introduction Chapter 2 Multicriteria Global Min-Cut 2.1 Introduction We consider two multicriteria versions of the global minimum cut problem in undirected graphs. Let G = (V, E) be an undirected graph, and let w1 , . . . , wk : E → R+ be k nonnegative cost (or weight) functions defined on its edges. A cut C of G is a subset C ⊆ V such that C 6= φ and C 6= V . The edges cut by this cut are E(C) = {(u, v) ∈ E | u ∈ C, v 6∈ C}. (As the graph is undirected, C and V −C define the same cut.) In the AND-version of the k-criteria problem, the i-th weight (or cost) of the cut is i-th cost in the AND-version: wi (C) = X wi (e) , 1≤i≤k. e∈E(C) In the OR-version of the problem we pay only one of the costs associated with each edge e ∈ E(C) of the cut. More specifically, we choose a function α : E(C) → {1, 2, . . . , k} which specifies which cost is paid for each edge of the cut. The i-th cost of the cut C, with respect to the choice function α, is then i-th cost in the OR-version: wi (C, α) = X wi (e) , 1≤i≤k. e∈E(C) ∧ α(e)=i For the AND-version, the basic multicriteria global minimum cut decision problem asks, given k cost bounds b1 , b2 , . . . , bk , is there a cut C such that wi (C) ≤ bi , for 1 ≤ i ≤ k? In the optimization problem we are given k − 1 bounds b1 , b2 , . . . , bk−1 and asked to find a cut C for which wk (C) is minimized, subject to the constraints wi (C) ≤ bi , for 1 ≤ i ≤ k − 1. The min-max version 10 Multicriteria Global Min-Cut of this problem asks for a cut C for which maxki=1 wi (C) is minimized, i.e., a cut whose largest cost is as small as possible. The Pareto set P (G, w1 , . . . , wk ) ⊆ Rk of an instance hG, w1 , ..., wk i is the set of cost vectors of cuts that are not dominated by the cost vector of any other cut. It follows, therefore, that if (c01 , c02 , . . . , c0k ) is the cost vector of a cut C 0 of the graph, then there exists a vector (c1 , c2 , . . . , ck ) ∈ P (G, w1 , . . . , wk ) such that ci ≤ c0i for 1 ≤ i ≤ k. Corresponding definitions can be made for the OR-version of the problem, where α should then also be chosen. Figure 2.1: A simple illustration for bicriteria min-cut. In the AND-version, the cost vector of the depicted cut is (4,9). In the OR-version, its cost with the chosen α is (1,2). Multicriteria optimization is an active field of research (see, e.g., the books of Climacao [Cli97] and Ehrgott [Ehr00]). (Most research is focused, in our terminology, on AND-versions of various optimization problems. All results cited below refer to the AND-versions of the problems, unless stated otherwise.) Papadimitriou and Yannakakis [PY00] investigated the complexity of several multicriteria optimization problems. In particular, they considered the multicriteria s-t minimum cut problem, in which the cut must separate two specified vertices, s and t. They proved that this problem is strongly NP-complete, even for just two criteria. We show here that (the AND-version of) the multicriteria global minimum cut decision problem can be solved in polynomial time for any fixed number of criteria, making it strictly easier than its s-t variant. The running time of our algorithm is O(mn2k ), where m = |E| is the number of edges in the graph, n = |V | is the number of vertices, and k is the number of criteria. This easily implies tractability of the optimization problem and also yields a pseudo-polynomial algorithm for constructing the Pareto set. The problem, however, becomes strongly NP-hard when the number of criteria is not fixed. We also show that the directed version of the problem is strongly NP-hard even for just two criteria. 2.1 Introduction 11 The single-criterion minimum cut problem has been studied for more than four decades as a fundamental graph optimization problem (see, e.g., [GH94, NI92, KS96, SW97, Kar99, Kar00]). Minimum cuts are used in solving a large variety of problems, including VLSI design, network design and reliability, clustering, and more (see [Kar00] and references therein). The best known deterministic algorithms for this problem run in O(mn + n2 log n) time (Nagamochi and Ibaraki [NI92] and Stoer and Wagner [SW97]). The best known randomized algorithm runs in O(m log3 n) time (Karger [Kar00]). (As can be seen, there is a huge gap between the complexities of the deterministic and randomized algorithms!). These algorithms are faster than the best known algorithms for the s-t minimum cut problem which are based on network flow. The polynomial time algorithm for the multicriteria problem relies on the fact that the standard single criterion global minimum cut problem has only a polynomial number of almost optimal solutions. More specifically, Karger and Stein [KS96] showed that for every α ≥ 1, not necessarily integral, the number of α-approximate solutions is only O(n2α ). Karger [Kar00] improved this bound to O(nb2αc ). Nagamochi et al. [NNI97] gave a deterministic O(m2 n + mn2α ) time algorithm for finding all the α-approximate cuts. Our algorithms for the multicriteria problem use their algorithm. Apart from the theoretical interest in minimum cuts in the multicriteria setting, there are some applications in which this problem is of interest. (The multicriteria global minimum cut problem is of interest in almost any application of the single criterion global minimum cut problem.) A multicriteria minimum balanced-partition is required, for example, in the situations described in [SKK99]. A special case of the bicriteria global minimum cut problem, called the ≤ r-cardinality min-cut, was considered by Bruglieri et al. [BEH00, BME04]. The input to this problem is an undirected graph G = (V, E) with a single weight function w : E → R+ defined on its edges. The goal is to find a cut of minimum cost that contains at most r edges. This is exactly the optimization version of the bicriteria minimum cut problem, where w1 (e) = 1, w2 (e) = w(e), for every e ∈ E, and w1 (C) must not exceed r. Bruglieri et al. [BEH00, BME04] ask whether this problem can be solved in polynomial time. We answer their question in the affirmative. We also obtain a polynomial time algorithm for finding a minimum cut which contains at most r vertices on the smallest side of the cut. As mentioned, most research on multicriteria optimization focused, in our terminology, on AND-versions of various multicriteria optimization problems. We consider here also the ORversions of the global minimum cut problem, the shortest path problem, and the minimum spanning tree problem. OR-versions of multicriteria optimization problems may be seen as generalizations of the 12 Multicriteria Global Min-Cut scheduling problem on unrelated machines (see [HS76, LST90, JP99]). The input to such a scheduling problem is a set of n jobs that should be scheduled on m machines. The i-th job has a cost vector (ci1 , . . . , cim ) associated with it, where cij is the processing time of the i-th job on the j-th machine. The goal is to allocate the jobs to the machines so as to minimize the makespan, i.e., the completion time of the last job. (Jobs allocated to the same machine are processed sequentially.) This is precisely the OR-version of the min-max m-criteria minimum cut problem on a graph with two vertices and n parallel edges. As another example where the OR-version of the multicriteria minimum-cut problem is of interest, consider a cyber-attacker wishing to disconnect a computer network, where there is more than one option for damaging each link. For example, assume that each link can either be disconnected by an electronic attack, which requires a certain amount of work hours (that may differ for different links), or by physically disconnecting it, e.g., by creating a strong electromagnetic field near the underground cable (the required power may again differ from link to link). Assuming an upper bound on the available power for electromagneric fields, what is the minimum electronic-attack time which enables disconnecting the network? It follows immediately from the simple reduction given above that the OR-version of the multicriteria global minimum cut problem is NP-hard even for just two criteria. We show, however, that the problem can be solved in pseudo-polynomial time for any fixed number of criteria. We also show that the problem can be solved in polynomial time when k, the number of criteria, is fixed and at least k − 1 of the weight functions assume only a fixed number of values. We also obtain some results on the complexity of the OR-versions of the shortest path and minimum spanning tree problems. The rest of this chapter is organized as follows. In the next section we consider the ANDversion of the global minimum cut problem. In Section 2.3 we then consider the OR-version of the problem. In Section 2.4 we consider the OR-version of the multicriteria shortest path and minimum spanning tree problems. Finally, we conclude in Section 2.5 with some concluding remarks and open problems. 2.2 Multicriteria global minimum cut: the AND-version We first present a polynomial time algorithm for the min-max version of the multicriteria global minimum cut. The algorithm for solving the min-max version of the problem is then used to solve the decision and optimization problems. 2.2 Multicriteria global minimum cut: the AND-version 13 Algorithm Min-Max(G(V, E, w1 , ..., wk )): P 1. Let w0 (e) = ki=1 wi (e), for every e ∈ E. 2. Find all the k-approximate minimum cuts in G with respect to w0 . 3. Among all the cuts C found in the previous step find the one for which maxki=1 wi (C) is minimized. Figure 2.2: A strongly polynomial time algorithm for the min-max version of the k-criteria global minimum cut problem. 2.2.1 The Min-Max problem An optimal min-max cut is a cut C for which maxki=1 wi (C) is minimized. We show that the simple algorithm given in Figure 2.2 solves the min-max version of the k-criteria global minimum cut problem in polynomial time, for every fixed k. A k-approximate cut in a graph G with respect to a single weight function w0 is a cut whose weight is at most k times the weight of the minimum cut. Theorem 2.1. Algorithm Min-Max solves the min-max version of the k-criteria global minimum cut problem. For any fixed k, it can be implemented to run, deterministically, in O(mn2k ) time. Proof. We begin by proving the correctness of the algorithm. We show that if C is an optimal minP max cut, and D is any other cut in the graph, then w0 (C) ≤ k·w0 (D), where w0 (e) = ki=1 wi (e), for every e ∈ E. This follows as w0 (C) = k X i=1 k k i=1 i=1 wi (C) ≤ k · max wi (C) ≤ k · max wi (D) ≤ k · k X wi (D) = k · w0 (D) . i=1 The inequality k · maxki=1 wi (C) ≤ k · maxki=1 wi (D) follows from the assumption that C is an optimal min-max cut. In particular, if D is an optimal minimum cut with respect to the single weight function w0 , then w0 (C) ≤ k·w0 (D), and it follows that C is a k-approximate cut of G with respect to w0 . This proves the correctness of the algorithm. We next consider the complexity of the algorithm. Karger and Stein [KS96] showed that every graph has at most O(n2k ) k-approximate cuts and gave a randomized algorithm for finding an implicit representation of them all in Õ(n2k ) time. A deterministic algorithm of Nagamochi et al. 14 Multicriteria Global Min-Cut [NNI97] explicitly finds all the k-approximte cuts in O(mn2k ) time. Choosing the best min-max cut among all the k-approximate cuts also takes only O(mn2k ) time. It is also easy to see that for any 1 < α ≤ k, we can find an α-approximate solution to the min-max problem in O(mn2k/α ) time, by checking all the k/α-approximate cuts in G0 = (V, E, w0 ). The randomized algorithm of Karger and Stein [KS96] extends to finding all the k-approximate minimum r-cuts in Õ(n2k(r−1) ) time. (An r-cut is a partition of the graph vertices into r sets, instead of 2). Thus, it is easy to see that the min-max multicriteria problem can also be solved for r-cuts, in Õ(mn2k(r−1) ) time using this randomized (Monte-Carlo) algorithm. 2.2.2 The decision problem We next show that the algorithm for the min-max version of the k-criteria problem can be used to solve the decision version of the problem: Given k bounds b1 , b2 , . . . , bk , is there a cut C such that wi (C) ≤ bi , for 1 ≤ i ≤ k. Theorem 2.2. For any fixed k, the decision version of the k-criteria global minimum cut problem can be solved, deterministically, in O(mn2k ) time. Proof. The decision problem can be easily reduced to the min-max problem. Given k weight functions w1 , . . . , wk : E → R+ and k bounds b1 , b2 , . . . , bk , we simply produce scaled versions wi0 (e) = wi (e)/bi , for every e ∈ E and 1 ≤ i ≤ k, of the weight functions. Clearly the answer to the decision problem is ‘yes’ if and only if there is a cut C for which maxki=1 wi0 (C) ≤ 1. 2.2.3 The optimization problem We next tackle the optimization problem: Given k − 1 bounds b1 , b2 , . . . , bk−1 , find a cut C for which wk (C) is minimized, subject to to the constraints wi (C) ≤ bi , for 1 ≤ i ≤ k − 1. Theorem 2.3. For any fixed k, the optimization version of the k-criteria global minimum cut problem for graphs with integer edge weights can be solved, deterministically, in O(mn2k log M ) P time, where M = e∈E wk (e). Proof. If the k-th weight function assumes only integral values, we can easily use binary search to solve the optimization problem. Given the k − 1 bounds b1 , b2 , . . . , bk−1 , we conduct a binary search for the minimal value bk for which there is a cut C such that wi (C) ≤ bi , for 1 ≤ i ≤ k. As the minimal bk is an integer in the range [0, M ], this requires the solution of only O(log M ) decision problems. 2.2 Multicriteria global minimum cut: the AND-version 15 The algorithm given above is not completely satisfactory as it is not strongly polynomial and does not work with non-integral weights. These problems can be fixed, however, as we show below. Theorem 2.4. For any fixed k, the optimization version of the k-criteria global minimum cut problem for graphs with arbitrary real edge weights can be solved, deterministically, in O(mn2k log n) time. Proof. We first isolate a small interval that contains the minimal value of bk . Let S = {wk (e) | e ∈ E} be the set of values assumed by the k-th weight function. The minimum bk of the optimization problem lies in an interval [s, ms], for some s ∈ S. (If C is the cut that attains the optimum, let s be the weight of the heaviest edge, with respect to wk , in the cut.) Using a binary search on the values in S, we can find such an interval that contains the minimum. This requires the solution of only O(log m) = O(log n) decision problems. Next, we conduct a binary search in the interval [s, ms] until we narrow it down to an interval of the form [s0 , (1 + n1 )s0 ] which is guaranteed to contain the right answer. This again requires the solution of only O(log(mn)) = O(log n) decision problems. Next, we run a modified version of the Min-Max algorithm given in Figure 2.2 on the following scaled versions of the weights: wi0 (e) = wi (e)/bi , for 1 ≤ i ≤ k − 1, and wk0 (e) = wk (e)/s0 . It is easy to see that if C is an optimal solution of the optimization problem, then C is also an (1 + n1 )-approximate solution of the min-max problem. This in turn implies, as in the proof of Theorem 2.1, that C is also a k(1 + n1 )-approximate minimum cut with respect to the weight P function w0 (e) = ki=1 wi (e), for every e ∈ E. Instead of finding all the k-approximate minimum cuts with respect to w0 , as done by algorithm Min-Max, we find all the k(1 + n1 )-approximate minimum cuts. Among all these cuts we find a cut C for which wi0 (C) ≤ 1, i.e., wi (C) ≤ bi , for 1 ≤ i ≤ k − 1, and for which wk (C) is minimized. This cut is the optimal solution to the optimization problem. We next analyze the complexity of the algorithm. be solved in O(mn n1/n 1 2k(1+ n ) O(mn2k ) = log n) time. O(mn2k ) All the k(1 + The O(log n) decision problems can 1 n )-approximate cuts can then be found in time using the algorithm of Nagamochi et al. [NNI97]. (Note that = O(1).) Checking all these cuts also takes only O(mn2k ) time. This completes the proof of the theorem. 2.2.4 Two applications Theorem 2.5. Let G = (V, E) be an undirected graph and let w : E → R+ be a weight function defined on its edges. Let 1 ≤ r ≤ m. Then, there is a deterministic O(mn4 log n) time algorithm 16 Multicriteria Global Min-Cut for finding a cut of minimum weight that contains at most r edges. Proof. We simply let w1 (e) = 1 and w2 (e) = w(e), for every e ∈ E, and solve the optimization problem with b1 = r. As mentioned in the introduction, this solves an open problem raised by Bruglieri et al. [BEH00, BME04]. We also have: Theorem 2.6. Let G = (V, E) be an undirected graph and let w : E → R+ be a weight function defined on its edges. Let 1 ≤ r ≤ n. Then, there is a deterministic O(n6 log n) time algorithm for finding a cut of minimum weight with at most r vertices on its smaller side. Proof. We set up two weight functions over a complete graph on n = |V | vertices: w1 (u, v) = 1, for every u, v ∈ V , and w2 (u, v) = w(u, v), if (u, v) ∈ E, and w(u, v) = 0, otherwise. We then find a cut C that minimizes w2 (C) subject to the constraint w1 (C) ≤ r(n − r). 2.2.5 The Pareto Set Suppose all weight functions are integral. Let Mi = P e∈E wi (e), for 1 ≤ i ≤ k. The Pareto set can be trivially found by invoking the basic decision algorithm Πki=1 Mi times, or the optimization algorithm Πk−1 i=1 Mi times. These naive algorithms are pseudo-polynomial for every fixed k. Using very similar ideas we can also obtain an FPTAS for finding an approximate Pareto set, a notion defined by Papadimitriou and Yannakakis [PY00]. It is defined as a set of feasible k-tuples, such that for every solution there is a k-tuple in the set within a factor of (1 − ) in all coordinates. More formally, the set P (G, w1 , . . . , wk ) is a set of cost vectors of cuts in the graph such that for every cut C there exists (c1 , ..., ck ) ∈ P (G, w1 , .., wk ) such that (1 − )ci ≤ wi (C) for 1 ≤ i ≤ k. It is easy to see that we can find this set in polynomial time, by invoking the basic algorithm for the decision problem only for powers of (1 − ), instead of checking all the possible values. 2.2.6 Hardness results Theorem 2.7. The multicriteria minimum cut problem with a non-fixed number of criteria is strongly NP-complete. Proof. We use a reduction from the bisection width problem (see [GJ79], problem ND17): Given an unweighted input graph G = (V, E) on n = 2r vertices and a bound b, is there a bisection 2.2 Multicriteria global minimum cut: the AND-version 17 Figure 2.3: A simple illustarion (with n = 4) for the construction in the proof of Theorem 2.7. All the original edges of the graph (solid lines) are assigned the costs (0, 0, 0, 0, 0, 0, 1). The edges connected to s are assigned the costs (1, 1, 1, 1, 1, 0, 0), and the edges connected to t are assigned the costs (1, 1, 1, 1, 0, 1, 0). of the graph that cuts at most b edges? We transform such an instance in the following way: Assume that V = {1, 2, . . . , n}. We add two vertices, s and t, and add edges connecting them to each of the vertices in V . Let G0 = (V 0 , E 0 ) be the resulting graph. Each edge of G0 is now assigned n + 3 weights. For 1 ≤ i ≤ n, we let wi (s, i) = wi (t, i) = 1, and wi (e) = 0 for all other edges. We assign wn+1 (e) = 1 for the edges of the form (s, i), i ∈ V , and wn+1 (e) = 0, otherwise. Similarly, we assign wn+2 (e) = 1 for the edges of the form (t, i), i ∈ V , and wn+2 (e) = 0, otherwise. Finally, wn+3 (e) = 1 for e ∈ E, and wn+3 (e) = 0, otherwise. It is now easy to see that G has a bisection of width at most b if and only if G0 has a cut C for which wi (C) ≤ 1, for 1 ≤ i ≤ n, wn+1 (C), wn+2 (C) ≤ r, and wn+3 (C) ≤ b. It is also not difficult to show that the directed multicriteria global minimum cut problem is strongly NP-complete, even for two criteria. In this problem, we are given a directed graph G = (V, E) with weight functions w1 , ..., wk : E → R+ . A solution consists of a cut C, and of a labelling of the two vertex sets it separates by S and T . The weights of each cut are the sums of the weights of the cut edges directed from S to T . Each of the above mentioned variants for the undirected multicriteria minimum cut problem can be considered here as well: the decision, optimization, min-max and Pareto-set problems. In the directed multicriteria s-t min-cut problem, two vertices, s and t, are specified with the input, and the solution must satisfy: s ∈ S and t ∈ T . Theorem 2.8. The directed multicriteria global minimum cut problem is strongly NP-complete, even for just two criteria. 18 Multicriteria Global Min-Cut Proof. We show this by a reduction from undirected multicriteria s-t-min-cut, which is strongly NP-hard, even for k = 2 (Papadimitriou and Yannakakis [PY00]). An instance of the undirected bicriteria s-t-min-cut decision problem can be reduced to an instance of the directed bicriteria s-t-min-cut problem simply by replacing each edge by two antiparallel directed edges with the same weight. Recall that in a directed s-t-min-cut only edges directed from S to T contribute to the cut weights (s ∈ S and t ∈ T ), so having edges in the opposite direction does not influence the solution. This instance can then be reduced to an instance of the directed bicriteria global min-cut decision problem. We simply connect each vertex to s with edges having weight m · M + 1 in both criteria (where M is the maximal weight), and do the same from t to all the other vertices (if some of these edges already exist then we replace them). We assume that at least one input bound satisfies bi < m · M + 1, otherwise the answer is trivially ”yes”. So a solution to this problem will necessarily have s and t on different sides, s ∈ S and t ∈ T . Therefore a solution to this problem also solves the original problem, and it has the same weights. Thus, the directed multicriteria global min-cut decision problem is strongly NP-hard, even for k = 2. 2.3 Multicriteria global minimum cut: the OR-version 2.3.1 Relation to scheduling on unrelated machines As mentioned in the introduction, there is a trivial reduction from the scheduling on unrelated machines problem to the OR-version of the min-max multicriteria global minimum cut problem. Known hardness results for the scheduling problem (see Lenstra et al. [LST90]) then imply the following: Theorem 2.9. The OR-version of the min-max multicriteria global minimum cut problem is NP-hard even for just two criteria. The problem with a non-fixed number of criteria cannot be approximated to within a ratio better than 3/2, unless P=NP. The scheduling problem on a fixed number of unrelated machines can however be solved in pseudo-polynomial time. Horowitz and Sahni [HS76] present a simple branch-and-bound pseudopolynomial algorithm for that problem which runs in O(m2 (kM )k−1 ) time, where m is the number of jobs, k is the number of machines, and M is the optimal makespan. This immediately implies: Theorem 2.10. Let G = (V, E) be an undirected graph with k integral weight functions w1 , . . . , wk : E → N defined on it edges. Let C be a cut in G. Then, a choice function α : E(C) → {1, 2, . . . , k} which minimizes maxki=1 wi (C, α) can be found in pseudo-polynomial time. 2.3 Multicriteria global minimum cut: the OR-version 19 Algorithm Min-Max-Or(G(V, E, w1 , ..., wk )): 1. Let w0 (e) = minki=1 wi (e), for every e ∈ E. 2. Find all the k-approximate minimum cuts in G with respect to w0 . 3. For each of the cuts C found in the previous step, find the best choice function α : E(C) → {1, 2, . . . , k}. 4. Output the best cut and choice function found. Figure 2.4: A pseudo-polynomial time algorithm for the min-max version of the k-criteria global minimum cut problem. Jansen and Porkolab [JP99] obtained an FPTAS for the unrelated machines scheduling problem, which runs in O(m(k/)O(k) ) time (for any fixed number k of machines). It can be used instead of the exact algorithm of [HS76] when approximate solutions are acceptable. 2.3.2 The min-max version We show that the simple algorithm given in Figure 2.4, which is a variant of the algorithm given in Figure 2.2, solves the OR-version of the min-max problem in pseudo-polynomial time, for any fixed number of criteria. Theorem 2.11. The OR-version of the min-max k-criteria global minimum cut problem with integer edge weights can be solved in O(m2 n2k (kM )k−1 ) time, where M is the optimal min-max value. Proof. We begin again with the correctness proof. Let C be an optimal min-max cut and let α be the corresponding optimal choice function. Let D be any other cut. We show that w0 (C) ≤ k·w0 (D), where w0 (e) = minki=1 wi (e), for every e ∈ E. To see that, we let β : E(D) → {1, 2, . . . , k} be a choice function for which β(e) = i if wi (e) ≤ wj (e), for every 1 ≤ j ≤ k. Then, k k i=1 i=1 w0 (C) ≤ k· max wi (C, α) ≤ k· max wi (D, β) ≤ k·w0 (D) . The second inequality follows as (C, α) is an optimal solution of the min-max problem. 20 Multicriteria Global Min-Cut We next consider the complexity of the algorithm. The k-approximate cuts with respect to w0 can be found again in O(mn2k ) time using the algorithm of Nagamochi et al. [NNI97]. For each one of the O(n2k ) approximate cuts produced, we find an optimal choice function using the algorithm of Horowitz and Sahni [HS76]. The total running time is then O(mn2k + n2k · m2 (kM )k−1 ) = O(mn2k + m2 n2k (kM )k−1 ), where M is the value of the optimal solution. Theorem 2.12. The OR-version of the min-max k-criteria global minimum cut problem, with k fixed, admits an FPTAS. Proof. The proof is identical to the proof of Theorem 2.11 with the exact algorithm of Horowitz and Sahni [HS76] replaced by the FPTAS of Jansen and Porkolab [JP99]. As in Section 2.2, we can use the algorithm for the min-max version of the problem to solve the decision and optimization versions of the problem. We omit the obvious details. 2.3.3 A case that can be solved in polynomial time We now discuss a restriction of the min-max problem that can be solved in strongly polynomial time. For simplicity, we consider the bicriteria problem. Theorem 2.13. Instances of the OR-version of the min-max bicriteria global minimum cut problem in which one of the weight functions assumes only r different values can be solved in O(mr+1 n4 ) time. Proof. Assume, without loss of generality, that w2 assumes only r different real values a1 , a2 , . . . , ar . Let Ei = w2−1 (ai ) = {e ∈ E | w2 (e) = ai }, for 1 ≤ i ≤ r. Consider an optimal min-max cut C and an optimal choice function α : E(C) → {1, 2} for it. It is easy to see that for every 1 ≤ i ≤ r there is a threshold ti such that if e ∈ Ei , then α(e) = 1 if and only if w1 (e) ≤ ti . (Indeed, if there are two edges e1 , e2 ∈ Ei such that w1 (e1 ) < w1 (e2 ), α(e1 ) = 2 and α(e2 ) = 1, then the choice function α0 which reverses the choices of α on e1 and e2 is a better choice function. We assume here, for simplicity, that all weights are distinct.) As there are at most m + 1 essentially different thresholds for each set Ei , the total number of choice functions that should be considered is only O(mr ). With a given choice function α : E → {1, 2}, the problem reduces to an AND-version of the problem with the weights wi0 (e) = wi (e), if α(e) = i, and wi0 (e) = 0, otherwise, for i = 1, 2. As each such problem can be solved in O(mn4 ) time, the total running time of the resulting algorithm is O(mr+1 n4 ). 2.4 OR-versions of other multicriteria problems 2.4 21 OR-versions of other multicriteria problems In this section we consider the OR-versions of the bicriteria shortest path and minimum spanning tree problems. Our results can probably be extended to any fixed number of criteria. 2.4.1 Shortest paths The input to the problem is a directed graph G = (V, E) with two weight functions w1 , w2 : E → R+ defined on its edges, two vertices s, t ∈ V , and two bounds b1 , b2 . The question is whether there is a path P from s to t in the graph and a choice function α : P → {1, 2} such that w1 (α−1 (1)) ≤ b1 and w2 (α−1 (2)) ≤ b2 . The graph G = (V, E) may, for example, represent the map of a city. Each edge e ∈ E of the graph can be traversed either by bus or by subway. The weight w1 (e) is the number of bus tokens needed for traversing the edge e by bus, while the weight w2 (e) is the number of subway tokens needed to traverse e by subway. The question then is whether it is possible to get from s to t using given amounts of subway tokens and bus tokens. It is easy to see, using a simple reduction from the scheduling on unrelated machines problem, that the OR-version of the bicriteria shortest path problem is NP-hard. We show, however, that it can be solved in pseudo-polynomial time. An FPTAS for the problem is easily obtained by scaling. Theorem 2.14. The OR-version of the bicriteria shortest path decision problem with integer edge lengths can be solved in O(nmW log(nW )) time, where W = maxe∈E w1 (e). Proof. The OR-version of the problem can be easily reduced to the AND-version of the problem by replacing each edge e having a weight vector (w1 (e), w2 (e)) by two parallel edges e0 and e00 having weight vectors (w1 (e), 0) and (0, w2 (e)). The standard, AND-version, of the problem can be solved using an algorithm of Hansen [Han80] within the claimed time bound. 2.4.2 Minimum spanning trees Next we consider the OR-version of the bicriteria minimum spanning tree problem. The input is an undirected graph G = (V, E), two weight functions w1 , w2 : E → R, and two bounds b1 and b2 . The question is whether there exist a spanning tree T and a choice function α : T → {1, 2} such that w1 (α−1 (1)) ≤ b1 and w2 (α−1 (2)) ≤ b2 . The OR-version of the bicriteria minimum spanning tree problem is again easily seen to be NP-hard. We provide a polynomial time algorithm for a special case of the problem, and a pseudo-polynomial time algorithm for the general case. 22 Multicriteria Global Min-Cut Theorem 2.15. The OR-version of the minimum spanning tree problem in which one of the weight functions is constant, i.e., w2 (e) = c, for every e ∈ E, can be solved by solving a single standard minimum spanning tree problem. Proof. We simply solve the standard minimum spanning tree problem with respect to the weight function w1 and obtain a minimum spanning tree T . For the bb2 /cc heaviest edges of T we choose to pay the w2 cost, and for all the others we pay the w1 cost. The correctness of this procedure follows from the well known fact that if the weights of the edges of T are a1 ≤ a2 ≤ · · · ≤ an−1 , and if T 0 is any other spanning tree of the graph G with edge weights a01 ≤ a02 ≤ · · · ≤ a0n−1 , then ai ≤ a0i , for 1 ≤ i ≤ n − 1. Theorem 2.16. The OR-version of the bicriteria minimum spanning tree decision problem with integer edge lengths can be solved in O(n4 b1 b2 log(b1 b2 )) time. Proof. The OR-version of the problem can be easily reduced to the AND-version of the problem by replacing each edge e having a weight vector (w1 (e), w2 (e)) by two parallel edges e0 and e00 having weight vectors (w1 (e), 0) and (0, w2 (e)). The standard, AND-version, of the problem can be solved using an algorithm of Hong et al. [HCP04] within the claimed time bound. The AND-version of the multicriteria minimum spanning tree problem can be solved in polynomial time using matroid intersection algorithms. 2.5 Concluding remarks We showed that the standard (i.e., the AND-version) multicriteria global minimum cut problem can be solved in polynomial time for any fixed number k of criteria. The running time of our algorithm, which is O(mn2k ), is fairly high, even for a small number of criteria. Improving this running time is an interesting open problem. We also considered the OR-version of the problem and showed that it is NP-hard even for just two criteria. It can be solved, however, in pseudopolynomial time, and it also admits an FPTAS, for any fixed number of criteria. Finally, we considered the OR-versions of the bicriteria shortest path and minimum spanning tree problems, and showed that both of them are NP-hard but can be solved in pseudo-polynomial time. It will also be interesting to study OR-versions of other multicriteria optimization problems. Chapter 3 Cooperative TSP 3.1 Introduction The Traveling Salesperson Problem (TSP) is a classical problem in combinatorial optimization, which has been studied extensively in many forms. Cooperative TSP is a set of variants of TSP in which the customers are able to move in order to assist the selling process. They may move in order to expedite the deliveries, and may also move after meeting the salesperson in order to help the distribution of the goods. The basic motivation for these variants is that the “salesperson” and “customers” are often part of the same organization, and can be instructed by the “headquarters” to cooperate. For example, consider a secret message that has to be distributed to several spies, but is only allowed to be passed in person. Every spy can be instructed when and where to receive it, and a spy who receives the message may then assist by passing it forward. We may want to devise a scheme for delivering the secret to all the spies as fast as possible. A further illustration is the problem of sleeping robots which need to be woken up. Once a robot is awaken, it can be instructed to assist in waking-up others, but it cannot move before that. As a robot’s battery is limited, we may wish to minimize the maximal travel of any of them. This problem is related to the previously studied “Freeze-Tag” problem [ABF+ 02, SABM02, ABG+ 03, KLS04] (which we later describe in more detail). Formally, an instance of Cooperative TSP (cTSP) is a set of agents and a salesperson, located in a finite metric space or a Euclidean space. A solution is a synchronized series of move instructions to all participants (i.e., the salesperson and the agents), such that all the agents eventually receive the delivery. We next elaborate on the various cooperation modes, the cost of solutions and other parameters affecting the cTSP. 24 Cooperative TSP Cooperation Modes. We consider three modes of cooperation. In the Purchase-Cooperation mode the salesperson has to meet all agents, and the agents are allowed to move towards the salesperson. In the Sales-Cooperation mode, each agent receiving a delivery becomes capable of making deliveries (exactly like the salesperson). However, an agent is not allowed to move before receiving a delivery. In the Full-Cooperation mode, an agent may cooperate in both the purchase and sales phases. That is, an agent may move before receiving the delivery and may make deliveries after receiving it. Goal Functions. Three objectives are considered for Cooperative TSP: Minimizing the total travel of all the participants (Min-Sum), minimizing the maximal travel of any participant (MinMax), and minimizing the total time until the sales process ends (Min-Makespan). Naturally, the Min-Sum goal is motivated by scenarios in which the travel of all the participants is covered by the same entity (e.g., the delivery-service company), which is therefore interested in minimizing the total travel. The Min-Max objective is required, for example, when there is a bound on the amount of fuel/battery of each participant, and each of them should spend as little energy as possible on this delivery process. Min-Makespan is motivated by cases in which the completion of the deliveries is urgent. Metric Space. We consider Cooperative TSP in any fixed-dimension Euclidean space and in non-negative weighted undirected graphs. Note that w.l.o.g., we may assume that the graph is complete and that the weights of all edges satisfy the triangle inequality, hence this is a finite metric space. Roundtrip vs. Path. We consider both roundtrip versions, in which all participants are required to return to their initial location, and path versions in which there is no such requirement (the starting points may be arbitrary, or we may only be interested in what happens until the deliveries are made). In this study we consider all the problems arising from combining cooperation-modes, goal functions, graph/Euclidean space and path/roundtrip versions. We refer to each of the problems we study using the format: Goal-Function - Cooperation-Mode - [Euclidean] - cTSP (e.g., “Min-Sum Purchase Euclidean cTSP”). Unless explicitly stated otherwise, a problem name indicates its path version (rather than its roundtrip version). 3.1 Introduction 25 cTSP in Graphs Goal Min Sum Min path Max Purchase Cooperation Approx. Inapprox. 2 + ln 3 NP-hard PTAS no FPTAS Min Polynomial Makespan Min Sum Min round Max Min trip Makespan 3 2 131 130 PTAS −ε no FPTAS Polynomial Sales Cooperation Approx. Inapprox. 2 3 √ O( log n) ∗ Full Cooperation Approx. Inapprox. NP-hard 2 + ln 3 APX-hard 2−ε 4 2−ε ∗∗ 2 2−ε −ε 3 2 5 3 −ε 3 2 131 130 131 130 3 √ O( log n) 3 2 −ε 2 2−ε 5 4 −ε 2 2−ε −ε Table 3.1: Approximation ratios vs. inapproximability ratios for cTSP in weighted Graphs. (∗) is by [KLS04] and (∗∗) is by [ABF+ 02]. The parameter ε stands for an arbitrarily small positive constant, or for a positive function that tends to zero as the input size increases. 3.1.1 Related Studies The classical TSP problem remains NP-hard even in planar graphs [GGJ76, Pap77]. However, there is a PTAS for any fixed-dimension Euclidean space [Aro98, Mit99]. When only metric space is assumed, the best known approximation algorithm yields a 32 -approximation ratio [Chr76] and an inapproximability factor of 131 130 - was shown [EK01]. The Freeze-Tag Problem. The Freeze-Tag problem was first suggested and studied by Arkin et al. in [ABF+ 02]. The problem arises in the context of swarm robotics: How to wake up a set of slumbering robots, when initially only one robot is awake (waking up a robot requires reaching its location). Once a robot is woken up it can assist in waking up other slumbering robots. The objective is to have all robots awake as early as possible. In our terminology, this is the path version of Min-Makespan Sales cTSP. Arkin et al. [ABF+ 02] provided an NPhardness proof, a PTAS for the Euclidean variant, and a constant approximation for some graph families. A series of studies followed (e.g., [SABM02, ABG+ 03, KLS04]) culminating with an √ O( log n)-approximation for the general weighted graph case [KLS04]. 26 Cooperative TSP TSP with Neighborhoods. TSP with Neighborhoods is a proximity-related variant of TSP. In this problem each customer is willing to meet the salesperson anywhere within some neighborhood. The problem was first studied by Arkin et al. [AH94], followed by quite a few papers (e.g., [MM95, GL99, DM01, dBGK+ 05, SS05, Mit07]). An instance of TSP with Neighborhoods may reside in a weighted graph or in a Euclidean space. The problem seems quite related to Purchase cTSP, as in both customers are willing to approach the salesperson. However, in TSP with Neighborhoods the customers’ travel is not counted in the goal function, while in Cooperative TSP their moves do cost, and are part of the optimization task. Other Cooperative Multi-Agent Routing Problems. As noted in [ABF+ 02], the Freeze-Tag Problem (and thus the Cooperative TSP problems) bears features of broadcasting, routing, scheduling and network design. The minimum broadcast time, the multicast problem and the minimum gossip time problem are all closely related to Cooperative TSP (see [HHL88] for a survey and [Rav94, BNGNS98] for approximation results). Controlling swarms of robots in order to perform a certain task, has also been studied in various algorithmic aspects, including environment exploration, robot formation, searching and recruitment (see [ABF+ 02] for a list of relevant papers). Other researches confront similar scenarios, but with no central control, where each agent has to make decisions with limited knowledge regarding the environment and the other agents (for example, the problem of routing autonomous agents in a wireless sensor network, and ants behavior inspired algorithms; see [ABF+ 02] for a list of relevant papers). As cTSP generalizes the Freeze-Tag problem and is closely related to the TSP with Neighborhoods problem, the algorithms (and intractability results) obtained for cTSP apply to similar scenarios, e.g., cooperative robots tasks (for additional relevant scenarios see [AH94, ABF+ 02]). 3.1.2 Our Contribution We consider all combinations of cooperation modes, goal functions, path / roundtrip and graph / Euclidean versions. The results for cTSP in weighted graphs are summarized in Table 3.1 and the results for cTSP in a fixed-dimension Euclidean space are summarized in Table 3.2. We obtain constant approximations for most of the problems, PTAS for many of them, and polynomial-time exact solutions for a few. On the intractability side we obtain NP-hardness proofs and inapproximability factors for all the NP-hard graph problems and for some of the Euclidean problems. 3.2 Euclidean cTSP 27 Euclidean cTSP path round trip Goal Min-Sum Min-Max Min-Makespan Min-Sum Min-Max Min-Makespan Purchase Sales Full Cooperation Cooperation Cooperation 5 +ε PTAS 2+ε 3 PTAS 3 4 ∗ Polynomial PTAS PTAS PTAS PTAS PTAS PTAS 3 2 Polynomial PTAS PTAS Table 3.2: Approximation ratios for cTSP in any fixed dimension Euclidean space. (∗) is by [ABF+ 02]. The parameter ε stands for an arbitrarily small positive constant, or for a positive function that tends to zero as the input size increases. Chapter Organization. We start with our results for Euclidean cTSP, which we present in Section 3.2. The results for cTSP in weighted graphs are presented in Section 3.3. Each section is divided into three subsections, one for each of the goal functions (Min-Sum, Min-Max and Min-Makespan, in this order). Furthermore, each subsection is divided according to cooperation modes (Purchase, Sales and Full-Cooperation, in this order). 3.2 Euclidean cTSP This section presents the results we obtained for the various Euclidean cTSP problems. 3.2.1 Min-Sum Euclidean-cTSP In this subsection we consider the various objectives for the path versions of Min-Sum Euclidean-cTSP. It is not hard to see that all the roundtrip versions, with either of the three cooperation-modes, are identical to the classical TSP problem. Specifically, Claim 3.1. For any metric space M, the roundtrip versions of Min-Sum Purchase cTSP in M, Min-Sum Sales cTSP in M, and Min-Sum Full-Cooperation cTSP in M are all equivalent to TSP in M. Proof. Consider a solution for any of the above cTSP problems in M. In addition, consider the 28 Cooperative TSP first meeting between the salesperson and an agent who moves in this solution. Let x be the point in which this meeting occurs, and let y be the initial location of that agent. We observe that since each agent’s moves form a cycle, there is a solution with the same cost in which that agent does not move. This holds since the salesperson can travel from x to y along the path traveled by the agent, meet the agent at x, then follow the rest of the cycle traveled by that agent (in reverse order), and return back to y. Thus, exactly the same points are visited and the cost of travel remains the same. Therefore, w.l.o.g., in any solution of the above mentioned cTSP problems, no participant moves except the salesperson. Hence, all the above mentioned cTSP problems in M are equivalent to TSP in M. Thus, like the classic TSP problem, these roundtrip problems are all NP-hard [GGJ76, Pap77], and have a PTAS for any fixed-dimension Euclidean space [Aro98, Mit99]. We therefore consider the Path version of these problems. 3.2.1.1 Min-Sum Purchase Euclidean-cTSP We next provide a PTAS for Min-Sum Purchase Euclidean-cTSP. Note that the problem is NP-hard even for the planar case. This follows, since an instance of the classical planar TSP can be reduced to an instance of Min-Sum Purchase Euclidean-cTSP by simply replacing each customer with three agents. This makes the salesperson the only participant who moves in an optimal solution (moving such three agents to meet him at another location clearly costs more than moving the salesperson from the other location to their initial location and back). The algorithm and analysis below use Arora’s technique for the PTAS of Euclidean TSP [Aro98]. Our algorithm differs from Arora’s algorithm in that it has to consider all the agents’ paths and not only the salesperson’s path. We show how this can be done while keeping the dynamic programming polynomial. We get: Theorem 3.2. Min-Sum Purchase Euclidean-cTSP admits a PTAS. We describe the PTAS for the 2-dimensional case. The extension to any fixed dimension is straightforward. Roughly speaking, we prove the existence of a coarse solution, which is called a minimal cost portal-limited-solution, that has a cost of at most (1 + ε) the cost of an optimal solution. We then show how to find a minimal cost portal-limited-solution in polynomial time, using dynamic programming. We start by introducing the terminology. Readers familiar with Arora’s PTAS for Euclidean TSP may want to skip to the (slightly altered) definition of portallimited-solutions. 3.2 Euclidean cTSP 29 Let ε > 0 be an arbitrary small constant. Denote by n the number of participants and by OP T the cost of the optimal solution. Let L = 23+d2 log ne (the smallest power of 2 which is at least 8n2 ). By stretching and shifting the input points we may assume, without loss of generality, that all the participants are located inside the bounding box [0, L/2]2 and that OP T > L/4. Super-pixels. We call each square [j, j + 2] × [j 0 , j 0 + 2], where j, j 0 ∈ {0, 2, 4, . . . , L − 2}, a pixel. We name the point (j + 1, j 0 + 1) the center of the pixel [j, j + 2] × [j 0 , j 0 + 2]. For every i = 0, . . . , log L − 1, we call each square [j, j + L/2i ] × [j 0 , j 0 + L/2i ], where j, j 0 ∈ {0, L/2i , 2 · L/2i . . . , L − L/2i }, a super-pixel of level i. Thus, each super-pixel of level log L − 1 is a pixel and the super-pixel of level 0 is the entire bounding box. Additionally, note that different super-pixels of the same level may overlap only at their boundaries, and that each super-pixel of level i contains four super-pixels of level i + 1, for i = 0, . . . , log L − 2. Clearly, the total number of super-pixels is polynomial in n. From now on we consider, without loss of generality, only instances for which all the participants are located at pixel centers. This is possible since any optimal solution of a general instance can be modified by instructing each participant to initially move to its pixel’s √ center. This increases the cost of the solution by at most n · 2. As OP T > L/4 ≥ 2n2 , the increase is at most OP T /n, which is less than ε/2 · OP T , for a sufficiently large n. An (a, b)-shifting. Let 0 ≤ a, b < L/2 be two even integers. For a set A ⊆ [0, L/2]2 we define the (a, b)-shift of A to be the set {(x + a, y + b) | (x, y) ∈ A}. In particular, we are interested in an (a, b) shift of the original instance, (a, b)-shifted instance, which by our choice of parameters lies inside the bounding box [0, L]2 .√ √ Portals. Let m ∈ [ 8 2 log L 16 2 log L , ) ε ε be a power of 2. Note that, m = O( logε n ). For each super-pixel we mark each one of its four boundaries with m equidistant points that we refer to as portals. In particular, the portals include the four corners of the super-pixel. Note that, as m is a power of 2, each portal of a super-pixel of level i is also a portal of a smaller super-pixel of level i + 1, for i = 0, . . . , log L − 2. This is illustrated in Figure 3.1. Portal-limited-solutions. We define a portal-limited-solution as a solution that satisfies the following four conditions: 1. Each participant may cross the boundary of a super-pixel only at its portals. 2. The salesperson does not cross her own route except at portals, where she may visit at most twice. 3. A meeting between an agent and the salesperson occurs only at a pixel center. 4. If two (or more) agents happen to reside at a pixel, then they all travel to (or stay at) the pixel’s center and cease to move. 30 Cooperative TSP Figure 3.1: This figure illustrates the partition of each super-pixel into four smaller superpixels, and the definition of portals (m on each super-pixel boundary). Note that a portal of the bigger super-pixel (black) is also a portal of the super-pixels contained in it (grey). Therefore, in a portal-limited-solution, the tour of each participant is a collection of segments which connect portals to portals, and centers of pixels to portals. Additionally, in an optimal portal-limited-solution tours of two agents do not cross. Using the above notations, our PTAS relies on the following two Lemmata: Lemma 3.3. A minimal cost portal-limited-solution can be found in time polynomial in n. Lemma 3.4. Let a, b be two even integers chosen uniformly at random from the set {0, 2, . . . , L/2− 2}. Then, the expected cost of a minimal cost portal-limited-solution of the (a, b)-shifted instance, is at most (1 + ε) · OP T . The PTAS enumerates over all O(L2 ) values of (a, b) pairs. For each pair it applies Lemma 3.3 to find a minimal cost portal-limited-solution. Finally, it outputs the cheapest solution found, which according to Lemma 3.4, must have a cost of at most (1 + ε) · OP T . Clearly, the O(n4 ) factor in running time, caused by the enumeration over all (a, b) pairs, can be avoided if only an expected (1 + ε) · OP T cost is desired. The proof of Lemma 3.3 explains how to consider both the salesperson’s and the other agents’ paths, while keeping the time polynomially bounded. Proof. (of Lemma 3.3) We use dynamic programming to build a polynomial-size table. For each super-pixel, the table contains 64m = nO(1/ε) entries. For each entry we store portions of some portal-limited-solutions (the portions of solutions limited to that super-pixel) together with their contribution to the overall cost. 3.2 Euclidean cTSP 31 The construction of the table is conducted in a bottom-up manner, starting from the pixels. A minimal value portal-limited-solution for the whole instance is obtained at the bounding-box super-pixel. The entries of the table for each super-pixel are represented by a list of 4m elements, one element for each portal of the super-pixel. Each element takes one of the following six values: 1. The salesperson enters the super-pixel at this portal 2. The salesperson leaves the super-pixel at this portal 3. The salesperson enters and leaves the super-pixel at this portal 4. One agent enters the super-pixel at this portal 5. One agent leaves the super-pixel at this portal 6. None of the participants use this portal Note that the conditions defining a portal-limited-solution assure that these six cases cover all possible tour portions induced by all portal-limited-solutions (here we use the fact that two agents do not happen to reach the same portal, as they start at pixel centers, their tours do not cross and they end up at pixel centers). Also note that not all the 4m-size lists represent a valid portion of some portal-limited-solution (they may represent non-matching numbers of entrances and exits of the salesperson/agents, agents staying at a super-pixel which the salesperson does not visit, or two agents leaving the same pixel). We use the term valid-list for a list that represents a collection of tours that can be extended to some portal-limited-solution (and this validity can easily be checked in O(m)time). Clearly, there are at most 64m = nO(1/ε) (valid-)lists. Finally, note that the salesperson’s paths can intersect only at her entrance or exit points. Hence, given a valid-list, pairings of the participants’ entrance and exit points can be found as in the algorithm of Arora, and there are nO(1/ε) options for such pairings [Aro98]. We now describe the construction in a bottom-up manner. Consider a pixel. Each valid-list of the pixel may fall into one the following three categories: 1. There is no agent in the pixel and the salesperson may visit the pixel one or more times. 2. There is one agent in the pixel. If the salesperson visits the pixel they meet at the pixel’s center. 3. Two or more agents visit the pixel. The salesperson also visits the pixel. In one of the visits she arrives at the center of the pixel. In this case, each agent travels along a straight line 32 Cooperative TSP from a portal of the pixel to the center of the pixel. Alternatively, an agent’s route may be an empty route if the agent is already located at the center of the pixel. In each case, the computation of the minimal cost for each valid-list of the pixel can be done in polynomial time. For each valid-list, the minimal cost is kept in the table, along with the pairing for which this minimum is obtained. We now turn to the computation of the table’s entries for the super-pixels of level i, assuming all valid-lists of super-pixels of level i + 1 were computed. Let S be a level i super-pixel and consider a list of 1, . . . , 6 values for its portals. The list already fixes the entrances and exits on the boundary of S. The super-pixel S contains four level i + 1 super-pixels, which have four boundaries internal to S, with a total of at most 4m more portals. Each of these portals may be used in one out of the six ways, giving rise again to nO(1/ε) possibilities. The cost for each possibility can be computed by using the values for the four i + 1 level super-pixels previously obtained. Thus, we can find the minimal cost that corresponds to each list in nO(1/ε) time. For each valid-list, we keep in the table which combination of uses of the internal boundaries provided the minimal cost. For the top-level super-pixel (the bounding-box) we may only consider the list for which neither the salesperson nor an agent visit a portal. The last table update of level 0 produces the cost of a minimal portal-limited-solution. We can reconstruct the solution itself, since for each valid-list we kept the pairings and internal-boundary uses that had minimal cost. The proof of Lemma 3.4 mainly follows arguments from the PTAS of Euclidean TSP [Aro98], and is brought here for completeness. Proof. (of Lemma 3.4) Let π be an optimal solution. For every a, b ∈ {0, 2, . . . , L/2 − 2} denote by πab the (a, b)-shift of π. We have to show, given a randomly chosen a and b, how to change πab to a portal-limitedsolution such that the expected increase in cost would be at most ε · OP T . We refer to the axis-parallel lines of the form x = 2k or y = 2k, where k is an integer, as even grid lines. Note that all portals are located on even grid lines. Suppose that in π, a participant travels along a segment that crosses an even grid line `. Let a and b be two even numbers chosen uniformly at random from 0, 2, . . . , L/2 − 2. Denote by `ab the (a, b)-shift of `. Note that the probability (over the choices of a and b) that `ab contains a boundary of a level i super-pixel is 2i /(L/4). Following the choice of a and b, we replace the segment traveled by the participant by two segments, so that the crossing of `ab is at the closest portal on `ab . The corresponding increase in cost is bounded by the interportal distance on `ab 3.2 Euclidean cTSP 33 which is (L/2i )/m. Thus, we may bound the expected increase in cost due to this crossing by log XL i=1 The last inequality holds as m ∈ h L 2i 4 log L ε = ≤ √ . 2i m L/4 m 2 2 √ √ 8 2 log L 16 2 log L , . ε ε 0 which is obtained by replacing each segment of the π by two Now, consider a solution of πab ab axis-parallel segments Clearly, the number of even grid lines crossings in πab is at most the number of even grid lines √ 0 , which is at most 2 · OP T . crossings in πab By combining the last two arguments we obtain that the total expected increase of cost is at most ε/2 · OP T . Thus, we showed how to obtain a solution with an expected total cost of at most (1 + ε/2) · OP T , which satisfies condition (1) of the portal-limited-solution definition. Now we may remove self-intersections by “short-cutting”. In addition, if a portal is used more than twice, we can keep “short-cutting” on the two sides of the portal until the portal is used at most twice. (If this introduces additional self-intersections, they can also be removed.) The obtained solution has an expected total cost of at most (1 + ε/2) · OP T and it satisfies conditions (1) and (2) of the portal-limited-solution definition. Note that changing the solution by moving each meeting point between an agent and the salesperson to the nearest pixel center, increases the cost by at most O(n) = O(OP T /n). Additionally, note that if in our solution two (or more) agents happen to meet, then they may cease to move. This holds, since in such a case the salesperson may come to meet the agents (and return) without increasing the total cost. Combined with the previous argument, we obtain that we can change the solution to also satisfy conditions (3) and (4) of the portal-limited-solution definition, without increasing the total cost by more than O(1/n) · OP T . The latter cost is less than ε/2 · OP T , for a sufficiently large n. Thus, we obtained a portal-limited-solution which has an expected total cost of at most (1 + ε) · OP T . Therefore, the proof is complete. 3.2.1.2 Min-Sum Sales Euclidean-cTSP For this problem we obtain a 5/3 + ε approximation, and improve the ratio to 3/2 + ε for its planar version (for an arbitrarily small ε > 0). We also prove NP-hardness, even for the planar case. 34 Cooperative TSP Lemma 3.5. Consider an instance of Min-Sum Sales Euclidean cTSP where there is no more than one participant in any single point, and there are no three participants on the same straight line. Solving it is equivalent to finding a bounded-degree minimum-spanning-tree, spanning the initial locations of the participants, where the degree-bound is 1 for the salesperson’s tree-node and 3 for all the other nodes. Proof. Consider an optimal solution for Min-Sum Sales Euclidean cTSP. Clearly, the participants move in straight lines between the initial locations of agents, to sell them the goods. We assume w.l.o.g. that all the intersections between the participants’ routes occur at initial locations of agents (if the routes of two agents intersect at another location, we can switch between them and thus lower the cost of the solution). We can also assume w.l.o.g. that any initial location of an agent is only visited once in an optimal solution. Thus, in an optimal solution, the collection of the routes used by the participants forms a spanning tree of their initial locations. The degrees of the spanning tree are bounded by 3, since at most one participant enters a tree-node and at most two leave it. The node corresponding to the salesperson must have degree 1. On the other hand, any such bounded-degree spanning-tree produces a solution for our problem. Such a solution can be obtained by simply directing the edges of the spanning-tree from parent to child and letting the participants follow these directed edges, starting with the salesperson (such that a single participant traverses each tree-edge). Therefore, finding a minimumspanning-tree which satisfies these degree-constraints is equivalent to solving our problem in this case. Corollary 3.6. Min-Sum Sales Euclidean cTSP can be approximated within 5/3 + ε, for any ε > 0. Proof. We slightly perturb the input locations, such that they satisfy the conditions of Lemma 3.5. Khuller et al. [KRY96] showed that a minimum-spanning tree in any fixed-dimension Euclidean space can be modified to satisfy the degree constraints we require (1 for a pre-specified node and 3 for the others), while increasing its weight by a factor of at most 5/3. Thus, the Corollary follows. For the planar case, we manage to improve the approximation ratio to 3/2 + ε: Theorem 3.7. Min-Sum Sales Planar-cTSP can be approximated within 3/2 + ε, for any ε > 0. 3.2 Euclidean cTSP 35 Algorithm Sales-Bounded-MST: 1. Compute a minimum-spanning tree, spanning all the agents’ initial locations (not including the salesperson), such that its degrees are bounded by 5. 2. Let p be the location of the salesperson, and let q be the location of the agent closest to him. Let r be the location of one of the agents connected to q in the above tree. Transform the subtree rooted at r into a tree with degree at most 3, such that the degree of r is 1. 3. Transform the subtree rooted at q, without the subtree rooted at r, into a tree with degree at most 3, such that the degree of q is 1. 4. Connect q to r and connect q to p. Output the resulting tree. Figure 3.2: A 3/2-approximation algorithm for a 3-bounded-degree tree that spans the participants’ locations and has salesperson’s node of degree 1. Proof. By slightly perturbing the initial locations of the participants, we can assume that the assumptions of Lemma 3.5 hold (this increases the optimal cost by a factor of 1 + ε, for an arbitrarily small ε > 0). We look for the optimal solution of this slightly perturbed input, i.e., we look for a minimum spanning tree satisfying the degree-constraints stated in Lemma 3.5. We find an approximate solution using the algorithm Sales-Bounded-MST of Figure 3.2. The first stage of the algorithm can easily be performed in polynomial time [MS92]. Note that connecting p and q right after this stage would have yielded a minimum-spanning-tree which spans all the initial locations. Stages 2 and 3 are performed using the 32 -approximation algorithm of Khuller et al. [KRY96], which requires that the degree of each node will be at most 5 and the degree of the root will be at most 4. Thus, each of these stages increases the weight of the transformed subtree by a factor of at most 3/2 [KRY96]. So all in all, we obtain a spanning tree which satisfies the degree bounds and has a weight of at most 3/2 times a minimum-spanning-tree, which also means at most 3/2 times the cost of an optimal solution. As was explained in Lemma 3.5, the participants can now follow the edges of this tree (rooted at p with edges directed from parent to child), and form a solution whose cost is the cost of the tree. Thus, the Theorem follows. Claim 3.8. Min-Sum Sales Planar-cTSP is NP-hard. 36 Cooperative TSP Proof. Finding the minimum-spanning-tree whose degree is bounded by 3 is NP-hard [PV84]. We note that the proof of [PV84] holds even if it is guaranteed that no three points of the input lie on a straight line (since the input points can be slightly perturbed in their construction). Requiring that a certain node will be a leaf only makes the problem harder (by solving the problem for all the possible locations of a leaf one can find the solution for the problem without this requirement). Since according to Lemma 3.5 this bounded-MST problem can be easily reduced to our problem (with the same input, where the salesperson is at the point which should be a leaf), our problem is NP-hard. 3.2.1.3 Min-Sum Full-Cooperation Euclidean-cTSP A (1 + ε)-approximate minimum Steiner tree, spanning all the participants’ initial locations, can be computed by using the PTAS of Arora [Aro98]. Clearly, by letting the salesperson tour the Steiner tree we obtain a (2 + 2ε)-approximate solution for our problem. Thus, Corollary 3.9. Min-Sum Full-Cooperation Euclidean-cTSP can be approximated within 2 + ε, for any ε > 0. 3.2.2 Min-Max Euclidean-cTSP 3.2.2.1 Min-Max Purchase Euclidean-cTSP We first show that both the path and the roundtrip versions of Min-Max Purchase Euclidean-cTSP have a PTAS. We do so by manipulating the input instance such that it fits the PTAS for the graph version of the problem (see Subsection 3.3.2.1). Claim 3.10. The roundtrip and path versions of Min-Max Purchase Euclidean-cTSP admit a PTAS. Proof. Consider an instance of the path version. We assume, w.l.o.g. that the instance lies inside [0, 1]2 and has an optimal cost of at least 1/2. Let m = d εn0 e, where n is the number of participants and ε0 is a parameter to be determined later. We divide the unit square [0, 1]2 into m2 pixels. I.e., 0 0 j j+1 j j +1 a pixel is a square of the form [ m , m ] × [m , m ], where j, j 0 = 0, 1, . . . , m − 1. We consider a slightly changed input, where each participant is located in the center of its pixel. This instance can be approximated using Coarse-Path(G, W, v, ε00 ) - the PTAS for the corresponding graph variant of the problem (see Subsection 3.3.2.1) as follows. Let G be a complete graph, with the m2 pixels as vertices. Let W (e), the weight of each edge, be the Euclidean distance between the 3.2 Euclidean cTSP 37 corresponding pixels’ centers. Let v be the pixel which contains the salesperson and let ε00 be an arbitrary small constant (to be determined shortly). The solution for the altered instance is amended into a solution for the original instance by connecting the original location of a participant and the center of its pixel (each such connection √ is of length at most 2 2m √ = 2ε0 2n ). Denote by OP T the optimal solution for the original instance, by OP T 0 the optimal solution for G, by ALG0 the output of the PTAS for G, and by ALG the output of the whole algorithm. Both OP T and OP T 0 consist of at most 2n segments (one for each agent and at most n for the √ 0 √ 2ε salesperson). Therefore, OP T 0 is at most 2n · 2n = 2ε0 larger than OP T . Similarly, ALG is √ at most 2 2ε0 larger than ALG0 . Thus, √ √ √ √ ALG ≤ ALG0 + 2 2ε0 ≤ (1 + ε00 )OP T 0 + 2 2ε0 ≤ (1 + ε00 )(OP T + 2ε0 ) + 2 2ε0 √ ≤ (1 + ε00 )OP T + 4 2ε0 Therunning time of the algorithm is dominated by the running time of Coarse-Path. Thus, (2/ε00 )+6 0 it is O (n/ε ) . Choosing ε0 = 1/ lg n and ε00 = ε − 12/ lg n, as 1/2 ≤ OP T we obtain that ALG ≤ (1 + ε)OP T and the running time is O (n lg n)2/ε−(12/ lg n)+6 . Exactly the same arguments yield a PTAS for the roundtrip version, with the same running time. 3.2.2.2 Min-Max Sales Euclidean-cTSP Claim 3.11. Min-Max Sales Euclidean-cTSP can be approximated within a factor of 3 both for the path and the roundtrip versions of the problem. Proof. We consider the complete graph whose vertices are the initial locations of the participants and whose edge-weights are the distances. We solve the problem for that graph using algorithm Hop-visit, described for graphs in Subsection 3.3.2.2. The same analysis holds here as well. 3.2.2.3 Min-Max Full-Cooperation Euclidean-cTSP Similarly to Claim 3.11, we can obtain the approximation by considering the complete graph whose vertices are the initial locations of the participants and whose edge-weights are the distances. We use here the algorithm described in Subsection 3.3.2.3, and the same analysis holds here as well. We thus have the following Corollaries: 38 Cooperative TSP Corollary 3.12. Min-Max Full-Cooperation Euclidean-cTSP can be approximated within a factor of 4. Corollary 3.13. The roundtrip version of Min-Max Full-Cooperation Euclidean-cTSP can be approximated within a factor of 2. 3.2.3 Min-Makespan Euclidean-cTSP In this Subsection we mainly present a simple PTAS for the roundtrip version of Min-Makespan Sales Euclidean-cTSP. A PTAS for the corresponding Full-Cooperation problem, in both the path and roundtrip versions, can be obtained by similar means. In addition, we note that both the path and the roundtrip versions of the corresponding Purchase problem, namely Min-Makespan Purchase Euclidean-cTSP, are polynomialtime solvable. This can be observed as follows. For both the path and the roundtrip versions any optimal solution can be modified to an optimal solution in which all participants meet at a single point. For the path version, the modification can be done by letting all the participants meet the salesperson at the last point she visits. For the roundtrip version, denote the value of an optimal solution by OP T . Then, the modification of the optimal solution can be done by letting all the participants meet at the point where the salesperson resides at time OP T /2 (afterwards, all participants return to their initial location). Hence, for both the path and the roundtrip cases, the single meeting point is the center of the enclosing sphere, and can thus be found in polynomial time (see for example Megiddo [Meg83], Welzl [Wel91] and de Berge et al. [dBvKOS00]). A PTAS for the path version of Min-Makespan Sales Euclidean-cTSP has been obtained by Arkin et al. [ABF+ 02]). We next present a PTAS for the roundtrip version of this problem. 3.2.3.1 The Roundtrip version of Min-Makespan Sales Euclidean-cTSP This problem seems quite close in nature to the corresponding path version, and thus calls for a similar PTAS. However, note that converting an optimal solution for the path version into a solution for the roundtrip version (by simply letting all the participants return) only guarantees a (2 + ε)-approximation for this problem. This is true since a participant’s way back may double the makespan, while the roundtrip version may have a solution with the same makespan as the path version. Thus, constructing a PTAS for the roundtrip version requires considering in advance that the participants should return to their initial locations. We therefore use a different approach than the one used by [ABF+ 02] for the path version of the problem. 3.2 Euclidean cTSP 39 We show the PTAS for the two dimensional case (see Figure 3.3). The generalization to any fixed dimension is straightforward. Theorem 3.14. The roundtrip version of Min-Makespan Sales Euclidean-cTSP admits a PTAS. The running time of the PTAS is O(n + f (ε)), where ε > 0 is an arbitrarily small constant, f (ε) depends only on ε, and n is the number of participants. Proving this theorem requires some preparations and preliminary observations. A constant approximation algorithm for the path version of this problem appears in [ABF+ 02]. The solution found by their algorithm is also O(1) times the diameter (the maximal distance between any two points) of the input. One can adapt this approximation to the roundtrip version simply by returning each participant to its origin. The cost of the resulting solution is at most twice the original solution. Since an optimal solution to the path version costs less than an optimal solution for the corresponding roundtrip version, this heuristic is a constant approximation for the roundtrip version. We assume, w.l.o.g. that the instance lies inside [0, 1]2 and has an optimal cost of at least 1/2. Let m = d1/εe. We divide the unit square [0, 1]2 into m2 pixels. I.e., a pixel is a square of 0 0 j j +1 j j+1 , m ] × [m , m ], where j, j 0 = 0, 1, . . . , m − 1. the form [ m The PTAS for the Sales version relies on the next lemma: Lemma 3.15. Let I be an instance of n participants with an optimal makespan of OP T . Then, there exists an instance S ⊆ I with at most 3m4 participants, in which each non-empty pixel in I is also non-empty in S and the optimal makespan of S is at most (1 + O(ε))OP T . Proof. We may assume, w.l.o.g., that no two participants in I are located at the same point and that no three participants lie on a straight line. Otherwise, we can perturb each participant’s location by at most ε/n. An optimal solution to the perturbed instance has a cost of at most OP T + O(ε) (as the cost increase per participant is at most 2ε n ). Since OP T ≥ 1/2 this cost is less than (1 + O(ε))OP T . We show how we can remove participants from I while keeping the cost of an optimal solution to be at most OP T + O(ε). Let π be an optimal solution to I. We define the sales-tree of π to be a directed graph in which the nodes are the locations of the participants and there is a directed edge from u to v if a participant traveled from u to v in π. Since no two participants are located at the same point and no three participants lie on a straight line the in-degree of every node is one and the out-degree is at most two. We prune the sales-tree of π by iteratively removing leaves: we remove a leaf u if there exists another node in the sales-tree which resides in the same pixel as u. At the end of the process we are left with at most m2 /2 leaves, and at most m2 /2 nodes 40 Cooperative TSP Makespan-Sales PTAS 1. For each pixel which contains more than 3m4 agents, arbitrarily select a subset of 3m4 agents. Let P be the subset of the participants which contains all these selected agents, as well as all the agents of the other pixels (that contained at most 3m4 agents), and the salesperson. 2. Enumerate over subsets S ⊆ P , of size up to 3m4 , which include the salesperson and contain a representative from each non-empty pixel. For each such subset S: (a) Find an optimal solution for S by conducting an exhaustive search. (b) In each non-empty pixel apply a constant-approximation to all original participants of the pixel, where the salesperson is a representative of the pixel. (c) Extend the partial solution of S to a solution for the original instance: when all the participants in S return to their pixels - simultaneously perform the solution found in step 1(b). 3. Return the minimal cost solution found Figure 3.3: A PTAS for the roundtrip version of Min-Makespan Sales EuclideancTSP. The parameter m is assumed to be d1/εe. of degree 3 (in-degree plus out-degree). Note that the makespan of an optimal solution for the new instance, denoted π0 , is at most OP T . We now further decrease the number of participants by pruning some of the degree-2 vertices. We call a maximal set of participants along a path in which all the nodes are of degree 2 a chain. Clearly, each chain ends with a degree 3 node or a leaf. Hence, there are at most m2 chains. For each chain, and a pixel it intersects with, we intend to keep at most two nodes (participants). All the other nodes are removed from the chain. For a given pixel and a chain, the two participants that we keep are the first and the last (of this chain, inside the pixel) who receive the goods. We call such nodes a beginner node and an ender node, respectively. Note that, we are left with at most 2 · m2 participants per chain, giving rise to at most 2m4 nodes of degree 2. The new instance constructed, denoted S, has at most 3m4 participants. We next show that Claim 3.16. There exists a solution πS for S of cost at most OP T (1 + O(ε)). Proof. Recall that π0 (an optimal solution after pruning the leaves) is of cost at most OP T . We 3.2 Euclidean cTSP 41 construct the solution πS from π0 as follows: Each participant of a beginner node travels along the corresponding original chain until it reaches the corresponding ender node, and then travels back to its starting location. All other participants travel along the same route they travel in π0 . Thus, they arrive to their original location by the time OP T . Beginner participants may be delayed by the time it takes to travel from the corresponding ender node back to their original √ location. This is at most the time it takes to cross a pixel which is at most 2ε. Thus, the cost of an optimal solution to S is at most (1 + O(ε))OP T . This concludes the proof of Lemma 3.15. The correctness of the PTAS algorithm for the roundtrip version of Min-Makespan Sales Euclidean-cTSP can now be deduced: Proof. (of Theorem 3.14) Let π be an optimal solution for the instance I and let S ∗ ⊆ I be an instance that satisfies the condition of Lemma 3.15. The cost of an optimal solution to S ∗ is (1 + ε)OP T . The subset of participants S ∗ is not necessarily included in the enumeration of our algorithm. However, our enumeration does include a subsets S, such that |S| = |S ∗ | and S and S ∗ have exactly the same number of agents in each of the pixels. Clearly, the agents in S can simulatneously move to the positions of the agents in S ∗ , in O() time. Thus, the cost of an optimal solution for S, which is computed at stage 1(b) of our algorithm, is (1 + O(ε))OP T . The additional cost produced at stage 1(c) is at most a constant times the diameter of the pixel, which is O(ε). Note that this is an additive O(ε) increase of the makespan, as after all the participants in S return to their pixels the delivery to the other participants is done in parallel. Hence, the total cost of the solution produced by our algorithm is at most (1 + O(ε)) times the cost of π. 4) Finally, note that there are less than O(m4 )O(m 4) = O(1/ε)O(1/ε sets of participants to enumerate on. For each such subset S, a solution is a sequence of at most 2|S| − 1 moves. This follows as in each move either a participant receives the delivery or a participant returns to its original location. In any case, each move can be represented as a pair of two of the original input locations. Hence, for a given subset S, the number of solutions the algorithm enumerates on is at most 4 O(m4 ) ! O(1/ε4 ) |S| 2|S|−1 m 1 . =O = ε 2 2 Thus, the algorithm is a PTAS and runs in time O n + 1 1 O( ε4 ) ε . 42 Cooperative TSP 3.3 cTSP in Graphs In this section we present the algorithmic and hardness results for cTSP in graphs. 3.3.1 Min-Sum cTSP We start by considering cTSP with the Min-Sum objective. For the path versions, we provide simple approximation algorithms and hardness results for each cooperation mode. For the roundtrip versions, the corresponding Purchase, Sales and Full-Cooperation problems are all equivalent to the classical metric-TSP, as explained in Claim 3.1, and thus have the same approximation and hardness results. We therefore have the following corollary: Corollary 3.17. The roundtrip versions of Min-Sum Purchase cTSP, Min-Sum Sales cTSP and Min-Sum Full-Cooperation cTSP can all be approximated within 3/2, and cannot be approximated within 131/130, unless P 6= N P . These observations do not hold for the other objectives, in which there is also a significant difference between the different cooperation-modes. We begin with the approximability results for the various path version: Claim 3.18. Under the Min-Sum objective, Purchase and Full-Cooperation cTSP can be approximated within 2 + ln 3. If each vertex contains a participant, the approximation ratio improves to 2. Min-Sum Sales cTSP can be approximated within 2. Proof. For the first two problems, we simply find an approximate minimum Steiner-tree that spans the vertices which contain participants, and the salesperson visits all the agents by touring this tree (e.g., in an “infix-order”). The total distance traveled is twice the tree’s weight. We use the approximation algorithm of [RZ00] for the minimum Steiner-Tree problem, which has an approximation ratio of 1 + (ln 3)/2 (' 1.55). Therefore, the distance traveled is at most (2 + ln 3) times the weight of the minimum Steiner-tree. On the other hand, the edges used by any solution to these problems must form a connected subgraph which spans the vertices that contain participants (since all the agents receive the goods). This means that the total distance traveled is at least the minimum Steiner-Tree weight. Therefore, the simple algorithm described has an approximation ratio of 2 + ln 3. If each vertex contains a participant, then a minimum-spanning-tree can be computed exactly. Thus, the approximation ratio in this case is 2. For Sales cTSP it is again sufficient to compute a minimum-spanning-tree, since the goods can be delivered to an agent only in the original vertex of that agent. 3.3 cTSP in Graphs 43 We next provide hardness results for each cooperation-mode. As in the Euclidean case, the Purchase version is NP-hard, since the classical path-TSP [Hoo91] can be reduced to an instance of Min-Sum Purchase cTSP. The reduction is again done simply by replacing each customer with three agents. Thus, the salesperson is the only participant who moves. Like TSP, the path-TSP problem has a 3/2 approximation when the triangle inequality holds [Hoo91]. Therefore, improving the approximation ratio for our problem strictly below 3/2 will also improve the approximation for path-TSP. We next address the NP-hardness of Min-Sum Sales cTSP. Claim 3.19. Min-Sum Sales cTSP is NP-hard. Proof. We use a reduction from path-TSP [Hoo91]. Recall that an instance of path-TSP consists of a complete weighted undirected graph, G = (V, E), in which the weight function satisfies the triangle inequality, and a vertex v ∈ V , in which the salesperson is located. A solution is a Hamiltonian path which has v as one of its endpoints. The goal is to find a solution of minimal weight. Given an instance of path-TSP, we construct an instance of Min-Sum Sales cTSP as follows. For each vertex u 6= v, we add a vertex u0 , and connect it to u by an M -weighted edge, where M is twice the sum of the edge weights of G plus 1. We denote this new graph by G0 = (V 0 , E 0 ). Each vertex of G0 contains a participant, and the participant in v is defined to be the salesperson. Clearly, if the optimal path-TSP solution is of length C, then there is a solution for the new problem with total length (n − 1)M + C. On the other hand, assume there is an optimal solution for the new problem with a total cost of (n − 1)M + C. Clearly, the agents in new vertices don’t move in such a solution (since it already costs at least (n − 1)M to reach them, and if such an agent moves the cost is increased by at least M ). We prove that there is an optimal solution for that problem in which agents adjacent to new vertices only move to the new vertex adjacent to them, and therefore the salesperson visits all the vertices of G by traversing a path of length C (this path is simple since the triangle inequality holds in the original graph G). Assume this is not true. Hence, there is an agent who travels to a vertex which is not the new vertex adjacent to it. Let the agent who started at vertex u be the first such agent which the salesperson meets. Clearly, some agent must visit u0 . We can assume w.l.o.g. that this agent receives the goods through the agent of vertex u (not necessarily directly from him), since otherwise we can simply “switch names” between the agent of vertex u and the salesperson when they meet. It is easy 44 Cooperative TSP to see that by switching-names between agents when they meet, we can obtain a solution with the same cost, in which the agent of vertex u is the agent who returns to u and moves to u0 . Therefore, the salesperson could have done the tour of that agent by himself and return to u, and the agent of vertex u could go immediately to u0 , without affecting the cost of the solution or the visited agents. This argument can also be applied to each of the next agents which the salesperson meets in the given optimal solution. Therefore, there is an optimal solution in which these agents only move to new vertices, as required. Thus, there is a Hamiltonian-Path in G which starts at v and has total length C, and the proof is completed. For Min-Sum Full-Cooperation we have: Theorem 3.20. Min-Sum Full-Cooperation cTSP is APX-hard. Proof. We use a reduction from a variant of Set-Cover, in which each element appears in exactly k sets and each set is of size d. We call this variant (k, d)-Set-Cover. We rely on the following theorem of [DGKR05]: Theorem 3.21. [DGKR05] For every k > 2, ε > 0 and a sufficiently large n, there exists a positive integer d, such that the following holds: Given an instance of (k, d)-Set-Cover with n elements and m sets, it is NP-hard to decide whether there exists a solution of size m k−1−ε or every solution is of size at least (1 − ε)m. Let C be an instance of (k, d)-Set-Cover, with k and d values for which the last Theorem holds. C is a collection of m subsets of size d of a finite set S (|S| = n), such that each element of S appears in exactly k subsets. We construct the following instance of our problem. Let the graph G = (V, E) be constructed as follows. We have a vertex vc for every set c ∈ C, a vertex vs for every element s ∈ S, and two other vertices u, v. Namely, V = {u, v} ∪ VC ∪ VS , where VC = {vc | c ∈ C}, VS = {vs | s ∈ S}. The edge-set E is defined as follows: every vertex vc ∈ VC is connected to v, there is an edge between vs ∈ VS and vc ∈ VC iff s ∈ C, and u is connected to v. Namely, E = {(u, v)} ∪ {(v, vc ) | c ∈ C} ∪ {(vc , vs ) | s ∈ c, c ∈ C}. 3.3 cTSP in Graphs 45 Figure 3.4: An illustration of the graph construction described in the proof of Theorem 3.20. The weight of each edge is 1, except for the edge (u, v), whose weight is 0. Each vertex vs ∈ VS m contains one agent, v contains d k−1−ε e − 1 agents, and u contains the salesperson. Let B be the instance constructed for our problem. Claim 3.22 (Completeness). If there is a solution for C of size at most solution for B of cost at most m d k−1−ε e m k−1−ε , then there is a + n. m Proof. Let C 0 be the solution for C of size at most d k−1−ε e. The solution for B is as follows. m The salesperson moves to v, and then d k−1−ε e participants traverse from v to VC 0 where VC 0 = {vc | c ∈ C 0 }. Each of the n agents at VS moves to a neighbor in VC 0 (as C 0 is a cover, every vertex m in VS has such a neighbor). Thus, the total cost of the solution for B is at most d k−1−ε e + n. Claim 3.23 (Soundness). If every solution for C is of size at least (1 − ε)m then every solution for B is of cost at least n + (1 − ε)m. Proof. Note that there is an optimal solution in which every agent in VS makes at least one step. This holds, since otherwise another participant has to traverse an edge adjacent to that agent, so the solution can only be cheaper if that agent from VS moves towards the other participant. As every solution for C is of size at least (1 − ε)m, at least (1 − ε)m of the vertices of VC are populated after the agents in Vs make one step. Therefore, at least (1 − ε)m more steps are needed for that optimal solution. Thus, every solution to B is of cost at least n + (1 − ε)m. 46 Cooperative TSP Coarse-Path(G = (V, E), W, v, ε): 1. For each ordered subset V 0 ⊆ V of size 1 + b1/εc or less, which starts with v. (a) For each u ∈ / V 0 that contains an agent, find its distance to a closest vertex 0 in V . Denote the maximal distance found by MaxDist(V’). (b) Compute the sum of distances between pairs of consecutive vertices in V 0 , and denote it by Length(V’). (c) Let Cost(V’) be the maximum of Length(V’) and MaxDist(V’). 2. Pick the ordered subset V 0 for which Cost(V’) is minimal. 3. Return the following solution: The salesperson follows the shortest paths between the consecutive vertices of V 0 . Each of the agents meets the salesperson at a closest vertex to that agent in V 0 . The salesperson waits for all the agents who come to a certain vertex before moving to the next vertex. Figure 3.5: A PTAS for Min-Max Purchase cTSP. By the completeness and soundness claims we obtain that it is NP-hard to approximate the problem to within n+(1−ε)m m n+ k−1−ε +1 , which is about d+(1−ε)k d+1 (as m = kn d ). As k > 2, the problem is APX-hard. 3.3.2 Min-Max cTSP We first present a simple PTAS for the Purchase version of this problem, and then prove that it has no FPTAS, assuming P 6= NP. For the other cooperation-modes, we present constant lower bounds on the approximation-ratio, assuming P 6= N P . We also provide algorithms that find constant factor approximations for these problems, which are tight in one case, and are at most twice the lower bounds in the other cases. The results for the roundtrip versions resemble the results for the corresponding path versions. 3.3.2.1 Min-Max Purchase cTSP We start by presenting the PTAS for the Purchase version, which appears in Figure 3.5 (algorithm Coarse-Path). Theorem 3.24. Algorithm Coarse-Path is a PTAS for Min-Max Purchase cTSP. 3.3 cTSP in Graphs 47 Proof. Clearly, the Min-Max cost of the solution returned by the algorithm is the minimal Cost(V 0 ) of the subsets it considers. We show that one of these subsets has Cost(V 0 ) of at most (1 + ε) times the optimum. Consider an optimal solution to the problem π, in which the cost is OP T . Choose a subset of the vertices of the path traveled by the salesperson in the following way. Start with vertex v, and then choose a vertex iff its distance from the previous vertex chosen is at least ε · OP T . Clearly, at most 1/ε vertices are selected. Denote this subset by V 0 . Note that Length(V 0 ) ≤ OP T . For each vertex u ∈ / V 0 that contains an agent, there is a vertex in V 0 at a distance of at most (1 + ε) · OP T . This holds, since for each vertex w visited by the salesperson in π, V 0 contains a vertex at a distance of at most ε·OP T from w. Thus, M axDist(V 0 ) ≤ (1+ε)OP T , and Cost(V 0 ) ≤ (1 + ε)OP T . Therefore, Algorithm Coarse-Path indeed finds a (1 + ε)-approximate solution. The running-time of the algorithm is O(nb1/εc+3 ), since it enumerates over ordered subsets of vertices of size at most b1/εc, and the required computation for each ordered subset takes at most O(n3 ) time (if we assume that the graph is complete then this is O(n/ε)). Thus, Coarse-Path is a PTAS. We similarly have a PTAS for the roundtrip version of the problem. Simply let all the participants return to their initial vertex at the end of Algorithm Coarse-Path, and compute the costs accordingly. It is easy to see that the arguments used for the path version hold here as well. Thus, we have: Corollary 3.25. The roundtrip version of Min-Max Purchase cTSP admits a PTAS. We next present the tight hardness results for these versions. Claim 3.26. Min-Max Purchase cTSP has no FPTAS, unless P = N P . Proof. We show a reduction from the Hamiltonian Path problem, where a given vertex v ∈ V must be an endpoint of the path. Given an input to that problem, G = (V, E), v ∈ V , we construct an instance of our problem in the following way. For each u ∈ V , we add a vertex u0 and an edge (u, u0 ), with a weight of n − 1 (the weights of the original edges remain 1). We locate the salesperson at v, and we locate an agent at each of the newly added vertices. It is easy to see that the instance of the Hamiltonian Path decision problem is a “yes” instance iff the value of the optimal solution of the new instance is n − 1. Thus, our problem is strongly N P − hard (the n − 1 weight used in the reduction is obviously polynomial in the input-size), and therefore has no FPTAS. Claim 3.27. The roundtrip version of Min-Max Purchase cTSP has no FPTAS, unless P = NP. 48 Cooperative TSP Hop-visit(G = (V, E), W, v): 1. Let G0 = (V 0 , E 0 ) be a weighted complete graph, where V 0 ⊆ V is the set of vertices which contain participants, and the edge-weights are the corresponding distances in G. 2. Compute a minimum-spanning-tree T of G0 , and root it at the salesperson’s vertex v. 3. The salesperson visits an arbitrary child, and doesn’t move any further. 4. When an agent receives a delivery: (a) If the agent has a sibling in T who has not received the delivery, then the agent visits such a sibling and one of that sibling’s children. (b) Otherwise, the agent visits a child of the sibling which was visited first (a child of the “eldest” sibling of that agent), if such a child exists. Figure 3.6: A 3-approximation algorithm for Min-Max Sales cTSP Proof. Similarly to the proof of Claim 3.26, we apply a reduction from Hamiltonian Cycle. Given an input G = (V, E), we locate the salesperson in one of the vertices, v. Additionally, for each u ∈ V , we add a vertex u0 , connected by an edge (u, u0 ) with a weight of n/2 (the weights of the original edges remain 1). All the newly added vertices contain an agent. It is easy to see that the instance of the Hamiltonian Cycle problem is a “yes” instance iff the value of the optimal solution of the new instance is n. Thus, our problem is strongly NP-hard and has no FPTAS. 3.3.2.2 Min-Max Sales cTSP In this Subsection we present a 3-approximation algorithm for both the path and roundtrip versions of the problem. We later prove that these problems cannot be approximated within a factor of less than 2 and 3/2, respectively (unless P = N P ). The simple constant approximation algorithm for Min-Max Sales cTSP is presented in Figure 3.6. Theorem 3.28. Min-Max Sales cTSP is 3-approximable. Proof. We prove that Algorithm Hop-visit is a 3-approximation algorithm for this problem. Clearly, all the agents are visited. Each participant traverses at most three edges of the MST, 3.3 cTSP in Graphs 49 which means that the cost of the solution is at most thrice the weight of the heaviest edge of the MST. On the other hand, consider an optimal solution, and define G00 = (V 0 , E 00 ), such that (u1 , u2 ) ∈ E 00 iff the participant from u1 sold the goods to the participant from u2 , or vice versa. Let the weight of (u1 , u2 ) ∈ E 00 in G00 be the distance between u1 and u2 in G. The optimal cost is clearly at least the weight of the heaviest edge in E 00 , since selling to an agent requires traveling to this agent’s vertex. Note that G00 is a connected subgraph of G0 . It is well-known that an MST is lexicographically minimal, i.e., its heaviest edge is not heavier than that of any other spanning-tree or spanningconnected-subgraph. Therefore, the cost of the solution found by the above algorithm is at most thrice the cost of an optimal solution. A similar argument holds for the roundtrip version. We use algorithm Hop-visit, and then let each participant return to its original vertex (using the shortest path). Clearly, the Min-Max value is at most 6 times the weight of the heaviest edge of the MST of G0 . On the other hand, the optimal cost is at least twice the weight of the heaviest edge of G00 (since selling to an agent requires reaching him and then returning back). Thus, we have: Corollary 3.29. The roundtrip version of Min-Max Sales cTSP is 3-approximable. We next show lower bounds on the approximability of both the path and the roundtrip version of Min-Max Sales cTSP. Claim 3.30. Min-Max Sales cTSP cannot be approximated better than 2, unless P 6= N P . Proof. The reduction is from the Hamiltonian-Path problem where one endpoint of the path, vertex u, is specified in the input. Given an instance of that Hamiltonian-Path problem, an unweighted undirected graph G = (V, E) and a vertex u, we construct an instance for our problem by simply locating the salesperson at u and putting an agent at each of the other vertices. If there is a Hamiltonian path in G which starts at u, then there is a solution for the new problem where the Min-Max value is 1, as follows. The salesperson moves to the next vertex in the path, sells to the agent there, and doesn’t move any further. Then, each agent moves to the next vertex on the path, sells the goods to the agent there, and also doesn’t move any further. Thus, all the agents are visited, and the maximal distance traveled is 1. On the other hand, if there is a solution with Min-Max value 1, then the salesperson and the agents each make at most one step and stop (i.e., they traverse at most one edge each). Since 50 Cooperative TSP all the agents are visited in a solution, each of the first |V | − 1 steps must have visited a vertex which hadn’t been visited before. Thus, following the steps in this sequence gives a Hamiltonian path which starts at u. It is therefore NP-hard to distinguish between an instance which has a solution with MinMax value 1 and an instance which only has solutions with Min-Max values 2 or more. Thus, it is NP-hard to approximate the value of the optimal solution within a factor lower than 2. The reduction for the roundtrip version of the problem is identical. However, here a “yes” instance implies a cost of 2, and it is NP-hard to distinguish between an instance which has a solution of cost 2 and an instance which only has solutions of cost 3 or more. We thus have: Claim 3.31. The roundtrip version of Min-Max Sales cTSP cannot be approximated better than 3/2, unless P 6= N P . 3.3.2.3 Min-Max Full-Cooperation cTSP The Min-Max Full-Cooperation cTSP problem allows only constant-factor approximations. We prove a lower bound of 2 on the approximation ratio for both the path and roundtrip versions of the problem. Additionally, we provide a simple algorithm which obtains a 4-approximate solution for the path version and a tight 2-approximate solution for the roundtrip. We start by considering the special case in which each vertex contains at least one participant. Claim 3.32. Min-Max Full-Cooperation cTSP is 2-approximable if each vertex contains at least one participant. Proof. Compute an MST rooted at the salesperson’s vertex, and let one agent from each vertex move to her parent’s vertex, and return after receiving the goods (the agents initially located at the leaves don’t need to return). The maximum distance traveled by any of the agents is at most twice the weight of the heaviest edge in the MST. On the other hand, any solution to the problem forms a spanning-connected-subgraph, and its Min-Max value is at least the weight of the heaviest edge in that subgraph. As we noted before, since the MST is lexicographically minimal, its heaviest edge is not heavier than that of any other spanning-connected-subgraph. Hence, the algorithm is a 2-approximation algorithm. Claim 3.33. Min-Max Full-Cooperation cTSP is 4-approximable. 3.3 cTSP in Graphs 51 Proof. The proof is similar to the proof of the approximation for the Min-Max Sales version. We define the weighted complete graph G0 = (V 0 , E 0 ), where V 0 is the set of vertices which contain participants, and the edge-weights are the distances between these vertices in the original graph. We now perform the same algorithm as in the last proof: We compute an MST rooted at the salesperson’s vertex, one agent from each vertex moves to the vertex of her parent, and she returns after receiving the goods. The maximum distance traveled by any of the agents is again at most twice the weight of the heaviest edge in the MST. On the other hand, consider an optimal solution, and define G00 = (V 0 , E 00 ), s.t. (u, v) ∈ E 00 iff participants from u and v meet during that solution (the weight is again the distance between them). The optimal Min-Max value is clearly at least half the weight of the heaviest edge in E 00 , since a meeting of two participants requires that at least one of them traversed half the distance between them. Also, G00 is clearly a spanning-connected-subgraph of G0 , and its heaviest edge has at least the cost of the heaviest edge of the MST of G0 found by the above algorithm. Therefore, this algorithm achieves an approximation ratio of 4. Note that the simple algorithm described in the last proof also solves the roundtrip version of the problem (with the same cost). On the other hand, the bound on the cost of the optimum is doubled for the roundtrip version (if two participants meet and return then one of them must travel at least the distance between them). Thus: Corollary 3.34. The roundtrip version of Min-Max Full-Cooperation cTSP is 2-approximable. We now turn to presenting hardness results. Theorem 3.35. Min-Max Full-Cooperation cTSP cannot be approximated better than 2, unless P = N P . Proof. We prove this by a reduction from the Set-Cover problem. The Reduction: Let (C, k) be an instance of Set-Cover, where C is a collection of subsets of a finite set S, and k is an integer. It is NP-hard to decide whether there is a set cover for S of size at most k, i.e., a subset C 0 ⊆ C such that every element in S belongs to at least one member of C 0 . We use the same construction as in the hardness proof for the Min-Sum objective (Theorem 3.20), except that v contains only k − 1 agents. Claim 3.36 (Completeness). If (C, k) is a “yes” instance of Set-Cover, then there is a solution for our problem with maximum-distance 1. 52 Cooperative TSP Proof. Let C 0 ⊆ C be a set cover of S of size |C 0 | ≤ k. Let VC 0 = {vc | c ∈ C 0 }. Then the salesperson moves to v, and the k participants now populating v go to the vertices of VC 0 (at least one participant to each vertex); the agents populating the vertices VS move to VC 0 as well (each to a closest vertex in VC 0 ). As C 0 is a set cover, each of the agents at VS has a vertex vc ∈ VC 0 at distance 1. Hence this scheme ends in one step. Claim 3.37 (Soundness). If there is a solution with maximum-distance lower than 2, then (C, k) is a “yes” instance of set-cover. Proof. As all edges are of length 1, every agent ends up at some vertex in VC . Let VC 0 be the set of those vertices. Clearly |VC 0 | ≤ k as there are originally only k − 1 agents at v and a salesperson in u, and every other agent has to meet one of them. Thus, C 0 = {c | vc ∈ VC 0 } is a set cover for S of size at most k, hence (C, k) is a “yes” instance of Set-Cover. Corollary 3.38 (Hardness of Approximation). It is NP-hard to distinguish between an instance of Min-Max Full-Cooperation cTSP with value 1 and an instance with value 2. Hence, this problem is NP-hard to approximate to within any factor smaller than 2. The reduction for the roundtrip version is identical. By using the same considerations, a set-cover of size k or less exists iff there is a solution with Min-Max value 2 to the new problem. Note that there can be no solution with Min-Max value 3, since there are no triangles in the constructed graph. We thus have: Corollary 3.39. The roundtrip version of Min-Max Full-Cooperation cTSP cannot be approximated better than 2, unless P = N P . 3.3.3 Min-Makespan cTSP The Min-Makespan objective is the most diverse out of the three. The Purchase problem has a polynomial-time solution for both the path and the roundtrip versions. The FullCooperation version can be approximated within a ratio of 2, and this cannot be improved, √ unless P 6= N P . For the Sales version, only an O( log n) approximation is known [KLS04], while the lower bounds for that version are smaller than 2 (5/3 for the path version, shown by [ABF+ 02], and 5/4 for the roundtrip version, which we show below). 3.3 cTSP in Graphs 3.3.3.1 53 Min-Makespan Purchase cTSP Claim 3.40. Min-Makespan Purchase cTSP can be solved in O(mn + n2 log n) time. Proof. We observe that there is an optimal solution in which all the agents meet the salesperson at a single vertex. This holds, since the value of a solution does not change if each agent that meets the salesperson joins her in her journey. Thus, they could have all met at the last vertex visited by the salesperson in that solution, without increasing the completion-time. Specifically, this argument is true for the optimal solution. Hence, an optimal solution can be found simply by computing all-pairs-shortest-paths in the graph and finding the vertex whose maximal-distance from any of the participants is minimal. This takes the above stated time using Dijkstra’s algorithm (e.g. [CLRS01]). Claim 3.41. The roundtrip version of Min-Makespan Purchase cTSP can be solved in O(mn + n2 log n) time. Proof. The idea of this proof is similar to the idea of the previous one, but it is slightly more involved. We first show that there exists an optimal solution in which all the agents meet the salesperson either in a single vertex or in two adjacent vertices. Consider an optimal solution of cost OP T . Let u be the last vertex visited by the salesperson until time OP T /2 in that solution, and let v be the first vertex she visited after that time. Clearly, all the agents that the salesperson met before leaving u can join her in her way to u and then return, without increasing the makespan. We observe that all the agents which the salesperson met after leaving u can come to meet her at v and return to their initial vertex before time OP T . Let the meeting of such an agent with the salesperson occur at vertex w. Then clearly that agent’s travel to w plus the salesperson’s travel from v to w take less then OP T /2 time. Thus, all these agents can reach v before the salesperson (in less than OP T /2 time). They can return to their initial vertices before time OP T , since they can join the salesperson’s tour until the vertex where they originally met, and then return to their initial vertex just like in the original optimal solution. This means that an optimal solution can be found by computing all-pairs-shortest-paths and enumerating on single vertices and pairs of adjacent vertices where the meetings may take place. The makespan for each suggestion for meeting-place(s) is computed in O(n) time (according to the distances from the participants). Again, computing all-pairs-shortest-paths requires an overall time of O(mn + n2 log n) using Johnson’s algorithm [CLRS01], which means that the total time required for solving the problem is O(mn + n2 log n). 54 Cooperative TSP 3.3.3.2 Min-Makespan Sales cTSP √ As noted before, the path version has an O( log n) approximation algorithm [KLS04], and there is a lower bound of 5/3 on its approximation ratio (assuming P 6= N P ) [ABF+ 02]. The same algorithm can clearly be used for the roundtrip version: √ Claim 3.42. The roundtrip version of Min-Makespan Sales cTSP is O( log n)-approximable. √ Proof. The known algorithm for the path version finds an O( log n) approximate solution [KLS04]. Requiring that all the participants return to their original vertex at the end may increase the cost of the solution found by the algorithm by a factor of at most 2. Clearly, the optimal cost for the roundtrip version is at least the optimal cost for the path version. Therefore, this √ problem also has an O( log n) approximation algorithm. We now turn to providing a hardness of approximation result. Claim 3.43. The roundtrip version of Min-Makespan Sales cTSP cannot be approximated better than 5/4, unless P = N P . Proof. We use a reduction from Set-Cover, similar to the reduction used in Theorem 3.35 for MinMax Full-Cooperation cTSP. We also use the same notations as in the proof of Theorem 3.35. There are two differences in the construction of the reduction. First, we add another vertex w which contains m agents (the number of sets) and is connected to all the vertices in VC . Second, each of the vertices of VC contains a number of agents equal to its degree (which is the size of the corresponding set). As in the above-mentioned reduction, all the edges have weight 1, the vertices of VS contain one agent each, and v contains k − 1 agents. A set-cover of size k (or less) provides a solution of cost 4 to our problem: The salesperson and the agents in v visit the vertices in VC corresponding to the cover. The agents in these vertices visit all the vertices in VS , and one of the agents who came from v visits w. Then the agents from w visit all the non-visited vertices in VC . It is easy to verify that all the participants can return from these visits without exceeding a makespan of 4. On the other hand, assume there is a solution to the new problem with a makespan of 4. It takes at least 2 time units to visit an agent in VS or w, so clearly these agents could not visit other agents in VS , and the same is true for agents in VC which were first visited by agents from w or Vs . Hence, agents in VS could either be visited by the salesperson, the k − 1 agents from v, or agents in VC which these participants visited at the first time-unit. Since the salesperson and 3.4 Discussion and Open Problems 55 the agents from v could visit at most k vertices of VC in the first time unit, there is a set-cover of size at most k. Therefore, it is NP-hard to distinguish between an instance with minimum makespan of 5 and an instance with minimum makespan of 4, which yields the required result. 3.3.3.3 Min-Makespan Full-Cooperation cTSP Here we have tight upper and lower bounds of 2 on the approximation ratio. The upper bound for the path version is obtained by simply letting all the agents go to the salesperson. Since delivering the goods to the agent which is farthest from the salesperson takes at least half the travel-duration between them, this yields a 2-approximation. The same can be done for the roundtrip version, followed by a return of all the agents to their initial vertices. The optimum clearly requires here at least the travel-duration to the furthest agent (at least the time until she receives the goods plus the time for returning to her initial vertex if she moved). We thus have: Corollary 3.44. Min-Makespan Full-Cooperation cTSP is 2-approximable for both the path and roundtrip versions. The hardness proofs for both the path and roundtrip versions are identical to the corresponding Min-Max problems (see Theorem 3.35 and Corollary 3.39). Therefore, we have: Corollary 3.45. Both the path and the roundtrip versions of Min-Makespan Full-Cooperation cTSP cannot be approximated better than 2, assuming P6=NP. 3.4 Discussion and Open Problems We obtained quite tight approximation and intractability results for most of the cTSP problems. Some of the cTSP problems turn out to be easier (in sense of approximation) than the classical TSP, while others are strictly harder. √ The status of Min-Makespan Sales cTSP is not settled, as there is an O( log n) approxi- mation and a constant inapproximability factor. Improving the factors of this problem as well as tightening the factors for some others is yet to be achieved. It is also likely that the running time of some of the PTAS can be improved. There are some disturbing asymmetries in the Euclidean results (see Table 3.2). For example, while the roundtrip versions of Min-Sum Sales and Full-Cooperation cTSP have a PTAS, the best approximations for the corresponding path-cTSP problems only guarantee some constant factors. We conjecture that these two path-cTSP versions indeed have a PTAS, but we 56 Cooperative TSP suspect that this may not be very easy to prove. This follows since it can be shown that a PTAS for the first problem implies a (currently unknown) PTAS for the well-studied 3-bounded-degreeplanar-MST (e.g., [PV84, KRY96, FKK+ 97, Cha03, AC04]). Chapter 4 On unweighted r-Gatherings 4.1 Introduction Facility-location has been studied in many forms over the past decades (see, e.g., [AGK+ 04, Byr07, Dre95, Gon85, GK99, GMM00, KM00, MYZ03, Svi02, Vyg05, ZCY04]). In the classic metric facility-location problem, we are given a set of customer locations S and a set of potential locations of facilities F (which may intersect S). Each location fi ∈ F is associated with a cost p(fi ) for opening a facility there. For every si ∈ S and fj ∈ F , there is a cost d(si , fj ) for connecting a customer in si to a facility in fj . These costs are equivalent to distances, and thus satisfy the symmetry and triangle-inequality requirements. The goal is to open facilities and assign each customer to a facility, such that the total cost is minimized (i.e., the sum of the facility opening-costs and the connection-costs should be minimal). The metric facility-location problem models many realistic scenarios, in which service-posts of a certain type should be opened to serve a set of customers. Applications range from classic power-plants or warehouse location problems to locating servers in computer-networks (see, e.g., [Dre95, Vyg05] for surveys). The current best approximation algorithm for metric facility-location achieves an approximation-ratio of 1.5 [Byr07]. On the other hand, this problem cannot be approximated within a factor of less than 1.463, assuming P 6= N P [GK99]. One of the interesting recent variants of metric facility-location is the r-gathering problem, introduced in parallel by Karger and Minkoff [KM00] and by Guha et al. [GMM00] (who called it load-balanced facility-location). The basic additional requirement in the r-gathering problem is that each facility will be assigned at least r customers (customers are not necessarily assigned to the nearest open facility in this problem). This variant captures the idea that opening a facility is economically justified only when it serves at least a certain amount of demand (and this constraint 58 On unweighted r-Gatherings may even be more natural than facility costs in some settings). Furthermore, in various settings there is an inherent lower bound on the number of customers in each facility. For example, in secret-sharing schemes (see [Sch96]), at least r shares are needed to uncover a secret. We may need to locate servers in the network, to which clients will connect in order to uncover the secret, and we may want this process to be as fast or as cheap as possible. Both papers [GMM00, KM00] considered the generalization of r-gathering in which customers have different demands, the connection-costs are the product of the demand and distance, and each facility must serve customers having a total of at least r demand [GMM00, KM00]. They both 1+α presented a ( 1−α β, α) bicriteria approximation, for any α < 1, where β is the approximation-ratio of the metric facility-location problem (currently 1.5 for the classic problem [Byr07] and 1.582 for the generalization in which customers may have different demands [Svi02]). Namely, their algorithm guarantees that each open facility in the solution will serve at least αr demand, and the cost will be at most 1+α 1−α β times the optimal cost of the r-gathering problem. Choosing α = r−1 r + for the case of unit-demands provides a 1.5(2r − 1 + )-approximate feasible solution. Note that we cannot hope for a significant improvement in the approximation-ratio due to improvement of β, since β is lower-bounded by 1.463 [GK99]. Very recently, Svitkina [Svi08] obtained a constantfactor (single-criterion) approximation algorithm for this problem with unit-demands, providing a 558-approximate solution. Although the first papers considered minimizing the sum of costs [GMM00, KM00], a natural variant is to minimize the maximal cost (in the spirit of the k-center problem [Gon85]). This may model, for example, the time until all the facilities and connections will be available (if each cost represents the time until the corresponding facility/connection will be ready). A special case of the min-max version of this problem with unit-demands, called “r-gather clustering”, has been recently considered by Aggrawal et al. [AFK+ 06]. In their special case, motivated by a clustering application, all the facility costs are zero and all the locations of customers are included in the set of optional facility locations (S ⊆ F ) [AFK+ 06]. Their paper presented a 2-approximation algorithm for this case, and proved that it cannot be approximated better, for any r ≥ 7 (assuming P 6= N P ). They also considered a generalization called (r, )-gather clustering, in which the solution can ignore n of the customers (“outlier points”), and stated that this problem can be approximated within a factor of 3 if facilities (cluster-centers) can only be located at customer (input points) locations [AFK+ 06]. We note that unlike the algorithm of [GMM00, KM00], the algorithm of [AFK+ 06] does not guarantee that each customer will be assigned to a nearest open facility. For the basic special case of r = 2, a recent paper of Anshelevich and Karagiozova [AK07] proves that both min-sum 2-gathering without facility-costs and min-max 2-gathering can be 4.1 Introduction 59 solved in polynomial time. Demaine et al. [DHM+ 07] have recently introduced another problem related to min-max 2gathering, which they called min-max minimum-movement facility location. In our terminology, there are two types of customers in that problem: Customers from type A (“clients”) must be assigned to a facility having at least one customer from type B (“server”) assigned to it, while customers from type B do not have to be assigned. Also, S ⊆ F and there are no facility costs. Demaine et al. [DHM+ 07] asked whether this problem can be approximated within a factor of less than 2. We prove that the answer is negative, assuming P 6= N P . In this chapter we focus on min-max r-gathering in the basic case of unit-demands - our results refer to this problem unless stated otherwise. In addition to the basic r-gathering problem, we consider the version in which there is an additional proximity requirement: Each customer in the solution must be assigned to the nearest open facility. This is clearly a desirable property of a solution in many facility-location settings, and also in clustering scenarios (e.g., in geographic data-mining, see [GvKN06]). We manage to obtain a constant-factor approximation for this problem as well. 4.1.1 Our results We start by presenting a simple 3-approximation algorithm for min-max r-gathering. On the other hand, we prove that this problem cannot be approximated within a factor of less than 3 (assuming P 6= N P ), for any r ≥ 3. By using a similar reduction, we also show that r-gather clustering cannot be approximated within a factor of less than 2 for any r ≥ 3, thus improving the hardness result of [AFK+ 06]. The same approximation algorithm extends to provide a 3-approximate solution for a generalization considered by [GMM00, KM00], in which each fi ∈ F has a different lower-bound ri on the number of customers required. Furthermore, it extends to provide the same approximation-ratio for the generalization in which there are several types of customers, and each open facility fi must have at least rij customers of type j (this may be useful for example for achieving “p-Sensitive k-Anonymity” [TB06] in publishing information from databases, similarly to the use of r-gather clustering for achieving “k-Anonymity”[AFK+ 06]). By using another extension of this algorithm, we provide a 3-approximation for the generalization of min-max r-gathering in which an -fraction of the customers can be ignored. We thus match the approximation-ratio stated in [AFK+ 06] for the special case of (r, )-gather clustering. Interestingly, practically the same algorithm also provides a 2r approximation for the minsum version of the problem, if there are no facility costs. For this case, this improves upon the 60 On unweighted r-Gatherings 1.5(2r − 1) + approximation implied by the bicriteria algorithm of [GMM00, KM00]. In parallel to our work, Lim et al. [LWX06] also obtained a 2r-approximation factor for this problem, using a different algorithm. Next we consider the proximity requirement and present a 9-approximation algorithm for min-max r-gathering which satisfies it (i.e., each customer is assigned to a nearest open facility). For the special case of r-gather clustering, our technique provides a 6-approximation algorithm. In addition, we provide a 2-approximation algorithm for 2-gather clustering which satisfies the proximity requirement. We show that this approximation factor cannot be improved: An algorithm for r-gather clustering which guarantees the proximity requirement cannot guarantee an approximation-ratio smaller than 2. Finally, we show that although min-max 2-gathering is polynomial [AK07], the related minmax minimum-movement facility-location [DHM+ 07] is NP-hard and cannot be approximated within a factor of less than 2 (assuming P 6= N P ). This resolves the open-question recently posed by Demaine et al. [DHM+ 07]. All our algorithms are based on discrete combinatorial techniques. Our hardness results use reductions from Exact-k-cover and SAT. The rest of this chapter is organized as follows. In Section 4.2 we present formal problem definitions and notations. Section 4.3 presents our simple approximation algorithm for min-max r-gathering, and analyzes its use for other versions. Section 4.4 considers the requirement of assigning each customer to a nearest open facility. The hardness results are provided in Section 4.5. We end with some concluding remarks and open problems. 4.2 Problem Definitions and Notations We now formally state the basic problems we consider and introduce some of the notations we use. (We use slightly different notations from those of [GMM00, KM00].) The input for an r-gathering problem consists of a set of customer-locations S = {s1 , ..., sn }, a set of potential facility-locations F = {f1 , ..., fm } with opening costs p : F → R+ ∪ {0}, and distances (connection-costs) d : (S ∪ F ) × (S ∪ F ) → R+ ∪ {0}. The input also includes a positive integer r > 1. A solution is an assignment of the n customers to (not necessarily distinct) facilities, t1 , ..., tn , which are considered open, such that customer i is assigned to facility ti ∈ F , and the number of customers assigned to each open facility is at least r. In the min-max version of the problem, the goal is to minimize max1≤i≤n {max(d(si , ti ), p(ti ))} (we refer to this as the cost of the solution). P P In the min-sum version, the goal is to minimize ni=1 d(si , ti ) + fi ∈{t1 ,...,tn } p(fi ) (each cost of 4.3 Approximating Min-Max r-Gathering 61 Algorithm Best-or-Rest 1. For each customer, find his min-cost, best facility and partners. 2. Sort the customers in non-decreasing min-cost order. 3. For each customer i in this sorted order: If customer i and all his partners have not been assigned yet assign them to the best facility of customer i (open this facility if it is not open yet). Otherwise, do nothing and continue to the next customer. 4. Assign any unassigned customer to the nearest open facility. (In case of a tie, arbitrarily choose the location with smallest index). Figure 4.1: A 3-approximation algorithm for min-max r-gathering. an open facility is considered once in this sum). A special case of min-max r-gathering is r-gather clustering [AFK+ 06], where S ⊆ F , and there are no facility costs (p(fi ) = 0, for 1 ≤ i ≤ m). 4.3 Approximating Min-Max r-Gathering Definition 4.1. The “min-cost” of customer i, denoted c(i), is the minimum cost of assigning r customers, including customer i, to a single facility (considering both the facility-cost and the customers’ connection-costs). The location of this min-cost assignment, gi ∈ F , is called “the best facility” of customer i. The “partners” of customer i are the r − 1 customers, other than customer i, who participate in this min-cost assignment. (If there are several options we arbitrarily prefer locations and customers with smaller indices). We now provide a simple approximation algorithm for the problem, Best-or-Rest (see Figure 4.1). Lemma 4.2. The cost of the solution found by algorithm Best-or-Rest for min-max r-gathering is at most thrice the maximal min-cost. Proof. First, observe that the cost of a customers’ assignment at stage (3) is the min-cost of one of the customers assigned at this stage (customer i), which is at most the maximal min-cost of 62 On unweighted r-Gatherings any of the n customers. Now consider a customer i assigned at stage (4). This customer was not assigned at stage (3), which means that when customer i was considered at stage (3), at least one of his partners, say customer j, had already been assigned to another facility, tj = gk (the best facility of some customer k 6= i). Customer i can also be assigned to tj , with a cost of d(si , tj ). Clearly, d(si , tj ) ≤ d(si , sj ) + d(sj , tj ). Observe that d(si , sj ) ≤ 2c(i), since d(si , gi ) ≤ c(i) and d(gi , sj ) ≤ c(i) (as j is one of the partners of customer i and gi is the best-facility of customer i). Also, d(sj , tj ) = d(sj , gk ) ≤ c(k), since customer j is one of the partners of customer k. Since we performed stage (3) in a non-decreasing order of min-cost, c(k) ≤ c(i). So taken together, for each customer assigned at stage (4), d(si , ti ) ≤ 3c(i) (the customer is assigned to a nearest open facility, and we saw that there exists an open facility which satisfies this). This yields the required result. Theorem 4.3. Algorithm Best-or-Rest finds a 3-approximate solution for min-max r-gathering, and can be implemented to run in O(n(m + r + log n)) time. Proof. The cost of an optimal solution for the problem is clearly at least the maximal min-cost (since there is a customer whose assignment requires at least that cost in any solution). Therefore, the previous lemma proves that the algorithm finds a 3-approximate solution. For implementing the first stage efficiently, we can first find for each t ∈ F the set of r customers closest to t. This can easily be done in O(n) time for each facility (using selection). Let Dt denote the distance from t to the r-th distant customer. Thus, for each customer i, the minimal cost of assigning him along with r − 1 other customers to location t is max(Dt , d(si , t), p(t)). So computing these costs for each customer and for each t ∈ F takes an overall time of O(mn). We now find the best facility of each customer according to these costs (in an overall time of O(mn)). The partners of customer i are clearly the r − 1 customers (other than customer i) that are closest to his best facility. Given the sets of r closest customers that we computed for each facility, noting the partners of each customer requires a total of O(rn) time (rn may be higher than mn). Thus, stage (1) can be implemented to run in O(n(m + r)) time. Stage (2) clearly requires O(n log n) time. The next stages are less time-consuming than the first one, and thus the total running time is as stated. Algorithm Best-or-Rest can also be used for the generalization in which an -fraction of the customers may be ignored ( is specified in the input). We can simply ignore the n customers with highest min-costs (in case of ties we ignore only those whose min-cost is strictly higher than the min-cost of (1 − )n other customers), and then run this algorithm. This guarantees an approximation-ratio of 3 for this generalization of the problem, since the optimal cost must be at 4.3 Approximating Min-Max r-Gathering 63 least the highest min-cost of the customers we considered (note that the customers we ignored are not partners of customers we haven’t ignored, since their min-cost is higher). As mentioned in the Introduction, this matches the approximation-ratio stated in [AFK+ 06] for a special case. It is also easy to see that algorithm Best-or-Rest can be used to achieve the same approximation ratio even if there is a different lower-bound ri on the number of customers for each facility fi ∈ F (a generalization considered by [GMM00, KM00] with the min-sum objective). This should simply be taken into account in the definitions of min-cost, best-facility and partners, and the first stage of the algorithm will change accordingly (and will be similarly implemented). Furthermore, it can be used to achieve the same approximation-ratio for the generalization in which there are several types of customers, and each open facility fi must have at least rij customers of type j (again, this should simply be taken into account in Definition 4.1, changing the first stage of the algorithm accordingly). We next prove that algorithm Best-or-Rest can be used to provide a 2r approximation for min-sum r-gathering (with unit demands), in the basic case introduced by [KM00] where there are no facility costs. We call this case basic min-sum r-gathering. This improves upon the ratio of 1.5(2r − 1) + implied by the algorithm of [GMM00, KM00] for this case of the problem. For small values of r, it is also better than the recent 558 approximation of [Svi08]. We define the min-cost, best-facility and partners in the corresponding way for the min-sum problem (the cost of an assignment to a facility is the sum of the connection-costs of the customers rather than their maximum). Lemma 4.4. The cost of the solution found by algorithm Best-or-Rest for basic min-sum rgathering is at most twice the sum of the min-costs of all the customers. Proof. The proof is very similar to the proof of Lemma 4.2. First, observe that assigning customers at a certain iteration of stage (3) costs exactly the min-cost of one of the customers being assigned (customer i). which is clearly smaller than the sum of the min-costs of all the customers assigned. Now consider a customer i assigned at stage (4). This customer was not assigned at stage (3), which means that when customer i was considered at stage (3), at least one of his partners, say customer j, had already been assigned to another facility, tj = gk (the best facility of some customer k 6= i). Customer i can also be assigned to tj , with a cost of d(si , tj ). Clearly, d(si , tj ) ≤ d(si , sj ) + d(sj , tj ). Observe that d(si , sj ) ≤ c(i), since d(si , gi ) + d(gi , sj ) ≤ c(i) (as j is one of the partners of customer i). Also, d(sj , tj ) = d(sj , gk ) ≤ c(k), since customer j was one of the partners of customer k. Since we performed stage (3) in a non-decreasing order of min-cost, c(k) ≤ c(i). So taken together, for each customer i assigned at stage (4), d(si , ti ) ≤ 2c(i) (since the customer is assigned to the nearest open facility, and we saw that there exists a facility which 64 On unweighted r-Gatherings satisfies this). This yields the required result. Lemma 4.5. The cost of an optimal solution for basic min-sum r-gathering is at least a (1/r)fraction of the sum of the min-costs of all the customers. Proof. Consider a facility t ∈ F opened by an optimal solution OP T . Let x = yr + z be the number of customers assigned to t in this solution (where y, z are integers such that y ≥ 1, r > z ≥ 0). Now divide these customers into (y + 1) sets in the following way. For each customer a assigned to t, calculate d(sa , t)/c(a). The first set, B0 , will contain the z customers for which the above calculated value was maximal. The other customers are arbitrarily divided into y sets of r customers, B1 , ..., By . Consider a set Bi , 1 ≤ i ≤ y. For each customer a ∈ Bi , c(a) ≤ P b∈Bi d(sb , t) (since this is the cost of assigning r customers, including customer a, to facility t). Summing this over all the P P customers in Bi , we get a∈Bi c(a) ≤ r · a∈Bi d(sa , t). This is true for every 1 ≤ i ≤ y, which means that the cost of assigning the customers of ∪yi=1 Bi in OP T is at least a (1/r)-fraction of the sum of their min-costs. Now consider a customer a ∈ B0 . If we replace one of the customers of B1 by customer a, then the previous argument still holds for this modified set of r customers. So the total cost of assigning the customers in this modified set to t is at least a (1/r)-fraction of the sum of their min-costs. From the way B0 has been selected, it follows that d(sa , t)/c(a) ≥ 1/r (otherwise this ratio must have been smaller than 1/r for all the customers in this set, and thus also for the sums). Since this is true for any customer in B0 , it is true for the whole B0 , i.e., the cost of assigning these customers to t is at least a (1/r)-fraction of the sum of their min-costs. All the above is true for any facility t opened by an optimal solution, which means that the cost of an optimal solution is at least a (1/r)-fraction of the sum of min-costs, as required. Theorem 4.6. Algorithm Best-or-Rest finds a 2r-approximate solution for basic min-sum rgathering, and can be implemented to run in O(n(m + r + log n)) time for this problem. Proof. The approximation-ratio follows from combining the last two lemmas. It is easy to see that the running-time is the same as in Theorem 4.3, since we can similarly implement the first stage of the algorithm (summing the costs in the min-cost computations instead of taking their maximum), and the next stages are the same. 4.4 Assigning to a Nearest Open Facility In this section we consider the min-max r-gathering problem with the additional constraint that each customer should be assigned to the nearest open facility (or to one of the nearest open 4.4 Assigning to a Nearest Open Facility 65 Algorithm Move-to-Solid 1. Run algorithm Best-or-Rest. If there are no unsatisfied customers, we are done. Otherwise, reassign customers according to the following stages (initially no customer is considered reassigned). 2. For each customer who has not been reassigned yet, check which of the currently open facilities he prefers (in case of a tie choose the facility with smallest index). If a facility is preferred by at least r such customers, we say that it became solid. 3. Move to each solid facility all the customers who prefer it that have not been reassigned yet. All the customers in solid facilities are now considered reassigned (and will not be considered at the next executions of stage (2)). 4. If there are non-solid facilities which contain less than r customers now, reassign their remaining customers to the facilities they most prefer out of the solid ones (and close these empty facilities). 5. If there are any non-solid facilities left, return to (2). 6. If there are unsatisfied customers, move them to the facilities they prefer out of the remaining (solid) facilities. Figure 4.2: A 9-approximation algorithm for min-max r-gathering, in which each customer is assigned to a nearest open facility. facilities in case of a tie). We start by presenting a 9-approximation algorithm which satisfies this constraint. In the following we say that a customer prefers a facility if there is no other open-facility nearer to his input location. We use the term unsatisfied for a customer who is not assigned to a nearest open facility. We use algorithm Move-to-Solid, described in Figure 4.2, for finding an approximate solution. Theorem 4.7. Algorithm Move-to-Solid finds a 9-approximate solution for min-max r-gathering, in which each customer is assigned to a nearest open facility. It requires O(n3 /r + mn) time. Proof. We first observe that the algorithm runs in the stated polynomial time. We call an ex- 66 On unweighted r-Gatherings ecution of stages (2)-(5) an iteration. Clearly, there are at most n/r open facilities after stage (1), so there can be at most n/r iterations in which facilities become solid. Note that since there are at least r customers in each facility after stage (1), there must be at least one solid facility. If at a certain iteration no facility becomes solid, it means that at least one customer assigned to a non-solid facility preferred one of the solid facilities at that iteration, and was therefore reassigned to it (the customers in non-solid facilities have not been reassigned yet, and if they all prefer non-solid facilities in (2) then at least one of these facilities must be preferred by at least r such customers). Since customers reassigned to solid facilities are not reassigned again until stage (6), there can be at most n such iterations. Thus the number of iterations is smaller than n + n/r. Clearly, each iteration requires O(n2 /r) time (this is what stage (2) may require at the worst case). Stage (1) requires O(n(m + r + log n)) time according to Theorem 4.3, and stage (6) can clearly be implemented in O(n2 /r) time. Summing these bounds yields the time bound stated in the theorem (as r ≤ n). We next explain why the algorithm indeed finds a solution for the problem. Since each solid facility has at least r customers who preferred it over all the other remaining facilities, at least r customers are left at each of the open facilities at the end (note that facilities are only closed and not opened, so a cheaper assignment option cannot appear later). Since each customer is assigned at stage (6) to a facility that he most prefers out of the remaining open facilities, each is assigned to a nearest open facility (by definition). We thus turn to considering the cost. We proved that algorithm Best-or-Rest finds a 3-approximate solution. We denote its cost by C. We now prove that the reassignments of Move-to-Solid increase the cost of the solution by a factor of at most 3. Note that the cost of open facilities does not increase (since we only close facilities), so we only need to consider the increase in the customers’ connection-costs (distances). Clearly, moving unsatisfied customers to a facility they prefer can only decrease their connectioncost. A customer’s connection-cost can increase only when he is moved from a canceled facility (a facility found at stage (1) which was left with less than r customers) to the solid facility that he most prefers (at stage (4)). Let u be such a canceled facility. If u was canceled, then one of the customers assigned to it at stage (1) must have preferred one of the solid facilities at that iteration, v, and was moved to it. Let customer i be the first such customer. It is clear that d(u, v) ≤ 2C, since d(u, si ) ≤ C, and d(si , v) ≤ d(si , u) (since customer i preferred v). Thus, moving any customer assigned to u at stage (1) to the solid facility that he most prefers adds at most 2C to his connection-cost, which is therefore at most 3C. After reaching a solid facility, the cost of a customer does not increase again (he is reassigned again only if he is unsatisfied at the end, which may only decrease his cost). Therefore, the maximum connection-cost of any customer in this solution is ≤ 3C, i.e., at most 9 times the optimum. 4.4 Assigning to a Nearest Open Facility 67 Figure 4.3: The instance constructed in the proof of Claim 4.8 We note that the procedure described in the last proof can be used to transform any solution into a solution in which each customer is assigned to a nearest open facility, while increasing the total cost by a factor of at most 3. Thus, by applying it to a 2-approximate solution found by the algorithm of [AFK+ 06] for r-gather clustering, we can obtain a 6-approximate solution for r-gather clustering which satisfies the proximity requirement. In the context of [AFK+ 06], it is a clustering solution in which each object is assigned to a nearest cluster center (which is clearly a desirable property of a clustering solution). 4.4.1 Improved Results for r = 2 Recall that min-max r-gathering is polynomial for r = 2 [AK07]. However, the solution found by [AK07] does not necessarily satisfy the proximity requirement. We start by showing that for any r ≥ 2, there are problem instances of r-gather clustering, for which the minimal cost solution that satisfies the proximity requirement costs almost twice the optimum. We then provide algorithm Nearest-Neighbor, that indeed finds a 2-approximate solution which satisfies the proximity requirement for 2-gather clustering. Claim 4.8. For every r ≥ 2 and > 0, there are instances of r-gather clustering such that the minimum cost of a solution that satisfies the proximity requirement is at least (2 − ) times the cost of an optimal solution that does not satisfy the requirement. Proof. Consider a graph which is a simple path of four vertices: v1 , v2 , v3 , v4 . Assume that v1 contains r customers, v2 and v4 contain one customer each, and v3 contains r − 2 customers. Let edge (v1 , v2 ) cost M , let the two other edges, (v2 , v3 ) and (v3 , v4 ), cost M + 1 each, and let the connection-costs be those implied by the distances in this graph (all the vertices are potential facility locations). This example is illustrated in Figure 4.3. The optimal solution costs M + 1: The single customers are assigned to a facility in v3 , along with the r − 2 customers in that vertex (and the customers in v1 are assigned to a facility in v1 ). However, the cheapest solution which satisfies the proximity requirement is acheived when all the customers are assigned to a facility in v3 , with a cost of 2M + 1. The required ratio follows, since M can be arbitrarily large. 68 On unweighted r-Gatherings Algorithm Nearest-Neighbor 1. For each customer i, find the customer j closest to him (his nearest-neighbor), and let c(i) = d(si , sj ). (In case of a tie, pick the customer with smallest index). 2. Consider the customers’ c(i) values in non-increasing order, and do the following for each such value x: (a) Build a graph G = (V, E), where V contains a vertex for each customer i with c(i) = x that was not assigned yet. For every u, v ∈ V , (u, v) ∈ E iff d(u, v) = x. (b) Remove isolated vertices from G. Repeatedly remove edges whose both endpoints have a degree > 1 as long as there are such edges, i.e., until the graph becomes a set of vertex-disjoint stars. (c) Open facilities in the star centers, and assign the customers in the remaining vertices of G to their star’s center (in case of a single edge, arbitrarily pick one of its endpoints to be the center) (d) For each customer in V which was not assigned so far, open a facility at the input location of his nearest-neighbor, and assign that customer and his nearest neighbor to that facility. Figure 4.4: Finding a 2-approximate solution for 2-gather clustering, in which each customer is assigned to a nearest open facility. Note that it is also easy to have an example in which si 6= sj if i 6= j. We can easily adjust the above graph to comply with this requirement. Vertices v1 , v3 will be the centers of stars of r or r − 2 other vertices, respectively, and will not contain a customer. Each of these new vertices will contain one customer, and the edges of the stars will have a very small weight, 1 δ > 0. It is easy to see again that the previous argument still holds for this modified graph, and the claim follows. For the approximation we use algorithm Nearest-Neighbor, described in Figure 4.4. Theorem 4.9. Algorithm Nearest-Neighbor finds a 2-approximate solution for 2-gather clustering, in which each customer is assigned to a nearest open facility. It requires O(n2 ) time. Proof. We start by showing that the algorithm finds a solution for the problem, which costs at most maxi c(i). The cost of assigning a customer i at stage 2(c) is clearly at most c(i), since the 4.5 Hardness Results 69 assignment described uses at most one edge of E for each customer. Each open facility is assigned at least two customers at this stage (those who are at the same star). At stage 2(d), an unassigned customer i is assigned to the input location sj of his nearest neighbor j. We observe that if customer j is the nearest neighbor of customer i then c(j) ≤ c(i) (since customer i is at a distance of c(i) from customer j). If c(j) < c(i), then clearly customer j was not assigned yet, and it is assigned to the same location sj by the algorithm (with zero cost). So this is a valid assignment, and the cost of assigning customer i is exactly c(i). If c(j) = c(i), then customer j must have been previously assigned to his own location sj , when another customer, k (satisfying c(k) > c(j)), has been assigned to it (otherwise si would not have been isolated in G, and customer i would have already been assigned at stage 2(c)). So this is again a valid assignment, which costs c(i). Thus all the customers are assigned, and each facility contains at least 2 customers. All this is true for each of the c(i) values and for each of the customers. Therefore, the total cost of the assignment is at most the maximum of the customers’ c(i) values. Clearly, the optimal solution costs at least half of this (the customers might be able to meet at the middle of a shortest path between them). Finally, we explain why each customer is indeed assigned by the algorithm to a nearest open facility. Facilities are only opened by the algorithm in locations of customers, and each customer is either assigned to his own location or to the location of one of his nearest neighbors (in which case there is no facility at his own location). As the algorithm progresses, there can only be less assignment options (since some of the customers are already assigned to locations of other customers). Therefore, at the end there can be no nearer open facility for any of the customers. It is easy to see that each stage of the algorithm requires a total of at most O(n2 ) time. 4.5 Hardness Results We match the approximation-ratio for min-max r-gathering with the following hardness result. Theorem 4.10. For any r ≥ 3, it is NP-hard to approximate min-max r-gathering within a factor of less than 3, even if there are no facility costs. Proof. We prove the theorem by a reduction from the Exact-k-Cover problem (also called ExactCover by k-Sets), which is known to be strongly NP-hard for any k ≥ 3 [EKR99, GJ79]. The input consists of a set of elements S = {x1 , ...xkn }, and m subsets of this set of elements, S1 , ..., Sm , where |Si | = k for every 1 ≤ i ≤ m. The question is whether there exists a collection of n subsets Si1 , ..., Sin , such that each element is included in exactly one of them. Our reduction first proves 70 On unweighted r-Gatherings that min-max r-gathering is NP-hard, and we later see that this implies that it is NP-hard to approximate within a factor of less than 3. We construct the following input for min-max r-gathering. The set of customer locations is S = {x1 , ...xkn }, i.e., there is one customer for each element xi , 1 ≤ i ≤ kn. There is one potential facility location fi ∈ F for each subset Si (1 ≤ i ≤ m), with p(fi ) = 0. For every xi ∈ Sj , d(xi , fj ) = 1. The other distances are those implied by this definition (i.e., the distances in the graph G = (S ∪ F, E), where (u, v) ∈ E iff d(u, v) = 1 and the weight of each edge is 1). We set r = k. We now prove that the cost of an optimal solution for this problem is 1 iff the answer to the Exact-k-Cover problem is “yes”. Assume the answer to the Exact-k-cover problem is “yes”. Opening facilities in the locations corresponding to the cover subsets Si1 , ..., Sin , and assigning each customer to the facility corresponding to the subset which covers his corresponding element, provides a solution in which each facility is assigned r customers and the cost is 1 for each customer. Thus, the optimal cost is indeed 1. On the other hand, if the optimal cost is 1, we show that the answer to the Exact-k-Cover problem is “yes”. A solution with a cost of 1 can only exist if each customer is assigned to a facility which corresponds to a subset containing his corresponding element. Thus, there are exactly r such customers assigned to each open facility in that solution, since each facility has only r customers at a distance of 1. Therefore there must be n such facilities, since all the customers are assigned. These facilities correspond to n subsets, each of them containing r different elements. Thus these subsets form an Exact-k-Cover. So both sides of the reduction are proven. Since Exact-k-Cover is NP-hard for any k ≥ 3, we get that our problem is NP-hard for any r ≥ 3. Clearly, the cost is at least 3 iff the answer is “no”, since there is no potential facility location at distance 2 from a customer. Thus, the theorem is proven. The problem remains hard to approximate even for the following special case. Theorem 4.11. For any r ≥ 3, the special case of min-max r-gathering in which S = F and there are no facility costs, is NP-hard to approximate within less than a factor of 2. Proof. Proving NP-hardness for the special case where S=F requires a change in the reduction described in the previous proof. Instead of having only one location fi corresponding to each subset Si , we now have r locations, fi1 , ..., fir , corresponding to each subset Si . For each xj ∈ Si we define d(xj , fi1 ) = 1. Also, for every 1 ≤ j < r, we define d(fij , fir ) = 1. An example can be seen in Figure 4.5. Again, the other distances are those implied by those we defined. Each location both contains a customer and is a potential location of a facility (S = F ). Again, r = k. 4.5 Hardness Results 71 Figure 4.5: An example illustrating the reduction in the proof of Theorem 4.11. The figure shows the subgraph constructed for a subset Si = {x1 , x3 , x4 , x6 } It is not difficult to see that a solution has cost 1 iff the customers corresponding to each subset Si are assigned to fir , and customers who correspond to elements are assigned to neighboring locations of type fi1 (as in the proof of the previous theroem, despite the additional locations and customers). Otherwise the cost is at least 2. Therefore the reduction holds due to the same arguments, and the problem cannot be approximated within a factor of less than 2, assuming P 6= N P . Since r-gather clustering is a generalization of the problem mentioned in the last theorem, this hardness result also holds for r-gather clustering, thus matching the approximation-ratio obtained by [AFK+ 06]. Previously this was known for r-gather clustering only for r ≥ 7 [AFK+ 06]. Corollary 4.12. For any r ≥ 3, it is NP-hard to approximate the r-gather clustering problem within a factor of less than 2. We next prove the hardness of a related problem described in the Introduction, min-max minimum-movement facility-location, which was introduced by [DHM+ 07]. They observed that this problem is approximable within a factor of 2. We provide a matching lower-bound on the approximability, thus resolving an open question that they presented [DHM+ 07]. Theorem 4.13. It is NP-hard to approximate the min-max minimum-movement facility-location problem within a factor of less than 2. 72 On unweighted r-Gatherings Figure 4.6: An example illustrating the reduction in the proof of Theorem 4.13. The letter “c” represents a client and “s” represents a server. Proof. The reduction is from SAT. We build an unweighted graph with the following vertices: A “server” for each variable, a “client” for each clause, and an empty vertex for each literal, connected to the clauses which contain it and to its variable (see Figure for an example illustrating this construction). A facility may be located at any vertex. The connection-costs are defined according to the distances in this graph. Thus, there is a satisfying assignment to the formula iff there is a solution of cost 1 to the minimum-movement facility-location problem (facilities are located in vertices corresponding to true literals). Otherwise, the cost is at least 2. Thus, the theorem follows. 4.6 Concluding Remarks and Open Problems We considered the min-max version of the r-gathering problem, and provided constant-approximation algorithms and hardness-of-approximation results for several variants, some of which are tight. Some of our results improve previous results for special cases or related problems, including an improved approximation for min-sum r-gathering without facility costs and improved results for 4.6 Concluding Remarks and Open Problems 73 r-gather clustering and min-max minimum-movement facility-location. Obvious remaining open problems are providing improved approximation algorithms or hardness results for min-max r-gathering with the proximity requirement and for min-sum r-gathering. Other problems which remain for future research are the generalizations in which each customer may have a different demand and each facility must serve a total demand of at least r, while the connection-costs are the product of distance and demand (previously considered by [GMM00, KM00] for the min-sum version). 74 On unweighted r-Gatherings Chapter 5 Concluding Remarks In this research we attempted to “explore new roads in an ancient land”, and show that meaningful findings can still be obtained even without using “heavy tools”. As is often the case in theoretical computer-science, it seems that the problems we introduced and the solutions we obtained raise more open questions for future research than the number of open questions that we solved. We hope that we succeeded in providing some new insights into the problems we considered, and that future research will resolve the questions we left open, as well as other questions arising from considering multicriteria, cooperative or other non-standard variants of combinatorial problems. Contributing to the research in this field has been a great enriching experience, which I hope many other prospective researchers will share. 76 Concluding Remarks Bibliography [AAS06] A. Armon, A. Avidor, and O. Schwartz. Cooperative TSP. In Proceedings of the 14th Annual European Symposium on Algorithms(ESA), Lecture Notes in ComputerScience, volume 4168, pages 41–50. Springer, 2006. [ABF+ 02] E. M. Arkin, M. A. Bender, S. P. Fekete, J. S. B. Mitchell, and M. Skutella. The Freeze-Tag problem: How to wake up a swarm of robots. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 568–577, 2002. [ABG+ 03] E. M. Arkin, M. A. Bender, D. Ge, S. He, and J. S. B. Mitchell. Improved approximation algorithms for the Freeze-Tag problem. In Proceedings of the 15th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 295–303, 2003. [AC04] S. Arora and K. L. Chang. Approximation schemes for degree-restricted MST and Red-Blue Separation problems. Algorithmica, 40(3):189–210, 2004. [AFK+ 06] G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. Achieving anonymity via clustering. In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 153–162, 2006. [AGK+ 04] V. Arya, N. Garg, R. Khandekar, A. Meyerson, K. Munagala, and V. Pandit. Local search heuristics for k-median and facility location problems. SIAM Journal on Computing, 33(3):544–562, 2004. [AH94] E. Arkin and R. Hassin. Approximation algorithms for the geometric Covering Salesman Problem. Discrete Applied Math., 55:197–218, 1994. 78 [AK07] BIBLIOGRAPHY E. Anshelevich and A. Karagiozova. Terminal backup, 3D matching, and covering cubic graphs. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC), pages 391–400, 2007. [Arc00] A. Archer. Inapproximability of the asymmetric facility location and k-median problems. citeseer.ist.psu.edu/archer00inapproximability.html, 2000. [Arm08] A. Armon. On min-max r-gatherings. In Approximation and Online Algorithms, 5th International Workshop (WAOA 2007), Lecture Notes in Computer-Science, volume 4927, pages 128–141. Springer, 2008. [Aro98] S. Arora. Polynomial-time approximation schemes for Euclidean TSP and other geometric problems. Journal of the ACM, 45(5):753–782, 1998. [AZ06] A. Armon and U. Zwick. Multicriteria global minimum cuts. Algorithmica, 46(1):15– 26, 2006. [BEH00] M. Bruglieri, M. Ehrgott, and H.W. Hamacher. Some complexity results for kcardinality minimum cut problems. Technical report. In Wirtschaftsmathematik, University of Kaiserslautern, (69/2000), 2000. [BME04] M. Bruglieri, F. Maffioli, and M. Ehrgott. Cardinality constrained minimum cut problems: Complexity and algorithms. Discrete Applied Mathematics, 137(3):311– 341, 2004. [BNGNS98] A. Bar-Noy, S. Guha, J. Naor, and B. Schieber. Multicasting in heterogeneous networks. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC), pages 448–453, 1998. [Byr07] J. Byrka. An optimal bifactor approximation algorithm for the metric uncapacitated facility location problem. In Proceedings of the 10th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), Lecture Notes in Computer Science, volume 4627, pages 29–43. Springer, 2007. [Cha03] T. M. Chan. Euclidean bounded-degree spanning tree ratios. In Proceedings of the 19th ACM Symposium on Computational Geometry (SoCG), pages 11–19, 2003. [Chr76] N. Christofides. Worst-case analysis of a new heuristic for the traveling salesman problem. Technical report, Graduate School of Industrial Administration, Carnegy– Mellon University, 1976. BIBLIOGRAPHY 79 [Cli97] J. Climacao. Multicriteria Analysis. Springer-Verlag, 1997. [CLRS01] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press, 2001. [dBGK+ 05] M. de Berg, J. Gudmundsson, M. J. Katz, C. Levcopoulos, M. H. Overmars, and A. F. van der Stappen. TSP with neighborhoods of varying size. Journal of Algorithms, 57:22–36, 2005. [dBvKOS00] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf. Computational Geometry: Algorithms and Applications. Springer, 2nd edition, 2000. [DGKR05] I. Dinur, V. Guruswami, S. Khot, and O. Regev. A new multilayered PCP and the hardness of hypergraph vertex cover. SIAM Journal on Computing, 34(5):1129– 1146, 2005. [DHM+ 07] E. D. Demaine, M. Hajiaghayi, H. Mahini, A. S. Sayedi-Roshkhar, S. Oveisgharan, and M. Zadimoghaddam. Minimizing movement. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 731–740, 2007. [DM01] A. Dumitrescu and J. S. B. Mitchell. Approximation algorithms for TSP with neighborhoods in the plane. In Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 38–46, 2001. [Dre95] Z. Drezner. Facility Location. Springer-Verlag, 1995. [EG00] M. Ehrgott and X. Gandibleux. A survey and annotated bibliography of multicriteria combinatorial optimization. Operation Research Spectrum, 22:425–460, 2000. [Ehr00] M. Ehrgott. Multicriteria Optimization. Lecture Notes in Economics and Mathematical Systems. Springer-Verlag, 2000. [EK01] L. Engebretsen and M. Karpinski. Approximation hardness of TSP with bounded metrics. In Proceedings of the 28th Annual International Colloquium on Automata, Languages and Programming (ICALP), Lecture Notes in Computer-Science, volume 2076, pages 201–212. Springer, 2001. [EKR99] F. Ergun, R. Kumar, and R. Rubinfeld. Fast approximate PCPs. In Proceedings of the 31st Annual ACM Symposium on Theory of Computing (STOC), pages 41–50, 1999. 80 [FKK+ 97] BIBLIOGRAPHY S. P. Fekete, S. Khuller, M. Klemmstein, B. Raghavachari, and N. Young. A network flow technique for finding low-weight bounded-degree trees. Journal of Algorithms, 24:310–324, 1997. [GGJ76] M. R. Garey, R. L. Graham, and D. S. Johnson. Some NP-complete geometric problems. In Proceedings of the 8th Annual ACM Symposium on Theory of Computing (STOC), pages 10–22, 1976. [GH94] O. Goldschmidt and Dorit S. Hochbaum. A polynomial algorithm for the k-cut problem for fixed k. Mathematics of Operations Research, 19(1):24–37, 1994. [GJ79] M. R. Garey and D. S. Johnson. Computers and Intractability – A Guide to the Theory of NP-Completeness. Freeman publishing, 1979. [GK99] S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228–248, 1999. [GL99] J. Gudmundsson and C. Levcopoulos. A fast approximation algorithm for TSP with neighborhoods. Nordic Journal of Computing, 6(4):469–488, 1999. [GMM00] S. Guha, A. Meyerson, and K. Munagala. Hierarchical placement and network design problems. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS), pages 603–612, 2000. [Gon85] T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Comput. Sci., 38:293–306, 1985. [GT88] A. V. Goldberg and R. E. Tarjan. A new approach to the maximum-flow problem. Journal of the ACM, 35(4):921–940, 1988. [GvKN06] J. Gudmundsson, M. J. van Kreveld, and G. Narasimhan. Region-restricted clustering for geographic data mining. In Proceedings of the 14th Annual European Symposium on Algorithms (ESA), Lecture Notes in Computer-Science, volume 4168, pages 399–410. Springer, 2006. [Han80] P. Hansen. Bicriterion path problems. In G. Fandel and T. Gal, editors, Multiple Criteria Decision Making: Theory and Applications, LNEMS 177, pages 109–127. Springer-Verlag, Berlin, 1980. BIBLIOGRAPHY [HCP04] 81 S.P. Hong, S.J. Chung, and B.H. Park. A fully-polynomial bicriteria approximation scheme for the constrained minimum spanning tree problem. Operations Research Letters, 32(3):233–239, 2004. [HHL88] S. M. Hedetniemi, S. T. Hedetniemi, and A.L. Liestman. A survey of gossiping and broadcasting in communication networks. Networks, 18(4):319–359, 1988. [HO94] J. Hao and J. B. Orlin. A faster algorithm for finding the minimum cut in a directed graph. Journal of Algorithms, 17(3):424–446, 1994. [Hoc82] D. Hochbaum. Heuristics for the fixed cost median problem. Mathematical Programming, 22(1):148–162, 1982. [Hoo91] J. A. Hoogeveen. Analysis of Christofides’ heuristic: Some paths are more difficult than cycles. Operation Research Letters, 10(5):291–295, 1991. [HS76] E. Horowitz and S. Sahni. Exact and approximate algorithms for scheduling nonidentical processors. Journal of the Association for Computing Machinery, 23:317– 327, 1976. [JP99] K. Jansen and L. Porkolab. Improved approximation schemes for scheduling unrelated parallel machines. In Proceedings of the 31st Annual ACM Symposium on Theory of Computing (STOC), pages 408–417, 1999. [Kar99] D. R. Karger. Random sampling in cut, flow, and network design problems. Mathematics of Operations Research, 24(2):383–413, 1999. [Kar00] D. R. Karger. Minimum cuts in near-linear time. Journal of the ACM, 47(1):46–76, 2000. [KLS04] J. Konemann, A. Levin, and A. Sinha. Approximating the degree-bounded minimum diameter spanning tree problem. Algorithmica, 41(2):117–129, 2004. [KM00] D. R. Karger and M. Minkoff. Building steiner trees with incomplete global knowledge. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS), pages 613–623, 2000. [KRY96] S. Khuller, B. Raghavachari, and N. Young. Low degree spanning trees of small weight. SIAM Journal on Computing, 25(2):355–368, 1996. 82 [KS96] BIBLIOGRAPHY D. R. Karger and C. Stein. A new approach to the minimum cut problem. Journal of the ACM, 43(4):601–640, 1996. [LST90] J.K. Lenstra, D.B. Shmoys, and E. Tardos. Approximation algorithms for scheduling unrelated parallel machines. Mathematical Programming, 46:259–271, 1990. [LWX06] A. Lim, F. Wang, and Z. Xu. A transportation problem with minimum quantity commitment. Transportation Science, 40(1):117–129, 2006. [Meg83] N. Megiddo. Linear-time algorithms for linear programming in R3 and related problems. SIAM Journal on Computing, 12(4):759–776, 1983. [Mit99] J. S. B. Mitchell. Guillotine Subdivisions approximate polygonal subdivisions: Part II – A simple polynomial-time approximation scheme for geometric TSP, k-MST, and related problems. SIAM Journal on Computing, 28(4):1298–1309, 1999. [Mit07] J. S. B. Mitchell. A PTAS for TSP with neighborhoods among fat regions in the plane. In Proceedings of the 18th annual ACM-SIAM symposium on Discrete Algorithms (SODA), pages 11–18, 2007. [MM95] C. Mata and J. S. B. Mitchell. Approximation algorithms for geometric tour and network design problems. In Proceedings of the 11th Annual ACM Symposium on Computational Geometry (SoCG), pages 360–369, 1995. [MS92] C. L. Monma and S. Suri. Transitions in geometric minimum spanning trees. Discrete & Computational Geometry, 8:265–293, 1992. [MYZ03] M. Mahdian, Y. Ye, and J. Zhang. A 2-approximation algorithm for the softcapacitated facility location problem. In Proceedings of the 10th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), Lecture Notes in Computer Science, volume 2764, pages 129–140. Springer, 2003. [NI92] H. Nagamochi and T. Ibaraki. Computing edge-connectivity in multigraphs and capacitated graphs. SIAM Journal on Discrete Mathematics, 5(1):54–66, 1992. [NNI97] H. Nagamochi, K. Nishimura, and T. Ibaraki. Computing all small cuts in an undirected network. SIAM Journal of Discrete Mathematics, 10(3):469–481, 1997. [Pap77] C. H. Papadimitriou. Euclidean TSP is NP-complete. Theoretical Computer Science, 4:237–244, 1977. BIBLIOGRAPHY [PV84] 83 C. H. Papadimitriou and U. V. Vazirani. On two geometric problems related to the Traveling Salesman Problem. Journal of Algorithms, 5:231–246, 1984. [PY00] C. H. Papadimitriou and M. Yannakakis. On the approximability of trade-offs and optimal access of web sources. In Proceedings of the 38th Annual Symposium on Foundations of Computer Science (FOCS), pages 86–92, 2000. [Rav94] R. Ravi. Rapid rumor ramification: Approximating the minimum broadcast time. In Proceedings of the 35th Symposium on Foundations of Computer Science (FOCS), pages 202–213, 1994. [RZ00] G. Robins and A. Zelikovsky. Improved Steiner tree approximation in graphs. In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 770–779, 2000. [SABM02] M. O. Sztainberg, E. M. Arkin, M. A. Bender, and J. S. B. Mitchell. Analysis of heuristics for the Freeze-Tag problem. In Proceedings of the 8th Scandinavian Workshop on Algorithm Theory (SWAT), Lecture Notes in Computer-Science, volume 2368, pages 270–279. Springer, 2002. [Sch96] B. Schneier. Applied Cryptography, pages 71–73. John Wiley and Sons, 1996. [SKK99] K. Schloegel, G. Karypis, and V. Kumar. A new algorithm for multi-objective graph partitioning. In Proceedings of the 5th European Conference on Parallel Processing (Euro-Par), Lecture Notes in Computer Science, volume 1685, pages 322–331. Springer, 1999. [SS05] S. Safra and O. Schwartz. On the complexity of approximating TSP with Neighborhoods and related problems. Computational Complexity, 14:281–307, 2005. [Svi02] M. Sviridenko. An improved approximation algorithm for the metric uncapacitated facility location problem. In Proceedings of the 9th International Conference on Integer Programming and Combinatorial Optimization (IPCO), Lecture Notes in Computer-Science, volume 2337, pages 240–257. Springer, 2002. [Svi08] Z. Svitkina. Lower-bounded facility location. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1154–1163, 2008. [SW97] M. Stoer and F. Wagner. A simple min-cut algorithm. Journal of the ACM, 44(4):585–591, 1997. 84 [TB06] BIBLIOGRAPHY T.M. Truta and V. Bindu. Privacy protection: p-sensitive k-anonymity property. In Proceedings of the Workshop on Privacy Data Management, in conjunction with the 22nd IEEE International Conference of Data Engineering (ICDE), pages 94–103, 2006. [Vaz03] V. V. Vazirani. Approximation Algorithms. Springer-Verlag, 2003. [Vyg05] J. Vygen. Approximation algorithms for facility location problems. cite- seer.ist.psu.edu/ vygen05approximation.html, 2005. [Wel91] E. Welzl. Smallest enclosing disks (balls and ellipsoids). In New Results and New Trends in Computer Science, Lecture Notes in Computer Science, volume 555, pages 359–370. Springer, 1991. [ZCY04] J. Zhang, B. Chen, and Y. Ye. Multi-exchange local search algorithm for the capacitated facility location problem. In Proceedings of the 10th International Conference on Integer Programming and Combinatorial Optimization (IPCO), Lecture Notes in Computer-Science, volume 3064, pages 219–233. Springer, 2004. אוניברסיטת תל אביב הפקולטה למדעים מדויקים ע"ש ריימונד ובברלי סאקלר ביה"ס למדעי המחשב ע"ש בלבטניק אלגוריתמים מקורבים ומדויקים לפתרון גרסאות חדשות של בעיות קלאסיות בגרפים חיבור זה הוגש לשם קבלת תואר דוקטור לפילוסופיה ע"י אמתי ערמון בהנחיית פרופ' אורי צוויק הוגש לסנאט של אוניברסיטת תל אביב ספטמבר 2008 תשרי התשס"ט מוקדש לאימי ולסבתי בלינה תודות למנחה שלי ,אורי צוויק ,על ההנחיה המוצלחת של המחקר ,ועל כך שחלק עימי את גישתו האופטימית לפתרון בעיות אלגוריתמיות. לעמוס פיאט ,מיכה שריר ,ורה אסודי ,אייל אבן-דר ובן זנדבנק ,על שיתופי-הפעולה הממושכים והמהנים בהוראת קורסים לתואר ראשון. לעדי אבידור ועודד שוורץ ,על שיתוף-פעולה מחקרי פורה ועבודה משותפת מהנה. לאמיר אפשטיין ועידו צמרת ,שנהניתי לחלוק איתם משרד. ולאימי ולסבתי בלינה ,על כל האהבה והתמיכה ,ועל כך שעשו אותי מי שאני. תמצית גרפים הם ככל הנראה האובייקט הנחקר ביותר בתיאוריה של מדעי המחשב ,ובעיות רבות הקשורות אליהם נחקרות מזה עשורים רבים ,כדוגמת מציאת מסלולים קצרים ביותר ,זרימה מקסימלית, ובעיית הסוכן הנוסע .עבור חלק מהבעיות בגרפים פותחו אלגוריתמים יעילים ,בעוד שאחרות הוכחו כקשות לפתרון .עבור רבות מהבעיות הקשות פותחו אלגוריתמים יעילים שמוצאים פתרון מקורב לבעיה ,כלומר פתרון שקרוב להיות אופטימלי. בעבודה זו אנו עוסקים בגרסאות חדשות של שלוש בעיות מרכזיות בגרפים :מציאת חתך מינימלי גלובלי ,בעיית הסוכן הנוסע ,ובעיית מיקום המתקנים ) .(facility locationעבור כל אחת מהבעיות האלה ,הגרסאות בהן אנו עוסקים מכלילות או מרחיבות את הבעיה ,ע"י הוספת מטרות ,אילוצים ,או אפשרויות ,הנובעים מתרחישים מציאותיים או ממחקר תיאורטי קודם .במחקרנו על הגרסאות הללו, אנו עונים גם על מספר שאלות פתוחות שהוצגו לגבי מקרים פרטיים שנחקרו בעבר ,ואנו משפרים מספר תוצאות קודמות. בפרק 2אנו עוסקים בגרסאות מרובות-קריטריונים של בעיית החתך המינימלי הגלובלי .בהכללת הבעיה הזו ל k-קריטריונים ,לכל קשת יש kמחירים אי-שליליים .המחירים הללו נמדדים ביחידות שונות ,שלא ניתן להמיר ביניהן .בגרסת ה"-וגם" של הבעיה ,בחירת קשת דורשת תשלום של כל k המחירים שלה .בגרסת ה"-או" ,ניתן לבחור איזה מבין המחירים שלה לשלם .בהינתן kחסמים ,אחד בכל קריטריון ,בעיית ההחלטה הבסיסית היא :האם קיים חתך שהעלות של בחירת קשתותיו אינה חורגת מאף אחד מהחסמים. אנו מראים שגרסת ה"-וגם" ניתנת לפתרון בזמן פולינומיאלי לכל מספר קבוע של קריטריונים )שאינו חלק מהקלט( ,והיא -NPקשה עבור מספר קריטריונים לא קבוע .התוצאות שלנו עשויות להיראות מפתיעות ,כיוון ש Papadimitriou and Yannakakis [PY00] -הוכיחו שבעיית ה s-t -חתך המינימלי עם שני קריטריונים היא -NPקשה .העבודה שלנו פותרת שאלה פתוחה שהציגו Bruglieri et al. ] .[BEH00, BME04הם שאלו האם מגבלה על מספר קשתות החתך הופכת את בעיית החתך המינימלי הגלובלי לבעיה -NPקשה או שהיא עדיין פולינומיאלית .אנו מראים שזהו מקרה פרטי של בעיית ה"-וגם" שחקרנו ,שהוא פולינומיאלי. לגבי גרסת ה"-או" ,אנו מראים -NPקשיות אפילו עבור שני קריטריונים ,ומוכיחים שהבעיה ניתנת לפתרון בזמן פסאודו-פולינומיאלי לכל מספר קבוע של קריטריונים )שאינו חלק מהקלט( .אנו מראים שלבעיה הזו יש גם .FPTASבהמשך אנו עוסקים בהרחבות וישומים ,וכן בגרסת "או" מרובת קריטריונים של שתי בעיות אופטמיזיציה נוספות .למיטב ידיעתנו ,אנו הראשונים שעוסקים בגרסאות "או" של בעיות מרובות קריטריונים .פרק זה מבוסס על המאמר ].[AZ06 בפרק 3אנו עוסקים בגרסאות שיתופיות של בעיית הסוכן הנוסע ) .(TSPבבעיות האלה סוכן-נוסע צריך להעביר סחורה ללקוחות שמוכנים לסייע לתהליך ההפצה .המוטיבציה הבסיסית לעיסוק בגרסאות כאלה היא שבמקרים מציאותיים רבים ה"-לקוחות" הם למעשה אנשים ששייכים לאותו ארגון/חברה שהסוכן הנוסע שייך אליו ,כך שאפשר לבקש מהם לעזור .הלקוחות עשויים להיות יכולים לשתף-פעולה בכמה אופנים :על-ידי תזוזה לכיוון הסוכן הנוסע כדי לקבל את הסחורה ,על-ידי העברת הסחורה שהם קיבלו ללקוחות אחרים ,או על-ידי עשיית שני הדברים האלה .יש כמה פונקציות-מטרה שעשויות להיות רלוונטיות במקרים כאלה :מינימיזציה של סכום המרחקים שנסעו כל המשתתפים ,מינימיזציה של המרחק המקסימלי שנסע איזשהו משתתף ,או מינימיזציה של הזמן עד שכל תהליך ההפצה יסתיים. המחקר שלנו עוסק בכל הקומבינציות של אופני שיתוף-הפעולה ופונקציות-המטרה ,בגרפים לא- מכוונים ממושקלים ,וגם במרחב אוקלידי ממימד קבוע .אנו מראים שלרוב הבעיות האלה יש אלגוריתם קירוב שמשיג יחס קבוע ,לרבות מהאחרות יש ,PTASוכמה מהבעיות ניתנות לפתרון בזמן פולינומיאלי .בצד הקשיות ,אנו נותנים הוכחות -NPקשיות והוכחות של קשיות קירוב עבור קבועים מסויימים ,שחלקן הדוקות .כל האלגוריתמים שלנו הם קומבינטוריים לחלוטין ,והוכחות הקשיות שלנו משתמשות ברדוקציות מבעיות -NPקשות ידועות ,ללא צורך בשימוש במשפט ה .PCP -פרק זה מבוסס על המאמר ].[AAS06 בפרק 4אנו עוסקים בגרסת min-maxשל בעיית ,r-gatheringשנחקרה בעבר ,עם משקלי-יחידה. הבעיה שאנו עוסקים בה היא בעיית מיקום-מתקנים מטרית ,שבה כל מתקן פתוח חייב לשרת לפחות rלקוחות ,והמקסימום של מחירי המתקנים והחיבור צריך להיות מינימלי )במקום הסכום(. המוטיבציה לבעיה הזו היא תרחישים שבהם דרושים rלקוחות כדי שיהיה כדאי לפתוח מתקן, והמחירים מייצגים את הזמן עד שהמתקן/החיבור יהיה מוכן )כלומר אנו רוצים שהפתרון כולו יהיה מוכן מוקדם ככל האפשר(. אנו מציגים אלגוריתם שמשיג יחס קירוב 3עבור הבעיה הזו ,ומוכיחים שהשגת קירוב טוב יותר היא -NPקשה .לאחר מכן אנו עוסקים בבעיה הזו עם האילוץ הטבעי הנוסף ,לפיו כל לקוח יחובר למתקן הפתוח הקרוב אליו ביותר .עבור גרסה זו אנו מציגים אלגוריתם שמשיג יחס קירוב .9בנוסף אנו עוסקים בגרסאות ומקרים פרטיים שנחקרו בעבר ,ומשיגים תוצאות אלגוריתמיות ותוצאות קשיות משופרות .תוצאות הפרק הזה מבוססות על המאמר ].[Arm08 תקציר גרפים הם ככל הניראה האובייקט הנחקר ביותר בחקר התאוריה של מדעי-המחשב ,ובעיות רבות בגרפים נחקרו במשך עשורים רבים .רשימה חלקית של בעיות כאלה כוללת את בעיית מציאת המסלולים הקצרים ביותר ,עץ-פורש מינימלי ,זרימה מקסימלית ,חתך מינימלי גלובלי ,הסוכן הנוסע, עץ שטיינר מינימלי-k ,מרכז-k ,חציון ,ועוד .ניתן למצוא סקירה בסיסית בספר היסודי של Cormen ] .et al. [CLRS01עבור חלק מהבעיות הקלאסיות הללו פותחו אלגוריתמים יעילים ,בעוד שאחרות הוכחו כקשות לפתרון .עבור רבות מהבעיות הקשות פותחו אלגוריתמים יעילים שמוצאים פתרון מקורב )ראו למשל סקירה בספר ].([Vaz03 במקביל למחקר המתמשך של הבעיות הקלאסיות בגרפים ,יש מחקרים רבים על מקרים פרטיים שלהן ,הכללות ,וגרסאות אחרות .מקרים פרטיים עוסקים פעמים רבות בסוגים מיוחדים של גרפים, או במקרים שבהם יש מגבלות על ערכי קלט מסויימים )כמו הגבלתם להיות שלמים ,או הגבלה לטווח מספרים מסויים( .הכללות עוסקות ,למשל ,בגרפים מכוונים במקום גרפים לא מכוונים ,או במטרואידים .סוגים אחרים של גרסאות מערבים שינוי של פונקציית המטרה )סכום/מקסימום/ מינימום/וכו'( ,תוספת/הסרה של אילוץ ,ו/או התייחסות למרחב שונה )למשל למישור או למרחב אוקלידי אחר( .המוטיבציה לגרסאות השונות נבעה לפעמים משאלות תאורטיות ,ולפעמים מישומים מציאותיים. בעבודה זו אנו עוסקים בגרסאות חדשות של שלוש בעיות קלאסיות :החתך המינימלי ,הסוכן הנוסע, ומיקום מתקנים .עבור כל אחת מהבעיות ,ניראה שהגרסה שבה אנו עוסקים מאירה על הבעיה באור חדש ,ומכלילה גרסאות שנחקרו בעבר .עבודתנו עונה על מספר שאלות פתוחות שנשאלו לגבי הגרסאות האלה ,משפרת מספר תוצאות קודמות ,ומרחיבה את היריעה של המחקר הקודם. חתכים מינימליים מרובי-קריטריונים בפרק 2אנו עוסקים בגרסאות מרובות קריטריונים של בעיית החתך המינימלי הגלובלי .בבעיה הזו יש לחלק לשתי קבוצות לא-ריקות את הקודקודים של גרף לא-מכוון שקשתותיו ממושקלות .המטרה היא שסכום המשקלים )מחירים( של הקשתות המקשרות בין שתי תתי-הקבוצות יהיה מינימלי. Kargerהציג אלגוריתם מונטה-קרלו כמעט לינארי עבור הבעיה הזאת ,שרץ בזמן )O(mlog3n ] .[Kar00האלגוריתמים הדטרמיניסטיים המהירים ביותר עבור הבעיה הזו דורשים זמן של ) .[HO94, NI92, SW97] O(mnlognזה זמן דומה לזה שדורשים האלגוריתמים המהירים ביותר למציאת חתך s-tמינימלי ,חתך שמפריד בין שני קודקודים מסויימים s ,ו ,t-שהם חלק מקלט הבעיה )הבעיה הזו נפתרת על-ידי אלגוריתמים למציאת זרימה מקסימלית ,ראו למשל ].([GT88 בעבודות של [BEH00, BME04] Bruglieri et al.הוצגה הבעיה של מציאת חתך-מינימלי "בעל עוצמה מוגבלת" ,כלומר חתך מינימלי במשקלו מבין החתכים שבהם מספר קשתות-החתך הוא קטן מערך מסויים שמופיע בקלט לבעיה .המחברים שאלו האם הבעיה הזו ניתנת לפתרון בזמן פולינומיאלי ,והשאירו את השאלה הזו כשאלה פתוחה .ניתן לראות את הבעיה שהם הציגו כמקרה פרטי של גרסה דו-קריטריונית של בעיית החתך המינימלי הגלובלי ,שבה לכל קשת יש מחיר נוסף )ששווה ל 1-עבור כל הקשתות( ,והמינימיזציה עבור המחיר )קריטריון( הראשון צריכה להיעשות מבלי לחרוג מגבול מסויים עבור המחיר השני )בהנחה שלא ניתן להמיר בין שני המחירים(. יש עבודות רבות ,בעיקר בקהילת חקר-הביצועים ,על בעיות עם עוצמה מוגבלת ובעיות מרובות- קריטריונים ,שהן בדרך כלל -NPקשות ודורשות קירובים )ראו למשל ].([EG00, Ehr00, Cli97 תוצאה ספציפית שקשורה לבעיה הזו היא תוצאה של ],Papadimitriou and Yannakakis [PY00 שהוכיחו שהגרסה הדו-קריטריונית של מציאת חתך s-tמינימלי היא -NPקשה. התרומה שלנו באופן די מפתיע ,אנו מראים שהגרסה הדו-קריטריונית של חתך מינימלי גלובלי ניתנת לפתרון בזמן פולינומיאלי .בכך אנו פותרים גם את השאלה הפתוחה שהציגו .Bruglieri et al.אנו מרחיבים את המחקר שלנו ועוסקים בשתי גרסאות מרובות-קריטריונים כלליות יותר של בעיית החתך המינימלי הגלובלי .בתרחיש ה-k -קריטריוני ,לכל קשת יש kמחירים )אי-שליליים( .המחירים הללו נמדדים ביחידות שונות ,שלא ניתן להמיר ביניהן .בגרסת ה"-וגם" של הבעיה ,הכללת קשת בחתך דורשת לשלם את כל המחירים שלה .בגרסת ה"-או" של הבעיה ,ניתן לבחור את אחד ממחירי הקשת ,ורק הוא ישולם )ביחידות המתאימות( .בהינתן kחסמים ,השאלה הבסיסית היא האם יש חתך של הגרף שמחירו לא חורג באף קריטריון מהחסמים הנתונים. אנו מראים שגרסת ה"-וגם" של הבעיה היא פולינומיאלית לכל מספר קבוע של קריטריונים ,והיא -NPקשה עבור מספר קריטריונים לא קבוע )כלומר מספר שהוא חלק מהקלט( .לעומת זאת ,גרסת ה- "או" של הבעיה היא -NPקשה אפילו עבור ,k=2אבל ניתנת לפתרון פסאודו-פולינומיאלי לכל מספר קבוע של קריטריונים .לבעיה הזו יש גם .FPTASאנו מציגים תוצאות דומות לגרסאות האופטימיזציה של שתי הבעיות האלה )מינימיזציה של המחיר בקריטריון מסויים ,בלי לחרוג מהחסמים בקריטריונים האחרים( .בנוסף אנו עוסקים גם בהרחבות וישומים נוספים ,כולל גרסאות מרובות-קריטריונים מסוג "או" של שתי בעיות אופטימיזציה נוספות .למיטב ידיעתנו עבודה זו היא הראשונה לעסוק בגרסאות "או" של בעיות מרובות-קריטריונים .הפרק הזה מבוסס על המאמר ].[AZ06 בעיות סוכן-נוסע עם שיתוף בפרק 3אנו עוסקים בגרסאות של בעיית הסוכן-הנוסע שיש בהן שיתוף בין הסוכן לבין הלקוחות. הקלט לבעיית הסוכן הנוסע הקלאסית כולל גרף מלא לא-מכוון עם משקלים על הקשתות ,והמטרה היא למצוא מעגל פשוט שעובר בין כל צמתי הגרף שמשקלו הכולל מינימלי .המוטיבציה הקלאסית לבעיה הזו היא תכנון מסלול-נסיעה עבור סוכן-נוסע ,שצריך לבקר אוסף של ערים/לקוחות ולשוב לביתו ,כך שסך מרחק הנסיעה יהיה מינימלי .הבעיה הזו ידועה כבעיה -NPקשה ] ,[GGJ76ועבור משקלי-קשתות כלליים לא ניתן להשיג לה אפילו יחס קירוב שהוא פולינומיאלי בגודל הקלט ] .[Vaz03עבור הגרסה המטרית ,שבה משקלי הקשתות מקיימים סימטריה ואי-שיויון המשולש, Christofidesהציג אלגוריתם קירוב שמשיג יחס .[Chr76] 3/2עבור הגרסה הזו הוכח שלא ניתן להשיג יחס קירוב טוב יותר מאשר ) 131/130בהנחה ש P-שונה מ.[EK01] (NP - לעומת זאת ,בכל מרחב אוקלידי ממימד קבוע ,יש לבעיה הזאת ,[Aro98, Mit99] PTASוהיא עדיין -NPקשה ] .[Pap77הוצג גם אלגוריתם קירוב שמשיג יחס 3/2לגרסת המסלול של הבעיה ],[Hoo91 כלומר הגרסה שבה הסוכן לא צריך לשוב לנקודת ההתחלה שלו )צריך למצוא מסלול ולא מעגל, כשנקודת קצה אחת של המסלול נתונה בקלט(. בעיית הסוכן-הנוסע נחקרה במשך השנים בגרסאות רבות .אחת מהגרסאות המעניינות האחרונות היא בעיה שמכונה בעיית ,[ABF+02, SABM02, ABG+03,KLS04] Freeze-Tagשהוצגה על-ידי .[ABF+02] Arkin et al.המוטיבציה לבעיה הזו נובעת מתרחיש שקיים בהפעלת בנחילי רובוטים. בתרחיש הזה יש רובוט מופעל אחד שצריך להפעיל אוסף של רובוטים אחרים הפזורים בשטח .כדי להפעיל רובוט יש להגיע אליו ,וכשרובוט מופעל ניתן להעביר לו הנחיות תנועה כך שהוא יפעיל רובוטים אחרים .המטרה היא להשלים את הפעלת כל הרובוטים מהר ככל האפשר. Arkin et al.הציגו אלגוריתמים שמשיגים יחס קירוב קבוע עבור הבעיה הזאת עבור כמה סוגים מיוחדים של גרפים ,והוכיחו שלא ניתן להשיג יחס קירוב טוב יותר מ 5/3 -עבור הבעיה )בהנחת P שונה מ Konemann et al. .(NP -הציגו לאחר מכן אלגוריתם שמשיג קירוב ) O( log nבגרף כללי ].[KLS04 ניתן לראות את בעיית Freeze-Tagכגרסה של בעיית הסוכן הנוסע ,שבה ה"-לקוחות" משתפים פעולה עם הסוכן ,על-ידי כך שהם עוזרים ב"-מכירות" אחרי שמגיעים אליהם .שיתוף-פעולה כזה יכול לקרות בתרחישים מציאותיים רבים אחרים ,שבהם ה"-לקוחות" שייכים למעשה לאותו ארגון כמו הסוכן הנוסע ,וניתן להורות להם לנוע כדי לסייע ב"-העברת הסחורה" .מובן שעשויים להיות סוגים נוספים של שיתוף-פעולה .לדוגמא ,יתכן שניתן להורות ל"-לקוחות" לנוע עוד לפני קבלת הסחורה, כדי לפגוש את הסוכן .בתרחישים מסויימים ,שיתוף פעולה כזה עשוי להיות רלוונטי במקום/בנוסף ל תנועה לאחר קבלת הסחורה .כמו-כן ,מטרות אחרות מלבד מינימיזציה של הזמן עשויות להיות מעניינות .לדוגמא ,בתרחיש של הרובוטים שצריך להפעיל ,אם לכל רובוט יש מצבר מוגבל אז עשוי להיות חשוב לעשות מינימיזציה של המרחק המקסימלי שאיזשהו רובוט נוסע .מטרה מעניינית נוספת בתרחישים מסויימים עשויה להיות מינימיזציה של המרחק הכולל של כל המשתתפים )למשל אם החברה שלהם משלמת עבור הוצאות הנסיעה הכוללות של כולם( .מובן שכמו בבעיית הסוכן הנוסע הקלאסית ,עשוי גם להיות רלוונטי לעסוק בבעיה ללא הדרישה שהסוכן והלקוחות "יחזרו הביתה" )יתכן שנקודת ההתחלה שלהם היא שרירותית ,למשל כשמפזרים נחיל רובוטים בשטח מסויים(. אפשר להתעניין בכל הבעיות הללו גם במרחב אוקלידי ולא בגרף. התרומה שלנו כפי שמתואר בפרק ,3חקרנו את כל הצירופים של הגרסאות שהוזכרו למעלה ,כלומר את כל הצירופים של סוגי שיתוף-פעולה ,פונקציות מטרה ומרחב מטרי ,עם או בלי הדרישה ל"-חזרה הביתה" .הראינו שלרוב הבעיות הללו יש אלגוריתם שמשיג יחס קירוב קבוע ,לרבות מהן יש ,PTAS ולמיעוטן יש פתרון בזמן פולינומיאלי .בצד הקשיות ,הצגנו הוכחות -NPקשיות והצגנו קבועים שלא ניתן לקרב ביחס נמוך מהם .חלק מתוצאות הקשיות שהצגנו הדוקות ,כלומר תואמות את התוצאות האלגוריתמיות שהצגנו .כל האלגוריתמים שלנו הם קומבינטוריים לחלוטין ,והוכחות הקשיות שלנו משתמשות ברדוקציות מבעיות קשות ידועות ,ללא צורך בשימוש במשפט ה .PCP -הפרק הזה מבוסס על המאמר ].[AAS06 גרסת ה min-max-של בעיית ה-r -אסיפות )(r-gathering בפרק 4אנו עוסקים בבעיית ה-r-אסיפות ,גרסה חדשה יחסית של בעיית מיקום-המתקנים ) facility .(locationבבעיית מיקום המתקנים הקלאסית ,הקלט כולל קבוצת לקוחות Cוקבוצת מיקומים פוטנציאליים של מתקנים .Fעבור כל מיקום פוטנציאלי נתון מחיר לפתיחת מתקן שם .לכל לקוח ומיקום פוטנציאלי של מתקן נתון מחיר-שירות ,כלומר המחיר שיעלה לשרת את הלקוח ע"י מתקן שימוקם שם .הבעיה היא לבחור מיקומים לפתיחת מתקנים ,כך שהסכום של מחירי הפתיחה ומחירי השירות יהיה מינימלי )כל לקוח מקבל שירות מהמתקן הפתוח שמחיר קבלת השירות ממנו הוא מינימלי עבור הלקוח הזה( .הבעיה הזו מוצגת לעיתים קרובות כגרף דו-צדדי מלא ,שצד אחד שלו מייצג את קבוצת הלקוחות Cוצידו השני מייצג את קבוצת המיקומים ) Fכל אחד עם מחירו( ,כך שמשקלי הקשתות הם מחירי השירות. Hochbaumהציגה אלגוריתם קירוב לבעיה הזו שמשיג יחס ) .[Hoc82] O(log nלא ניתן להשיג לבעיה הזו יחס קירוב ) ,[Arc] o(log nאלא אם כן )) .NP ⊆ DTIME (n O(loglognאם מחירי השירות מקיימים את אי-שיויון המשולש ,אז יש אלגוריתם שנותן קירוב 3/2לבעיה ] .[Byr07מצד שני ,לא ניתן לתת לה קירוב טוב יותר מאשר ,1.463אלא אם כן )) .[GK99] NP ⊆ DTIME (n O(loglognרוב המחקר עוסק בגרסה המטרית ,שבה מחירי השירות אכן מקיימים את אי-שיויון המשולש .בתרחיש הזה משתמשים לפעמים במונח "מרחקים" עבור מחירי השירות ,ואומרים שכל לקוח מחובר למתקן הפתוח הקרוב אליו ביותר .בעיית מיקום המתקנים המטרית נחקרה במשך עשורים רבים ,תחילה בעיקר בקהילת חקר-הביצועים ,ויש לה גרסאות רבות שנחקרו באופן נרחב )ראו למשל סקירות של ].([Dre95, Vyg05 כמה מהגרסאות החדשות יותר של בעיית מיקום המתקנים עוסקות באילוץ נוסף ,על מספר הלקוחות שכל מתקן משרת .בבעיית מיקום מתקנים עם קיבולות )) (capacitated facility locationראו למשל ] ,([Vyg05יש חסם עליון על מספר הלקוחות שמתקן יכול לשרת .בגרסה עם "קיבולות רכות" ניתן לפתוח יותר ממתקן אחד באותו מקום )כך שמחיר הפתיחה מוכפל במספר המתקנים שנפתחו במקום הזה( .בגרסה עם "קיבולות קשות"" ,בכל מקום פוטנציאלי אפשר לפתוח רק מתקן אחד. עבור "קיבולות רכות" ,יש אלגוריתם שמשיג יחס קירוב ,2בדיוק כמו ה integrality-gap-של העידון הלינארי של הבעיה ] .[MYZ03עבור קיבולות קשות ,יש אלגוריתם שמשיג יחס קירוב 5.83 ] .[ZCY04החסם התחתון הגבוה ביותר שידוע על יחס הקירוב הוא החסם התחתון של 1.463עבור הגרסה שבה אין קיבולות ].[SCY04, GK99 ראוי לציין שבשונה מבעיית מיקום המתקנים הקלאסית ,לקוח לא בהכרח יקבל שירות מהמתקן שמחיר השירות שלו הוא הנמוך ביותר עבורו ,בשל אילוצי הקיבולות. אחת הגרסאות היותר חדשות של בעיית מיקום המתקנים שמתחשבת במספר הלקוחות שכל מתקן משרת היא בעיית ה-r -אסיפות ] .[GMM00, KM00, Svi08הבעיה הזו היא אנלוגית לבעיית מיקום המתקנים עם קיבולות ,אולם יש בה חסם תחתון של rעל מספר הלקוחות שכל מתקן משרת )במקום חסם עליון( .המוטיבציה הבסיסית היא שבדרך-כלל דרוש מספר מסויים של לקוחות כדי שפתיחת מתקן תהיה כדאית כלכלית .ניתן להסתכל על המקרה הפרטי של הבעיה הזו שבו אין מחירי-פתיחה למתקנים גם כעל בעיית ,clusteringשבה לא מעוניינים בקבוצות קטנות )ראו ] .([AFK+06עבודות קודמות עסקו גם במקרה שבו לכל מיקום יש חסם תחתון אחר על מספר הלקוחות הדרושים לפתיחת מתקן בו ].[GMM00, KM00 נשים לב שגם בבעיית ה-r -אסיפות לקוח לא בהכרח יקבל שירות מהמתקן שמחיר השירות שלו עבורו הוא הנמוך ביותר ,בשל האילוצים שיוצר החסם התחתון. העבודות של [KM00] Karger and Minkoffושל ,[GMM00] Guha et al.שהציגו את בעיית ה-r - אסיפות ,הציגו גם אלגוריתם קירוב דו-קריטריוני עבורה .האלגוריתם שלהם מבטיח שכל מתקן ישרת לפחות αrלקוחות ,במחיר שגדול לכל היותר פי ) (1+α).β/(1-αמהמחיר האופטימלי לבעיית ה-r -אסיפות ,כאשר βהוא יחס הקירוב עבור בעיית מיקום המתקנים .אם בוחרים ,α=(r- 1)/r+ε האלגוריתם הזה נותן קירוב יחס קירוב של )) 1.5(2r-1+εבקריטריון אחד כרגיל(. לאחרונה [Svi08] Svitkinaהציגה אלגוריתם שמשיג יחס קירוב קבוע עבור הבעיה הזו )יחס של .(558כמו-כן ,עבור ,r=2העבודה של ] [AK07הראתה שהבעיה פולינומיאלית אם אין מחירים למתקנים. העבודה של [AFK+06] Aggrawal et al.עסקה במקרה פרטי של גרסת min-maxשל הבעיה הזו, שבה אין מחירי פתיחה למתקנים) C=F ,כלומר מיקומי הלקוחות זהים למיקומים הפוטנציאליים לפתיחת מתקנים( ,והמטרה היא להביא למינימום את מחיר השירות המקסימלי )ולא את סכום המחירים( .הם קראו למקרה הפרטי הזה ,שהמוטיבציה אליו הגיע מישום ל ,clustering-בשם .r-gather clusteringבעבודתם הם הציגו אלגוריתם שמשיג יחס קירוב 2עבור המקרה הפרטי הזה, והוכיחו שלא ניתן להשיג יחס קירוב טוב יותר עבור ) r>6בהנחה ש NP -שונה מ .(P -עבור הכללה שבה מותר להתעלם מאחוז מסויים של הלקוחות שנתון בקלט )"טעויות דגימה" מבחינת ה- ,(clusteringהם מצאו על אלגוריתם שמשיג יחס קירוב .3 התרומה שלנו אנו מציגים אלגוריתם שמשיג יחס קירוב של 3עבור גרסת ה min-max-הכללית של בעיית ה-r - אסיפות .אנו מוכיחים שלא ניתן לקרב אותה טוב יותר עבור ) r>2עבור r=2העבודה של ][AK07 מוכיחה שהבעיה ניתנת לפתרון בזמן פולינומיאלי( .בנוסף ,אנו מוכיחים שלכל r>2לא ניתן לקרב את בעיית ה r-gather clustering -עם יחס קירוב קטן יותר מ .2-בכך אנו משפרים את תוצאת הקשיות של ] .[AFK+06לאלגוריתם שלנו יש מספר הרחבות ,ביניהן קירוב 3להכללה שבה ניתן להתעלם מאחוז מסויים של הלקוחות )כלומר אנו משיגים אותו יחס קירוב לבעיה יותר כללית מאשר זו ש- ] [AFK+06עסקו בה( .בנוסף ,אנו עוסקים בגרסה שבה על כל לקוח לקבל שירות מהמתקן הפתוח שמחיר השירות שלו הוא הזול ביותר עבורו )"המתקן הקרוב ביותר"( ,דרישה טבעית למדי .אנו מציגים אלגוריתם שמשיג יחס קירוב של 9עבור הגרסה הזאת. לגרסה המקורית של הבעיה ,שבה יש לעשות מינימיזציה לסכום המחירים ,אנו משיגים יחס קירוב של 2rבמקרה שבו אין מחירי פתיחה למתקנים .יחס קירוב כזה הושג במקביל לעבודתנו על-ידי Lim ,[LWX06] et al.תוך שימוש באלגוריתם שונה. התוצאות המתוארות בפרק זה מבוססות על המאמר ].[Arm08
Similar documents
P E S A H 5 7 7 1
acum, eliberarea din robie a strămoşilor noştri, ne dă putere să le facem faţă. Suntem preocupaţi de păstrarea şi dezvoltarea patrimoniului iudaic sacru, de marile valori arhitectonice, artistice, ...
More informationפתרון הבחינה באנגלית
and does nothing about it. Mr. Cattanzara pushes George to get an education by reading books, so that he won't end up like he did in this rundown neighborhood. This fi...
More informationEffective Noise Theory for the Nonlinear Schroedinger
The interesting question about the dynamics of (1.1) is: will an initial wave function
More informationANTIFONI
The external flexible cable or cord of this luminaire cannot be replaced; if the cord is damaged, the luminaire shall be destroyed.
More information