68. Workshop ¨ uber Algorithmen und Komplexit¨ at
Transcription
68. Workshop ¨ uber Algorithmen und Komplexit¨ at
68. Workshop u ¨ ber Algorithmen und Komplexit¨ at (Theorietag) Friedrich-Schiller-Universit¨ at Jena 13. November 2014 Zusammenfassungen der Vortr¨ age (zusammengestellt von Martin Mundhenk) Vorl¨ aufiges Programm 9:40–10:05 Markus L. Schmid (Trier): Pattern matching with variables 10:05–10:30 Florin Manea (Kiel): Pattern matching with variables: fast algorithms and new hardness results 10:45–11:10 Katrin Casel (Trier): Mathematische Modelle zur Anonymisierung von Mikrodaten 11:10–11:35 Andr´e Nichterlein (TU Berlin): On combinatorial anonymization 11:35–12:00 Manuel Malatyali (Paderborn): Online top-k-position monitoring of distributed data streams 13:00–14:00 Martin Dietzfelbinger (Ilmenau): Zuf¨alligkeit in Hashfunktionen: Konstruktionen und Anwendungen 14:15–14:40 Christian Komusiewicz (TU Berlin): Polynomial-time data reduction for the subset interconnection design problem 14:40–15:05 Manuel Sorge (TU Berlin): The minimum feasible tileset problem 15:05–15:30 Arne Meier (Hannover): Parameterized complexity of CTL: a generalization of Courcelle’s theorem 15:50–16:15 Moritz Gobbert (Trier): Die Komplexit¨at von Latrunculi 16:15–16:40 Pascal Lenzner (Jena): Selfish network creation – dynamics and structure 16:40–17:05 Stefan Kratsch (TU Berlin): Losing weight is easy Mathematische Modelle zur Anonymisierung von Mikrodaten Katrin Casel Universit¨at Trier Elektronische Aufzeichnungen vertraulicher pers¨onlicher Daten existieren in vielen Bereichen, z.B. in Form digitalisierter medizinischer Akten. Diese Informationen bilden einerseits wertvolle Ressourcen f¨ ur die Forschung, in den falschen H¨anden aber andererseits eine bedenkliche Verletzung der Privatsph¨are. In Deutschland d¨ urfen derartige Daten daher nur ver¨offentlicht werden, wenn Einzelangaben nur mit einem unverh¨altnism¨aßig großen Auf” wand an Zeit, Kosten und Arbeitskraft zugeordnet werden k¨onnen“(§16 Abs. 6 BStatG). Ein solcher Anonymisierungsgrad ben¨otigt eine Bearbeitung individueller Originaldaten (Mikrodaten), die u ¨ber das reine L¨oschen eindeutiger Attribute wie Name oder Steueridentifikationsnummer hinausgeht. In den letzten Jahrzehnten entstanden viele unterschiedliche Methoden zur Anonymisierung mit haupts¨achlich heuristischen L¨osungsverfahren. Abstrakte Modellierungen erm¨oglichen den Vergleich unterschiedlicher Ans¨atze und er¨offnen neue Wege zur Probleml¨osung. Konkret lassen sich viele Generalisierungsmethoden (Vergr¨oberung der Mikrodaten) als spezielle Cluster-Probleme auf Graphen und bestimmte Zugriffseinschr¨ankungen als kombinatorische Probleme auf Matrizen modellieren. Eine Untersuchung der Parallelen und Unterschiede zu bekannten Problemen wie k-center, k-colorability, set-cover, etc. ergibt neue L¨osungsans¨atze und erlaubt eine Klassifizierung der Anonymisierungsmethoden hinsichtlich Approximierbarkeit und (parametrisierter) Komplexit¨at. 1 Zuf¨ alligkeit in Hashfunktionen: Konstruktionen und Anwendungen Martin Dietzfelbinger TU Ilmenau ¨ Der Vortrag gibt einen Uberblick u ¨ber neuere Konstruktionen und Analysen von Hashfunktionen (f¨ ur Datenstrukturen – nicht kryptographische Hashfunktionen!) mit guten und sehr guten Zuf¨alligkeitseigenschaften und stellt einige Anwendungen vor, die von so starken Hashfunktionen profitieren. Dies sind W¨orterb¨ ucher (Dictionaries), die Darstellung von Funktionen mit konstanter Auswertezeit, extrem platzeffiziente perfekte Hashfunktionen und die Simulation voll zuf¨alliger, also idealer Hashfunktionen. Interessant ist der Methodenmix: Wahrscheinlichkeitsrechnung, Lineare Algebra und die Theorie der Zufalls(hyper)graphen werden kombiniert. 2 Die Komplexit¨ at von Latrunculi Moritz Gobbert Universit¨at Trier [email protected] Ludus Latrunculorum — auch Latrunculi genannt — ist ein altes R¨omisches Spiel, dessen exakte Regeln nicht vollst¨andig u ¨berliefert sind. Aufgrund dieser Unstimmigkeiten gibt es unterschiedliche Rekonstruktionen des Regelwerks. Inhalt des Vortrags ist die Frage nach der Komplexit¨at des Spiels. Zuerst wird kurz auf den geschichtlichen Hintergrund des Spieles eingegangen. Danach wird ein bestimmtes Regelwerk vorgestellt, welches vielen modernen Beschreibungen des Spiel entspricht. Dann wird gezeigt, dass schon alleine die Frage, ob Spieler A einen bestimmten Spielstein so ziehen kann, dass dieser einen bestimmten Spielstein von Spieler B schl¨agt, N P-vollst¨andig ist. Dies steht im Kontrast zu vielen anderen Spielen wie z. B. Schach oder Go, bei denen analoge Fragestellungen in Polynomzeit entscheidbar sind. Zum Schluss des Vortrags wird ein Ausblick auf einige, noch offene Fragen bez¨ uglich Latrunculi gegeben. Keywords: Komplexit¨at, Latrunculi, Ludus Latrunculorum, R¨omisches Spiel. 3 Polynomial-time Data Reduction for the Subset Interconnection Design Problem Jiehua Chen Christian Komusiewicz Rolf Niedermeier Manuel Sorge Ondˇrej Such´ y Mathias Weller Fakult¨ at f¨ ur Softwaretechnik und Theoretische Informatik, TU Berlin, Germany Email: [email protected] The NP-hard Subset Interconnection Design problem, also known as Minimum Topic-Connected Overlay, is motivated by numerous applications including the design of scalable overlay networks and vacuum systems. It has as input a finite set V and a collection of subsets V1 , V2 , . . . , Vm ⊆ V , and asks for a minimum-cardinality edge set E such that for the graph G = (V, E) all induced subgraphs G[V1 ], G[V2 ], . . . , G[Vm ] are connected. We study Subset Interconnection Design in the context of polynomialtime data reduction rules that preserve the possibility to construct optimal solutions. Our contribution is threefold: First, we show the incorrectness of earlier polynomial-time data reduction rules. Second, we show linear-time solvability in case of a constant number m of subsets, implying fixed-parameter tractability for the parameter m. Third, we provide a fixed-parameter tractability result for small subset sizes and tree-like output graphs. To achieve our results, we elaborate on polynomial-time data reduction rules which also may be of practical use in solving Subset Interconnection Design. 4 Losing weight is easy Michael Etscheid & Matthias Mnich & Heiko R¨oglin Universit¨at Bonn Stefan Kratsch TU Berlin The talk discusses some aspects of having large numbers and weights in the inputs of combinatorial problems. Pivotal problems in this regard are Subset Sum and Knapsack but also weighted versions of classical NP-hard problems like Vertex Cover as well as problems related to integer linear programs. Shrinking numbers to small size is an important task in kernelization, where one studies provable bounds on efficient simplification of NP-hard problems. We recall and lightly discuss some fairly recent attempts at coping with large numbers in Subset Sum and Knapsack. Further progress beyond these results was the topic of several open problems in kernelization. The last part of the talk shows how an almost 30 year old theorem single-handedly defeats the open problems. 5 Selfish Network Creation – Dynamics and Structure Pascal Lenzner Department of Computer Science Friedrich-Schiller-University Jena [email protected] Many important networks, most prominently the Internet, are not designed and administrated by a central authority. Instead, such networks have evolved over time by (repeated) uncoordinated interaction of selfish agents which control and modify parts of the network. The Network Creation Game [Fabrikant et al. PODC’03] and its variants attempt to model this scenario. In these games, agents correspond to nodes in a network and each agent may create costly links to other nodes. The goal of each agent is to obtain a connected network having maximum service quality, i.e. small distances to all other agents, at low cost. The key questions are: How do the equilibrium networks of these games look like and how can selfish agents actually find them? For the latter, recent results on the dynamic properties of the sequential version of these games will be surveyed. For the former, ongoing work focussing on structural properties is presented. 6 Online Top-k-Position Monitoring of Distributed Data Streams Manuel Malatyali Heinz Nixdorf Institute, University of Paderborn In this talk we consider a model in which there is one coordinator and a set of n distributed nodes directly connected to the coordinator. Each node continuously receives data from an input stream only known to the respective node or, in other words, observes a private function whose value changes over time. At any time, the coordinator has to know the k nodes currently observing the k largest values. In order to inform the coordinator about its current value, a node can exchange messages with the coordinator. Additionally, the coordinator can send broadcast messages received by all nodes. The goal in designing an algorithm for this setting, which we call Top-k-Position Monitoring, is to find a solution that, on the one hand, keeps the coordinator informed as much as necessary for solving the problem and, at the same time, aims at minimizing the communication, i.e., the number of messages, between the coordinator and the distributed nodes. For the considered problem, we present an algorithm that combines the notion of filters with a kind of random sampling of nodes. The basic idea of assigning filters to the distributed nodes is to reduce the number of exchanged messages by providing nodes constraints defining when they can safely resign to send observed changes in their input streams to the coordinator. However, if it might become necessary to communicate observed changes and update filters, we make extensive use of a new randomized protocol for determining the maximum (or minimum) value currently observed by (a certain subset of) the nodes. As our problem is an online problem, since the values observed by the nodes change over time and are not known in advance, in our analysis we compare the number of messages exchanged by our online algorithm to that of an offline algorithm that sets filters in an optimal way. We show that they differ by a factor of at most O((log∆+k)·logn) on expectation, where ∆ is the largest difference of the values observed at the nodes holding the k-th and (k+1)-st largest value at any time. 7 Pattern Matching with Variables: Fast Algorithms and New Hardness Results 1 Henning Fernau1 , Florin Manea2 , Robert Merca¸s2 , and Markus L. Schmid1 , Fachbereich IV – Abteilung Informatikwissenschaften, Universita¨t Trier, D-54286 Trier, Germany, {Fernau, MSchmid}@uni-trier.de 2 Department of Computer Science, Kiel University, D-24098 Kiel, Germany, {flm, rgm}@informatik.uni-kiel.de A pattern is a string that consists of terminal symbols (e. g., a, b, c) and variables (e. g., x1 , x2 , x3 ). The terminal symbols are constants, while the variables are uniformly replaced by strings over the set of terminals; thus, a pattern is mapped to a terminal word. For example, x1 abx1 x2 cx2 x1 can be mapped to acabaccaaccaaac by the replacement (x1 → ac, x2 → caa). Due to their simple definition, the concept of patterns emerges in various areas of theoretical computer science, such as language theory (pattern languages), learning theory (inductive inference, PAC-learning), combinatorics on words (word equations, unavoidable patterns, etc.), pattern matching (generalised function matching), database theory (extended conjunctive regular path queries), and we can also find them in practice as extended regular expressions with backreferences, used in programming languages, e.g., Perl, Java, Python. In all these different applications, the main purpose of patterns is to express combinatorial pattern matching questions. Unfortunately, deciding whether a given general pattern can be mapped to a given word is N P-complete. On the other hand, some subclasses of patterns are known for which the matching problem is in P; however, the existing polynomial time algorithms for these classes are fairly basic and cannot be considered efficient in a practical sense. Therefore, we present several efficient algorithms for the known polynomial variants of the matching problem. While we consider our algorithms to be non-trivial, their running times have still an exponential dependency on certain parameters (necessary under common complexity theoretical assumptions) of patterns and, thus, are acceptable only for strongly restricted classes of patterns. In some applications of patterns it is necessary to require the mapping of variables to be injective. To this end, we show the N P-completeness of the following natural combinatorial factorisation problem: given a number k and a word w, can w be factorised into k distinct factors? It follows that even for the trivial patterns x1 · · · xk the matching problem is N P-complete if we require injectivity. In terms of complexity, a clear borderline between the injective and the non-injective versions of the matching problem is thus established. 8 Parameterized Complexity of CTL: A Generalization of Courcelle’s Theorem Martin Lu Arne Meier ¨ck Irina Schindler∗ Institut fu ¨r Theoretische Informatik Leibniz Universit¨at Hannover {lueck, meier, schindler}@thi.uni-hannover.de We present an almost complete classification of the parameterized complexity of all operator fragments of the satisfiability problem in computation tree logic CTL. The investigated parameterization is temporal depth and pathwidth. The classification shows a dichotomy between W[1]-hard and fixed-parameter tractable fragments. The only real operator fragments which is in FPT is the fragment containing solely AX. Also we prove a generalization of Courcelle’s theorem to infinite signatures which will be used to prove the FPT-membership cases. ∗ Supported in part by DFG ME 4279/1-1. 9 On Combinatorial Anonymization Andr´e Nichterlein Fakult¨ at f¨ ur Softwaretechnik und Theoretische Informatik, TU Berlin, Germany Email: [email protected] We review our recent and ongoing work on analyzing the computational complexity of combinatorial data anonymization, mainly discussing degree-based network anonymization. Roughly speaking, an object is called k-anonymous if there are at least k − 1 other objects in the data that ”look the same”. In case of graphs, a vertex is called k-anonymous if there are at least k −1 other vertices having the same degree. The goal to make a graph k-anonymous (that is, all its vertices shall be k-anonymous) leads to a number of algorithmic graph modification problems. These problems are mostly intractable, in particular we exclude √ o( n)-approximation algorithms with running time f (s) · nO(1) where s denotes the number of allowed modifications. On the positive side, we show efficiently solvable cases when restricting to edge insertion as allowed modification. This talk is based on joint work with Cristina Bazgan, Robert Bredereck, Vincent Froese, Sepp Hartung, Clemens Hoffmann, Rolf Niedermeier, Ondrej Such´ y, Nimrod Talmon, and Gerhard Woeginger. 10 Pattern Matching with Variables Markus L. Schmid Universit¨at Trier, FB IV–Abteilung Informatikwissenschaften Let Σ be an arbitrary alphabet of terminals and let X = {x1 , x2 , x3 , . . .} be an enumerable set of variables. Any string α ∈ (Σ ∪ X)+ is a pattern and every string w ∈ Σ∗ is a word. A substitution is a mapping h : (X ∪ Σ) → Σ∗ with h(a) = a for every a ∈ Σ. For a pattern α = z1 z2 . . . zn , zi ∈ Σ ∪ X, 1 ≤ i ≤ n, by h(α) we denote the word h(z1 )h(z2 ) . . . h(zn ). The problem of pattern matching with variables is defined as follows: Pattern Matching with Variables (VPatMatch) Instance: A pattern α and a word w. Question: Does there exist a substitution h with h(α) = w? As an example, we consider the pattern α = x1 a x1 b x2 x2 , where a, b, c ∈ Σ, and the word w = bacaabacabbaba. We note that (α, w) is a positive instance of VPatMatch since for h(x1 ) = baca and h(x2 ) = ba we have h(α) = w. On the other hand, there exists no substitution h with h(α) = cbcabbcbbccbc. Due to their natural and simple definition, the concept of patterns (and how they map to words) emerges in various areas of theoretical computer science, such as language theory, learning theory, combinatorics on words, pattern matching, database theory, and we can also find them in practice in the form of extended regular expressions with backreferences, used in programming languages like Perl, Java, Python, etc. The problem VPatMatch, as defined above, is N P-complete (which is easy to show), but in the literature different variants of VPatMatch are investigated: the nonerasing version (i. e., variables must be substituted by non-empty words), the terminal-free version (i. e., the patterns contain only variables), the injective version (i. e., different variables must be substituted by different words) and any combination of these. In addition to that there are many numerical parameters: number of variables, number of terminals, length of w, number of occurrences per variable, length of the images h(x). By combining the different VPatMatch-variants with all possibilities of bounding some of the numerical parameters by constants, we obtain a fairly large class of different pattern matching problems with variables. In this talk, we present some of the main results of a systematic multivariate complexity analysis (see [3, 1, 2]) of this rich class of pattern matching problems with variables. It turns out that surprisingly strong restricted versions of VPatMatch are still N P-complete, while all polynomial time solvable cases are such that the brute-force algorithm already has polynomial running time. Literatur [1] H. Fernau and M. L. Schmid. Pattern matching with variables: A multivariate complexity analysis. In Proceedings of the 24th CPM, volume 7922 of LNCS, pages 83–94, 2013. [2] H. Fernau, M. L. Schmid, and Y. Villanger. On the parameterised complexity of string morphism problems. In Proceedings of the 33rd FSTTCS, volume 24 of Leibniz International Proceedings in Informatics (LIPIcs), pages 55–66, 2013. [3] D. Reidenbach and M. L. Schmid. Patterns with bounded treewidth. Information and Computation, 2014. http://dx.doi.org/10.1016/j.ic.2014.08.010. 11 The Minimum Feasible Tileset problem Yann Disser Institut f¨ ur Mathematik, TU Berlin, Germany Stefan Kratsch Institut f¨ ur Softwaretechnik und Theoretische Informatik, TU Berlin, Germany Manuel Sorge Institut f¨ ur Softwaretechnik und Theoretische Informatik, TU Berlin, Germany Email: [email protected] We consider the Minimum Feasible Tileset problem: Given a set of symbols and subsets of these symbols (scenarios), find a smallest possible number of pairs of symbols (tiles) such that each scenario can be formed by selecting at most one symbol from each tile. We show that this problem is NP-complete even if each scenario contains at most three symbols. Our main result is a 4/3-approximation algorithm for the general case. In addition, we show that the Minimum Feasible Tileset problem is fixed-parameter tractable both when parameterized with the number of scenarios and with the number of symbols. 12