Albert-Ludwigs-Universit¨ at, Inst. f¨ ur Informatik Prof. Dr. Fabian Kuhn
Transcription
Albert-Ludwigs-Universit¨ at, Inst. f¨ ur Informatik Prof. Dr. Fabian Kuhn
Albert-Ludwigs-Universit¨ at, Inst. f¨ ur Informatik Prof. Dr. Fabian Kuhn S. Daum, H. Ghodselahi, O. Saukh November 14, 2013 Algorithm Theory Wintersemester 2013 Problem set 2 – Sample solution Exercise 1: Polynomial Multiplication using FFT Compute the coefficient representation of (p(x))2 by using the FFT algorithm. p(x) = x3 − 3x2 + 2x + 1 Remark: Instead of using exact numbers for the point-wise evaluations (which involves irrational numbers) you can also round numbers to, say, 3+ digits after the decimal point. This also reflects what happens in an implemented version of FFT, as algebraic representations do not lead to an O(n log n) running time. Those unfamiliar with complex numbers should ask fellow students for some help - calculating roots of unity and multiplying 2 complex numbers is all you need for this exercise. Solution To multiply two polynomials p and p0 of degree dp and dp0 respectively, by using interpolation, we must evaluate those polynomials at dp + dp0 + 1 distinct points. In our case we have two identical polynomials of degree 3, i.e., 7 points are needed. To use FFT, rounding up to the next power of 2 is advised, as it simplifies the computation by more than the additional amount of work we need to do. Thus, from now on, N = 8. The coefficient vector for our polynomial p is a = (1, 2, −3, 1). For the upcoming multiplication of p with itself we will calculate DF TN (a) which is the abbreviation for (p(ω80 ), . . . , p(ω87 )), i.e., the evaluation of p at the 8 roots of unity. The polynomial p2 has a descriptive vector of length 7, but the FFT algorithm is much clearer if all polynomials use the same length of their descriptive vector, and it should be of length N , thus we redefine a := (1, 2, −3, 1, 0, 0, 0, 0). FFT Divide: First we need to split p into p0 and p1 . Their respective vectors are a0 := (1, −3, 0, 0) and a1 := (2, 1, 0, 0). Those we split further until we reach a descriptive vector of length 1: a00 := (1, 0), a01 := (−3, 0), a10 := (2, 0), a11 := (1, 0) and a000 := (1), a001 := (0), a010 := (−3), a011 := (0), a100 := (2), a101 := (0), a110 := (1), a111 := (0). Now, a polynomial of degree 0 is easy to calculate, as it represents a constant function: DFT1 (a000 ) = p000 (ω10 ) = 1. As you can see, we could already have stopped at polynomials p00 , . . . , p11 , because they already represented polynomials of degree 0. But let’s be meticulous: k ω1k p000 p001 p010 p011 p100 p101 p110 p111 0 1 1 0 −3 0 2 0 1 0 1 (1, 2, −3, 1, 0, 0, 0, 0) (1, −3, 0, 0) (1, 0) (1) (2, 1, 0, 0) (−3, 0) (0) (2, 0) (−3) (0) (2) (1, 0) (0) (1) (0) Conquer: Remember that our goal in each step is to combine two polynomials of degree n/2 − 1 to one polynomial of degree n − 1, but in the point-value representation, i.e., we need to get the values of that polynomial at n different roots of unity. The formula for combination is k mod n/2 q(ωnk ) = q0 (ωn/2 E.g., p10 (ω21 ) = p10 (−1) = p100 (ω11 mod 1 ) k mod n/2 ) + ωnk q1 (ωn/2 + ω21 p101 (ω11 mod 1 ) ). = 2 + (−1) · 0 = 2. k ω2k p00 p01 p10 p11 0 1 1 −3 2 1 1 −1 1 −3 2 1 The next step is slightly more complicated. E.g., p1 (ω43 ) = p1 (−i) = p10 (ω23 p10 (−1) + (−i)p11 (−1) = 2 − i k ω4k p0 0 1 1 + 1 · (−3) 1 i 1 + i · (−3) 2 −1 1 − 1 · (−3) 3 −i 1 − i · (−3) mod 2 )+ω 3 p (ω 3 mod 2 ) 4 11 2 = p1 2+1·1 2+i·1 2−1·1 2−i·1 Or more nicely: k ω4k p0 p1 0 1 −2 3 1 i 1 − 3i 2 + i 2 −1 4 1 3 −i 1 + 3i 2 − i ±1±i √ , which makes calculation a bit more 2 −1−i 5 5 mod 4 )+ω 5 p (ω 5 mod 4 ) = p (i)+ −1−i √ p1 (i) = Example calculation: p(ω8 ) = p( √2 ) = p0 (ω4 0 8 1 4 2 1 1 1 3 √ (−1 − i)(2 + i) = 1 − 3i + √ (−2 − i − 2i + 1) = 1 − √ + i · (−3 − √ ) ≈ 0.293 − 5.121i 2 2 2 2 The last step involves 4 new roots of unity of the form difficult. 1 − 3i + k 0 1 2 3 4 5 6 7 ω8k 1 1+i √ 2 i −1+i √ 2 −1 −1−i √ 2 −i 1−i √ 2 p −2 + 1 · 3 √ (2 + i) 1 − 3i + 1+i 2 4+i·1 √ (2 − i) 1 + 3i + −1+i 2 −2 − 1 · 3 √ (2 + i) 1 − 3i + −1−i 2 4−i·1 √ 1 + 3i + 1−i 2 p(rounded) 1 1.707 − 0.879i 4+i 0.293 + 5.121i −5 0.293 − 5.121i 4−i 1.707 + 0.879i Notice that each table contains about N entries. Calculating those entries can be done in constant time as long as we are using floating point arithmetic. 2 Multiplying the polynomials Now comes the easiest part. Understand that we just changed the representation of the polynomial p from its coefficient vector a = (1, 2, −3, 1) to the representation DF T8 (a) = (1, 1.707 − 0.879i, 4 + i, 0.293 + 5.121i, −5, 0.293 − 5.121i, 4 − i, 1.707 + 0.879i), which is an abbreviation of the point value representation p ≡ {(ω80 , 1), (ω81 , 1.707 − 0.879i), (ω82 , 4 + i), (ω83 , 0.293 + 5.121i), (ω84 , −5), (ω85 , 0.293 − 5.121i), (ω86 , 4 − i), (ω87 , 1.707 + 0.879i)}. (1) (the “DF TN part only says in a short way which points x0 , . . . , xN −1 are used for the evaluation.) We learned in the lecture that one can multiply polynomials quickly in this representation by just multiplying the evaluated points with each other. So lets do that: x ω80 ω81 ω82 ω83 ω84 ω85 ω86 ω87 p(x) p2 (x) =: q 1 1 1.707 − 0.879i 2.141 − 3i 4+i 15 + 8i 0.293 + 5.121i −26.139 + 3i −5 25 0.293 − 5.121i −26.139 − 3i 4−i 15 − 8i 1.707 + 0.879i 2.141 + 3i and in direct point-value representation: p2 ≡ {(ω80 , 1), (ω81 , 2.141 − 3i), (ω82 , 15 + 8i), (ω83 , −26.139 + 3i), (ω84 , 25), (ω85 , −26.139 − 3i), (ω86 , 15 − 8i), (ω87 , 2.141 + 3i)} (2) Unfortunately for us the point value representation of p2 is hard to grasp nor very handy for various tasks (other than multiplying polynomials), that’s why we want it transformed back into the vector form. Inverse DFT To get the descriptive vector b of p2 =: q we have to solve a linear equation of the form W · b = y, i ). We did see how W −1 looks like and we realized that there is a polynomial where yi = q(ωN r(x) := y0 + y1 x + · · · + yN −1 xN −1 , which we simply have to evaluate at the same points ω80 , . . . ω87 to get the values b00 , b01 , . . . , b0N −1 which we can transform into the vector b. We again use FFT to get b0 := DFTN (y), which we can easily transform into b by changing the order a bit and by dividing all results by N . Divide: (1, 2.141 − 3i, 15 + 8i, −26.139 + 3i, 25, −26.139 − 3i, 15 − 8i, 2.141 + 3i) (1, 15 + 8i, 25, 15 − 8i) (1, 25) (1) (15 + 8i, 15 − 8i) (25) (15 + 8i) (15 − 8i) (2.141 − 3i, −26.139 + 3i, −26.139 − 3i, 2.141 + 3i) (2.141 − 3i, −26.139 − 3i) (2.141 − 3i) (−26.139 − 3i) 3 (−26.139 + 3i, 2.141 + 3i) (−26.139 + 3i) (2.141 + 3i) Conquer: And again, the formula for combination is k mod n/2 q(ωnk ) = q0 (ωn/2 E.g., r10 (ω21 ) = r10 (−1) = r100 (ω11 k mod n/2 ) + ωnk q1 (ωn/2 mod 1 )+ω 1 r 1 mod 1 ) 2 101 (ω1 ). = 2.141−3i+(−1)·(−26.139−3i) = 28.28. k ω2k r00 r01 r10 r11 0 1 26 30 −24 − 6i −24 + 6i 1 −1 −24 16i 28.28 −28.28 Example for step 2: r1 (ω43 ) = r1 (−i) = r10 (ω23 28.28 + 28.28i mod 2 ) + ω43 r11 (ω23 mod 2 ) = r10 (−1) + (−i)r11 (−1) = k ω4k r0 r1 0 1 26 + 30 −24 − 6i − 24 + 6i 1 i −24 + i · 16i 28.28 + i · (−28.28) 2 −1 26 − 30 −24 − 6i − (−24 + 6i) 3 −i −24 − i · 16i 28.28 − i · (−28.28) Or more nicely: k ω4k r0 r1 0 1 56 −48 1 i −40 28.28 − 28.28i 2 −1 −4 −12i 3 −i −8 28.28 + 28.28i And the last step: k 0 1 2 3 4 5 6 7 ω8k 1 1+i √ 2 i −1+i √ 2 −1 −1−i √ 2 −i 1−i √ 2 r r 56 − 48 8 √ (28.28 − 28.28i) −40 + 1+i 0 2 −4 + i · (−12i) 8 √ (28.28 + 28.28i) −48 −8 + −1+i 2 56 − (−48) 104 1+i √ −40 − 2 (28.28 − 28.28i) −80 −4ii · (−12i) −16 √ (28.28 + 28.28i) −8 − −1+i 32 2 Thus, b0 = DF T8 (y) = (8, 0, 8, −48, 104, −80, −16, 32). To get b we have to divide all results by N = 8 and reverse the order of all elements, except the first one: b = N1 (b00 , b0N −1 , b0N −2 , . . . , b02 , b01 ) = (1, 4, −2, −10, 13, −6, 1, 0). The final result is: q(x) = p2 (x) = x6 − 6x5 + 13x4 − 10x3 − 2x2 + 4x + 1 Exercise 2: Distribution of Sum Let A and B be two sets of integers between 0 and n − 1, i.e., A, B ⊆ {0, . . . , n − 1}. We define two random variables X and Y , where X is obtained by choosing a number uniformly at random from A and Y is obtained by choosing a number uniformly at random from B. We further define the random variable Z = X + Y . Note that the random variable Z can take values from the range {0, . . . , 2n − 2}. (a) Give a simple O(n2 ) algorithm to compute the distribution of Z. Hence, the algorithm should compute the probability Pr(Z = z) for all z ∈ {0, . . . , 2n − 2}. How fast can your algorithm be in case A and B have a low cardinality? (b) Can you get a more efficient algorithm to compute the distribution of Z? You can use algorithms discussed in the lecture as a black box. What is the time complexity of your algorithm? Hint: Try to represent A and B using polynomials. 4 Solution (a) One solution would be: While A and B might not be stored as an array, the distribution Z is Initialize: Z[i] = 0∀i ∈ {0, . . . , n − 2} for all a ∈ A do for all b ∈ B do Z[a + b] ← Z[a + b] + 1 N ← |A||B| for all i ∈ {0, . . . , 2n − 2} do Z[i] ← Z[i]/N probably best accessed as an array, and thus we cannot trivially speed up the initialization part, which needs O(n) steps. The two for-loops need O(|A||B|) in run time, which can be less than O(n); obviously they need O(n2 ) run time, sufficing the conditions. The whole algorithm runs in O(n + |A||B|) time. (b) Consider the indicator function 1A (x) : {0, . . . , n − 1} → {0, 1}, which evaluates to 1 if x ∈ A and to 0 otherwise. For i = 0, . . . , n − 1 define ai := 1A (i), the 0 − 1 indicator vector of A. Similarly Pn−1 define bi := 1B (i). Define polynomial α(x) := i=0 ai xi , and β(x) similarly. P2n−2 Then γ(x) = α(x)β(x) = k=0 ck xk is a polynomial such that ck is the number of pairs (ai , bj ) with ai = bj = 1 for which i+j = k. We can calculate γ quickly using F F T (DFT, Multiplication, Inverse DFT). To get Z we only have to divide all coefficients ck by |A||B|. Exercise 3: Scheduling The wildly popular Spanish search engine “El Goog” needs to do a serious amount of computation every time it recompiles its index. Fortunately, the company has a single supercomputer (SC) at its disposal, together with an essentially unlimited supply of high-end PCs. The computation can be broken into n distinct jobs J1 , J2 , ..., Jn . Each job Ji needs to be preprocessed for pi time units on the SC, before it can (and should be) finished within fi time units on one of the PCs. Due to the large amount of PCs all the finishing of the jobs can be done in parallel, however, the SC has to run the jobs sequentially. El Goog needs a scheduling of the jobs on the SC that minimizes the completion time of the computation, which is the earliest point in time for which all jobs have been finished processing on the PCs. Find a fast algorithm that computes an optimal scheduling and prove its correctness. Solution P First let’s talk notation. Let S = (s1 , . . . , sn ) be any scheduling. If we define t(j) := fj + ji=1 pi , then the completion time T of scheduling S equals to T (S) = maxk∈[n] t(sk ), where [n] := {1, . . . , n}. The sought-after assignment is of the greedy type. We name the greedy solution G = (g1 , . . . , gn ) and we acquire it by sorting the jobs by descending finishing time, i.e., fg1 ≥ fg2 ≥ · · · ≥ fgn and feeding them in this order to the super computer. To prove the optimality of this solution G we compare it to any solution S, and in case they differ, we change S to S 0 , such that S 0 is ”closer” to G than S, but T (S 0 ) ≤ T (S). If S is optimal, then after a finite number of steps we reach G without increasing in completion time, thus proving optimality. Let w be the highest index such that sw 6= gw , which implies by construction that fgw ≤ fsw . Let v be the index such that sv = gw and note that v < w (otherwise v would be the highest index for which S and G differ, a contradiction to our choice of w). We now swap the jobs sv and sw and get a scheduling S 0 = (s1 , . . . , sv−1 , sw , sv+1 , . . . , sw−1 , sv , sw+1 , . . . , sn ) =: (s01 , . . . , s0n ). 5 For the completion time we have T (S 0 ) = maxk∈[n] t(s0k ), for which we know that t(s0k ) = t(sk ) ∀k 6= v, w. Thus let us look at t(s0v ) and t(s0w ): t(s0v ) = t(s0w ) = v X i=1 w X v<w pi + fsw ≤ w X pi + fsw = t(sw ) i=1 pi + fsv sv =gw = t(gw ) fgw ≤fsw ≤ t(sw ) i=1 Both values are smaller than t(sw ) ≤ T (S) and maxk∈[n]\{v,w} t(s0k ) = maxk∈[n]\{v,w} t(sk ) ≤ T (S). But this concludes the proof, as with each such transformation the value w := arg maxk {k : sk = 6 gk } 0 decreases until (after a finite number of steps) we have S = G. Exercise 4: Matroids We have defined matroids in the lecture. For a matroid (E, I), a maximal independent set S ∈ I is an independent set that cannot be extended. Thus, for every element e ∈ E \ S, the set S ∪ {e} 6∈ I. a) Show that all maximal independent sets of a matroid (E, I) have the same size. (This size is called the rank of a matroid.) b) Consider the following greedy algorithm: The algorithm starts with an empty independent set S = ∅. Then, in each step the algorithm extends S by the minimum weight element e ∈ E \ S such that S ∪ {e} ∈ I, until S is a maximal independent set. Show that the algorithm computes a maximal independent set of minimum weight. c) For a graph G = (V, E), a subset F ⊆ E of the edges is called a forest iff (if and only if) it does not contain a cycle. Let F be the set of all forests of G. Show that (E, F) is a matroid. What are the maximal independent sets of this matroid? Solution a) Assume that there are two maximal independent sets S and T with |S| 6= |T |. Without loss of generality (w.l.o.g.) we assume that |S| < |T |. The exchange property tells us that there exists an element x ∈ T \ S which can be added to S and that S ∪ {x} is still independent, which is a contradiction to the maximality of S. b) Let S = (e1 , . . . , er ) be the greedy solution and T = (f1 , . . . , fr ) be any other solution, where r is the rank of the matroid. Note that both S and T need to have cardinality r to be maximal. For the sake of contradiction assume that w(T ) < w(S), i.e., T has a lower weight than S. We also assume that the ei and the fi are already ordered by increasing weight. We now let k be the smallest index such that w(fk ) < w(ek ); note that for smaller indexes equality may not hold. Consider set Sk−1 := {e1 , . . . , ek−1 } and Tk := {f1 , . . . , fk }. Using the exchange property we get that there is a j ∈ [k] such that Sk−1 ∪ {fj } is independent. Since the fi are ordered we know that w(fj ) ≤ w(fk ) < w(ek ). But that means that in step k the greedy algorithm skipped element fj even though fj has less weight than ek and it would not have violated the independence of Sk−1 – a contradiction to the way the algorithm works. c) We need to show 3 properties to proof that (E, F) is a matroid. 1) The empty set ∅ ⊂ E does not contain any cycles and therefore is a forest and thus in F. 2) Let F 0 ( F and F be a forest. If F 0 would contain a cycle then clearly F would contain the same cycle, which contradicts any assumption that F 0 6⊂ F. 6 3) Showing the exchange property is not trivial. S Notation: for any subset F of E let V [F ] be the set of vertices that occur in F , i.e., V [F ] = e∈F e (note that for undirected graphs this is indeed a set of vertices and not a set of edges). Assume we have two forests F and F 0 with |F 0 | > |F |, but the exchange property does not hold, i.e., adding any edge e ∈ F 0 \ F to F would close a cycle. In graphs, a tree has no cycle and thus is a forest. A forest in reverse is a collection of trees. Now let’s split F into its disjoint trees T1 , . . . , Tm , such that V [Ti ] ∩ V [Tj ] = ∅ for all i, j ∈ [m]. Assume that there is an edge {v1 , v2 } ∈ F 0 \ F such that v1 ∈ / V [F ]. Then we could safely add e, a contradiction. Thus assume that both v1 and v2 are in V [F ], but let them lie in different trees, i.e., v1 ∈ V [Ti ] and v2 ∈ V [Tj ] for some i 6= j. But then again we just combine two trees to one without closing a cycle. Thus, for all edges e ∈ F 0 \ F , both vertices must lie in the same vertex set V [Tie ] for some ie ∈ [k]. Let E 0 [V [Ti ]] := {e ∈ F 0 : eS⊂ V [Ti ]}, i.e., the sets of edges of F 0 that 0 0 belong to the vertex set that is spanned by Ti . Since m i=1 E [V [Ti ]] = F there must be an i for which |E 0 [V [Ti ]]| > |Ti | = |V [Ti ]| − 1. But then the edges of F 0 in that component are more than |V [Ti ]| − 1, i.e., they exceed the maximum size of a tree in that component, thus, there exists a cycle in F 0 . This is a contradiction about the forest property of F 0 and finishes our proof. c) - Alternative solution We again proof via contradiction. A tree is uniquely defined via its set of edges T ⊂ E, but for this solution let’s also root the tree, i.e., define a vertex v as the unique root of a tree. In this scenario it makes sense to look at trees without edges, only consisting of a single root node. Let T = (Et , rT ) be such a representation with ET the edges and rT the root of the tree T . Furthermore define |T | := |ET | + 1, i.e., the number of vertices in T . Consider a forest F and split it into all its disjoint trees, with all vertices of V that are not incident to any edge of F being an empty tree (ET = ∅) and itself being the root. Let T be the set of those trees. Claim: |T | = |V| − |F|. The proof is simply done via induction, starting with F = ∅ implying |T | = |V| and then realizing that adding an edge without closing a cycle connects exactly two trees which have different roots, from which one has to give up its status. Thus, if we assume that |F | < |F 0 |, then analogously |T 0 | < |T |, i.e., there are more trees in F . But then there must be at least one tree T 0 in F 0 such that at least two of its vertices v and u are belonging to two different trees Tu and Tv in F . Within T 0 there exists a path from u to v and thus there must be an edge {u1 , u2 } on this path such that u1 is a vertex that belongs to Tu and u2 does not. This edge lies in F 0 \ F which can be safely added to F without closing a cycle. The maximal independent sets are spanning forests, i.e., the spanning trees of all connected components of G. The rank of the corresponding matroid is n − k, where k is the number of components of G (for connected G we have k = 1). 7