Albert-Ludwigs-Universit¨ at, Inst. f¨ ur Informatik Prof. Dr. Fabian Kuhn

Transcription

Albert-Ludwigs-Universit¨ at, Inst. f¨ ur Informatik Prof. Dr. Fabian Kuhn
Albert-Ludwigs-Universit¨
at, Inst. f¨
ur Informatik
Prof. Dr. Fabian Kuhn
S. Daum, H. Ghodselahi, O. Saukh
November 14, 2013
Algorithm Theory
Wintersemester 2013
Problem set 2 – Sample solution
Exercise 1: Polynomial Multiplication using FFT
Compute the coefficient representation of (p(x))2 by using the FFT algorithm.
p(x) = x3 − 3x2 + 2x + 1
Remark: Instead of using exact numbers for the point-wise evaluations (which involves irrational
numbers) you can also round numbers to, say, 3+ digits after the decimal point. This also reflects
what happens in an implemented version of FFT, as algebraic representations do not lead to an
O(n log n) running time. Those unfamiliar with complex numbers should ask fellow students for some
help - calculating roots of unity and multiplying 2 complex numbers is all you need for this exercise.
Solution
To multiply two polynomials p and p0 of degree dp and dp0 respectively, by using interpolation, we
must evaluate those polynomials at dp + dp0 + 1 distinct points. In our case we have two identical
polynomials of degree 3, i.e., 7 points are needed. To use FFT, rounding up to the next power of 2 is
advised, as it simplifies the computation by more than the additional amount of work we need to do.
Thus, from now on, N = 8.
The coefficient vector for our polynomial p is a = (1, 2, −3, 1). For the upcoming multiplication of
p with itself we will calculate DF TN (a) which is the abbreviation for (p(ω80 ), . . . , p(ω87 )), i.e., the
evaluation of p at the 8 roots of unity. The polynomial p2 has a descriptive vector of length 7, but the
FFT algorithm is much clearer if all polynomials use the same length of their descriptive vector, and
it should be of length N , thus we redefine a := (1, 2, −3, 1, 0, 0, 0, 0).
FFT
Divide: First we need to split p into p0 and p1 . Their respective vectors are a0 := (1, −3, 0, 0)
and a1 := (2, 1, 0, 0). Those we split further until we reach a descriptive vector of length 1: a00 :=
(1, 0), a01 := (−3, 0), a10 := (2, 0), a11 := (1, 0) and a000 := (1), a001 := (0), a010 := (−3), a011 :=
(0), a100 := (2), a101 := (0), a110 := (1), a111 := (0). Now, a polynomial of degree 0 is easy to
calculate, as it represents a constant function: DFT1 (a000 ) = p000 (ω10 ) = 1. As you can see, we could
already have stopped at polynomials p00 , . . . , p11 , because they already represented polynomials of
degree 0. But let’s be meticulous:
k ω1k p000 p001 p010 p011 p100 p101 p110 p111
0 1
1
0
−3
0
2
0
1
0
1
(1, 2, −3, 1, 0, 0, 0, 0)
(1, −3, 0, 0)
(1, 0)
(1)
(2, 1, 0, 0)
(−3, 0)
(0)
(2, 0)
(−3) (0)
(2)
(1, 0)
(0)
(1)
(0)
Conquer: Remember that our goal in each step is to combine two polynomials of degree n/2 − 1 to
one polynomial of degree n − 1, but in the point-value representation, i.e., we need to get the values
of that polynomial at n different roots of unity. The formula for combination is
k
mod n/2
q(ωnk ) = q0 (ωn/2
E.g., p10 (ω21 ) = p10 (−1) = p100 (ω11
mod 1 )
k
mod n/2
) + ωnk q1 (ωn/2
+ ω21 p101 (ω11
mod 1 )
).
= 2 + (−1) · 0 = 2.
k ω2k p00 p01 p10 p11
0 1
1 −3 2
1
1 −1 1 −3 2
1
The next step is slightly more complicated. E.g., p1 (ω43 ) = p1 (−i) = p10 (ω23
p10 (−1) + (−i)p11 (−1) = 2 − i
k ω4k
p0
0 1 1 + 1 · (−3)
1 i 1 + i · (−3)
2 −1 1 − 1 · (−3)
3 −i 1 − i · (−3)
mod 2 )+ω 3 p (ω 3 mod 2 )
4 11 2
=
p1
2+1·1
2+i·1
2−1·1
2−i·1
Or more nicely:
k ω4k
p0
p1
0 1
−2
3
1 i 1 − 3i 2 + i
2 −1
4
1
3 −i 1 + 3i 2 − i
±1±i
√ , which makes calculation a bit more
2
−1−i
5
5
mod
4 )+ω 5 p (ω 5 mod 4 ) = p (i)+ −1−i
√ p1 (i) =
Example calculation: p(ω8 ) = p( √2 ) = p0 (ω4
0
8 1 4
2
1
1
1
3
√ (−1 − i)(2 + i) = 1 − 3i + √ (−2 − i − 2i + 1) = 1 − √ + i · (−3 − √ ) ≈ 0.293 − 5.121i
2
2
2
2
The last step involves 4 new roots of unity of the form
difficult.
1 − 3i +
k
0
1
2
3
4
5
6
7
ω8k
1
1+i
√
2
i
−1+i
√
2
−1
−1−i
√
2
−i
1−i
√
2
p
−2 + 1 · 3
√ (2 + i)
1 − 3i + 1+i
2
4+i·1
√ (2 − i)
1 + 3i + −1+i
2
−2 − 1 · 3
√ (2 + i)
1 − 3i + −1−i
2
4−i·1
√
1 + 3i + 1−i
2
p(rounded)
1
1.707 − 0.879i
4+i
0.293 + 5.121i
−5
0.293 − 5.121i
4−i
1.707 + 0.879i
Notice that each table contains about N entries. Calculating those entries can be done in constant
time as long as we are using floating point arithmetic.
2
Multiplying the polynomials
Now comes the easiest part. Understand that we just changed the representation of the polynomial
p from its coefficient vector a = (1, 2, −3, 1) to the representation DF T8 (a) = (1, 1.707 − 0.879i, 4 +
i, 0.293 + 5.121i, −5, 0.293 − 5.121i, 4 − i, 1.707 + 0.879i), which is an abbreviation of the point value
representation
p ≡ {(ω80 , 1), (ω81 , 1.707 − 0.879i), (ω82 , 4 + i), (ω83 , 0.293 + 5.121i),
(ω84 , −5), (ω85 , 0.293 − 5.121i), (ω86 , 4 − i), (ω87 , 1.707 + 0.879i)}. (1)
(the “DF TN part only says in a short way which points x0 , . . . , xN −1 are used for the evaluation.)
We learned in the lecture that one can multiply polynomials quickly in this representation by just
multiplying the evaluated points with each other. So lets do that:
x
ω80
ω81
ω82
ω83
ω84
ω85
ω86
ω87
p(x)
p2 (x) =: q
1
1
1.707 − 0.879i
2.141 − 3i
4+i
15 + 8i
0.293 + 5.121i −26.139 + 3i
−5
25
0.293 − 5.121i −26.139 − 3i
4−i
15 − 8i
1.707 + 0.879i
2.141 + 3i
and in direct point-value representation:
p2 ≡ {(ω80 , 1), (ω81 , 2.141 − 3i), (ω82 , 15 + 8i), (ω83 , −26.139 + 3i),
(ω84 , 25), (ω85 , −26.139 − 3i), (ω86 , 15 − 8i), (ω87 , 2.141 + 3i)} (2)
Unfortunately for us the point value representation of p2 is hard to grasp nor very handy for various
tasks (other than multiplying polynomials), that’s why we want it transformed back into the vector
form.
Inverse DFT
To get the descriptive vector b of p2 =: q we have to solve a linear equation of the form W · b = y,
i ). We did see how W −1 looks like and we realized that there is a polynomial
where yi = q(ωN
r(x) := y0 + y1 x + · · · + yN −1 xN −1 , which we simply have to evaluate at the same points ω80 , . . . ω87 to
get the values b00 , b01 , . . . , b0N −1 which we can transform into the vector b.
We again use FFT to get b0 := DFTN (y), which we can easily transform into b by changing the order
a bit and by dividing all results by N .
Divide:
(1, 2.141 − 3i, 15 + 8i, −26.139 + 3i, 25, −26.139 − 3i, 15 − 8i, 2.141 + 3i)
(1, 15 + 8i, 25, 15 − 8i)
(1, 25)
(1)
(15 + 8i, 15 − 8i)
(25) (15 + 8i)
(15 − 8i)
(2.141 − 3i, −26.139 + 3i, −26.139 − 3i, 2.141 + 3i)
(2.141 − 3i, −26.139 − 3i)
(2.141 − 3i)
(−26.139 − 3i)
3
(−26.139 + 3i, 2.141 + 3i)
(−26.139 + 3i)
(2.141 + 3i)
Conquer: And again, the formula for combination is
k
mod n/2
q(ωnk ) = q0 (ωn/2
E.g., r10 (ω21 ) = r10 (−1) = r100 (ω11
k
mod n/2
) + ωnk q1 (ωn/2
mod 1 )+ω 1 r
1 mod 1 )
2 101 (ω1
).
= 2.141−3i+(−1)·(−26.139−3i) = 28.28.
k ω2k r00 r01
r10
r11
0 1
26 30 −24 − 6i −24 + 6i
1 −1 −24 16i
28.28
−28.28
Example for step 2: r1 (ω43 ) = r1 (−i) = r10 (ω23
28.28 + 28.28i
mod 2 )
+ ω43 r11 (ω23
mod 2 )
= r10 (−1) + (−i)r11 (−1) =
k ω4k
r0
r1
0 1
26 + 30
−24 − 6i − 24 + 6i
1 i −24 + i · 16i
28.28 + i · (−28.28)
2 −1
26 − 30
−24 − 6i − (−24 + 6i)
3 −i −24 − i · 16i
28.28 − i · (−28.28)
Or more nicely:
k ω4k r0
r1
0 1
56
−48
1 i −40 28.28 − 28.28i
2 −1 −4
−12i
3 −i −8 28.28 + 28.28i
And the last step:
k
0
1
2
3
4
5
6
7
ω8k
1
1+i
√
2
i
−1+i
√
2
−1
−1−i
√
2
−i
1−i
√
2
r
r
56 − 48
8
√ (28.28 − 28.28i)
−40 + 1+i
0
2
−4 + i · (−12i)
8
√ (28.28 + 28.28i) −48
−8 + −1+i
2
56 − (−48)
104
1+i
√
−40 − 2 (28.28 − 28.28i) −80
−4ii · (−12i)
−16
√ (28.28 + 28.28i)
−8 − −1+i
32
2
Thus, b0 = DF T8 (y) = (8, 0, 8, −48, 104, −80, −16, 32). To get b we have to divide all results by
N = 8 and reverse the order of all elements, except the first one: b = N1 (b00 , b0N −1 , b0N −2 , . . . , b02 , b01 ) =
(1, 4, −2, −10, 13, −6, 1, 0).
The final result is:
q(x) = p2 (x) = x6 − 6x5 + 13x4 − 10x3 − 2x2 + 4x + 1
Exercise 2: Distribution of Sum
Let A and B be two sets of integers between 0 and n − 1, i.e., A, B ⊆ {0, . . . , n − 1}. We define two
random variables X and Y , where X is obtained by choosing a number uniformly at random from A
and Y is obtained by choosing a number uniformly at random from B. We further define the random
variable Z = X + Y . Note that the random variable Z can take values from the range {0, . . . , 2n − 2}.
(a) Give a simple O(n2 ) algorithm to compute the distribution of Z. Hence, the algorithm should
compute the probability Pr(Z = z) for all z ∈ {0, . . . , 2n − 2}. How fast can your algorithm be in
case A and B have a low cardinality?
(b) Can you get a more efficient algorithm to compute the distribution of Z? You can use algorithms
discussed in the lecture as a black box. What is the time complexity of your algorithm?
Hint: Try to represent A and B using polynomials.
4
Solution
(a) One solution would be: While A and B might not be stored as an array, the distribution Z is
Initialize: Z[i] = 0∀i ∈ {0, . . . , n − 2}
for all a ∈ A do
for all b ∈ B do
Z[a + b] ← Z[a + b] + 1
N ← |A||B|
for all i ∈ {0, . . . , 2n − 2} do
Z[i] ← Z[i]/N
probably best accessed as an array, and thus we cannot trivially speed up the initialization part,
which needs O(n) steps. The two for-loops need O(|A||B|) in run time, which can be less than
O(n); obviously they need O(n2 ) run time, sufficing the conditions. The whole algorithm runs in
O(n + |A||B|) time.
(b) Consider the indicator function 1A (x) : {0, . . . , n − 1} → {0, 1}, which evaluates to 1 if x ∈ A and
to 0 otherwise. For i = 0, . . . , n − 1 define ai :=
1A (i), the 0 − 1 indicator vector of A. Similarly
Pn−1
define bi := 1B (i). Define polynomial α(x) := i=0
ai xi , and β(x) similarly.
P2n−2
Then γ(x) = α(x)β(x) = k=0 ck xk is a polynomial such that ck is the number of pairs (ai , bj )
with ai = bj = 1 for which i+j = k. We can calculate γ quickly using F F T (DFT, Multiplication,
Inverse DFT). To get Z we only have to divide all coefficients ck by |A||B|.
Exercise 3: Scheduling
The wildly popular Spanish search engine “El Goog” needs to do a serious amount of computation
every time it recompiles its index. Fortunately, the company has a single supercomputer (SC) at its
disposal, together with an essentially unlimited supply of high-end PCs.
The computation can be broken into n distinct jobs J1 , J2 , ..., Jn . Each job Ji needs to be preprocessed
for pi time units on the SC, before it can (and should be) finished within fi time units on one of the
PCs.
Due to the large amount of PCs all the finishing of the jobs can be done in parallel, however, the SC
has to run the jobs sequentially. El Goog needs a scheduling of the jobs on the SC that minimizes the
completion time of the computation, which is the earliest point in time for which all jobs have been
finished processing on the PCs.
Find a fast algorithm that computes an optimal scheduling and prove its correctness.
Solution
P
First let’s talk notation. Let S = (s1 , . . . , sn ) be any scheduling. If we define t(j) := fj + ji=1 pi ,
then the completion time T of scheduling S equals to T (S) = maxk∈[n] t(sk ), where [n] := {1, . . . , n}.
The sought-after assignment is of the greedy type. We name the greedy solution G = (g1 , . . . , gn ) and
we acquire it by sorting the jobs by descending finishing time, i.e., fg1 ≥ fg2 ≥ · · · ≥ fgn and feeding
them in this order to the super computer.
To prove the optimality of this solution G we compare it to any solution S, and in case they differ,
we change S to S 0 , such that S 0 is ”closer” to G than S, but T (S 0 ) ≤ T (S). If S is optimal, then after
a finite number of steps we reach G without increasing in completion time, thus proving optimality.
Let w be the highest index such that sw 6= gw , which implies by construction that fgw ≤ fsw . Let v be
the index such that sv = gw and note that v < w (otherwise v would be the highest index for which
S and G differ, a contradiction to our choice of w).
We now swap the jobs sv and sw and get a scheduling
S 0 = (s1 , . . . , sv−1 , sw , sv+1 , . . . , sw−1 , sv , sw+1 , . . . , sn ) =: (s01 , . . . , s0n ).
5
For the completion time we have T (S 0 ) = maxk∈[n] t(s0k ), for which we know that t(s0k ) = t(sk ) ∀k 6=
v, w. Thus let us look at t(s0v ) and t(s0w ):
t(s0v ) =
t(s0w ) =
v
X
i=1
w
X
v<w
pi + fsw ≤
w
X
pi + fsw = t(sw )
i=1
pi + fsv
sv =gw
=
t(gw )
fgw ≤fsw
≤
t(sw )
i=1
Both values are smaller than t(sw ) ≤ T (S) and maxk∈[n]\{v,w} t(s0k ) = maxk∈[n]\{v,w} t(sk ) ≤ T (S).
But this concludes the proof, as with each such transformation the value w := arg maxk {k : sk =
6 gk }
0
decreases until (after a finite number of steps) we have S = G.
Exercise 4: Matroids
We have defined matroids in the lecture. For a matroid (E, I), a maximal independent set S ∈ I is
an independent set that cannot be extended. Thus, for every element e ∈ E \ S, the set S ∪ {e} 6∈ I.
a) Show that all maximal independent sets of a matroid (E, I) have the same size. (This size is called
the rank of a matroid.)
b) Consider the following greedy algorithm: The algorithm starts with an empty independent set
S = ∅. Then, in each step the algorithm extends S by the minimum weight element e ∈ E \ S
such that S ∪ {e} ∈ I, until S is a maximal independent set. Show that the algorithm computes a
maximal independent set of minimum weight.
c) For a graph G = (V, E), a subset F ⊆ E of the edges is called a forest iff (if and only if) it does
not contain a cycle. Let F be the set of all forests of G. Show that (E, F) is a matroid. What are
the maximal independent sets of this matroid?
Solution
a) Assume that there are two maximal independent sets S and T with |S| 6= |T |. Without loss
of generality (w.l.o.g.) we assume that |S| < |T |. The exchange property tells us that there exists
an element x ∈ T \ S which can be added to S and that S ∪ {x} is still independent, which is a
contradiction to the maximality of S.
b) Let S = (e1 , . . . , er ) be the greedy solution and T = (f1 , . . . , fr ) be any other solution, where
r is the rank of the matroid. Note that both S and T need to have cardinality r to be maximal.
For the sake of contradiction assume that w(T ) < w(S), i.e., T has a lower weight than S. We
also assume that the ei and the fi are already ordered by increasing weight. We now let k be
the smallest index such that w(fk ) < w(ek ); note that for smaller indexes equality may not hold.
Consider set Sk−1 := {e1 , . . . , ek−1 } and Tk := {f1 , . . . , fk }. Using the exchange property we get
that there is a j ∈ [k] such that Sk−1 ∪ {fj } is independent. Since the fi are ordered we know that
w(fj ) ≤ w(fk ) < w(ek ). But that means that in step k the greedy algorithm skipped element fj
even though fj has less weight than ek and it would not have violated the independence of Sk−1 – a
contradiction to the way the algorithm works.
c)
We need to show 3 properties to proof that (E, F) is a matroid.
1) The empty set ∅ ⊂ E does not contain any cycles and therefore is a forest and thus in F.
2) Let F 0 ( F and F be a forest. If F 0 would contain a cycle then clearly F would contain the same
cycle, which contradicts any assumption that F 0 6⊂ F.
6
3) Showing the exchange property is not trivial.
S Notation: for any subset F of E let V [F ] be the set
of vertices that occur in F , i.e., V [F ] = e∈F e (note that for undirected graphs this is indeed a
set of vertices and not a set of edges).
Assume we have two forests F and F 0 with |F 0 | > |F |, but the exchange property does not hold,
i.e., adding any edge e ∈ F 0 \ F to F would close a cycle.
In graphs, a tree has no cycle and thus is a forest. A forest in reverse is a collection of trees.
Now let’s split F into its disjoint trees T1 , . . . , Tm , such that V [Ti ] ∩ V [Tj ] = ∅ for all i, j ∈ [m].
Assume that there is an edge {v1 , v2 } ∈ F 0 \ F such that v1 ∈
/ V [F ]. Then we could safely add e,
a contradiction. Thus assume that both v1 and v2 are in V [F ], but let them lie in different trees,
i.e., v1 ∈ V [Ti ] and v2 ∈ V [Tj ] for some i 6= j. But then again we just combine two trees to one
without closing a cycle. Thus, for all edges e ∈ F 0 \ F , both vertices must lie in the same vertex
set V [Tie ] for some ie ∈ [k]. Let E 0 [V [Ti ]] := {e ∈ F 0 : eS⊂ V [Ti ]}, i.e., the sets of edges of F 0 that
0
0
belong to the vertex set that is spanned by Ti . Since m
i=1 E [V [Ti ]] = F there must be an i for
which |E 0 [V [Ti ]]| > |Ti | = |V [Ti ]| − 1. But then the edges of F 0 in that component are more than
|V [Ti ]| − 1, i.e., they exceed the maximum size of a tree in that component, thus, there exists a
cycle in F 0 . This is a contradiction about the forest property of F 0 and finishes our proof.
c) - Alternative solution
We again proof via contradiction.
A tree is uniquely defined via its set of edges T ⊂ E, but for this solution let’s also root the tree,
i.e., define a vertex v as the unique root of a tree. In this scenario it makes sense to look at trees
without edges, only consisting of a single root node. Let T = (Et , rT ) be such a representation with
ET the edges and rT the root of the tree T . Furthermore define |T | := |ET | + 1, i.e., the number
of vertices in T . Consider a forest F and split it into all its disjoint trees, with all vertices of V
that are not incident to any edge of F being an empty tree (ET = ∅) and itself being the root. Let
T be the set of those trees.
Claim: |T | = |V| − |F|.
The proof is simply done via induction, starting with F = ∅ implying |T | = |V| and then realizing
that adding an edge without closing a cycle connects exactly two trees which have different roots,
from which one has to give up its status.
Thus, if we assume that |F | < |F 0 |, then analogously |T 0 | < |T |, i.e., there are more trees in F .
But then there must be at least one tree T 0 in F 0 such that at least two of its vertices v and u are
belonging to two different trees Tu and Tv in F . Within T 0 there exists a path from u to v and
thus there must be an edge {u1 , u2 } on this path such that u1 is a vertex that belongs to Tu and
u2 does not. This edge lies in F 0 \ F which can be safely added to F without closing a cycle.
The maximal independent sets are spanning forests, i.e., the spanning trees of all connected components of G. The rank of the corresponding matroid is n − k, where k is the number of components of
G (for connected G we have k = 1).
7