Lecture notes

Transcription

Lecture notes
Lecture 7
General Quadratic Programs and
Quantum Multiplayer Games∗
In the last lecture we introduced general quadratic programs and devised a semidefinite
relaxation. In particular, we established that for any A ∈ m×n ,
X
X
OPT(A) := max
Aij xi yj ≤
sup
Aij ~ui · ~vj =: SDP(A).
R
|xi |,|yj |≤1
R
~
ui ,~vj ∈ d
i,j
(d≤m+n)
k~
ui k=k~vj k=1
i,j
In this lecture we provide a detailed analysis of how good this relaxation is. We will see
that is never more than a constant factor off. As an application, we will consider a quantum
games and see how one can apply this program to derive some interesting results.
7.1
Analysis of the SDP Relaxation
In this section, we analyze the performance of the SDP relaxed formulation. The main result
is that we can bound the ratio of the objective values of these two programs by a constant
factor, regardless of the choice of A.
Theorem 7.1. There exists K = KG such that ∀A, SDP(A) ≤ K · OPT(A).
Remark 7.2. (1) If one does not care about the optimal constant the result can made
“algorithmic”, as follows: given any unit vectors ~ui , ~vj achieving SDP(A), there is a
randomized (even deterministic) polynomial-time algorithm that outputs xi , yj ∈ {±1}
such that
X
X
Aij xi yj ≥ α
Aij ~ui · ~vj
i,j
i,j
for some α ≥ 0.01.
*
Lecturer: Thomas Vidick. Scribe: Linqi Guo.
1
(2) One can do better in√ the value of α. You have a homework problem showing one
can achieve 2 ln(1 + 2)/π ≈ 0.56 with a randomized algorithm. The best possible
constant is known as Grothendieck’s constant, which is strictly greater than 0.56. We
do not know its precise value — only the first few digits have been pinned down.
R
We prove the theorem. Suppose ~ui , ~vj ∈ d achieving SDP(A) are given. We can always
assume d ≤ m + n since this is the number of vectors. Observe that
!
d
√
√
X
1X X
Aij ~ui · ~vj =
Aij
d(~ui )k
d(~vj )k
d
i,j
i,j
k=1
This implies we can find k ∈ {1, . . . , d} such that
√
√
X
X
Aij
d(~ui )k
d(~vj )k ≥
Aij ~ui · ~vj = SDP(A) ≥ OPT(A).
i,j
i,j
√
√
How large are the |(~ui )k |? From k~ui k ≤ 1 we know |(~ui )k | ≤ 1, which implies d(~ui )k ≤ d.
This naive bound is not good enough: we are looking for an assignment of values in the
√
range [−1, 1]. Now, if everything was “well-behaved”, we should expect |(~ui )k | ' 1/ d
because k~ui k ≤ 1. This heuristic suggests the approach for the proof, which is to perform
a “random” rotation (we will in fact apply a deterministic transformation) that guarantees
the coordinates are well-balanced, so that most of them are not too large, no larger than
some constant. Our rotation is based on the following claim:
Claim 7.3. In dimension d, there exists t = O(d2 ) vectors ~g1 , . . . , ~gt ∈ {±1}d , such that the
~gi are 4-wise independent.
The condition of 4-wise independence in the claim means the following. Consider the
matrix


− ~g1 −
 − ~g2 − 
.
G=


···
− ~gt −
Then if we fix any 4 columns of G, corresponding to coordinates of the ~gi , then each possible
pattern in {±1}4 occurs exactly t/16 times. This property naturally implies r-wise independence for r < 4. That is, if we look at a single column, the number of 1’s should be the same
as the number of −1’s, and similar conditions hold for two and three columns. Note that if
we allow exponential large t, we can just choose all the possible d-dimensional ±1 vectors
to satisfy this property. The main point here is that we can make this t only polynomial in
d. An (optional) homework exercise will show you how to construct such a family of vectors
based on BCH error-correcting codes.
~g√
u
d
t
i ·~
√u . For M > 0 define
For any ~u ∈ , define h(~u) ∈ , such that h(~u)i = t = G~
t i

√

if |h(u)i | ≤ M/ t
h(u)√i
√
M
h (u) = M/ t
if h(u)i > M/ t

√
√

−M/ t if h(u)i < −M/ t
R
R
2
Then we have the following lemma:
Lemma 7.4. For any ~u, ~v ∈
Rd, with k~uk = k~vk = 1,
(1) h(~u) · h(~v ) = ~u · ~v
(2) kh(~u)k = 1
(3) khM (~u)k ≤ 1
√
3
M
(4) kh(~u) − hM (~u)k ≤
The last property gives a precise trade-off between the size of M and the quality of the
approximation of the vector h(~u) by its truncation hM (~u). Using the lemma, we can write
X
SDP(A) =
Aij ~ui · ~vj
i,j
(1) X
=
Aij h(~ui ) · h(~vj )
i,j
=
X
Aij hM (~ui ) · hM (~vj ) +
i,j
+
X
Aij hM (~ui ) · (h(~vj ) − hM (~vj ))
i,j
X
M
Aij (h(~ui ) − h (~ui ))hM (~vj )
i,j
≤
X
i,j
X
M
M
M
M
Aij h (~ui ) · h (~vj ) + Aij h (~ui ) · (h(~vj ) − h (~vj ))
i,j
{z
}
|
(4) √
≤ M3 ·SDP(A)
X
M
M
Aij (h(~ui ) − h (~ui ))h (~vj )
+
i,j
|
{z
}
(4) √
≤ M3 ·SDP(A)
√ √ t
M2 X X
t
t
M
M
≤
Aij (h (~ui ))k ·
(h (~vj ))k ·
t k=1 i,j
M
M
√
2 3
+
· SDP(A)
M
Thus we can find a k such that
√ √ X
t
t
1
M
M
Aij (h (~ui ))k ·
(h (~vj ))k ·
≥ 2
M
M
M
i,j
3
!
√
2 3
SDP(A) −
SDP(A)
M
To conclude, we set M = 4, xi = (hM (~ui ))k ·
X
√
√
t
,
M
yj = (hM (~vj ))k · Mt and get
√ !
M −2 3
≈ 0.01 · SDP(A)
M3
Aij xik yjk ≥ SDP(A) ·
i,j
Here xi , yj ∈ [−1, 1], and we can always find values in {±1} that are at least as good (to see
how, fix all the xi and observe that there is always an optimal setting of each individual yj
that is either +1 or −1).
Thus the theorem is proved, conditioned on the lemma, whose proof we now turn to.
Proof of the lemma. The proof of the lemma strongly relies on the 4-wise independence we
posed on the vectors ~gi .
(1) We check that
t
X
~gk · ~u ~gk · ~v
√
√
h(~u) · h(~v ) =
t
t
k=1
t X
d
X
(~gk )i · ~ui (~gk )j · ~vj
√
√
=
t
t
i,j=1
k=1
!
d
t
X
1X
=
(~gk )i (~gk )j ~ui~vj
t
i,j=1
k=1
|
{z
}
=δij by 2-independence
=
d
X
~ui~vi
i=1
= ~u · ~v
(2) From the last result,
kh(~u)k2 = h(~u) · h(~u) = ~u · ~u = k~uk2 = 1
(3) Again from the last result,
khM (~u)k2 =
X
X
(hM (~u))2k ≤
(h(~u))2k = 1
k
k
4
(4) From the definition of hM (·), we have
X
kh(~u) − hM (~u)k =
|h(~u)k |2
M
k:|h(~
u)k |> √
t
1 X
=
(~gk · ~u)2
t
k:|~g ·~
u|>M
s k
s
1 X
1 X
4
≤
(~gk · ~u)
1
t
t
k:|~g ·~
u|>M
k:|~gk ·~
u|>M
√ k
√ 3
3
≤ 3 2 = 2
M
M
where the second last line is by the Cauchy-Schwarz inequality and the last line is
because
1X X
1 X
(~gk · ~u)4 =
(~gk )i (~gk )j (~gk )l (~gk )m~ui~uj ~uk ~um
t
t k i,j,l,m
k:|~gk ·~
u|>M
!
X
1 X
(~ui )4 + 3
(~ui )2 (~ul )2
=
t
i
i6=l
3
(k~ui k)2 = 3,
t
P
where the second line follows by observing that k (~gk )i (~gk )j (~gk )l (~gk )m = 0 unless i = j
and l = m, or i = l and j = m, or i = m and j = l. From this bound we also gets
#{k : |~gk · ~u| > M } ≤ M34 t, as required above.
≤
Remark 7.5. The same guarantees for the rounding procedure can be obtained by taking
random projections on Gaussian vectors. We will see this later in class.
7.2
Quantum Multiplayer Games
We have been studying the quantity
max
xi ,yj ∈{±1}
X
Aij xi yj
i,j
as an optimization problem. Now we are going to rephrase this as the optimum of a twoplayer game. This is a useful perspective on optimization / constraint satisfaction problems
in general. It will also let us make a connection with quantum multiplayer games.
Consider a game between a referee and two players, Alice and Bob. (Alice and Bob
cooperate to win the game “against” the referee.)
5
(1) Select (i, j) ∈ {1, . . . , m} × {1, . . . , n} according to some distribution π.
(2) Reveal i to Alice and j to Bob. Alice and Bob reply with signs xi , yj ∈ {±1} respectively.
(3) The payoff is xi yj cij , where cij ∈ {±1}.
So the goal of the players is to provide answers whose product is a certain target cij . However,
Alice only knows i and Bob only knows j, which is what can make the game challenging.
Consider the following simple example:
Example 7.6. We let m = n = 2 and c11 = c12 = c21 = 1, c22 = −1. It is not hard to see
under such setting,
w(G) =
max
ai ,bj ∈{±1}
X
i,j
1
1
π(i, j)cij ai bj = max (a1 b1 + a1 b2 + a2 b1 − a2 b2 ) = .
4
2
So the maximum payoff is 1/2 – it is impossible to coordinate perfectly to always win in this
game.
Rn×m with coefficients
Given a game of the form described above, introduce a matrix A ∈
Ai,j = π(i, j)cij . Then the maximum expected payoff for the players is
X
w(G) = max
Aij xi yj = OPT(A).
xi ,yj ∈{±1}
i,j
R
A
Conversely, for any A ∈ m×n , we can define cij = sgn(Aij ) and π(i, j) = P ij|Akl | to
k,l
transform any optimization problem of the form we’ve been considering into a game. In
particular, our results so far imply that finding the optimal strategy in such a game is NPhard (as it implies that one should be able to solve MAXCUT), but can also be approximated
within constant factor in polynomial time. This surprising connection, between quadratic
optimization and games, lies at the heart of many recent results in complexity theory (PCP
theorem, anyone?).
Remark 7.7. In general one may allow the players to use randomized strategies, including
both private and shared randomness, to select their answers. It is easy this cannot help in
general: if on average (over their random coins) they achieve a certain payoff, then there
must exist some choice of coins that lets them achieve at least the average payoff, and they
might as well fix this choice of coins as part of their strategy.
Now let’s consider quantum players. Compared to the classical setting, quantum players
have a new resource known as entanglement. This means that they can be in a joint quantum
state, |ψi, and by making measurements on their part of |ψi they can get answers that are
correlated in a way that could not happen classically.
In general, a measurement is an observable, which is a Hermitian matrix X ∈ d×d such
that X = X T , X 2 = I (in other words, all eigenvalues of X are ∈ {±1}). Any measurement
R
6
produces a outcome a ∈ {±1}. The laws of quantum mechanics state that if Alice and Bob
measure their own half of a certain state using X and Y , then the product of the outcomes
on expectation satisfies
1
E[a · b] = Tr(XY ) ∈ [−1, 1].
d
If we use d = 1, the above reduces to the classical setting. Quantum mechanics allows us to
explore higher dimension. Let’s consider the following example:
Example 7.8. If X = Y = I, then the expectation is 1. If X = I, Y = −I, then the
1 0
expectation is −1. If X = I, Y =
, then the expectation is 0. Beyond these simple
0 −1
examples, we can get richer things. Consider for instance
1
1 1
1 0
X1 =
Y1 = √
0 −1
2 1 −1 1
1 −1
0 1
X2 =
Y2 = √
1 0
2 −1 −1
√
√
Then 21 X1 · Y1 = 21 X1 · Y2 = 12 X2 · Y1 = 22 and 12 X2 · Y2 = − 22 . Such correlations are
impossible for classical players! How do we see this? Plugging in this “strategy” in our
example game from earlier, we see that the expected payoff achieved by the quantum players
is
√
√
2
2
1
·4·
=
≈ 0.73,
4
2
2
which is strictly larger than the value 1/2 that we proved was optimal for classical players.
From this example we see that quantum players strictly outperform their classical peers.
How well can they do? The optimal expected payoff for quantum players is given by
w∗ (G) =
sup
R
X
Xi ,Yj ∈ d×d i,j
Xi2 =Yj2 =I
Xi =XiT
Yj =YjT
Aij ·
1
Tr(Xi Yj ) ≤
d
sup
R2
X
Aij ~ui · ~vj = SDP(A).
~
ui ,~vj ∈ d
i,j
k~
ui k=k~vj k=1
This inequality holds because we can set ~ui = √1d vec(Xi ) and ~vj = √1d vec(Yj ), where vec(·)
returns the vector concatenating all the columns of the input matrix. Under such choice,
one can verify that
1
1
k~ui k = Xi · Xi = Tr(I) = 1
d
d
1
1
k~vi k = Yj · Yj = Tr(I) = 1
d
d
1
1
ui · vj = Xi · Yj = Tr(Xi Yj )
d
d
7
Amazingly, the converse inequality is also true: from vectors one can define observables,
hence the optimal value for the quantum players is exactly the optimum of the SDP. This is
quite a surprising connection, and it has been very helpful in the study of quantum games
and nonlocality. Some simple consequences are:
• The maximum expected payoff of quantum players can be computed efficiently (recall
that for classical players it is NP-hard),
• The best quantum strategy can be found efficiently, as the transformation from vectors
to observables is efficient,
• Quantum players can only achieve a payoff that is a constant factor larger than the
best classical.
8