Computation of Nash Equilibria in Bimatrix Games

Transcription

Computation of Nash Equilibria
in Bimatrix Games
7th Athens Colloquium on Algorithms and Complexity (ACAC 2012)
University of Athens, 27-28 August 2012
Spyros Kontogiannis
[email protected]
Computer Science Department
University of Ioannina
Computer Technology Institute
& Press ‘‘Diophantus’’
Monday, August 27 2012
Skeleton of the Talk
1
Bimatrx Games Preliminaries
Complexity of 2NASH
Formulations for 2NASH
The algorithm of Lemke & Howson
2
Polynomial-time Tractable Subclasses
3
Approximability of 2NASH
Theoretical Analysis
Experimental Study
4
Conclusions
S. Kontogiannis :
Tractability of NE in Bimatrix Games
[2 / 68]
Bimatrix Games: Representation
2 players: the ROW player and the column player.
m (n) alternative actions for ROW (col) player.
S. Kontogiannis :
[3 / 68]
Γ[k, j ] = (Rk,j , Ck,j ): The pair of payoffs for the two players, for
a given pair (profile) of actions (k, j ) ∈ [m] × [n].
S. Kontogiannis :
[3 / 68]
Representation: By the m × n bimatrix of normalized payoffs:
Γ = h R ∈ [0, 1]m×n , C ∈ [0, 1]m×n i
1
ooo
j
ooo
n
1
*,*
*,*
*,*
*,*
*,*
ooo
*,*
*,*
*,*
*,*
*,*
*,*
*,*
Rk,j,Ck,j
*,*
*,*
ooo
*,*
*,*
*,*
*,*
*,*
m
*,*
*,*
*,*
*,*
*,*
k
S. Kontogiannis :
[3 / 68]
Representation: By the m × n bimatrix of normalized payoffs:
Γ = h R ∈ [0, 1]m×n , C ∈ [0, 1]m×n i
ooo
j
ooo
n
1
*,*
*,*
*,*
*,*
*,*
ooo
*,*
*,*
*,*
*,*
*,*
*,*
*,*
Rk,j,Ck,j
*,*
*,*
ooo
Payoff to ROW
player when row
plays k and col
plays j...
1
*,*
*,*
*,*
*,*
*,*
m
*,*
*,*
*,*
*,*
*,*
k
Payoff to COL
player when row
plays k and col
plays j...
Size of representation: O(m × n) rational numbers.
S. Kontogiannis :
[3 / 68]
Uncorrelated Strategies (for the players)
S. Kontogiannis :
[4 / 68]
Each player chooses a (mixed /
pure) strategy (probability
distribution over) her action set.
1
ooo
j
ooo
n
*,*
*,*
*,*
*,*
ooo
*,*
*,*
*,*
*,*
*,*
k
*,*
*,*
Rk,j,Ck,j
*,*
*,*
ooo
S. Kontogiannis :
1
*,*
*,*
*,*
*,*
*,*
*,*
m
*,*
*,*
*,*
*,*
*,*
[4 / 68]
1
ooo
j
ooo
n
*,*
*,*
*,*
*,*
*,*
*,*
*,*
*,*
*,*
*,*
Rk,j,Ck,j
*,*
*,*
ooo
ooo
*,*
*,*
*,*
*,*
*,*
*,*
*,*
m
*,*
*,*
*,*
*,*
*,*
k
S. Kontogiannis :
1
[4 / 68]
ooo
j
ooo
n
*,*
*,*
*,*
*,*
*,*
*,*
*,*
*,*
*,*
k
*,*
*,*
Rk,j,Ck,j
*,*
*,*
ooo
S. Kontogiannis :
1
*,*
ooo
The adopted action per player is
determined randomly and
independently of other players’
behavior and her final choices of
actions, according to her
announced strategy.
1
*,*
*,*
*,*
*,*
*,*
m
*,*
*,*
*,*
*,*
*,*
[4 / 68]
The adopted action per player is
determined randomly and
independently of other players’
behavior and her final choices of
actions, according to her
announced strategy.
1
1
ooo
j
ooo
n
*,*
*,*
*,*
*,*
*,*
ooo
*,*
*,*
*,*
*,*
*,*
k
*,*
*,*
Rk,j,Ck,j
*,*
*,*
ooo
*,*
*,*
*,*
*,*
*,*
m
*,*
*,*
*,*
*,*
*,*
Each player is interested in optimizing the expected value of
her own payoff, as a function of the announced strategies
profile for both players (x, y):
I
ROW player: xT Ry =
I
column player: xT Cy =
P
i∈[m]
P
P
i∈[m]
j ∈[n] xi · Ri,j · yj .
P
j ∈[n] xi · Ci,j · yj .
S. Kontogiannis :
[4 / 68]
Correlated Strategies (for the mediator)
The players are allowed to coordinate their choices of actions,
via a credible coordinating device, the mediator.
S. Kontogiannis :
[5 / 68]
ooo
j
ooo
n
1
W1,1
*,*
W1,j
*,*
W1,n
ooo
*,*
*,*
*,*
*,*
*,*
Wk,1
*,*
Wk,j
*,*
Wk,n
1
S. Kontogiannis :
k
ooo
The mediator chooses an action profile
for both players, (k, j ) ∈ [m] × [n],
according to a correlated strategy, ie,
a probability distribution over the entire
set of action profiles: W ∈ ∆m×n .
*,*
m Wm,1
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
Probability of the
mediator proposing the
action profile (k,j)
[5 / 68]
1
ooo
j
ooo
n
W1,1
*,*
W1,j
*,*
W1,n
ooo
The mediator recommends via private channels the
corresponding action per player in the chosen profile, without
revealing it to the opponent. Each player freely chooses an
action, already knowing mediator’s recommendation to her.
1
*,*
*,*
*,*
*,*
*,*
Wk,1
*,*
Wk,j
*,*
Wk,n
k
ooo
*,*
m Wm,1
S. Kontogiannis :
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
Probability of the
[5 / 68]
1
ooo
j
ooo
n
W1,1
*,*
W1,j
*,*
W1,n
ooo
The mediator recommends via private channels the
corresponding action per player in the chosen profile, without
revealing it to the opponent. Each player freely chooses an
action, already knowing mediator’s recommendation to her.
1
*,*
*,*
*,*
*,*
*,*
Wk,1
*,*
Wk,j
*,*
Wk,n
k
ooo
*,*
m Wm,1
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
Probability of the
The expected value of the payoff to each player, for a
correlated strategy W of the mediator, is:
ROW player:
X X
Wk,j Rk,j column player:
k∈[m] j ∈[n]
X X
Wk,j Ck,j
k∈[m] j ∈[n]
S. Kontogiannis :
[5 / 68]
Uncorrelated vs Correlated Strategies
Any uncorrelated strategies-profile (x, y) ∈ ∆([m]) × ∆([n]) for
the players, induces an equivalent correlated strategy (wrt to
expected payoffs) for the mediator W = x · yT ∈ ∆([m] × [n]).
S. Kontogiannis :
[6 / 68]
Uncorrelated vs Correlated Strategies
Any uncorrelated strategies-profile (x, y) ∈ ∆([m]) × ∆([n]) for
the players, induces an equivalent correlated strategy (wrt to
expected payoffs) for the mediator W = x · yT ∈ ∆([m] × [n]).
Some correlated strategies cannot be decomposed into
equivalent uncorrelated strategies profiles for the players. Eg,
in the following (chicken) game, the correlated strategy
[(C, c) : 1/3, (C, d) : 1/3, (D, c) : 1/3, (D, d) : 0] is not
decomposable into a pair of independent strategies for the
players:
DARE
ROW
col
chicken
0,0
7,2
CHIKEN
dare
2,7
6,6
S. Kontogiannis :
[6 / 68]
Popular Solution Concepts (I)
DEFINITION: Approximations of Nash Equilibrium
For a normalized bimatrix game hR, Ci, an uncorrelated strategies
profile (x̄, ȳ) is:
S. Kontogiannis :
[7 / 68]
ε-approximate Nash equilibrium (ε-NE) iff no player can
improve her expected payoff more than an additive term of
ε ≥ 0, by changing unilaterally her strategy, against the given
strategy of the opponent: ∀x ∈ ∆([m]), ∀y ∈ ∆([n]),
x̄T Rȳ ≥ xT Rȳ − ε ∧ x̄T Cȳ ≥ x̄T Cy − ε
S. Kontogiannis :
[7 / 68]
ε-approximate Nash equilibrium (ε-NE) iff no player can
ε ≥ 0, by changing unilaterally her strategy, against the given
strategy of the opponent: ∀x ∈ ∆([m]), ∀y ∈ ∆([n]),
ε-well-supported approximate Nash equilibrium (ε-WSNE) iff
no player assigns positive probability mass to any of her
actions whose expected payoff against the given strategy of
the opponent is less than an additive term ε ≥ 0 from the
maximum expected payoff: ∀i ∈ [m], ∀j ∈ [n],
[x̄i > 0 → ei T Rȳ ≥ max(Rȳ) − ε] ∧ [ȳj > 0 → x̄T Cej ≥ max(x̄T C) − ε]
S. Kontogiannis :
[7 / 68]
Some Comments on Nash Equilibria
1
We focus on normalized games, so as to be able to compare
different algorithms wrt the quality of approximation they
achieve.
S. Kontogiannis :
[8 / 68]
1
achieve.
2
Any ε-WSNE is also a ε-NE, but the inverse is not necessarily
true.
S. Kontogiannis :
[8 / 68]
1
achieve.
2
true.
3
For ε = 0, ε-NE ↔ ε-WSNE: These are (exact) Nash
equilibria, where each player assigns all her probability mass to
actions that are payoff maximizers (best responses) against the
given strategy of the opponent.
S. Kontogiannis :
[8 / 68]
1
achieve.
2
true.
3
For ε = 0, ε-NE ↔ ε-WSNE: These are (exact) Nash
equilibria, where each player assigns all her probability mass to
actions that are payoff maximizers (best responses) against the
given strategy of the opponent.
4
Approximate Nash equilibria are invariant under shifts but are
affected by scalings of the payoff matrices. Exact Nash
equilibria are also invariant under positive scalings.
S. Kontogiannis :
[8 / 68]
Popular Solution Concepts (II)
DEFINITION: Correlated Equilibrium
A correlated strategy W̄ ∈ ∆([m] × [n]) (for the mediator) is a
correlated equilibrium iff no player can improve her expected payoff
by unilaterally ignoring the recommendation of the mediator, given
that the opponent will adopt her own recommendation by the
mediator: ∀i, k ∈ [m], ∀j, ` ∈ [n],
X
j ∈[n]
(Ri,j − Rk,j )W̄i,j ≥ 0
and
X
(Ci,j − Ci,` )W̄i,j ≥ 0
i∈[m]
S. Kontogiannis :
[9 / 68]
Popular Solution Concepts (II)
DEFINITION: Correlated Equilibrium
A correlated strategy W̄ ∈ ∆([m] × [n]) (for the mediator) is a
correlated equilibrium iff no player can improve her expected payoff
by unilaterally ignoring the recommendation of the mediator, given
that the opponent will adopt her own recommendation by the
mediator: ∀i, k ∈ [m], ∀j, ` ∈ [n],
X
j ∈[n]
(Ri,j − Rk,j )W̄i,j ≥ 0
and
X
(Ci,j − Ci,` )W̄i,j ≥ 0
i∈[m]
Remarks:
1 The space of correlated equilibria is a polytope!!!
2
The existence of a mediator, although helpful in having the
players’ actions coordinated, raises serious concerns wrt
implementation, such as trust, manipulability, objectivity, etc.
S. Kontogiannis :
[9 / 68]
Popular Solution Concepts (III)
DEFINITION: MAXMIN Strategies
A profile of (uncorrelated) strategies (x̄, ȳ) is MAXMIN profile iff:
x̄ ∈ arg maxx∈∆([m]) miny∈∆([n]) xT Ry = arg maxx∈∆([m]) minj∈[n] xT Rej
ȳ ∈ arg maxy∈∆([n]) minx∈∆([m]) xT Cy = arg maxy∈∆([n]) mini∈[m] ei T Cy
S. Kontogiannis :
[10 / 68]
Popular Solution Concepts (III)
DEFINITION: MAXMIN Strategies
A profile of (uncorrelated) strategies (x̄, ȳ) is MAXMIN profile iff:
x̄ ∈ arg maxx∈∆([m]) miny∈∆([n]) xT Ry = arg maxx∈∆([m]) minj∈[n] xT Rej
ȳ ∈ arg maxy∈∆([n]) minx∈∆([m]) xT Cy = arg maxy∈∆([n]) mini∈[m] ei T Cy
Remarks:
1
Efficiently computable via Linear Programming.
2
Extremely pessimistic predictions: It is possible that a MAXMIN
profile is ‘‘too far’’ from any notion of approximate equilibrium
(not only as points, but also wrt the approximation guarantee).
S. Kontogiannis :
[10 / 68]
AN EXAMPLE: Chicken Game
DARE
ROW
S. Kontogiannis :
col
chicken
0,0
7,2
CHIKEN
dare
2,7
6,6
[11 / 68]
DARE
ROW
col
chicken
0,0
7,2
CHIKEN
dare
2,7
6,6
Two pure Nash equilibria: (D, c) and (C, d).
S. Kontogiannis :
[11 / 68]
DARE
ROW
col
chicken
0,0
7,2
CHIKEN
dare
2,7
6,6
additional imixed
h Nash equilibrium:
i
1
2
D : 3 , C : 3 , d : 13 , c : 32 .
An
h
Expected payoff per player: 0 · 19 + 2 · 29 + 7 · 29 + 6 · 49 =
S. Kontogiannis :
14 .
3
[11 / 68]
DARE
ROW
col
chicken
0,0
7,2
CHIKEN
dare
2,7
6,6
additional imixed
h Nash equilibrium:
i
1
2
D : 3 , C : 3 , d : 13 , c : 32 .
An
h
14 .
3
One more (extreme) (which are the others?) correlated
equilibrium, external to the set conv(NE(R, C)):
h
i
(D, d) : 0, (D, c) : 31 , (C, d) : 13 , (C, c) : 13
Expected payoff per player: 2 · 13 + 7 · 13 + 6 · 13 = 5.
S. Kontogiannis :
[11 / 68]
DARE
ROW
col
chicken
0,0
7,2
CHIKEN
dare
2,7
6,6
additional imixed
h Nash equilibrium:
i
1
2
D : 3 , C : 3 , d : 13 , c : 32 .
An
h
14 .
3
h
i
(D, d) : 0, (D, c) : 31 , (C, d) : 13 , (C, c) : 13
MAXMIN profile?
S. Kontogiannis :
[11 / 68]
DARE
ROW
col
chicken
0,0
7,2
CHIKEN
dare
2,7
6,6
additional imixed
h Nash equilibrium:
i
1
2
D : 3 , C : 3 , d : 13 , c : 32 .
An
h
14 .
3
h
i
(D, d) : 0, (D, c) : 31 , (C, d) : 13 , (C, c) : 13
MAXMIN profile? (C, c) with payoff 6 per player.
S. Kontogiannis :
[11 / 68]
Complexity of Equilibria
S. Kontogiannis :
[12 / 68]
Hardness of Computing Equilibria
How hard is to compute an equilibrium in a bimatrix game?
S. Kontogiannis :
[13 / 68]
Correlated equilbria: Their space is a polytope described by a
polynomial number of constraints.


:


 W̄ ∈ ∆([m] × [n])

P
∀i, k ∈ [m],
CE(R, C) =
j ∈[n] (Ri,j − Rk,j )W̄i,j ≥ 0


P


∀j, ` ∈ [n],
i∈[m] (Ci,j − Ci,` )W̄i,j ≥ 0
S. Kontogiannis :
[13 / 68]


:


 W̄ ∈ ∆([m] × [n])

P
∀i, k ∈ [m],
CE(R, C) =
j ∈[n] (Ri,j − Rk,j )W̄i,j ≥ 0


P


∀j, ` ∈ [n],
i∈[m] (Ci,j − Ci,` )W̄i,j ≥ 0
MAXMIN profiles: Again polynomial-time solvable, via Linear
Programming.
x̄ ∈ arg maxx∈∆([m]) minj∈[n] xT Rej
MAXMIN(R, C) = (x̄, ȳ) :
ȳ ∈ arg maxy∈∆([n]) mini∈[m] ei T Cy
S. Kontogiannis :
[13 / 68]


:


 W̄ ∈ ∆([m] × [n])

P
∀i, k ∈ [m],
CE(R, C) =
j ∈[n] (Ri,j − Rk,j )W̄i,j ≥ 0


P


∀j, ` ∈ [n],
i∈[m] (Ci,j − Ci,` )W̄i,j ≥ 0
MAXMIN profiles: Again polynomial-time solvable, via Linear
Programming.
x̄ ∈ arg maxx∈∆([m]) minj∈[n] xT Rej
MAXMIN(R, C) = (x̄, ȳ) :
ȳ ∈ arg maxy∈∆([n]) mini∈[m] ei T Cy
Nash equilibria: Harder case, PPAD−hard problem, even for
bimatrix games, or arbitrarily good approximations!!!
S. Kontogiannis :
[13 / 68]
Solving Bimatrix Games
Determination of One / All Nash Equilibria
We are interested in the computation, if possible in time
polynomial in the representation of the game and / or the
produced output, of the following problems:
2NASH:
I
INPUT: A bimatrix game with non-negative rational payoff
matrices: R, C ∈ Qm≥×0 n .
I
OUTPUT: Any (exact) Nash equilibrium, ie, any point of
NE(R, C).
ALL2NASH:
I
I
OUTPUT: All the extreme Nash equilibria, that determine
NE(R, C).
S. Kontogiannis :
[14 / 68]
Solving Bimatrix Games
Determination of One / All Nash Equilibria
We are interested in the computation, if possible in time
polynomial in the representation of the game and / or the
produced output, of the following problems:
2NASH:
I
I
OUTPUT: Any (exact) Nash equilibrium, ie, any point of
NE(R, C).
S. Kontogiannis :
[14 / 68]
Complexity of 2NASH
[Nash (1950)]
: Existence of NE points, for finite games.
[Kuhn (1961), Mangasarian (1964), Lemke-Howson (1964), Rosenmüller (1971),
Wilson (1971), Scarf (1967), Eaves (1972), Laan-Talman (1979), van den
Elzen-Talman (1991), ...] : Algorithms for 2NASH. None of them
is provably of polynomial complexity.
[Savani-von Stengel (2004)] : LH may take an exponential number
of pivots, to converge to a NE point, for any initial choice of
the label to be rejected.
[Goldberg-Papadimitriou-Savani (2011)] : Any algorithm for 2NASH
based on the homotopy method, has no hope of being
polynomial-time, (unless P = PSPACE ).
[CD / DGP (2006)] : PPAD−completeness of kNASH, ∀k ≥ 2.
[Gilboa-Zemel (1989), Conitzer-Sandholm (2003)] : N P−complete to
determine special NE points (eg, ∃ more than one NE points?
∃NE with a lower bound on the payoff of one player? . . .)
S. Kontogiannis :
[15 / 68]
Formulations of 2NASH
S. Kontogiannis :
[16 / 68]
Formulations for Nash Equilibrium Sets (I)
As a Linear Complementarity Problem (LCP)
The set NE(R, C) of Nash equilbria of the bimatrix game
hR, Ci can be expressed as the space of non-zero feasible
solutions to a Linear Complementarity Problem, after proper
scaling so that its points become probability-distribution pairs:
NE(R, C) ≈ LCP(M, q) − {0}
where: R, C
×n
∈ Rm
,
>0
LCP(M, q) =
M=
(w, z) :
O −R
, q=
−CT
O
1m
, and
1n
q + Mz = w ≥ 0; z ≥ 0; wT z = 0
Remark: The positivity of the payoff matrices is not a
substantial constraint, due to invariance under shifting of
NE(R, C).
S. Kontogiannis :
[17 / 68]
Formulations for Nash Equilibrium Sets (II)
As a Quadratic Programming Problem (QP)
The space of Nash equilibria can be expressed as the space of
optimal solutions in a Quadratic Programming Problem. Eg:
[Mangasarian-Stone (1964)]
(MS)
minimize (r − xT Ry) + (c − xT Cy)
s.t. r · 1 − Ry ≥ 0
c · 1 − CT x ≥ 0
x ∈ ∆m
y ∈ ∆n
S. Kontogiannis :
[18 / 68]
Formulations for Nash Equilibrium Sets (II)
As a Quadratic Programming Problem (QP)
The space of Nash equilibria can be expressed as the space of
optimal solutions in a Quadratic Programming Problem. Eg:
[Mangasarian-Stone (1964)]
(MS)
minimize (r − xT Ry) + (c − xT Cy)
s.t. r · 1 − Ry ≥ 0
c · 1 − CT x ≥ 0
x ∈ ∆m
y ∈ ∆n
Remark: The objective function is the sum of two upper
bounds on two players’ regrets against the given strategy of
the opponent:
ROW player: regI (x, y) = max(Ry) − xT Ry
column player: regII (x, y) = max(CT x) − xT Cy
S. Kontogiannis :
[18 / 68]
Formulations of Nash Equilibrium Sets (III)
An alternative, parameterized QP formulation for NE(R, C):
Marginal distributions of a correlated
strategy W: ∀k ∈ [m], ∀j ∈ [n],
X
Wk,`
`∈[n]
X
Wk,j
k∈[m]
S. Kontogiannis :
*,*
xk Wk,1
ooo
yj (W) =
x1 W1,1
ooo
xk (W) =
y1
*,*
xm Wm,1
ooo
yj
ooo
yn
*,*
W1,j
*,*
W1,n
*,*
*,*
*,*
*,*
*,*
Wk,j
*,*
Wk,n
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
[19 / 68]
X
Wk,`
`∈[n]
X
Wk,j
k∈[m]
THEOREM:
*,*
xk Wk,1
ooo
yj (W) =
x1 W1,1
ooo
xk (W) =
y1
*,*
xm Wm,1
ooo
yj
ooo
yn
*,*
W1,j
*,*
W1,n
*,*
*,*
*,*
*,*
*,*
Wk,j
*,*
Wk,n
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
[Kontogiannis-Spirakis (2010)]
NE(R, C) is the set of marginal distributions for all the optimal
solutions in the following parameterized quadratic program, for
any choice of the parameter λ ∈ (0, 1) and Z(λ) = λR + (1 − λ)C:
T
minimize
i j Wi,j Z(λ)i,j − x(W) Z(λ)y(W)
s.t. W ∈ CE(R, C)
PP
(KS(λ))
S. Kontogiannis :
[19 / 68]
X
Wk,`
`∈[n]
X
Wk,j
k∈[m]
THEOREM:
*,*
xk Wk,1
ooo
yj (W) =
x1 W1,1
ooo
xk (W) =
y1
*,*
xm Wm,1
ooo
yj
ooo
yn
*,*
W1,j
*,*
W1,n
*,*
*,*
*,*
*,*
*,*
Wk,j
*,*
Wk,n
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
T
minimize
s.t. W ∈ CE(R, C)
PP
(KS(λ))
S. Kontogiannis :
[19 / 68]
X
Wk,`
`∈[n]
X
Wk,j
k∈[m]
THEOREM:
*,*
xk Wk,1
ooo
yj (W) =
x1 W1,1
ooo
xk (W) =
y1
*,*
xm Wm,1
ooo
yj
ooo
yn
*,*
W1,j
*,*
W1,n
*,*
*,*
*,*
*,*
*,*
Wk,j
*,*
Wk,n
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
T
minimize
s.t. W ∈ CE(R, C)
PP
(KS(λ))
S. Kontogiannis :
[19 / 68]
X
Wk,`
`∈[n]
X
Wk,j
k∈[m]
THEOREM:
*,*
xk Wk,1
ooo
yj (W) =
x1 W1,1
ooo
xk (W) =
y1
*,*
xm Wm,1
ooo
yj
ooo
yn
*,*
W1,j
*,*
W1,n
*,*
*,*
*,*
*,*
*,*
Wk,j
*,*
Wk,n
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
T
minimize
s.t. W ∈ CE(R, C)
PP
(KS(λ))
S. Kontogiannis :
[19 / 68]
X
Wk,`
`∈[n]
X
Wk,j
k∈[m]
THEOREM:
*,*
xk Wk,1
ooo
yj (W) =
x1 W1,1
ooo
xk (W) =
y1
*,*
xm Wm,1
ooo
yj
ooo
yn
*,*
W1,j
*,*
W1,n
*,*
*,*
*,*
*,*
*,*
Wk,j
*,*
Wk,n
*,*
*,*
*,*
*,*
*,*
Wm,j
*,*
Wm,n
T
minimize
s.t. W ∈ CE(R, C)
PP
(KS(λ))
S. Kontogiannis :
[19 / 68]
Lemke & Howson Algorithm
S. Kontogiannis :
[20 / 68]
A Combinatorial Algorithm for 2NASH (I)
LH Algorithm [Lemke-Howson (1964)]
Based on pivoting (like Simplex for LPs).
S. Kontogiannis :
[21 / 68]
Exploits the best response polyhedra of the game, that are
also used in the LCP formulation of NE(R, C), if we ignore the
complementarity conditions:
n
(y, u) : 1u − Ry ≥ 0;
n
(x, v) : 1v − CT x ≥ 0;
=
o
P̄ =
1T y = 1; y ≥ 0
Q̄
1T x = 1; x ≥ 0
S. Kontogiannis :
o
[21 / 68]
P = {ψ : 1 − Rψ ≥ 0; ψ ≥ 0}
n
o
Q = χ : 1 − CT χ ≥ 0 ; χ ≥ 0
ASSUMPTION (wlog): The payoff matrices are positive.
S. Kontogiannis :
[21 / 68]
P = {ψ : 1 − Rψ ≥ 0; ψ ≥ 0}
n
o
Q = χ : 1 − CT χ ≥ 0 ; χ ≥ 0
ASSUMPTION (wlog): The payoff matrices are positive.
Labels: ∀(χ, ψ) ∈ Q × P,
L(χ, ψ) =
{i ∈ [m] : χi = 0} ∪ (m + {j ∈ [n] : (CT χ)j = 1})
S
{i ∈ [m] : (Rψ)i = 1} ∪ (m + {j ∈ [n] : ψj = 0})
S. Kontogiannis :
[21 / 68]
A Combinatorial Algorithm for 2NASH (II)
Crucial Observation: A profile of strategies (x̄, ȳ) is Nash
equilibrium iff the corresponding (non-zero) point (χ̄, ψ̄) ∈ Q × P
is completely labeled: all actions appear as labels in L(χ, ψ).
For non-degenerate games: Only pair of vertices in Q × P
may be completely labeled.
S. Kontogiannis :
[22 / 68]
A Combinatorial Algorithm for 2NASH (II)
Crucial Observation: A profile of strategies (x̄, ȳ) is Nash
equilibrium iff the corresponding (non-zero) point (χ̄, ψ̄) ∈ Q × P
is completely labeled: all actions appear as labels in L(χ, ψ).
For non-degenerate games: Only pair of vertices in Q × P
may be completely labeled.
Pseudocode of LH (for non-degenerate games)
1. Initialization: Starting from the completely labeled point (0, 0)
(artificial equilibrium), pivot-in (ie, label-out) an arbitrary label
from (0, 0).
2. Pivot-in (ie, label-out) the unique double-label, at the ‘‘other’’
polyhedron.
3. if the uniquely missing label is pivoted-out (labeled-in)
4. then return the new, completely labeled point (χ, ψ).
5. else goto step 2.
S. Kontogiannis :
[22 / 68]
EXAMPLE: Execution of LH
4
5
6
1 4 , 6 4 , 12 4 , 0
2 0,0
0,4
6,0
3 5,8
0 , 0 0 , 13
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
100
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
ΑΡΧΙΚΟΠΟΙΗΣΗ
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 1
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 2
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 3
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 4
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 5
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 6
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 7
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
BHMA 8
S. Kontogiannis :
[23 / 68]
ROW
COL
000
000
1 2 3
4
5
4 5 6
6
001
1 4 , 6 4 , 12 4 , 0
2 4 5
0,4
6,0
3 5,8
0 , 0 0 , 13
2 4 6
4/7
0
2 4 5 3/7
5/11
0
6/11
4
12/13
10/31
9/31
4 5 6
5
0
13/17
4/17
1 5 6
1/3
0
2/3
4/5
0
1/5
1 3 5
3
010
3
100
1 3 5
2
1 2 5
6
100
2 3 5
4
5
1
2 0,0
001
1 2 6
2
3 5 6
0
1/3
2/3
1 2 4
1
4/5
1/5
0 1 3 6
010
6
1 4 6
ΕΞΟ∆ΟΣ
S. Kontogiannis :
[23 / 68]
Why LH Works (for non-degenerate games)
The artificial equilibrium (0, 0) is completely labeled.
The current pair of points, during the execution of LH, has
exactly one missing label and two adjacent edges (one per
polyhedron), for labeling-out each of the only two copies of
the unique double-label.
The ‘‘other ends’’ of these two edges lead to pairs of vertices
with at most one double-label. The only missing label (if any)
is always the one initially labeled-out.
Each edge is traversed towards new vertices, so that no cycles
may appear.
Termination: Starting from one end of a finite path, we move
towards the other end of this unique path, that is necessarily a
completely-labeled pair of vertices.
S. Kontogiannis :
[24 / 68]
1
Complexity of 2NASH
2
3
Experimental Study
4
Conclusions
S. Kontogiannis :
[25 / 68]
Polynomial-time Tractable Classes wrt 2NASH
Zero-sum Games: Any profile of strategies is Nash equilibrium
iff it is a MAXMIN profile
I
I
I
: Existence proof for NE in zero-sum
bimatrix games, based on a Fixed Point argument.
[Dantzig (1947)] : MAXMIN profiles for bimatrix games are
equivalent to a pair of primal-dual linear programs.
[Khachiyan (1979), Karmakar (1984)] : Polynomial-tractability of LP.
[J. von Neumann (1928)]
Constant-rank Games: rank(R + C) = k
: Existence of FPTAS for constructing
ε-NE points.
[Kannan-Theobald (2007)]
Rank-1 Games: rank(R + C) = 1
: Polynomial-time algorithm for
determining exact Nash equilibria.
[Adsul-Garg-Mehta-Sohoni (2011)]
(Very) Sparse Games & Games with Pure Equilibria: Relatively
easy cases.
S. Kontogiannis :
[26 / 68]
Other Tractable Classes?
Constant-sum games are poly-time solvable. Constant-rank
games admit a FPTAS.
S. Kontogiannis :
[27 / 68]
Other Tractable Classes?
Constant-sum games are poly-time solvable. Constant-rank
games admit a FPTAS.
Another subclass of poly-time solvable games:
Mutually-concave games.
S. Kontogiannis :
[27 / 68]
The Class of Mutually Concave Games
DEFINITION: Mutually Concave (MC) Games
A bimatrix game hR, Ci is mutually concave, iff ∃λ ∈ (0, 1) s.t. for
Z(λ) = λR + (1 − λ)C, the function Hλ (x, y) ≡ xT Z(λ)y is concave.
S. Kontogiannis :
[28 / 68]
Poly-time Solvability of 2NASH for MC Games
THEOREM:
For arbitrary normalized bimatrix MC game hR, Ci, with rational
payoff matrices, there exists a (unique, except for some trivial
cases) rational number λ∗ ∈ (0, 1) s.t. for Z(λ) = λ∗ R + (1 − λ∗ )C
the following quadratic program is convex, and therefore
polynomial-time solvable:
∗
T
∗
minimize
i j Wi,j Z(λ )i,j − x(W) Z(λ )y(W)
s.t. W ∈ CE(R, C)
PP
S. Kontogiannis :
[29 / 68]
THEOREM:
minimize λ∗ · (r − xT Ry) + (1 − λ∗ ) · (c − xT Cy)
s.t. r · 1 − Ry ≥ 0
c · 1 − CT x ≥ 0
x ∈ ∆m
y ∈ ∆n
S. Kontogiannis :
[29 / 68]
THEOREM:
s.t. r · 1 − Ry ≥ 0
c · 1 − CT x ≥ 0
x ∈ ∆m
y ∈ ∆n
Are we done?
S. Kontogiannis :
[29 / 68]
THEOREM:
s.t. r · 1 − Ry ≥ 0
c · 1 − CT x ≥ 0
x ∈ ∆m
y ∈ ∆n
Are we done?
NOT YET!!!
It is crucial to be able to recognize in polynomial time whether
a bimatrix game belongs to the MC class.
S. Kontogiannis :
[29 / 68]
Recognition of Poly-time Solvable Bimatrix Classes
Trivial issue for a game hR, Ci that...
I
...has constant-sum payoffs.
I
...has constant rank.
I
...possesses a pure Nash equilibrium.
I
...is a very sparse win-lose game.
S. Kontogiannis :
[30 / 68]
I
I
I
I
Is there a way to detected mutual concavity also in polynomial
time?
S. Kontogiannis :
[30 / 68]
I
I
I
I
Is there a way to detected mutual concavity also in polynomial
time?
YES!!!
S. Kontogiannis :
[30 / 68]
Characterizations of Mutual Concavity
PROPOSITION:
A bimatrix game hR, Ci is mutually concave iff any of the following
properties holds:
1 ∃λ ∈ (0, 1) : ξ T Z(λ)ψ = 0, for any pair of directions of change
in strategies ξ, ψ for the two players (ie, such that
1T ξ = 1T ψ = 0).
2
∃λ ∈ (0, 1) : Z(λ) · ψ = 1 · c, for an arbitrary constant c and any
direction of change, ψ ∈ Rn : 1T ψ = 0, for the strategy of the
column player.
S. Kontogiannis :
[31 / 68]
PROPOSITION:
properties holds:
1T ξ = 1T ψ = 0).
2
column player.
COROLLARY: Constant-Sum Games & MC Class
Every constant-sum game hA, −A + c · 1 · 1T i is mutually concave.
S. Kontogiannis :
[31 / 68]
PROPOSITION:
properties holds:
1T ξ = 1T ψ = 0).
2
column player.
COROLLARY: Constant-Sum Games & MC Class
Every constant-sum game hA, −A + c · 1 · 1T i is mutually concave.
Question: How do we detect the existence (or not) of the
proper λ∗ -value?
S. Kontogiannis :
[31 / 68]
Mutual Concavity for 2 × 2 Games
PROPOSITION:
For any 2 × 2 game hA, Bi, let a = A1,1 + A2,2 − A1,2 − A2,1 and
b = B1,1 + B2,2 − B1,2 − B2,1 . Then, hA, Bi is mutually concave iff:
a = b = 0 ∨ min {a, b} < 0 < max {a, b}
If a, b 6= 0, then the unique value λ∗ =
mutual concavity of the game.
S. Kontogiannis :
−b
a−b
proves (if it holds) the
[32 / 68]
Mutual Concavity for 2 × 2 Games
PROPOSITION:
For any 2 × 2 game hA, Bi, let a = A1,1 + A2,2 − A1,2 − A2,1 and
b = B1,1 + B2,2 − B1,2 − B2,1 . Then, hA, Bi is mutually concave iff:
a = b = 0 ∨ min {a, b} < 0 < max {a, b}
If a, b 6= 0, then the unique value λ∗ =
mutual concavity of the game.
−b
a−b
proves (if it holds) the
COROLLARY: A Necessary Condition for MC
If an m × n bimatrix game hA, Bi is mutually concave, then the
following must hold: ∃λ∗ ∈ (0, 1) : ∀1 ≤ i < k ≤ m, ∀1 ≤ j <` ≤ n,
[ aik,j` = bik,j` = 0 ]
h
∨
max{aik,j` , bik,j` } > 0 > min{aik,j` , bik,j` }
S. Kontogiannis :
∧
λ∗ =
−bik,j`
aik,j` −bik,j`
i
[32 / 68]
Examples of MC / non-MC Games
col
(b) MC version of PD
game
0,0
1,5
(d) Battle of Sexes.
(e) Non-MC version of
PD game.
S. Kontogiannis :
TOP
BOTTOM
δ1 , δ2 ε1 , ε2
(c) MC-game ∀γ > −1,
arbitrary δ1 , δ2 , ε1 and
ε2 be their function.
dare
-5 , -5 0 , -10
-10 , 0 -1 , -1
3,0
col
chicken
DARE
0,0
PRISONER 1
5,1
SILENT BETRAY
SHE
BASKET? OK THEATER !!!
Non-MCgames
basket !!!
1,1
prisoner 2
betray silent
he
theater? ok
0 , 10 6 , 6
ROW
(a) MC-game, ∀γ > −1.
4 , 4 10 , 0
2 , 1 1,1+γ
0,0
7,2
CHIKEN
3,0
PRISONER 1
1,1
SILENT BETRAY
TOP
ROW
2 , 1 1,1+γ
BOTTOM
MCgames
right
ROW
prisoner 2
betray silent
col
left
right
MIDDLE
left
2,7
6,6
(f) Chicken Game.
[33 / 68]
Checking Mutual Concavity in Poly-time
ALGORITHM: Detecting MC for non-trivial games
INPUT:
(R, C) ∈ Rm×n .
(1)
Check all 2 × 2 subgames of (R, C), for the induced
λ−values.
(2)
if all 2 × 2 subgames have zero a− and b−values
(3)
then return (‘‘there is PNE’’)
(4)
if there is a unique induced λ∗ −value by all 2 × 2 subgames
with nonzero a− and b−values
(6)
O
Z(λ∗ )
then if Z(λ∗ )T
is negative semidefinite
O
then return ( ‘‘MC-game’’ )
(7)
return ( ‘‘non-MC-game’’ )
(5)
S. Kontogiannis :
[34 / 68]
MC-games vs. Fixed-Rank Games
prisoner 2
S. Kontogiannis :
PRISONER 1
betray
SILENT BETRAY
Even rank-1 games may not be
MC-games.
silent
-5 , -5 0 , -10
-10 , 0 -1 , -1
[35 / 68]
MC-games vs. Fixed-Rank Games
prisoner 2
Even rank-1 games may not be
MC-games.
PRISONER 1
SILENT BETRAY
betray
silent
-5 , -5 0 , -10
-10 , 0 -1 , -1
There exist MC games that have full rank. Eg, for:




Z=




1 2
0 1
2 3
−1
0
3 4
−2 −1
4 5
4 8 16 32
3 7 15 31
5 9 17 33
2 6 14 30
6 10 18 34
1 5 13 29
7 11 19 35
64
63
65
62
66
61
67









with rank(Z) = 2, the game hR, Ci with R = I7 and
C = 34 Z − 31 R is indeed (cf. next slide’s characterization) an
MC-game, but it has rank(R + C) = 7.
S. Kontogiannis :
[35 / 68]
MC Games vs. Strategically Zero-Sum Games
PROPOSITION:
For any m, n ≥ 2 and payoff matrices R, C ∈ Rm×n , the game hR, Ci
is an MC-game iff: ∃λ ∈ (0, 1), ∃a ∈ Rm , ∃d = [0, d2 , . . . , dn ]T ∈ Rn :
∀j ∈ [n], Z(λ)[*, j ] = −dj · 1 + a
S. Kontogiannis :
[36 / 68]
MC Games vs. Strategically Zero-Sum Games
PROPOSITION:
For any m, n ≥ 2 and payoff matrices R, C ∈ Rm×n , the game hR, Ci
is an MC-game iff: ∃λ ∈ (0, 1), ∃a ∈ Rm , ∃d = [0, d2 , . . . , dn ]T ∈ Rn :
∀j ∈ [n], Z(λ)[*, j ] = −dj · 1 + a
Some Remarks:
1
2
3
The MC-class matches the class of strategically-zero-sum
(SZS) games of [Moulin-Vial (1978)] .
Characterizing the SZS-property
implies the solution of a large
linear program, in time O n6 .
Characterizing the MC-property implies the solution of a much
smaller quadratic program, in time O n4 .
S. Kontogiannis :
[36 / 68]
1
Complexity of 2NASH
2
3
Experimental Study
4
Conclusions
S. Kontogiannis :
[37 / 68]
Reminder...
DEFINITION: Approximate Nash Equilibria
For a normalized bimatrix game hR, Ci, a profile of (uncorrelated)
strategies (x̄, ȳ) is:
ε−approximate Nash equilibrium (ε-NE) iff no player can
ε ≥ 0 by unilaterally changing her strategy, against the given
strategy of the opponent: ∀x ∈ ∆m , ∀y ∈ ∆n ,
ε−well supported approximate Nash equilibrium (ε-WSNE) iff
no player assigns positive probability mass to actions that are
less than an additive term ε ≥ 0 than the maximum payoff she
may get against the given strategy of the opponent:
∀i ∈ [m], ∀j ∈ [n],
[x̄i > 0 → ei T Rȳ ≥ max(Rȳ) − ε] ∧ [ȳj > 0 → x̄T Cej ≥ max(x̄T C) − ε]
S. Kontogiannis :
[38 / 68]
S. Kontogiannis :
[39 / 68]
: Subexponential-time
−2
approximation scheme for ε-WSNE, in time nO( ·log n) .
[Althöfer (1994) / Lipton-Markakis-Mehta (2003)]
S. Kontogiannis :
[39 / 68]
−2
Polynomial-time Approximation Algorithms for 2NASH?
S. Kontogiannis :
[39 / 68]
−2
ε-NE:
I
I
I
I
I
I
[Kontogiannis-Panagopoulou-Spirakis (2006)] : 0.75
[Daskalakis-Papadimitriou-Mehta (2006)] : 0.5
[Daskalakis-Papadimitriou-Mehta (2007)] : ∼ 0.38
[Bosse-Byrka-Markakis (2007)] : ∼ 0.36
[Spirakis-Tsaknakis (2007)] : ∼ 0.3393
[Kontogiannis-Spirakis (2011)] : ∼ 1/3 + δ, for any
constant δ > 0 and symmetric games.
S. Kontogiannis :
[39 / 68]
−2
ε-NE:
I
I
I
I
I
I
ε-WSNE:
I
I
[Kontogiannis-Spirakis (2007)] : 0.667
[Fearnley-Goldberg-Savani-Sørensen (2012)]
S. Kontogiannis :
: < 0.667
[39 / 68]
−2
ε-NE:
I
I
I
I
I
I
ε-WSNE:
I
I
[Kontogiannis-Spirakis (2007)] : 0.667
[Fearnley-Goldberg-Savani-Sørensen (2012)]
: < 0.667
Question: Can we break any of the bounds, 1/3 for ε-NE and
2/3 for ε-WSNE? Is there a PTAS for either case?
S. Kontogiannis :
[39 / 68]
−2
ε-NE:
I
[Kontogiannis-Spirakis (2011)] : ∼ 1/3 + δ,
S. Kontogiannis :
for any
[39 / 68]
Theoretical Analysis of
SYMMETRIC-2NASH
Approximations
S. Kontogiannis :
[40 / 68]
About Symmetric Bimatrix Games...
n × n games hR, Ci, where the players may exchange roles (ie,
the payoff matrices are: R = S, C = ST .
: Every finite symmetric game has a symmetric
Nash equilibrium (in which all players adopt the same strategy).
[Nash (1950)]
For symmetric strategy profiles in symmetric bimatrix games,
the players have common expected payoffs and also common
expected payoff vectors, against the opponent’s strategy.
A formalism for
For
SYMMETRIC-2NASH:
T O
S
,
x = zz , Q =
S O
(SMS)
minimize
s.t.:
f (s, z) = s − zT Sz = s − 12 xT Qx
−1s +
Sz
≤ 0
T
−
1 z +1 = 0
s ∈ R,
z ∈ Rn≥0
S. Kontogiannis :
[41 / 68]
Necessary Optimality (KKT) Conditions for (SMS)
∇f (s̄, z̄) =
(KKTSMS)
0≤
1
−Sz̄−ST z̄
=
1T w̄
T
−S w̄+ū+1ζ̄
w̄T · 1s̄−Sz̄ ≤
z̄
ū
δ
s̄ ∈ R, Sz̄ ≤ 1s̄, 1T z̄ = 1, z̄ ≥ 0
w̄ ≥ 0, ζ̄ ∈ R, ū ≥ 0
δ-KKT Points of (SMS): Feasible solutions (s̄, z̄, w̄, ζ̄, ū) for
(KKTSMS).
Some Remarks:
1
2
The Lagrange multiplier w̄ is an alternative strategy for the
players, ie, a point from ∆([n]).
w̄ is also a δ−approximate best response of the ROW player
against the given strategy z̄ of the opponent: s̄ ≤ w̄T Sz̄ + δ.
S. Kontogiannis :
[42 / 68]
Computing (approximate) KKT Points of QPs
Exact computation of a KKT point of Quadratic Programs:
N P -hard problem.
S. Kontogiannis :
[43 / 68]
N P -hard problem.
THEOREM: Approximate KKT Points in QP
[Ye (1998)]
There is a FPTAS for computing δ−KKT points of a n-varialbe
Quadratic Program, in time:
"
O
# !
n6
1
1
log
+ n4 log(n) · log log
+ log(n)
δ
δ
δ
S. Kontogiannis :
[43 / 68]
N P -hard problem.
THEOREM: Approximate KKT Points in QP
[Ye (1998)]
There is a FPTAS for computing δ−KKT points of a n-varialbe
Quadratic Program, in time:
"
O
# !
n6
1
1
log
+ n4 log(n) · log log
+ log(n)
δ
δ
δ
Question: What is the quality as a Nash approximation of a
δ−KKT point?
S. Kontogiannis :
[43 / 68]
A Fundamental Property of (KKTSMS) – (I)
LEMMA:
[Kontogiannis-Spirakis (SEA 2011)]
m, n ≥ 2, S ∈ [0, 1]m×n , and any
For any
point
(max(Sz̄), z̄, w̄, ū, ζ̄ ) ∈ (KKTSMS) the following properties hold:
1 ζ̄ = f (z̄) − z̄T S z̄.
2 2f (z̄) = w̄T S w̄ − z̄T S w̄ − w̄T ū.
3 2f (z̄) + f (w̄) = RI (z̄, w̄) − w̄T ū.
S. Kontogiannis :
[44 / 68]
A Fundamental Property of (KKTSMS) – (I)
LEMMA:
[Kontogiannis-Spirakis (SEA 2011)]
m, n ≥ 2, S ∈ [0, 1]m×n , and any
For any
point
(max(Sz̄), z̄, w̄, ū, ζ̄ ) ∈ (KKTSMS) the following properties hold:
1 ζ̄ = f (z̄) − z̄T S z̄.
2 2f (z̄) = w̄T S w̄ − z̄T S w̄ − w̄T ū.
3 2f (z̄) + f (w̄) = RI (z̄, w̄) − w̄T ū.
Some Remarks:
f (z̄) = max(Sz̄) − z̄T Sz̄ is either player’s regret, for the
symmetric profile (z̄, z̄).
RI (z̄, w̄) = max(Sw̄) − z̄T Sw̄ ≤ 1 is (only) the row player’s
regret, for the asymmetric profile (w̄, z̄).
The third property assures that any (exact) KKT point (is not
necessarily itself, but) indicates a 1/3−NE of hS, ST i, in
normalized games.
S. Kontogiannis :
[44 / 68]
A Fundamental Property of (KKTSMS) – (II)
Proof of the Lemma
−Sz̄ − ST z̄ = −ST w̄ + ū + 1ζ̄

−z̄T Sz̄ − z̄T ST z̄ = −z̄T ST w̄ + |{z}
z̄T ū + |{z}
z̄T 1 ζ̄



=0
=1
⇒
T Sz̄ − w̄T ST z̄ = −w̄T ST w̄ + w̄T ū + w̄T 1 ζ̄

−
w̄

| {z }

=1
(
ζ̄ = −2z̄T Sz̄ + z̄T ST w̄ = f (z̄) − z̄T Sz̄
⇒
ζ̄ = −w̄T Sz̄ − z̄T Sw̄ + w̄T Sw̄ − w̄T ū
(
ζ̄ = f (z̄) − z̄T Sz̄
⇒
2f (z̄) = −2z̄T Sz̄ + 2w̄T Sz̄ = −z̄T Sw̄ + w̄T Sw̄ − w̄T ū
S. Kontogiannis :
[45 / 68]
A Fundamental Property of (KKTSMS) – (II)
Proof of the Lemma
−Sz̄ − ST z̄ = −ST w̄ + ū + 1ζ̄

−z̄T Sz̄ − z̄T ST z̄ = −z̄T ST w̄ + |{z}
z̄T ū + |{z}
z̄T 1 ζ̄



=0
=1
⇒
T Sz̄ − w̄T ST z̄ = −w̄T ST w̄ + w̄T ū + w̄T 1 ζ̄

−
w̄

| {z }

=1
(
ζ̄ = −2z̄T Sz̄ + z̄T ST w̄ = f (z̄) − z̄T Sz̄
⇒
ζ̄ = −w̄T Sz̄ − z̄T Sw̄ + w̄T Sw̄ − w̄T ū
(
ζ̄ = f (z̄) − z̄T Sz̄
⇒
2f (z̄) = −2z̄T Sz̄ + 2w̄T Sz̄ = −z̄T Sw̄ + w̄T Sw̄ − w̄T ū
We add f (w̄) = max(Sw̄) − w̄T Sw̄ to both sides of the
equation:
2f (z̄) + f (w̄) = max(Sw̄) − w̄T Sw̄ − z̄T Sw̄ + w̄T Sw̄ − w̄T ū
⇒
3 · min{f (z̄), f (w̄)} ≤ 2f (z̄) + f (w̄) = RI (z̄, w̄) − w̄T ū ≤ 1
S. Kontogiannis :
[45 / 68]
(< 1/3)−NE from a Given Exact
THEOREM: [Kontogiannis-Spirakis (2011)]
KKT Point – (I)
Starting from any (exact) KKT point of (KKTSMS) for a
normalized
symmetric bimatrix game hS, ST i, computing a
< 13 −NE can be done in polynomial time.
S. Kontogiannis :
[46 / 68]
KKT Point – (I)
normalized
Proof Sketch:
(max(Sz̄), z̄, w̄, ū, ζ̄ ): The given KKT point, along with the
proper Lagrange multipliers.
if f (z̄) 6= f (w̄) then 3 · min{f (z̄), f (w̄)} < RI (z̄, w̄) − w̄T ū ≤ 1.
∴ ASSUMPTION 1: f (z̄) = f (w̄) = 13 .
S. Kontogiannis :
[46 / 68]
KKT Point – (I)
normalized
Proof Sketch:
(max(Sz̄), z̄, w̄, ū, ζ̄ ): The given KKT point, along with the
proper Lagrange multipliers.
if f (z̄) 6= f (w̄) then 3 · min{f (z̄), f (w̄)} < RI (z̄, w̄) − w̄T ū ≤ 1.
∴ ASSUMPTION 1: f (z̄) = f (w̄) = 13 .
if (max(Sw̄), w̄) ∈/ (KKTSMS)
then starting from (max(Sw̄), w̄), the next step towards a KKT
point will give a (< 1/3)−NE.
∴ ASSUMPTION 2: (max(Sw̄), w̄) ∈ (KKTSMS).
S. Kontogiannis :
[46 / 68]
(< 1/3)−NE
0
0
from a Given Exact KKT Point – (II)
0)
(w̄ , ū , ζ̄ : The appropriate Lagrange
(max(Sw̄), w̄) ∈ (KKTSMS).
multipliers for
From the Basic Lemma, applied now to (max(Sw̄), w̄):
2f (w̄) + f (w̄0 ) = RI (w̄, w̄0 ) − (w̄0 )T ū0
S. Kontogiannis :
[47 / 68]
(< 1/3)−NE
0
0
0)
multipliers for
2f (w̄) + f (w̄0 ) = RI (w̄, w̄0 ) − (w̄0 )T ū0
Observation 1:
1 = 3f (z̄) = RI (z̄, w̄) − w̄T ū
⇒
max(Sw̄) = 1 ∧ z̄T Sw̄ = 0 ∧ w̄T ū = 0
1 = 3f (w̄) = RI (w̄, w̄0 ) − (w̄0 )T ū0
⇒
max(Sw̄0 ) = 1 ∧ w̄T Sw̄0 = 0 ∧ (w̄0 )T ū0 = 0
S. Kontogiannis :
[47 / 68]
(< 1/3)−NE
0
0
0)
multipliers for
2f (w̄) + f (w̄0 ) = RI (w̄, w̄0 ) − (w̄0 )T ū0
Observation 1:
1 = 3f (z̄) = RI (z̄, w̄) − w̄T ū
⇒
max(Sw̄) = 1 ∧ z̄T Sw̄ = 0 ∧ w̄T ū = 0
1 = 3f (w̄) = RI (w̄, w̄0 ) − (w̄0 )T ū0
⇒
max(Sw̄0 ) = 1 ∧ w̄T Sw̄0 = 0 ∧ (w̄0 )T ū0 = 0
Observation 2:
1 = f (w̄) =
3
max(Sw̄) − w̄T Sw̄ ⇒ w̄T Sw̄ =
S. Kontogiannis :
2
3
[47 / 68]
(< 1/3)−NE
from a Given Exact KKT Point – (III)
if f (w̄) 6= f (w̄0 ) then 3 min{f (w̄), f (w̄0 )} < 1.
∴ ASSUMPTION 3: f (z̄) = f (w̄) = f (w̄0 ) = 31 .
S. Kontogiannis :
[48 / 68]
(< 1/3)−NE
from a Given Exact KKT Point – (III)
if f (w̄) 6= f (w̄0 ) then 3 min{f (w̄), f (w̄0 )} < 1.
∴ ASSUMPTION 3: f (z̄) = f (w̄) = f (w̄0 ) = 31 .
(max(Sz̄), z̄) ∈ (KKTSMS):

−Sz̄ − ST z̄ + ST w̄ = ū + 1ζ̄ 
⇒
ζ̄ = f (z̄) − z̄T Sz̄ 
−(w̄0 )T Sz̄ − z̄T Sw̄0 + |w̄T{z
Sw̄}0 = w̄0T ū + f (z̄) − z̄T Sz̄ ⇒
=0
0 )T
0 ≤ (w̄
0 )T
ū = −(w̄
Sz̄ − z̄T Sw̄0 − f (z̄) + z̄T Sz̄
⇒
(w̄0 )T Sz̄ − z̄T Sz̄ ≤ − 13 − z̄T Sw̄0 ≤ − 13 < 0
S. Kontogiannis :
[48 / 68]
(< 1/3)−NE
from a Given Exact KKT Point – (IV)
(max(Sw̄), w̄) ∈ (KKTSMS):

−Sw̄ − ST w̄ + ST w̄0 = w̄0 + 1ζ̄ 0 
⇒
ζ̄ 0 = f (w̄) − w̄T Sw̄ = − 13 
− z̄|T{z
Sw̄} − z̄|T S{zT w̄} +z̄T ST w̄0 = z̄T ū0 − 13 ⇒
=max(Sz̄)
=0
0 ≤ z̄T ū0 = 31 − max(Sz̄) + (w̄0 )T Sz̄ ⇒
max(Sz̄) − z̄T Sz̄ ≤ 13 + (w̄0 )T Sz̄ − z̄T Sz̄ ⇒
|
{z
=f (z̄)= 31
}
(w̄0 )T Sz̄ − z̄T Sz̄ ≥ 0
S. Kontogiannis :
[49 / 68]
(< 1/3)−NE
from a Given Exact KKT Point – (IV)
(max(Sw̄), w̄) ∈ (KKTSMS):

−Sw̄ − ST w̄ + ST w̄0 = w̄0 + 1ζ̄ 0 
⇒
ζ̄ 0 = f (w̄) − w̄T Sw̄ = − 13 
− z̄|T{z
Sw̄} − z̄|T S{zT w̄} +z̄T ST w̄0 = z̄T ū0 − 13 ⇒
=max(Sz̄)
=0
0 ≤ z̄T ū0 = 31 − max(Sz̄) + (w̄0 )T Sz̄ ⇒
max(Sz̄) − z̄T Sz̄ ≤ 13 + (w̄0 )T Sz̄ − z̄T Sz̄ ⇒
|
{z
=f (z̄)= 31
}
(w̄0 )T Sz̄ − z̄T Sz̄ ≥ 0
∴
if f (z̄) = f (w̄) = f (w̄0 ) =
(KKTSMS)
then
1 ∧ (max(Sz̄), z̄), (max(Sw̄), w̄) ∈
3
0 ≤ (w̄0 )T Sz̄ − z̄T Sz̄ ≤ − 31
S. Kontogiannis :
/ ∗ CONTRADICTION ∗/
[49 / 68]
Efficient Computation of
THEOREM:
1 + δ −NE
3
For any normalized symmetric bimatrix game hS, ST i with rational
payoff values, S ∈ [0, 1]n×n , and any constant δ > 0, it is possible
to construct a symmetric (1/3 + δ)−NE point, in time polynomial in
the description of the game and quasi-linear in the value of δ.
S. Kontogiannis :
[50 / 68]
Efficient Computation of
THEOREM:
1 + δ −NE
3
For any normalized symmetric bimatrix game hS, ST i with rational
payoff values, S ∈ [0, 1]n×n , and any constant δ > 0, it is possible
to construct a symmetric (1/3 + δ)−NE point, in time polynomial in
the description of the game and quasi-linear in the value of δ.
Similar proof with that for the NE approximability of exact
KKT points, only working now with δ−approximate (rather
than exact) KKT points of (SMS).
S. Kontogiannis :
[50 / 68]
Experimental Study of
SYMMETRIC-2NASH
Approximations
S. Kontogiannis :
[51 / 68]
Experimental Evaluation of 2NASH Approximations
Goal: Try various heuristics for providing approximate NE
points in symmetric bimatrix games.
Random Game Generator: Win-Lose symmetric games hR, RT i,
provided by rounding a normalized-random game hS, ST i
whose entries are normal r.v.s with mean 0 and deviation 1.
A fine-tuning parameter allows for favoring either sparse or
dense win-lose games.
OPTION: Clear the random win-lose game by avoiding...
I
‘‘all-ones’’ rows in R, (weakly dominates all actions of row
player).
I
‘‘all-zeros’’ rows in R, (weakly dominated row which cannot
disturb an approximate NE point).
I
(1, 1)−elements
in the bimatrix hR, RT i (trivial pure NE points).
S. Kontogiannis :
[52 / 68]
1st Basic Approach
KKT Points of (SMS) as ε−NE Points
Method: Use any polynomial-time construction algorithm to
converge to a KKT point of SMS.
Our approach: KKTSMS Use the quadprog (active-set)
method of MatLab to locate a KKT point of (SMS).
S. Kontogiannis :
[53 / 68]
2nd Basic Approach
Reformulation-Linearization Relaxation of (SMS)
Method: Create a (1st-level) LP relaxation of SMS, based on
the RLT method of [Sherali-Adams (1998)] .
Our approach: RLTSMS Solve the following relaxation:
P
P
minimize
α − i∈[n] j∈[n] Ri,j Wi,j
P
PP
s.t. β − j (Ri,j + R
)γ + j ` Ri,j Rk,` Wj,` ≥
0, i, k ∈ [n]
Pk,j j
P
α − j Ri,j xj − γk + j Ri,j Wj,k ≥ 0, i, k ∈ [n]
−xi − xP
j + Wi,j ≥ −1, i, j ∈ [n]
γk − Pj Ri,j γj ≥ 0, i, k ∈ [n]
xi − P
0, i ∈ [n]
j Wi,j =
x
=
1,
j
j
−xi − α + γi ≥ −1, i ∈ [n]
0, i ∈ [n]
Pxi − γi ≥
β −P j Ri,j γj ≥ 0, i ∈ [n]
0,
i γi − α =
α − β ≥ 0,
β ≥ 0; γi ≥ 0, i ∈ [n]; Wi,j ≥ 0, i, j ∈ [n]
S. Kontogiannis :
[54 / 68]
3rd Basic Approach
Doubly Positive SDP Relaxation of (SMS)
Method: Create an SDP-relaxation of (SMS), considering also
the non-negativity of the produced matrix.
Our approach: DPSDP Solve the following relaxation:
minimize
s.t
PP
α−
P
α − j Ri,j xj
P
j xj
Wi,j − Wj,i
W x
Z≡ T
x 1
Z
Z
j Ri,j Wi,j
i
≥
=
=
0,
1,
0,
<
0,
i ∈ [n]
i, j ∈ [n]
1 · 1T ,
≥ 0 · 0T
≤
S. Kontogiannis :
[55 / 68]
4th Basic Approach
Marginals of Extreme CE points of (SMS)
Method: Return the marginals of extreme points in the
CE-polytope of hR, RT i.
Our approach:BMXCEV4 Solve the following relaxation and
return the profile with the marginals of the optimum correlated
strategy W ∈ ∆([n]) × ∆([n]):
min.
s.t.
PP
i P
j [Ri,j · Rj,i ]Wi,j
∀i, k ∈ [m],
R )Wi,j ≥
j ∈[
n] (Ri,j −
P
P k,j
i∈[m] j ∈[n] Wi,j =
∀(i, j ) ∈ [m] × [n],
Wi,j ≥
S. Kontogiannis :
0
1
0
[56 / 68]
Hybrid Approaches
Method: Consider only best-of results for various (couples, or
triples of) methods.
S. Kontogiannis :
[57 / 68]
Experimental Results for Pure Methods (I)
Worst-case ε
#unsolved games
Worst-case round
RLTSMS
0.512432
112999
10950
KKTSMS
0.22222
110070
15484
DPSDP
0.6
0
16690
BMXCEV4
0.49836
405
12139
Experimental results for worst-case approximation among
500K random 10x10 symmetric win-lose games.
S. Kontogiannis :
[58 / 68]
Experimental Results for Pure Methods (II)
Worst-case ε
#unsolved games
Worst-case round
RLTSMS
0.41835
11183
45062
KKTSMS
0.08333
32195
42043
DPSDP
0.51313
0
55555
BMXCEV4
0.21203
1553
17923
500K random 10x10 symmetric win-lose games, which avoid
(1, 1)−elements, (1, *)− and (0, *)−rows.
S. Kontogiannis :
[59 / 68]
Experimental Results for Hybrid Methods (I)
Worst-case ε
#unsolved games
Worst-case round
KKTSMS +
BMXCEV4
0.45881
0
652
KKTSMS + RLT
+ BMXCEV4
0.47881
0
1776
KKTSMS +
RLT + DPSDP
0.54999
0
737
500K random 10x10 symmetric win-lose games.
S. Kontogiannis :
[60 / 68]
Experimental Results for Hybrid Methods (II)
Worst-case ε
#unsolved games
Worst-case round
KKTSMS +
BMXCEV4
0.08576
0
157185
KKTSMS + RLT
+ BMXCEV4
0.08576
0
397418
KKTSMS +
RLT + DPSDP
0.28847
0
186519
500K random 10x10 symmetric win-lose games, which avoid
(1, 1)−elements, (1, *)− and (0, *)−rows.
S. Kontogiannis :
[61 / 68]
Distribution of Solved Games (I)
2500
RLT-10X10-10K-WINLOSE #games solved
2000
1500
1000
500
0
0
0.2
0.4
0.6
0.8
epsilon value
1
1.2
1.4
(a) RLTSMS
Distribution of games solved for particular values of
approximation, in runs of 10K random 10x10 symmetric
win-lose games. The games that remain unsolved in each case
accumulate at the epsilon value 1.
S. Kontogiannis :
[62 / 68]
Distribution of Solved Games (II)
9000
KKTSMS-10X10-10K-WINLOSE #games solved
8000
7000
6000
5000
4000
3000
2000
1000
0
0
0.2
0.4
0.6
0.8
epsilon value
1
1.2
1.4
(b) KKTSMS
S. Kontogiannis :
[63 / 68]
Distribution of Solved Games (III)
700
DPSDP-10X10-10K-WINLOSE: #games solved
600
500
400
300
200
100
0
0
0.2
0.4
0.6
0.8
epsilon value
1
1.2
1.4
(c) DPSDP
S. Kontogiannis :
[64 / 68]
Distribution of Solved Games (IV)
2000
BMXCEV4-10X10-10K-WINLOSE: #games solved
1800
1600
1400
1200
1000
800
600
400
200
0
0
0.2
0.4
0.6
0.8
epsilon value
1
1.2
1.4
(d) BMXCEV4
S. Kontogiannis :
[65 / 68]
1
Complexity of 2NASH
2
3
Experimental Study
4
Conclusions
S. Kontogiannis :
[66 / 68]
Recap & Open Problems
kNASH is, together with FACTORING,
probably the most important problems at the intersection of P
and N P .
Even 2NASH seems already too hard to solve.
Is there a PTAS, or a lower bound that, unless something
extremely unlikely holds (eg, P = N P ), excludes the existence
of a better approximation ratio for ε-NE points?
[Papadimitriou (2001)]
I
Extensive experimentation on randomly constructed win-lose
games shows that probably 1/3 is not the end of the story...
How about ε-WSNE points?
Are there any other, more general subclasses of bimatrix game
for which 2NASH is polynomial-time tractable, or at least a
PTAS exists?
S. Kontogiannis :
[67 / 68]
Thanks for your attention!
Questions / Remarks ?
S. Kontogiannis :
[68 / 68]

Computation of Nash Equilibria in Bimatrix Games

Transcription

Similar documents

annotated slides