The Euclidean Matching Problem

Transcription

The Euclidean Matching Problem
U NIVERSITÀ
DI
P ISA
Scuola di dottorato in Scienze di base “Galileo Galilei”
Dottorato in Fisica
Ciclo xxvii
The Euclidean Matching Problem
Statistical Physics Methods for Optimisation
A Thesis for the degree of
P HILOSOPHIÆ D OCTOR
in
T HEORETICAL P HYSICS
Supervisor:
P ROF. S ERGIO
Candidate:
C ARACCIOLO
G ABRIELE S ICURO
December 17, 2014
Pisa
Contents
Introduction
A matching history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1. Optimisation, disorder and statistical mechanics
1.1. Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2. Combinatorial Optimisation: generalities . . . . . . . . . . . . . .
1.3. Weighted matching problems . . . . . . . . . . . . . . . . . . . .
1.3.1. The Hungarian algorithm . . . . . . . . . . . . . . . . . .
1.3.2. Random matching problems . . . . . . . . . . . . . . . . .
1.4. Optimisation and statistical mechanics: replicas for optimisation
1.4.1. Spin glasses . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.2. The solution of the random matching problem via replicas
1.5. Models on graphs and Belief propagation . . . . . . . . . . . . .
1.5.1. Factor graph and Belief propagation . . . . . . . . . . . .
v
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
4
5
5
8
10
10
22
23
23
2. Euclidean Matching Problems
2.1. Euclidean optimisation problems . . . . . . . . . . . . . . . . . . . .
2.1.1. Euclidean optimisation and randomness: empirical measures
2.1.2. Asymptotics for Euclidean functionals . . . . . . . . . . . . .
2.2. Euclidean matching problems . . . . . . . . . . . . . . . . . . . . . .
2.3. The Monge–Kantoroviˇc transport problem . . . . . . . . . . . . . . .
2.4. One dimensional Ebmp: an exact solution for convex weights . . . .
2.4.1. Ebmp on the interval . . . . . . . . . . . . . . . . . . . . . .
2.4.2. Ebmp on the circumference . . . . . . . . . . . . . . . . . . .
2.4.3. General solution and universality for p > 1 . . . . . . . . . .
2.5. Higher dimensional Euclidean matching problem . . . . . . . . . . .
2.6. Correlation functions for the Ebmp . . . . . . . . . . . . . . . . . . .
2.7. Functional approach . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7.1. Uniform density on the hypertorus . . . . . . . . . . . . . . .
2.7.2. Matching problem on the line . . . . . . . . . . . . . . . . . .
2.8. Emmp via replicas . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.8.1. Recovering the mean field approximation . . . . . . . . . . .
2.8.2. Polygonal approximation and zero temperature limit . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
29
30
30
31
34
36
38
40
43
46
53
59
63
63
65
69
70
.
.
.
.
.
.
.
.
.
.
Conclusions
77
A. The Wiener process and the Brownian bridge process
A.1. Wiener process and Brownian bridge process . . . . . . . . . . . . . . . . .
A.2. Probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
79
81
B. The Simplex Method
B.1. Linear optimisation problems . . . . . . . . . . . . . . . . . . . . . . . . . .
83
83
iii
iv
Contents
B.2. The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bibliography
83
87
Introduction
A matching history
In 1781 a French geometer, Gaspard Monge (1746–1818), published his M´emoire sur la
th´eorie des d´eblais at des remblais, in which he discussed the following very simple problem.
Let us suppose that we have a certain number of mines and the same number
of deposits. We want to associate each mine to one deposit only (where the
production of the considered mine will be transported and stored). How we can
perform this matching in such a way that the total transport cost is minimum?
Monge made the quite natural assumption that the transport cost from a certain mine to
a certain deposit is a given function of the distance between the mine and the deposit.
Moreover, in this formulation of the problem, everything is deterministic: we know where
mines and deposits are, we know their distances and therefore the problem is fixed in all its
details. However the problem of finding an optimal matching between mines and deposits,
given their positions e.g. on a chart, is simple in its formulation, but quite difficult to solve:
if the number of mines is N , we have N ! ways to match mines and deposits and we have
to select the cheaper one. It is however self-evident that, if N is quite large, a brute force
approach is not feasible: if we are able to compute the cost of a matching in, let us say, 10−3
seconds, for N = 20 we have to wait 7 · 107 years to obtain the solution in the worst case. . .
It is clear therefore that a smarter approach is needed. Only in 19551 Harold Kuhn proposed
an algorithm, called Hungarian algorithm, able to solve the problem in a computation time
that scales as N 3 in the size N of the original problem [29]: we will discuss this algorithm
in Section 1.3.1. The Hungarian algorithm proves that the problem is in the so called P
computational complexity class, but still the computation time required to solve it grows
quite rapidly as N increases.
The idea of an algorithmic solution the matching problem was interesting by itself in
the 18th century, but not completely usable, due to the lack of computational resources.
Monge therefore reformulated the problem in a “continuum” version, in which the matching between points of different types was replaced by a transport map between two different
measures on the same domain, to be found in such a way that a certain functional of the
map is minimised: in Measure Theory this problem is called optimal transport problem. The
combinatorial problem became therefore an interesting problem at the edge between Measure Theory and Geometry. This approach was developed in 1938 by Leonid V. Kantoroviˇc
(1912–1986): its (dual) reformulation of the problem is of paramount importance in Measure
Theory, Economics and Linear Programming and led him to the Nobel Prize in Economics
in 1975 (we will briefly discuss his contributions in Section 2.3). During the last decade the
interest in the Theory of the Optimal transport has increased exponentially in the mathematical community, due to the results of Luigi Ambrosio, Luis A. Caffarelli, Alessio Figalli,
C´edric Villani and others on the existence and the properties of optimal transport maps
1 Remarkably,
in 2006 it was discovered that the problem had been solved by Carl Gustav Jacobi, but his
work was published ony in 1890 in Latin and it was ignored at the time.
v
vi
Contents
(see the recent review paper of Bogachev and Kolesnikov [9] and the monograph by Villani
[53]).
Both the previous approaches (the continuum one and the combinatorial one) assume
that no disorder or randomness is present in the problem (the point positions or the measures are supposed given). When we find the optimal matching between the two set of
points or the optimal map between the two measures, the problem is solved. But we can
consider the problem under a different point of view. Let us suppose for example that
we have two set of random points on a certain domain and that we ask for the optimal
matching between them in such a way that a certain functional of the matching itself and of
point positions is minimum. The specific solution in this case is not of great interest: more
interestingly, we could ask what are the average properties of the optimal matching (e.g.,
the average optimal cost) as functions of the parameters of the problem and of the specific
cost functional adopted. The problem, called in this version random Euclidean matching
problem, is not trivial at all: indeed, correlation among the distances is present, due to the
presence of the Euclidean constraint. Marc M´ezard and Giorgio Parisi treated this problem
in a set of papers published between 1985 and 1988 [35–37, 40]: they considered as first
order approximation the so called random matching problem, in which the distances are
supposed totally uncorrelated, as a sort of mean field model of the original problem; subsequently, they introduced correlations as corrections to the mean field results, only sketching
the complete computation. Their results are remarkable for different important reasons:
firstly, they were able to give a complete solution of the purely random case, obtaining the
correct average optimal cost and its distribution; secondly, their results were obtained using
statistical physics techniques developed for the study of disordered systems. Their approach
was therefore not rigorous, but the results were confirmed completely sixteen years later
with rigorous probability arguments; they showed that the statistical physics methods are
extremely powerful to treat combinatorial problems when randomness is assumed.
In the present thesis we overview the main results obtained on the Euclidean matching
problem in the last fifty years and we present the results of our investigation on this subject.
The material is organized in two chapters:
Optimisation and physics In the first chapter we review the two main subject of this work,
optimisation and the physics of disordered systems. From one hand, the availability
in the last decades of powerful computational resources encouraged the development
of efficient algorithms to solve difficult optimisation problems on graphs; on the other
hand, dating back to the work of Edwards and Anderson [22] on spin glasses, the
physics of disordered systems developed a pletora of powerful techniques with an impressive impact in many scientific fields: we will present in particular the contribution
of this field to Combinatorial Optimisation.
The Eucliden Matching Problem and our contributions In the second part we deal with
the main subject of the thesis, Euclidean matching problem. We followed very different approaches, each of them able to provide an insight on the problem and its
peculiarities. In particular:
• we solved the one dimensional matching problem with convex cost function using
Measure Theory arguments and showing that, in the thermodynamical limit, the
solution of problem is equivalent to a Brownian bridge process [12];
• we proposed a scaling ansatz for the optimal cost in any dimension, obtaining
a precise analytical prediction for its scaling respect to the system size and for
the finite size corrections to the average optimal cost (this part of the work
was developed with Carlo Lucibello and Giorgio Parisi, from University “La
Sapienza” in Rome) [14];
• we computed the correlation function in any dimension for the optimal transport
map in the thermodynamical limit using both a generalization of the previous
Contents
vii
ansatz and a functional argument that provided a recipe for a very general case
[11, 13];
• we developed the computation of M´ezard and Parisi for the corrections to the
mean field approximation of the Euclidean matching problem; we showed also
that there is a formal correspondence between the corrections to the mean field
contribution and the finite size corrections in the purely random case, already
computed by Parisi and Rati´eville [46]. This part, as the scaling ansatz, was
developed again with Carlo Lucibello and Giorgio Parisi.
Chapter 1
Optimisation, disorder
and statistical mechanics
Summary of the chapter In the first part of this chapter we collect some well-known mathematical
results on Combinatorial Optimisation, Matchings and Graph Theory. The aim of this part is to provide a
brief introduction to the subject. As concern the Section on Graph Theory, we refer to the work of Diestel
[19]. The book of Papadimitriou and Steiglitz [43] is a classical introduction to Combinatorial Optimisation
in general. We recommend also the monograph of Lov´
asz and Plummer [31] on Matching Theory. In the
second part we recall some classical results on Spin Glass Theory. For a pedagogical introduction, we refer
to the nice paper of Castellani and Cavagna [15]. A classical book on the subject was published by M´
ezard,
Parisi, and Virasoro [33]. We refer also to the review paper of Binder and Young [8] and to the modern
expositions of Nishimori [42] and Fischer and Hertz [25]. We will recall some fundamental results on the
application of techniques developed in the Theory of Disordered Systems to optimisation problems, like the
Rmmp or the Rbmp: again, this type of investigation was pioneered by M´
ezard and Parisi [37] in a famous
seminal paper. The fruitful application of the celebrated Cavity Method is also discussed. A modern
and quite complete exposition of the techniques discussed here can be found in the recent monograph by
M´
ezard and Montanari [34]. The material presented here is a collection and re-elaboration of the main
results presented in the above works.
1.1. Graph Theory
In the present thesis we will consider some relevant combinatorial problems defined on
graphs. We will give therefore a small introduction to Graph Theory without claiming to be
complete: we will fix the fundamental definitions and our notation to be used throughout
the present work.
Fundamental definitions A graph G = Graph(V; E) is a pair of sets (V, E) such that
E ⊆ V × V. The elements of V are called vertices (or nodes), whilst the elements of E are
usually called edges. We will denote by1 V = |V| the number of elements of V (sometimes
called order of G) and we will suppose always that V ∈ N is finite unless otherwise specified,
N set of natural numbers. Moreover, given a vertex v and an edge e, we say that v is incident
with e if v ∈ e: in this case e is an edge at v. We will call E(v) the set of edges at v and
|E(v)| the degree of v. We say that u, v ∈ V are adjacent if (u, v) ∈ E; we will denote by ∂v
the set of adjacent vertices to v. Finally, we define the complete graph KV as the graph with
V vertices in which each vertex is adjacent to all the others. Two graphs G = Graph(V; E)
1 In
the present work, given a set A of N ∈ N elements, we will use the notation |A| = N for the cardinality
of the set.
1
2
Chapter 1. Optimisation, disorder and statistical mechanics
(a) An example of separable planar graph.
(b) The Sephiroth is
an example of graph
with 10 vertices and
22 edges.
(c) The Petersen’s graph
is an example of non planar graph: its genus is
g = 1.
Figure 1.1.1.: Three examples of graphs.
and G0 = Graph(V 0 ; E 0 ) are isomorphic, or G ∼ G0 , if a bijection ϕ : V → V 0 exists such
that (u, v) ∈ E ⇔ (ϕ(u), ϕ(v)) ∈ E 0 . Finally, given two graphs G = Graph(V; E) and
G0 = Graph(V 0 ; E 0 ), if V ⊆ V 0 and E ⊆ E 0 , than we say that G is a subgraph of G0 and G0 is
a supergraph of G: in symbols, G ⊆ G0 . We say that G is a spanning subgraph of G0 if and
only if V = V 0 . A certain graph G = Graph(V; E), V = {vi }i=1,...,V , E = {ei }i=1,...,E , can
be represented also by a V × E matrix, called incidence matrix, B := (bij )ij in such a way
that
(
1 if vi ∈ ej
(1.1.1)
bij =
0 otherwise.
Similarly, we introduce the more commonly used V × V adjacency matrix A := (aij )ij , such
that
(
1 if (vi , vj ) ∈ E
aij =
(1.1.2)
0 otherwise.
A vertex cover of G is a subset of V such that any edge of G has at least one endpoint in
it. The vertex covering number cV (G) is the smallest possible size of a vertex cover of G.
Similarly, and edge cover of G is a subset of E such that any vertex of G is the endpoint of
at least one element in it. The edge covering number cE (G) is the smallest possible size of
an edge cover of G.
A directed graph (or digraph) is a graph in which we assign an initial vertex and a
terminal vertex to each edge in the edge set: in a digraph, an edge in which the initial
vertex and terminal vertex coincide is called loop and corresponds to a non-zero element on
the diagonal of the adjacency matrix. A weighted graph is a graph in which we associate
a certain function w : E → R to the graph itself; given an edge e, we say that w(e) is the
weight of the edge. For a given weighted graph G, we can introduce the weighted adjacency
matrix as W := (w(eij )aij )ij .
Given two graphs G = Graph(V; E) and G0 = Graph(V 0 ; E 0 ) with the same vertex set V,
we define
G4G0 := Graph(VG4G0 ; E4E 0 ),
(1.1.3)
where E4E 0 is the symmetric difference between the two edge sets and VG4G0 is the set of
the vertices that are ends for the edges in E4E 0 .
1.1. Graph Theory
3
Path, cycles, trees and forests A path P = Graph(VP ; EP ) in a graph G is a
particular subgraph P ⊆ G such that VP = {v0 , . . . , vk } is a set of distinct vertices and the
edge set is given by EP = {(v0 , v1 ), (v1 , v2 ), . . . , (vk−1 , vk )}. We say that the path links v0
and vk and has lenght k. The length of the shortest path joining two vertices u and v of G is
called distance of u and v on G. If any two vertices of a graph G can be connected by a path,
we say that G is connected: any graph can be expressed as union of maximal connected
subgraphs (eventually given by single vertices), called components, and a connected graph
is a graph with only one component. Given a connected graph G = Graph(V; E), the subset
X ⊂ V ∪ E is said to be a separating set if G0 = Graph(V \ X ; E \ X ) is not connected. If
X contains only a single vertex, the vertex is said a cutvertex; similarly if X contains one
edge, we say that the selected edge is a bridge. For k ≥ 3, a cycle C = Graph(VC ; EC ) in a
graph G is a subgraph C ⊆ G such that VC = {v0 , . . . , vk−1 } is a set of distinct vertices and
EC = {(v0 , v1 ), (v1 , v2 ), . . . , (vk−2 , vk−1 ), (vk−1 , v0 )}; we say that such a cycle has lenght k.
The minimum length of a cycle contained in a certain graph G is called girth of G, whilst
the maximum length is called circumference of G. Clearly, not all graphs contain cycles:
an acyclic graph is called forest, a connected forest is called tree. A non-trivial forest (i.e.,
a forest with |E| 6= 0) has always 1-degree vertices, called leaves. Given a tree, sometimes a
specific vertex is considered special and called root of the tree: with reference to the root,
we define the height of a vertex as the distance of the vertex itself from the root.
Multipartite graphs A graph G = Graph(V; E) is called qpartite
if we can partition V into q subsets (or classes), V = 1
2
3
Sq
V
,
i=1 i Vi ∩ Vj = ∅ for i 6= j, in such a way that every edge
in E connects vertices in different classes: we will denote such a
graph as G = Graph(V1 , . . . , Vq ; E). A q-partite graph is called
complete if, given two vertices in two different classes, there exists 1
2
3
an edge connecting them; we will denote it by KV1 ,...,Vq , Vi := |Vi |.
If q = 2 a multipartite graph is called bipartite: bipartite graphs Figure 1.1.2.: K3,3 ; observe that cV (K3,3 ) =
have the characterising property of having no odd cycles.
3 (for bipartite complete
graphs cV (KV,V ) = V ).
Matchings Given a certain graph G = Graph(V; E), we say that
M = Graph(VM ; EM ) ⊆ G is a matching of size |EM | in G if, given
two edges in M, they have no vertex in common. The size of the
largest matching in G, m(G), is called matching number of G; if VM = V we say that M is
perfect. For a bipartite graph G = Graph(V1 , V2 ; E) the K¨
onig minimax theorem holds:
cV (G) = m(G).
Moreover, Hall proved that G has a matching if and only if, ∀X ⊆ V1 ,
[
∂x ≥ |X |.
(1.1.4)
(1.1.5)
x∈X
In particular, if |V1 | = |V2 | = V the previous inequality guarantees the existence of a perfect
matching (Marriage theorem): observe that in this case, numbering separately the vertices
of each class in the graph, a perfect matching can be expressed as a permutation of V
elements. For a proof of the previous statements, see [31]. Finally, given a path P (or a
cycle) in G, we say that the path P is M-alternating if the edges of P are alternately in
and not in M.
Planar graphs and Euler formula Let us consider a connected graph G = Graph(V; E)
with E = |E| edges and V = |V| vertices. It can be proved that there always exists at least
a surface S on which the graph can be embedded through a proper map, in such a way that
4
Chapter 1. Optimisation, disorder and statistical mechanics
3
4
1
2
3
4
1
2
3
4
2
5
1
6
8
7
Figure 1.1.3.: On the left, complete graph K8 and an example of perfect matching on it. On the right,
complete bipartite graph K4,4 and an example of perfect bipartite matching on it.
no edge-crossing appears. In particular, let us consider the set SG of the surfaces on which
we can perform such operation. We say that g is the genus of the graph if
g(G) = min g(S),
S∈SG
(1.1.6)
where g(S) is the genus of the surface S. By definition, G is planar if g(G) = 0. We say
that L + 1 is the number of faces of the graph, i.e., the number regions delimited by edges
without any edge inside when a graph is drawn on a surface of genus g(G). Euler proved
that, for a graph of genus g, the following relation holds:
V + L − E = 1 − 2g.
(1.1.7)
1.2. Combinatorial Optimisation: generalities
In the previous section we have introduced some fundamental definitions of Graph Theory.
To introduce our main subject, let us now give a brief survey of Combinatorial Optimisation
from a mathematical point of view.
An instance of an optimisation problem is a couple of mathematical elements, i.e., a space
of configurations Ξ and a cost function E : Ξ → R. The space Ξ can be in general a set
of finite, countable or uncountable number of elements. The target is to find the (globally)
optimal solution (if it exists), i.e. an element ξ ∗ ∈ Ξ such that
E [ξ ∗ ] ≤ E [ξ]
∀ξ ∈ Ξ.
(1.2.1)
The set of all instances of a certain optimisation problem is the optimisation problem itself
by definition. Combinatorial optimisation deals with optimisation problems in which the
cardinality of Ξ is finite for all their instances, |Ξ| ∈ N: obviously, in this case the problem
has always a solution. However, a direct inspection of the optimal solution requires the
evaluation of the cost function over all possible elements of Ξ, and this brute-force approach
is usually computationally unfeasible.
As a specific example, let us consider the Travelling Salesman Problem (Tsp). In this
problem, a (finite) set Σ of N points is given; moreover, to each couple (i, j) of points it is
associated a weight wij ∈ R+ . In the Tsp, the space Ξ is given by all possible closed paths
ξ passing only once through each point, whilst the cost function is given by
ξ 7→ E Tsp [ξ] :=
N
X
wξ(i)ξ(i+1) .
(1.2.2)
i=1
From a graph theoretical point of view, the Tsp can be expressed as the problem of finding
the minimum weight Hamiltonian cycle2 . Observe that there are 21 (N − 1)! possible paths
2A
Hamiltonian cycle is a closed path traversing all vertices of a certain graph exactly once.
1.3. Weighted matching problems
5
and therefore the direct inspection of the solution by computing all possible values of the
cost function requires at least 21 (N − 1)! steps, i.e. a huge number of evaluations as N
grows: it is clear that a smarter approach is needed.
Optimisation problems can be classified in a more fundamental way with reference to their
complexity class, i.e., with reference to the number of computational operations (running
time) needed to solve the problem as function of the size of the instance through a certain
algorithm. Algorithms themselves can be classified as polynomial if the running time is
bounded from above by a certain polynomial in the size of the input and superpolynomial
if such a bound does not exist. We say that a certain optimisation problem belongs to
the class P if there exists a polynomial algorithm that solves it. There is a larger class of
problems, called NP class, containing non-deterministic polynomial problems, i.e., problems
that can be solved in polynomial time by a non-deterministic algorithm. Clearly P ⊆ NP,
but it is still unknown whether P = NP or P 6= NP. The Tsp belongs to the NP class but
no polynomial algorithm to solve it is known.
In the following we will analyse in some details a specific combinatorial problem belonging
to the P class, i.e., the matching problem. As explained below, for our problem there exists
a quite efficient polynomial algorithm that provides the solution for a given instance.
1.3. Weighted matching problems
Let us now consider a weighted graph G = Graph(V; E) with weight function w : E → R.
In the (monopartite) matching problem we want to find a matching M ⊆ G such that a
certain functional EG [M] is minimum. Usually, the cost functional has the form, in the
notation above,
1 X
w(e).
(1.3.1)
EG [M] :=
|EM |
e∈EM
If w(e) = 1 ∀e ∈ E the problem is sometimes called cardinality matching problem. In
the relevant case in which G is a bipartite graph, the problem is called bipartite matching
problem or assignment problem. The matching problem, both in its monopartite and in its
bipartite version, belongs to the P computational complexity class. As explained in Section
1.3.1, a fast algorithm is available for the solution of the assignment problem on KN,N
in polynomial time respect to the size of the input N . In the following we will consider
weighted matching problems on complete graphs only.
1.3.1. The Hungarian algorithm
A lot of different algorithms to treat optimisation problems are available in the literature.
The most popular algorithm for the solution of linear optimisation problems is the Simplex
Algorithm (see Appendix B); this algorithm is however in general superpolynomial and less
efficient of specific algorithms in the solution, e.g., of matching problems. In the present
section we will analyse in some details the celebrated Hungarian algorithm for the solution of
the assignment problem on the complete graph3 KN,N , proposed by Kuhn [29] in 1955: this
algorithm has known polynomial computational complexity, proving that the assignment
problem belongs to the P class.
Let us consider therefore a weighted complete bipartite graph KN,N = Graph(V, U; E),
|V| = |U| = N , V = {vi }i , U = {ui }i . Denoting by wij the weight associated to the edge
eij := (vi , uj ), let us introduce also the following N × N weight matrix W = (wij )i,j=1,...,N .
3 Observe
that more general polynomial algorithms are available to solve the matching problem on KN,M ,
N 6= M .
6
Chapter 1. Optimisation, disorder and statistical mechanics
A matching on this graph can be represented by a matrix M = (mij )i,j=1,...,N such that
(
mij =
1
0
N
X
if eij is included in the matching,
otherwise,
i=1
mij =
N
X
mij = 1.
(1.3.2)
j=1
We associate to the matching M the matching cost
EN [M ] :=
1 X
wij mij .
N i,j
(1.3.3)
The optimal matching cost is therefore given by
EN := min
M
1 X
wij mij .
N i,j
(1.3.4)
Denoting by M ∗ the matrix associated to the optimal matching, we want to find the expression of M ∗ for a given instance W of the problem. To obtain an algorithm for this
matrix, observe firstly that our problem is invariant under some gauge transformations: in
particular, given two set of real values L = {λi }i=1,...,N and M = {µi }i=1,...,N we can map
PN
0 :=
wij − λi − µj , whilst EN [M ] 7→ EN [M ] + e0 , where e0 = N1 i=1 (λi + µi ) and
wij 7→ wij
therefore the optimality of a certain solution is not altered by the gauge transformation.
The couple (L, M ) defines the gauge transformation; we say that the gauge is proper if
0
wij
≥ 0. Examples of proper gauges are the following ones:
(L1 , M 1 )
:
{λi = min wij , µi = 0}i=1,...N ,
(1.3.5a)
(L2 , M 2 )
:
{λi = 0, µi = min wji }i=1,...N
(1.3.5b)
j
j
Let us suppose that a proper gauge transformation has been performed on our matrix
(L,M )
W −−−−→ W 0 an let us define the zeroes graph Z = Graph(V; EZ ) ⊆ KN,N as the subgraph
0
of the complete bipartite graph such that eij ∈ EZ ⇔ wij
= 0: if Z contains a perfect
matching on KN,N we say that the proper gauge (L, M ) is Hungarian. It is clear that to
find M ∗ we have to find a suitable Hungarian gauge: observe however that the space of
Hungarian gauges is in general quite large. It can be proved that the optimal matching cost
is given by
EN =
max
e0 ;
(1.3.6)
(L, M ) proper
moreover, a proper gauge (L∗ , M ∗ ) that realizes the previous equality always exists, it is
Hungarian and the corresponding zeroes graph Z admits a matching. This set of nontrivial
properties was proved by Egerv´ary in 1931. On the basis of K¨onig theorem, eq. (1.1.4),
and Egerv´
ary’s results, Kuhn elaborated the Hungarian algorithm, so called in honour of
the two mathematicians, as a sequence of gauge operations on the original weight matrix.
His algorithm can be summarised in the flow chart in figure 1.3.1. The output of the
out
algorithm provides a weight matrix W out = (wij
)ij with at least N zero entries in such
out
a way that we can construct a matching matrix M ∗ so that mij = 1 ⇒ wij
= 0 and
4
corresponding to an optimal matching . The Hungarian algorithm is polynomial and its
time complexity is O(N 4 ). A modified version by Edmonds and Karp achieves O(N 3 ):
it follows that the assignment problem is a “simple” combinatorial problem, but still not
trivial at all. Remarkably, the algorithm presented above is deeply connected with the more
general theory of the Cavity Method, that will be presented in Section 1.5.
4 Observe
that different matching solutions could be possible.
1.3. Weighted matching problems
7
Input: weight matrix W
1
1
2
2
(L ,M )
(L ,M )
˜ −
W −−−−−−→ W
−−−−−→ W 0
Apply the gauge trasformation
λi
=
µi
=
c
0
−c
0
if the i row is uncovered
otherwise
if the i column is uncovered
otherwise
Draw lines through rows and columns of
W 0 in such a way that all zero entries are
covered with the minimum number of lines
Select the minimum uncovered entry c
no
Is the
number of
drawn
lines N ?
yes
Output: weight matrix W out
Figure 1.3.1.: Hungarian algorithm: note that at the first step the gauge transformations (1.3.5) are
applied.
8
Chapter 1. Optimisation, disorder and statistical mechanics
φ
1
111
11
12
112
113
2
13
21
3
22
23
31
32
4
33
41
42
43
Figure 1.3.2.: Structure of the unfolding map for a K4,4 graph. In bold line, an example of matching. The
directed edge ~e = (33, 3) is represented with its children ~e1 = (3, φ), ~e2 = (3, 31), ~e3 = (3, 32).
1.3.2. Random matching problems
Given a certain instance of the matching problem, expressed by the weighted adjacency
matrix W of the complete graph under analysis, we can find a solution of the problem using
well known algorithms both in the monopartite case and in the bipartite case (e.g., the
Hungarian algorithm presented above for the bipartite matching). Here however we are
interested in the case in which the weights {w(e)}e∈E on the complete graph are random
variables with a certain distribution. In the simplest case all weights are independently and
identically distributed random variables with a distribution density ρ(w). The distribution
defines an ensemble of random instances and we ask for the typical properties (average
optimal cost, optimal cost distribution. . . ) for a given ensemble. In this case we say that
we are considering a random matching problem; in particular in the random monopartite
matching problem (Rmmp) the matching problem is defined on a complete graph K2N ,
N ∈ N, whilst in the random bipartite matching problem (Rbmp) the problem is supposed
formulated on the complete bipartite graph KN,N , N ∈ N. In the latter case, the matching
can be expressed as a permutation of n elements π ∈ SN , SN set of all permutations of
N elements: indeed, numbering the vertices of each class in KN,N = Graph(V, U; E) as
V = {v1 , . . . , vN } and U = {u1 , . . . , uN }, the optimal cost can be expressed as
EN = min
π∈SN
N
1 X
wi π(i) ,
N i=1
wij = w ((vi , uj )) .
(1.3.7)
Both the Rmmp and the Rbmp have been investigated using purely combinatorial and probabilistic arguments, but also through statistical physics approaches that proved themselves
particularly powerful. In this Section we will summarise some mathematical results: remarkably, many of them have been obtained firstly using non-rigorous statistical physics
arguments and have been proved rigorously only subsequently. We will consider the Rmmp
in Section 1.4, where a replica-based approach is used. Here we discuss, from a probabilistic
point of view, some interesting results on Rbmp.
The Rbmp: Aldous’ solution and Parisi formula
In 2001 Aldous [2] provided a rigorous treatment of the Rbmp on the complete bipartite
graph KN,N in the N → ∞ limit, adopting a weight distribution density given by
ρ(w) = e−w .
(1.3.8)
Due to the importance of his solution, we will give a brief sketch of his results.
Aldous introduced an unfolding map Φ−1 to construct, from the weighted complete graph
KN,N = Graph(V, U; E), V = {v1 , . . . , vN } and U = {u1 , . . . , uN }, an infinite ordered tree
T = Graph(VT ; ET ), in the following way. Let us select a vertex of KN,N at random and
let us call it φ: for the sake of simplicity, let us suppose that φ ≡ vN . We identify the N
1.3. Weighted matching problems
9
vertices at the other ends of the edges starting from φ by 1, . . . , N . We order them in such
a way that wφ i < wφ j ∀i, j such that i < j. Given a vertex i connected to φ, we denote
by ik the vertex connected to i but different from φ such that wki is the k-th weight in
increasing order in the set of wli , l = 1, . . . , N − 1. We proceed in this way constructing
an infinite tree like in figure 1.3.2: each vertex of the tree has (N − 1) “children” and each
child is associate to one and only one vertex in the original KN,N graph. Taking the limit
N → ∞, an element of the infinite tree is denoted by φ ∈ VT or by ω = ω1 ω2 · · · ωd ∈ VT at
the d level, where ωi ∈ N. Observe that the inverse Φ of the unfolding map is a function
Φ : VT → V ∪ U: denoting by e = (ω, ω 0 ) an edge of the tree, there is only one corresponding
edge (Φ(ω), Φ(ω 0 )) ∈ E. As anticipated, Aldous supposed moreover that the weights {w(e)}
are independently and identically distributed random variables with probability density
(1.3.8). For this reason the constructed tree is called Poisson–weighted infinite tree. An
optimal matching on this tree is given by a matching such that
X
ET = min
w(e)
(1.3.9)
M
e∈M
where the minimum is taken over all possible matchings M ⊆ T on the tree. Observe that
the previous quantity is formally divergent in the N → ∞ limit.
−−−−→
Denoting by ~e = (ω, ω 0 ) ∈ E~T a directed edge with tail in ω and head in ω 0 , E~T set of
directed edges on the tree, we say that ~e1 , . . . , ~eN −1 are children of ~e if they are directed
−−−−→
edges with tail in ω 0 . Let us consider a certain edge ~e = (ω, ω 0 ) on the Poisson–weighted
infinite tree and let us introduce the tree T~e obtained considering the descendants of ~e only
e
obtained considering all the descendent of ~e with the exclusion of its
and the forest T~nc
children. Then, with reference to eq. (1.3.9), we introduce
X(~e) := ET~e − ET~nc
e .
(1.3.10)
X(~e) is therefore the difference between an optimal matching cost with and without the
children of ~e. We can write ET~e in the previous expression as


X
ET~e = min w(~ei ) + ETe~i +
ET~ej 
(1.3.11)
~
ei child
nc
j6=i
It follows that


X(~e) = min w(~ei ) + ETe~i +
~
ei child
X
nc
j6=i
ET~ej  −
X
j
ET~ej
(1.3.12)
= min [w(~ei ) − X(~ei )] .
~
ei child
∗
Denoting by ~e the child for which the identity is verified, ~e∗ is an edge in the optimal
matching and
w(~e∗ ) = X(~e) + X(~e∗ ).
(1.3.13)
From eq. (1.3.12) and assuming a distribution in the form (1.3.8) for the weights w, it
follows that X(~e) has a logistic probability distribution density:
2
1
.
(1.3.14)
pX (x) =
x
x
e 2 + e− 2
From this result and from eq. (1.3.13), Aldous proved a formula obtained by M´ezard and
Parisi [36] in 1985 using the replica method, i.e. that in the N → ∞ limit
N →∞
EN −−−−→
π2
,
6
(1.3.15)
10
Chapter 1. Optimisation, disorder and statistical mechanics
where we denoted by • the average over all instances. Aldous’ approach gives also the
distribution density of the optimal matching cost, already found by M´ezard and Parisi, as
ρE (x) =
x ex
(ex
−1)
2
−
ex
1
.
−1
(1.3.16)
Finally, Aldous showed that e ∈ ET is an edge of the optimal matching if and only if
w(e) < X(~e) + X(e~).
(1.3.17)
Aldous’ results are valid in the N → ∞ limit but no information about finite size effects
can be obtained from the arguments above. In 1998 Parisi [44] conjectured that for the
Rbmp on KN,N the expected cost of the optimal matching in the hypothesis of exponential
distribution (1.3.8) is given by
N
X
1
EN =
.
(1.3.18)
k2
k=1
This result was rigorously proved by Coppersmith and Sorkin one year later (we refer to
the monograph of M´ezard and Montanari [34] for a complete proof).
1.4. Optimisation and statistical mechanics: replicas
for optimisation
In the present work we deal with combinatorial problems in which disorder plays an important role and therefore we are not interested in the properties of a specific instance of
our problem, but in its average properties, eventually depending on the parameters of the
problem itself. In Statistical Physics, the study of the average properties of systems with a
huge number of degrees of freedom in which disorder appears has a long standing tradition.
In this systems disorder is introduced usually in the Hamiltonian function of the system,
H = H(σ, J), as a set of parameters {J} randomly extracted from a certain distribution:
these parameters usually couple with the degrees of freedom {σ} of the system itself. Different techniques were developed to treat properly these systems depending on the fact that
the disordered parameters are constant on the time scale over which the degrees of freedom
of the system fluctuate (quenched disorder) or not (annealed disorder).
In a famous set of seminal papers, M´ezard and Parisi [37] discussed the application of
the methods of spin glass theory to some optimization problems. In particular they studied
the Tsp and the Rmmp. To understand their result, some concepts from the Theory of
Disordered Systems are necessary. We will present here a brief survey on Spin glass Theory
and, subsequently, we will reproduce their argument for the Rmmp, neglecting the first
order finite size corrections obtained by Parisi and Rati´eville [46]: remarkably, these results
on random matching problems, published in 1985, were rigorously proved by Aldous for the
Rbmp only in 2001, as explained in Section 1.3.2.
1.4.1. Spin glasses
In the context of the Theory of Disordered Systems, spin glasses played the role of reference frame in which physicists analysed the peculiar effects of disorder on the behaviour of
systems with a large number of degrees of freedom. To summarise the important results obtained in the last forty years, let us briefly introduce the Ising model and the corresponding
mean field disordered models. After that, we will discuss the theory of spin glasses, with a
particular attention to the SK-model and its solution.
1.4. Optimisation and statistical mechanics: replicas for optimisation
11
Ising model
Let us consider a graph G = Graph(V; E) and suppose that we assign to each node i ∈ V of
G an Ising spin variable σi ∈ {−1, 1} and to each edge e ≡ (i, j) ∈ E an interaction energy
−Jσi σj , J ∈ R, depending on the value of the Ising spins on the end points of the edge.
We introduce also an energy functional for the graph in the following form:
X
X
HG [σ; J, h] := −J
σi σj − h
σi .
(1.4.1)
(ij)∈E
i∈V
A model defined as above is an Ising model on the graph G: if G is a complete graph,
the model is sometimes called Curie–Weiss model (in the literature, Ising-type models
on complete graphs are also called infinite range models). In the problem of magnetism,
the Ising spins represent microscopic magnetic moments on a certain lattice, coupled in
a ferromagnetic (J > 0) or antiferromagnetic (J < 0) way and in presence of a certain
external uniform magnetic field h. In eq. (1.4.1) we denoted by σ := {σi }i∈V the generic
configuration of the system. To each configuration is associated a Gibbs–Boltzmann weight
µG [σ; β, J, h] :=
1
e−βHG [σ;J,h] ,
ZG (β, J, h)
ZG (β, J, h) :=
X
e−βHG [σ;J,h] ,
(1.4.2)
σ
P
in such a way that σ µG [σ; β, J, h] = 1. The parameter β −1 > 0 is called, as known,
temperature of the system, whilst ZG (β, J, h) is its partition function. In the following,
given a function f := f (σ), we will denote by
X
hf i :=
f (σ)µG [σ; β, J, h].
(1.4.3)
σ
If N := |V| ∈ N, ZG (β, J, h) is an analytic function of its arguments β, J and h; however,
in the limit N → ∞, some singularities may appear in some points (critical points) or
surfaces (critical surfaces) in the (β, J, h) space. The existence and location of this critical
points or lines is of paramount interest, being the fingerprint of phase transitions, i.e. they
identify the interfaces between different macroscopic behaviours of the system. This change
of macroscopic behaviour is in general characterized by the sharp change of the value of a
certain parameter, called order parameter. Unfortunately, there are no general criteria to
identify the proper order parameter for a certain system.
To be more precise in our exposition, let us consider an hypercubic lattice Ld in d
dimensions of side L, in which there are Ld ≡ N vertices and each of them is connected to
its 2d nearest neighbours. On this lattice, let us consider an Ising model (fig. 1.4.1a) and
let us denote by
HN [σ; J, h] := −J
X
σi σj − h
N
X
σi
i=1
hiji
and ZN (β, J, h) :=
X
e−βHN [σ;J,h]
(1.4.4)
σ
its Hamiltonian functional and its partition function respectively. We denote by hiji pairs
of nearest neighbours spins and we assume also J > 0 (ferromagnetic model). The Gibbs–
Boltzmann measure is therefore
µN [σ; β, J, h] :=
1
e−βHN [σ;J,h] .
ZN (β, J, h)
(1.4.5)
In the large N limit (i.e., for an infinite lattice) an useful order parameter is the so-called
magnetisation:
m(β, J, h) := lim
N
1 X
1 ∂ ZN (β, J, h)
hσi i ≡ − lim
∈ [−1, 1].
N N
N i
∂h
(1.4.6)
12
Chapter 1. Optimisation, disorder and statistical mechanics
(a) Ising model.
(b) EA-model.
(c) SK-model.
Figure 1.4.1.: Pictorial representation of 2-dimensional lattice models and of a mean field model with Ising
variables. Different edge colors represent different values of the coupling constants.
Magnetisation is an order parameter in a strict sense: in fact, |m(β, J, h)| ≈ 1 is indicative
of the fact that, on average, almost all spins have the same value, whilst |m(β, J, h)| ≈ 0
suggests that the system is disordered.
Suppose now J > 0 fixed. Remarkably, for d > 1 the function m(β, J, h) has a branch
cut in the (β, h) plane for h = 0 and β > βc (d, J), where βc (d, J) is a certain critical value
depending on dimensionality: formally, for d = 1
βc (1, J) = +∞,
(1.4.7)
whilst for d = 2 a famous duality argument from Kramers and Wannier [28] shows that
√ ln 1 + 2
.
(1.4.8)
βc (2, J) =
2J
For d > 2 only numerical estimations of βc are available. The branch cut of m(β, J, h) is a
signal of the non analyticity of ZN (β, J, h) in the N → ∞ limit; moreover, for β > βc (d, J),
we have that
(1.4.9)
lim+ m(β, J, h) 6= lim− m(β, J, h) = −1
h→0
h→0
and therefore
lim µN [σ; β, J, h] 6= lim µN [σ; β, J, h].
h→0+
h→0−
(1.4.10)
In other words, on the critical line two different equilibrium measures coexist and we have
a spontaneous symmetry breaking of the Z2 invariance {σi } 7→ {−σi } of the Hamiltonian
functional for h = 0. We say that for β > βc (d, J) and h = 0 we have two different phases,
each of them identified by the value of the order parameter m.
The intuitive concept of pure phase can be more rigorously formulated in the context of
Statistical Field Theory [57]. In this context, we simply state that an equilibrium measure
µ describes a pure phase if and only if, given two local observables A(x) and B(x) (where
x is a graph site) then
lim
[hA(x)B(y)i − hA(x)i hB(y)i] = 0,
(1.4.11)
δ(x,y)→+∞
where δ(x, y) is the distance between the node x and the node y in the graph. The previous
condition is called cluster property.
The Ising model is easily solvable on a one dimensional lattice; in 1944 Lars Onsager
announced its celebrated solution for the Ising model with h = 0 in d = 2. Unfortunately,
no exact solutions are available for d > 2, even in absence of external magnetic field. For
this reason a proper approximation can be useful to understand, at least qualitatively,
the properties of our system. The most used approximation is the mean field approach:
1.4. Optimisation and statistical mechanics: replicas for optimisation
13
we assume that σi = m + δσi , where m ≡ m(β, J, h), and we neglect fluctuations in the
non-linear terms of the Hamiltonian. For the Ising model on a d-dimensional lattice,
HN [σ; J, h] = −J
X
(m + δσi ) (m + δσj ) − h
N
X
σi ≈ 2dJN m2 − (2dJm + h)
i=1
hiji
N
X
σi .
i=1
(1.4.12)
In this approximation is easily obtained
2
N
ZN (β, J, h) = e−β2dN Jm [2 cosh β (Jmz + h)] ,
m = tanh β(2dJm + h).
(1.4.13)
From the last equation, we have that a non-vanishing magnetisation can be obtained for
1
. The critical value obtained
h = 0 only if 2dβJ > 1, i.e. for β > 2dJ
βcmf =
1
2dJ
(1.4.14)
is incorrect: however, the mean field theory qualitatively reproduce the critical behaviour
of the original model and correctly predict the existence of a critical temperature.
The mean field energy for the Ising model has the form
4
1
N
N
β
N
m
FN (β, J, h) = − ln ZN (β, J, h) ≈ − ln 2+ mf 1 − mf m2 +
β 3 +o m4
mf
β
β
2βc
βc
12 βc
(1.4.15)
in which we performed a power expansion in m up to the fourth order. Note that the
functional above has a global minimum in m = 0 for β < βcmf that becomes a local maximum
for β > βcmf ; on the other hand, two new minima appear for β > βcmf . This simple
observation, proposed for the first time by Landau, gives a qualitative explanation of the
spontaneous symmetry breaking discussed above.
Edwards–Anderson model and Sherrington–Kirkpatrick model
Let us consider an Ising-type model on a graph G = Graph(V; E) as introduced at page 11.
We consider the following Hamiltonian functional:
X
X
HG [σ; {J}, h] := −
Jij σi σj − h
σi .
(1.4.16)
(i,j)∈E
i
This time, each Jij is supposed to be extracted, independently from all the others, from a
certain given probability distribution density ρ(Jij ), identical for all e = (i, j) ∈ E. The set
of values is extracted once and for all (for this reason we say that the disorder is quenched),
and usually both positive and negative value of Jij are admitted. If we consider as graph the
hypercubic lattice Ld , the model obtained is called Edwards–Anderson model (EA-model)
and it is an example of disordered system. A disordered system is a system in which both
frustration 5 and randomness appear. Clearly we are not interested in the properties of a
precise realisation of our system, but we have to average in some sense on the possible
values of {J}. How to perform this average, however, is a delicate matter: in fact, it turns
5A
lattice model is frustrated whenever there exists a set of local constraint in conflict each other. Suppose
that we have a cycle of length four in the EA-model in which product of the coupling constants J along
the cycle is negative; then, no configuration of Ising spins capable of minimising simultaneously the
energy contribution of all edges separately exists. For example, denoting by a black line an interaction
with J < 0 and by a white line an interaction with J > 0, there is no way to satisfy locally all interactions
in the following plaquette:
14
Chapter 1. Optimisation, disorder and statistical mechanics
out that the free energy FG for a disordered system on a generic graph G is self averaging,
i.e.,
(F¯ − F )2
= 0,
(1.4.17)
lim
|V|→∞
F¯ 2
where we denote by ¯
• the average over the disorder, but the partition function is not. It
follows that we have to compute in general the following configurational average:


Z
1 Y
FG [{Jij }; β] := −
p(Jij ) d Jij  ln ZG [{J}; β].
(1.4.18)
β
(i,j)∈E
Replica approach
To perform the integral (1.4.18), it is usually adopted a famous and powerful trick, called
replica trick. The manipulation is based on the following identity:
xn − 1
,
n→0
n
ln x = lim
x > 0.
(1.4.19)
To proceed with our computation, let us assume that G ≡ KN is a complete graph on which
the Hamiltonian functional (1.4.16) is defined: the obtained model (an infinite range version
of the EA-model) is called Sherrington–Kirkpatrick model (SK-model). In this model the
probability distribution density
r
2
N − N Jij
2J 2
e
(1.4.20)
ρ(Jij ) =
2πJ 2
is assumed6 . We want to compute the average free energy density
Zn − 1
− β f¯ := lim N
.
n→0
nN
(1.4.21)
N →∞
It follows that
Q
P
R
Pn
PN Pn
Zn =
p(Jij ) d Jij exp β i<j Jij α=1 σiα σjα + βh i=1 α=1 σiα
(ij)∈E
i
P
2
2 2 hQ P
PN Pn
β2 J 2 P
Nβ J n
α β
α
+ βh i=1 α=1 σi
= exp
α
{σ α } exp
α<β
i σi σi
4
2n
2 2 hQ
i
h
i
R
N β2 J 2 P
2
q
+
N
ln
z[q]
,
d
q
= exp N β 4J n
exp
αβ
α<β
α<β αβ
2n
(1.4.22)
where we have introduced

z[q] :=
X
exp β 2 J 2
σ 1 ,...,σ n

X
qαβ σ α σ β + βh
α<β
X
σα  .
(1.4.23)
α
Being the exponent of the integrand proportional to N , we can use the steepest descent
to evaluate the expression. In particular, the extremality condition respect to the generic
element qαβ of the matrix q := (qαβ )αβ gives
qαβ =
6 The
∂ ln z[q] α β ≡ σ σ z,
∂qαβ
(1.4.24)
normalisation constant is due to the fact that we want the energy to be an extensive quantity.
1.4. Optimisation and statistical mechanics: replicas for optimisation
15
where we denoted by


1
O σ 1 , . . . , σ n z :=
z
X
O σ1 , . . . , σ
n
exp β 2 J 2
σ 1 ,...,σ n
X
qαβ σ α σ β + βh
X
σα  ;
α
α<β
(1.4.25)
Let Q := (Qαβ )αβ be the solution of eq. (1.4.24). Calling
S[Q ] := −
β2J 2 X 2
1
1
Qαβ + β 2 J 2 + ln z[Q ],
2
4
n
H(αβ)(γδ) :=
α6=β
∂ 2 S[q] , (1.4.26)
∂qαβ ∂qγδ q=Q
it follows that



2 2 X
1
β
J
1
1
1 + nN −
Zn ∼ √
Q2αβ + β 2 J 2 + ln z[Q ] .
4
4
n
det H
α6=β
(1.4.27)
Notably, Qαβ are not merely a set of variables: they express a sort of “average superposition”
between replicas, being (as can be directly checked)
E
D
(1.4.28)
Qαβ = σiα σiβ ,
R
where h•iR is the average respect to the replicated system whose Hamiltonian function is
H R := −
n X
X
Jij σiα σjα + h
α=1 i<j
n X
N
X
σiα .
(1.4.29)
α=1 i=1
The variables Qαβ play the role of spin glass order parameters, as we will see. If we suppose
that all replicas are equivalent (replica symmetric hypothesis)
D E
D
E
2
Qαβ = σiα σiβ
= hσi i = q =: qEA .
(1.4.30)
= hσiα iR σiβ
R
R
The quantity qEA is called Edward–Anderson order parameter: we expect that for β → 0 the
spins are randomly oriented, and therefore qEA = 0, whilst in the β → +∞ limit qEA > 0,
having hσi i =
6 0 for each realization.
Stability of the replica symmetric solution
A first question that arises about the solution obtained is if it is stable respect to fluctuations
around the solution Qαβ = q. From a mathematical point of view, the solution is stable
if the Hessian matrix H is positive definite in Qαβ = q. We have to solve the eigenvalue
problem
H η = λη.
(1.4.31)
We will not reproduce here all the steps of this computation, but we will only present the
results, obtained for the first time by Almeida and Thouless [3]. A direct computation shows
that
H(αβ)(γδ) = δαγ δβδ − β 2 J 2 σ α σ β σ γ σ δ z − σ α σ β z σ γ σ δ z .
(1.4.32)
Therefore, the structure of the Hessian matrix is the following
2 H(αβ)(αβ) = 1 − β 2 J 2 1 − σ α σ β z ≡ a,
H(αβ)(αγ) = −β 2 J 2 σ β σ γ z − σ α σ β z hσ α σ γ iz ≡ b,
H(αβ)(γδ) = −β 2 J 2 σ α σ β σ γ σ δ z − σ α σ β z σ γ σ δ z ≡ c,
(1.4.33a)
(1.4.33b)
(1.4.33c)
16
Chapter 1. Optimisation, disorder and statistical mechanics
i.e., the Hessiam matrix has a particular symmetric form. In
the paramagnetic phase, being q = 0, a direct computation
shows that b = c = 0 and a = 1, so the solution is stable. In
the ferromagnetic phase, Almeida and Thouless [3] found the
following eigenvalues for the eigenvector symmetric under the
exchange of replica indices (i.e. of the form η αβ = η ∀α, β)
λ1 =
Figure 1.4.2.: Structure of the
Hessian matrix H for n = 4.
1 − a − 4b + 3c ± |1 − a + 4b − 3c|
,
2
(1.4.34)
The eigenvalues λ2 obtained for the n − 1 eigenvectors in
the form η αβ ≡ η0 (δαθ + δβθ ) + η1 for a certain θ, is such
n→0
that λ2 −−−→ λ1 . The third eigenvalue, corresponding to the
n(n−3)
eigenvectors in the form η θν ≡ η1 , η θα = η να ≡ η2 and
2
αβ
η ≡ η3 ∀α, β and for fixed ν, θ, is
λ3 = a − 2b + c.
It can be seen that, for Qαβ ≡ q, λ1 is always positive. However
Z
w2
1
1
√
λ3 = a − 2b + c > 0 ⇒ 2 2 > √
e− 2 sech2 (βJ qw + βh) d w.
β J
2π
(1.4.35)
(1.4.36)
The previous inequality defines a region in the (β, h) space in which the replica symmetric
solution is stable and, as consequence, a region in which this solution is unstable. The
boundary between these two region is called Almeida–Thouless line (AT-line). The eigenvector η3 corresponding to the eigenvalue λ3 is called replicon.
Returning to eq. (1.4.27), careful computation gives
√
ln λ1
n(n−3)
n−1
1
= e− 2 − 2 ln λ2 − 4 ln λ3
det H
1 dλ1
1 dλ2
ln λ2
3
≈1+n −
+
−
+ ln λ3 + o(n)
2λ1 dn
2λ2 dn
2
4
n=0
b − 32 c
n
3
=1−
−
+ ln λ1 − ln λ3 + o(n). (1.4.37)
2
λ1
2
n=0
1
→ 1. If all replicas are equivalent,
In the n → 0 limit, at first order we obtain simply √det
H
i.e., Qαβ ≡ q
Z
w2
β2J 2
1
√
2
¯
− βf =
(1 − q) + √
e− 2 ln [2 cosh (βJ qw + βh)] d w,
(1.4.38)
4
2π
from which we get
Z
w2
1
√
e− 2 tanh (βJ qw + βh) d w,
m= √
(1.4.39)
2π
whilst the value of q to be inserted in the previous equation can be obtained from the
extremization condition on eq. (1.4.38):
Z
w2
1
√
q=√
e− 2 tanh2 (βJ qw + βh) d w.
(1.4.40)
2π
The replica approach for the Sherrington–Kirkpatrick model seems to give a complete solution of the model and some information can be extracted from a proper Landau theory
on eq. (1.4.38). However, our solution presents an anomalous behaviour in the β → +∞
∂ f¯
1
gives s → − 2π
limit. For example, a direct computation of the entropy density s = β 2 ∂β
for β → +∞. The origin of this pathological behaviours (instability, negative entropy) was
identified and fixed by Parisi [45].
1.4. Optimisation and statistical mechanics: replicas for optimisation
17
Replica symmetry breaking and Parisi’s solution In the replica symmetric hypothesis, the matrix Q = (Qαβ )αβ can be represented in the following way (in the example,
n = 12):


0 q q q q q q q q q q q
q 0 q q q q q q q q q q 


q q 0 q q q q q q q q q 


q q q 0 q q q q q q q q 


q q q q 0 q q q q q q q 
q q q q q


0 q q q q q q 
(1.4.41)
Q = q q q q q q

0 q q q q q 

q q q q q q q

0 q q q q 

q q q q q q q q 0 q q q 


q q q q q q q q q 0 q q 


q q q q q q q q q q 0 q 
q q q q q q q q q q q 0
Here we impose Qαα = 0 ∀α and Qαβ = q for α 6= β, under the hypothesis that the invariance under the exchange of replicas is satisfied. Parisi suggested to break this symmetry in
a set of steps. At the first step we choose an integer m1 such that all replicas are divided
into mn1 groups: if α and β belong to the same group, then Qαβ = q1 , otherwise Qαβ = q0 .
The matrix Q has the following form (in the example, n = 12 and m1 = 4):


0 q1 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0
 q1 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 


 q1 q1 0 q1 q0 q0 q0 q0 q0 q0 q0 q0 


 q1 q1 q1 0 q0 q0 q0 q0 q0 q0 q0 q0 


 q0 q0 q0 q0 0 q1 q1 q1 q0 q0 q0 q0 
q q q q q

 0 0 0 0 1 0 q1 q1 q0 q0 q0 q0 
Q = q q q q q q
(1.4.42)

 0 0 0 0 1 1 0 q1 q0 q0 q0 q0 
q q q q q q q

 0 0 0 0 1 1 1 0 q0 q0 q0 q0 
 q0 q0 q0 q0 q0 q0 q0 q0 0 q1 q1 q1 


 q0 q0 q0 q0 q0 q0 q0 q0 q1 0 q1 q1 


 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q1 
q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 q1 0
We can proceed further, subdividing each one of the m1 groups of replicas in m2 subgroups,
and so on. We finally obtain a set of integer numbers n ≥ m1 ≥ m2 · · · ≥ 1 and a gradually
finer subdivision of the matrix Q , in which at the kth step of symmetry breaking we admit
k + 1 possible different q values (in the example below, n = 12, m1 = 4, m2 = 2):
0
q
q
q
q
q
q
q
q
q
q
q
|
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
{z
rs
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
q
q
q
q
q
q
q
q
q
q
q
q
0
}
⇒
0
q1
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
|
q1
0
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
0
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
q1
0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
0
q1
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
0
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
0
q1
q0
q0
q0
q0
{z
1rsb
q0
q0
q0
q0
q1
q1
q1
0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
0
q1
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
0
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
0
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
q1
0
}
⇒
0
q2
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
|
q2
0
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
0
q2
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
q2
0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
0
q2
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q2
0
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
0
q2
q0
q0
q0
q0
{z
2rsb
q0
q0
q0
q0
q1
q1
q2
0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
q0
0
q2
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q2
0
q1
q1
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
0
q2
q0
q0
q0
q0
q0
q0
q0
q0
q1
q1
q2
0
⇒ ···
}
(1.4.43)
The idea of evaluating the free energy taking into account different levels of replica symmetry
breaking is not justified by any general principle, but the results obtained are in excellent
agreement with the data. Moreover, not always infinite steps of replica symmetry breaking
(full rsb) are needed, and it happens for certain disordered systems that a finite number
of steps is sufficient to reproduce correctly the behaviour of all interesting quantities. For
18
Chapter 1. Optimisation, disorder and statistical mechanics
the SK-model, numerical evidences show that a full rsb is necessary. To perform it, note
that, at the kth level of rsb,
k
X
1X l
Qαβ =
(mj − mj+1 )qjl ,
n
j=0
m0 ≡ n, mk+1 ≡ 1, l ∈ N.
(1.4.44)
αβ
Let us also define
q l (x) :=
k
X
qil I[mi+1 ,mi ] (x)
(1.4.45)
i=1
where I[mi+1 ,mi ] (x) takes the value 1 if x ∈ [mi+1 , mi ], zero otherwise. In the limit n → 0
we have that n ≥ m1 ≥ m2 · · · ≥ 1 −→ 0 ≤ m1 ≤ · · · ≤ 1, and therefore q(x) is defined in
the interval [0, 1]; taking n → 0 and in the large k limit, mj − mj+1 → − d x and we can
write
Z1
k
X
1X l
l k→+∞
Qαβ =
(mj − mj+1 )qj −−−−−→ − q l (x) d x.
(1.4.46)
n
j=0
αβ
0
From eq. (1.4.27) and eq. (1.4.37), we obtain
− β f¯ ∼ −
1
1
β2J 2 X 2
Qαβ + β 2 J 2 + ln z[Q ].
4
4
n
(1.4.47)
α6=β
and then we get for the energy and the susceptibility
R1
P
2
2
¯ := ∂β f¯ = − βJ2 1 + n1 α6=β Q2αβ −→ − βJ2 1 − 0 q 2 (x) d x
R1
P
χ
¯ := ∂h f¯ = β 1 + n1 α6=β Qαβ −→ β 1 − 0 q(x) d x .
(1.4.48a)
(1.4.48b)
The complete expression for f¯ in the full rsb scheme can be obtained after some computations [42]:


Z1 p
Z1
w2
2 2
1
β
J
2
¯


P
1 + q (x) d x − 2q(1) − √
q(0)w e− 2 d w,
βf = −
4
2π
(1.4.49)
0
0
where the function P (x, h) satisfies the so called Parisi equation:
∂ P (x, h)
∂P (x, h)
J 2 dq(x) ∂ 2 P (x, h)
+x
=−
,
∂x
2 dx
∂h
∂h2
P (1, h) = ln (2 cosh βh) .
(1.4.50)
As before, q must be found extremising the functional (1.4.49): this is a quite difficult task
but numerical simulations are in perfect agreement with Parisi’s solution when the limit
β → +∞ is taken.
The reason of the success of Parisi’s approach is that, in the thermodynamical limit, the
free energy of a spin glass has a multivalley structure in which many minima separated by
indefinitely high barriers appear: the rsb is necessary to probe all these different valleys
simultaneously. It is said that each valley corresponds to a phase (or state) and ergodicity
breaking occurs. To better understand this concept, let us introduce the overlap between
states for a certain realization {J} of the system in the form
Qab :=
N
1 X
hσi ia hσi ib ,
N i=1
(1.4.51)
1.4. Optimisation and statistical mechanics: replicas for optimisation
19
where h•ia denote the average restricted to the state a, and let be
X
ρJ (Q) := hδ(Q − Qab )i =
wa wb δ(Q − Qab ),
(1.4.52)
ab
where wa is the probability of being in the state a. We are interested, as usual, to the
average over the disorder of ρJ ,
(1.4.53)
ρ(Q) := ρJ (Q).
Note that in the Ising model for β → +∞ and h = 0 only two phases are possible and
ρ(Q) = δ(Q−1)
+ δ(Q+1)
. In the full rsb scheme, it can be proved [45] that
2
2
1
Qk
Z
:=
Z
X
1
k
Q ρ(Q) d Q = lim
Qαβ = q k (x) d x,
n→0 n(n − 1)
k
α6=β
(1.4.54)
0
average of all Qαβ , solutions of (1.4.24). Here we used eq. (1.4.46). In particular, Q =
R1
R1
q(x) d x = −1 q dx
dq d q, i.e., denoting by x(q) the inverse function of q(x),
0
Zq
ρ(Q) d Q.
x(q) =
(1.4.55)
−∞
This equation stresses the strong relation between the overlap between different replicas
Qαβ and the overlap between different states Qαβ : in particular, it is sufficient to invert
q(x) to obtain the cumulative distribution of the overlaps.
Thouless–Anderson–Palmer equations
A different but instructive approach was proposed in 1977 by Thouless, Anderson, and
Palmer [52] to obtain a set of equations for the local magnetisation in the SK-model. Given
a certain realisation of the SK-model, the free energy can be written as



X
X
X
− βF [β; {h}; {J}] := ln 
exp β
Jij σi σj + β
hi σi  .
(1.4.56)
{σi }
i
i6=j
Here we suppose that the external magnetic field depends on the site position. As known
from the general theory, the local magnetisation is given by mi := hσi i = −∂hi F . Let us
now perform a Legendre transform, introducing
"
#
X
G[β; {m}; {J}] = max F [β; {h}; {J}] +
mi hi .
(1.4.57)
hi
i
The new functional is a constrained free energy, in the form



X
X
X
− βG[β; {m}; {J}] = ln 
exp β
Jij σi σj + β
hi (mi )(σi − mi ) ,
{σi }
(1.4.58)
i
i6=j
where hi (mi ) are such that hσi i = mi . Let us now expand the previous functional around
β = 0. We obtain
− βG[β; {m}; {J}] =
N X
1 − mi
2
i=1
+β
X
ij
ln
1 − mi
2
Jij mi mj +
1 + mi
−
ln
2
1 + mi
2
β2 X 2
J (1 − m2i )(1 − m2j ) + o(β 2 ). (1.4.59)
2 ij ij
20
Chapter 1. Optimisation, disorder and statistical mechanics
σ1
σ2
σ5
σ3
σ4
(a) B2N .
(b) G2N,5
σ1
σ1
σ2
σ3
σ0
σ2
σ5
σ4
σ5
σ0
σ3
σ4
(d) Example of both link addition σ1 − σ2 ,
and site addition, σ0 .
(c) Addition of the site σ0 .
Figure 1.4.3.: Cavity method. In a pictorial way we stressed that in a Bethe lattice loops at large distance
N
) preserving the frustration property of the disordered system.
are present (of lenght of order ln
ln k
Plefka [48] showed that, for the SK-model, the additional terms can be neglected in the
N → +∞. The minimum condition respect to mi , ∂mi G = 0, gives the celebrated Thouless–
Anderson–Palmer equations (TAP-equations):
 
mi = tanh β 

X
Jij mj + hi − mi
j6=i
X
2
Jij
β(1 − mj )2 
(1.4.60)
j6=i
In the previous expression an additional term appears respect to the Ising mean field magnetisation (1.4.13), i.e. the so called Onsager reaction
hO
i := −mi
X
2
Jij
β(1 − mj )2 .
(1.4.61)
j6=i
This additional term is not negligible in the spin glass case due to the fact that Jij ∼
and not Jij ∼ N1 as in the ferromagnetic case.
√1
N
1.4. Optimisation and statistical mechanics: replicas for optimisation
21
The cavity method
The cavity method provides a set of iterative equations to obtain information on the ground
state of a disordered system: in a quite dated formalism, the method was applied to the
random matching problem by M´ezard and Parisi [35], as we will discuss later. We briefly
present this method for a glass on a Bethe lattice following a more recent paper [39].
Suppose that we have a Bethe lattice BkN = Graph(V; E), i.e. a random graph of N 0
vertices with fixed connectivity k+1 (see figure 1.4.3a). Moreover, let σi be the spin variable
on site i, Jij the random coupling on the link (i, j). We suppose that Jij are identically
and independently distributed random variables with probability distribution density ρ(J),
and that the Hamiltonian has the form
X
HBk [σ; {J}] = −
Jij σi σj .
(1.4.62)
N
(i,j)∈E
Let EN be the global ground state (ggs) energy for this system. Consider a new graph
GkN,q ⊆ BkN obtained from BkN decreasing the degree of q > k spins from k + 1 to k (cavity
spins, see figure 1.4.3b). Fixing the values of these spins to σ1 , . . . , σq , we suppose that the
ggs energy of this new system is related to the old one in the following way
EN (σ1 , . . . , σq ) − EN =
q
X
hi σi ,
(1.4.63)
i=1
where hi are called local cavity fields. Due to the random nature of the system, we assume
that {hi }i are i.i.d. random variables with distribution %(h). Now suppose that a new spin
σ0 is added to the lattice and coupled to k cavity spins σ1 , . . . , σk by J1 , . . . , Jk respectively
(see figure 1.4.3c). We fix the value of σ0 and change the value of σ1 , . . . , σk in such a way
that the new ggs energy is minimized. The energy of each link (i, 0) is minimized taking
i = min [(−hi − Ji σ0 )σi ] =: −a(Ji , hi ) − σ0 b(Ji , hi ).
σi
(1.4.64)
It is clear that the final cavity field on σ0 is
h0 =
k
X
b(Ji , hi ),
(1.4.65)
i=1
implying that the following recursion relation holds
%(h) =
k Z
Y
%(hi ) d hi δ h −
i=1
k
X
!
b(Ji , hj )
(1.4.66)
i=1
Now, it can be observed that, denoting by δ10 the energy shift of the ggs due to a site
addition and by δ20 the energy shift of the ggs due to link addition to GkN,q [38],
EN
k+1 2
= δ10 −
δ0 .
N →∞ N
2
lim
(1.4.67)
If we are able to solve eq. (1.4.66), then we can compute the ggs energy in the β → ∞
limit by


k+1
k+1
k+1
Y Z
X
X
%(hi ) d hi 
a(Jj , hj ) + δ10 = −
b(Jj , hj ),
(1.4.68a)
j=1
i=1
j=1
δ20
= −
2 Z
Y
i=1
%(hi ) d hi max (h1 σ1 + h2 σ2 + J12 σ1 σ2 ).
σ1 ,σ2
(1.4.68b)
22
Chapter 1. Optimisation, disorder and statistical mechanics
Finally, observe that the method, in the formulation presented here, works only in the
replica symmetric hypothesis. In other words, we assume that there are not local ground
states, i.e. local minima in energy whose corresponding configurations can be obtained from
the global minimum ones only by an infinite number of spin flip. This eventuality is not
so rare and it is necessary to break the replica symmetry to correctly reproduce the ggs
energy.
1.4.2. The solution of the random matching problem via replicas
After the digression above on the general Theory of Disordered Systems, let us turn back
to the Rmmp. As explained before, in the Rmmp we work on the complete graph G = K2N
with a weight function w in which the edge weights are random values extracted from a
certain probability density function ρ(w). We suppose that this distribution is
ρ(w) =
wr e−w
.
r!
(1.4.69)
Labelling by i = 1, . . . 2N the vertices of the graph, the 2N × 2N weighted adjacency
matrix W = (wij ), wij = wji weight on the edge joining i and j, identifies a realization of
our problem. The cost of a matching on this graph and the optimal matching cost can be
expressed respectively as
EN [M ] =
1
N
X
EN := min EN [M ],
mij wij ,
M
1≤i<j≤2N
(1.4.70)
where the symmetric matrix M = (mij )ij is such that mij = 1 if the vertices i and j are
P2N
connected, zero otherwise, mii = 0 ∀i, and j=1 mij = 1 ∀i.
To study the optimal matching properties, we can write down a “partition function” for
our “system”, assuming EN [M ] as an “energy” density functional of the “configuration” M ,
in the form



2N
2N
Y
X
X
 δ 1,
mij  e−βN EN [M ]
Z[β; W ] :=
{nij =0,1}

=
i=1
2N Z2π iλi
Y
i=1 0
e
j=1
(1.4.71)

d λi  Y 1 + e−βwjk −i(λj +λk ) ,
2π
j<k
where we introduced the Kronecker symbol δ (a, b) ≡ δa,b . To get a finite free energy
1
ˆ r+1
density in the N → ∞ limit we impose the scaling β = βN
. We proceed using the
replica approach. After some computations, we get in the large N limit
 

Z Y
n
n
Y
X
X
d
q
1
1
α
...α
1
p
p
Zn =
exp N −
qα2 1 ...αp + 2 ln z[q] ,
2
g(p)
2πN
g(p)
α
<···<α
α
<···<α
p=1
p=1
p
1
p
1
(1.4.72)
where g(p) :=
1
ˆ r+1
(pβ)
and



n Z2π iλα
n
n
α
Y
X
X
X
α1
αp
e
d
λ
 exp i
z[q] := 
λα +
qα1 ...αp e−i(λ +···+λ )  .
2π
α=1
α=1
p=1 α <···<α

1
0
p
(1.4.73)
The saddle point equation gives
D
E
α1
αp
qα1 ...αp = 2g(p) e−i(λ +···+λ ) ,
z
(1.4.74)
1.5. Models on graphs and Belief propagation
23
where h•iz is the average computed respect to the measure defined by the partition function
z[q]. In the following we will denote by Qα1 ...αp the solution of the saddle point equation.
In the replica symmetric hypothesis Qα1 ...αp ≡ Qp and
− β f¯ = −
Z∞ ∞
w
1 X (−1)p−1 2
e− e − e−Gr (w) d w,
Qp + 2
2 p→0 g(p)p
Gr (x) :=
∞
X
(−1)p−1
p=1
−∞
Qp exp
.
p!
(1.4.75)
The saddle point equation can be rewritten in terms of Gr (w) as
Gr (w) =
2
βˆr+1
+∞
Z
Br (w + y) e−Gr (y) d y,
Br (x) :=
∞
X
(−1)p−1
p=1
−∞
epx
.
pr (p!)2
From the expression of f¯ the average mean energy can be easily obtained:
R +∞
1
Gr (w) e−Gr (w) d w.
hE (β)i = N − r+1 r+1
−∞
βˆ
(1.4.76)
(1.4.77)
Remarkably, in the limit βˆ → +∞
1
hE (β)i = N − r+1 (r + 1)
+∞
Z
˜ r (w) e−G˜ r (w) d w,
G
˜ r (w) := 2
G
−∞
+∞
Z
(w + y)r −G˜ r (w)
e
d y.
r!
−w
(1.4.78)
and for r = 0, the computation can be performed exactly and the the famous result for the
average optimal cost in the flat Rmmp is obtained:
lim N EN = lim N hE (β)i =
N →∞
β→∞
π2
.
12
(1.4.79)
Moreover, M´ezard and Parisi [37] were able to obtain the distribution ρ(w) of the optimal
1
weights: rescaling by w 7→ wN − r+1 , they obtained for r = 0
ρ(w) =
w − e−w sinh w
.
sinh2 w
(1.4.80)
By similar arguments, M´ezard and Parisi [36] showed that in the Rbmp for r = 0
lim N EN =
N →∞
π2
.
6
(1.4.81)
Finally, observe that, also in the replica symmetric hypothesis, an infinite number of order
parameters is necessary to deal with such combinatorial problems, contrary to what happens
for the SK-model.
1.5. Models on graphs and Belief propagation
1.5.1. Factor graph and Belief propagation
The idea of a map between an infinite bipartite graph and a tree, suggested by Aldous
and presented in section 1.3.2, meets the statistical physics approach in an algorithmic
method that can be used to solve the Rbmp called Belief Propagation (Bp): this method is
indeed quite general, as we will see below. Moreover, this algorithmic approach is strictly
connected to the cavity method. Let us now introduce the general theory of Bp following
the pedagogical exposition of M´ezard and Montanari [34]. Suppose that we have a model in
24
Chapter 1. Optimisation, disorder and statistical mechanics
N variables xi , i = 1, . . . , N , taking their values in a finite set and suppose also that their
joint probability has the form
M
1 Y
ψa (x∂a ),
µ(x1 , . . . , xN ) =
Z a=1
(1.5.1)
where by x∂a we denote a certain subset of the set of variables, depending on a and by Z
a proper normalization constant. A joint probability of the form (1.5.1) can be graphically
represented by a factor graph, i.e. a bipartite graph in which two types of vertices are
present: in a factor graph each “function–type” vertex a is connected to the corresponding
subset ∂a of the N “variable–type” vertices. We will denote by Vf , |Vf | = M , the set of
function nodes, by Vv the set of variable nodes and by Gµ = Graph(Vv , Vf ; E) the factor
graph itself, E set of the edges. We can define on the edges of this bipartite graph two
functions (in this context called messages), one for each direction in which the edge can
be crossed. In particular, for an edge e = (i, a) joining the vertex corresponding to xi to
the function vertex corresponding to ψa , we define ui→a (xi ) and va→i (xi ). The messages
are elements in the space of probability distribution functions, subject to the normalization
conditions
N
M
X
X
ui→a = 1
va→i = 1.
(1.5.2)
a=1
i=1
We can write down an iterative “message–passing” algorithm (the so-called Bp-equations),
in which we update a message on an edge on the basis of the values of the incoming messages
on the tail of the directed edge, i.e., up to multiplicative (normalising) constants,
Y
t
ut+1
vb→i
(xi ),
(1.5.3a)
i→a (xi ) '
b∈∂i\a
t+1
va→i
(xi )
'
X
ψa (x∂a )
{x∂a\i }
Y
utk→a (xk ).
(1.5.3b)
k∈∂a\i
where ∂i denote the set of function nodes connected to the variable node i and t is the index
of the iteration. After T iterations
Y
T −1
uTi (xi ) '
va→i
(xi ),
(1.5.4)
a∈∂i
It can be proved that, if the factor graph Gµ is a tree, eq. (1.5.3) converge after at most
T ∗ iterations, where T ∗ = diam(Gµ ) is the diameter of the graph, irrespectively of the
initial conditions, to some functions u∗i , vi∗ . Moreover, in this case the obtained functions
are strictly related to the marginal probability distributions:
X
u∗i (xi ) ≡
µ(x1 , . . . , xN ).
(1.5.5)
{xj6=i }
This provides the connection between our initial µ function and the message functions.
There is a simple correspondence between the cavity method and the Bp algorithm. To
make clearer this relation between the cavity equations and the Bp equations, let us consider
the glass model in Section 1.4.1 and let us start from the cavity equations (1.4.68a): we
can consider for a moment a non zero temperature, β < ∞; the marginal distribution of σ0
µ0 (σ0 ) ' eβh0 σ0 is simply given by
X
X
µ0 (σ0 ) '
eβJi σ0 σi +βhi σi ≡
eβJi σ0 σi µi (σi ).
(1.5.6)
{σi }
Comparing with eq. (1.5.3) and observing that:
{σi }
1.5. Models on graphs and Belief propagation
25
21
2
1
23
22
11
2
12
32
13
31
3
1
33
3
Figure 1.5.1.: Factor graph for eq. (1.5.8) for the assignment problem on K3,3 , see figure 1.1.2: each factor
node i represents the constraint
P
j
mij = 1; similarly
circle nodes ij represent the variables nij and
j
represents the constraint
represents the factor e
−βmij wij
P
i
mij = 1; the
.
1. in the factor graph of this model each factor node has degree 2, being, in the notation
of the previous section, ψa ≡ ψ(σi , σj ) = Z1 eβJij σi σj ∀a;
2. µi (σi ) ≡ u∗i (σi );
it follows that the cavity equations at zero temperature are the fixed point Bp equations.
Observe that the model introduced in Section 1.4.1 is supposed on a Bethe lattice, i.e. on
a locally tree-like graph in which however large loops are present.
Belief propagation for loopy factor graphs: the Rbmp.
It is not obvious that the algorithm above works (even approximately) on “loopy” factor
graphs (i.e., factor graphs with cycles), but the general idea in this case is to adapt our
method imposing local consistent conditions. It can be proved that, in this case, if ψa (x∂a ) >
0 ∀a, then the Bp algorithm has at least one fixed point. We will not treat this delicate
problem in details, being outside the purposes of the present work (for a treatment of the
problem, see [34]).
To exemplify the described method on a loopy factor graph, let us consider its application
to the Rbmp. Given a complete bipartite graph KN,N = Graph(V; E), let us call W = (wij )ij
the weight matrix, where wij are i.i.d random variable with a probability density ρ(w). We
define also M = (mij )ij , matching matrix, such that mij = 1 if the edge (i, j) is occupied,
0 otherwise. Note that in this case the first index and the second one in M and W run
on different sets (i.e., W is not a weight adjacency matrix in a strict sense). Again, M
PN
PN
represents a matching if, and only if,
i=1 mij =
j=1 mij = 1 and the energy of the
matching M is given by
X
EN [M ] =
mij wij .
(1.5.7)
ij
In the following we identify M with the matching itself. We can write the joint probability
µ(M ) =
Y
(i,j)
δ

! N
X
e−βmij wij
.
mij , 1 δ 
mij , 1
Z
i=1
j=1
N
X
(1.5.8)
The corresponding factor graph for N = 3 is depicted in figure 1.5.1. Looking to the
26
Chapter 1. Optimisation, disorder and statistical mechanics
previous formula, the Bp-equations become in this case
ut+1
(ij)→i (mij ) '
t+1
vi→(ij)
(mij ) '
t
(mij ) e−βmij wij ,
vj→(ij)


X
X
Y
δ mij +
mkj , 1
ut(kj)→j (mkj ).
mij =0,1
k6=i
(1.5.9a)
(1.5.9b)
k6=i
−−→
To take the β → ∞ limit, let us introduce, for each oriented edge ~e = (i, j) of the original
bipartite graph,
t
1 vi→(ij) (1)
X t (~e) := ln t
.
(1.5.10)
β vi→(ij) (0)
Through this change of variable and taking the β → ∞ limit, we obtain the so called
min-sum algorithm for the assignment problem:
X t+1 (~e) = min w(ej ) − X t (~ej ) .
(1.5.11)
~
ej child
Observe moreover that for a given edge e ∈ E we have two message functions, i.e. X(~e) and
X(e~) and we have to update both simultaneously. Comparing eq. (1.5.11) with eq. (1.3.12),
it is easily seen that the fixed point solution of the previous equation X(~e) is the same function introduced by Aldous on the Poisson weighted directed tree: indeed, our Bp-equations
suggest that the problem of finding an optimal matching on the (loopy) bipartite complete
graph is equivalent to the searching for an optimal matching on an (infinite) tree in the
N → ∞ limit. This fact however does not guarantee the convergence for finite N . Indeed,
Fichera [24] showed that, for finite N , after a certain time tc the message functions enter in
a quasi-periodic regime that however still make possible to identify the correct solution. To
summarize his results, let us define the drift of a matching M with corresponding matching
matrix M = (mij )ij from the optimal matching M∗ , whose corresponding matching matrix
is M ∗ = (m∗ij )ij , as
EN [M ] − EN [M ∗ ]
P
δE [M] =
.
(1.5.12)
N − ij mij m∗ij
Introducing the set of matching differing from the optimal one for one loop,
D := {M matching on KN,N : M4M∗ is a cycle} ,
(1.5.13)
we can define
∆1
=
M1
=
min δE [M],
(1.5.14a)
arg min δE [M],
(1.5.14b)
M∈D
M∈D
∆2
=
min
M∈D\{M1 }
δE [M].
(1.5.14c)
Let us now consider a certain edge e = (i, j) ∈ E. Fichera proved that, running the
algorithm, we have that
∗
min (maxi,j (wij ), EN )
+ N − 1,
sign X t (~e) = (−1)mij −1 for t >
2∆1
(1.5.15)
where as usual EN ≡ EN [M ∗ ] := minM EN [M ] is the optimal matching cost. The previous
formula gives a criterion to identify the optimal matching in the hypothesis ∆1 6= 0. If
∆1 6= 0 and ∆1 6= ∆2 , then there exists a period T and a time tc such that
∗
X t+T (~e) = X t (~e) + (−1)mij −1 T ∆1 for t > tc .
(1.5.16)
1.5. Models on graphs and Belief propagation
27
It follows that the algorithm does not converge but it still provides useful informations on
the solution through the relation (1.5.15). More importantly, Fichera proved that, in the
quasi-periodic regime, denoting as above by e = (i, j) ∈ E,
λi = −
t+T −1
1 X
min X τ (~e),
T τ =t j∈∂i
µj = −
t+T −1
1 X (2) τ
min X (e~)
T τ =t i∈∂j
(1.5.17)
(2)
are an Hungarian gauge for our instance. In the previous formula, min gives as output the
second minimum in a certain set of values. If ∆1 1 or ∆1 ≈ ∆2 the algorithm converges
very slowly.
Chapter 2
Euclidean Matching Problems
Summary of the chapter In the present Chapter we present our original contributions to the main subject of the thesis, the Euclidean matching problem. Firstly, we review some well known results on Euclidean
matching problems. We discuss also the so called Monge–Kantoroviˇ
c transport problem mainly following
the lecture notes by Ambrosio [4] and Evans [23] and the monograph of Villani [54]; some fundamental
results from Measure Theory on the existence and properties of optimal maps are reviewed. Subsequently,
we discuss and present our results on the Euclidean matching problem: we give an exact solution for the
one dimensional case both on the line and on the circle; we discuss a proper scaling ansatz to correctly
reproduce the scaling behaviour of the optimal cost in any dimension; we present two different approaches
for the computation of the correlation function of the optimal matching map in any dimension; we present
a replica approach showing the analogies between the Euclidean problem and the purely random problem
that manifests itself as a mean field approximation of the Euclidean version.
2.1. Euclidean optimisation problems
In the previous chapter we presented some optimisation problems on weighted graphs and
some useful tools from Mathematics and Physics to solve them. However, we did not
discuss the effect of eventual correlations between the weights on the specific weighted graph
G = Graph(V; E) on which the problem is formulated, or we supposed that no correlation
at all was present (as in the random matching problems). In the present chapter, we will
consider the case in which the vertices of the graph correspond to geometrical points on a
certain d-dimensional space Ω ⊆ Rd . To be more precise, we will consider the case
Ω ≡ Ωd (L) := [0, L]d ⊆ Rd ,
Ωd := Ωd (1)
(2.1.1)
and a mapping
Φ : V → Ωd (L) such that vi ∈ V 7→ Φ(vi ) ≡ ξi ∈ Ωd (L).
(2.1.2)
In an Euclidean problem, given an edge eij = (vi , vj ) of the weighted graph, the corresponding weight wij := w(eij ) is a function of the Euclidean distance kξi − ξj k between the
geometric points corresponding to the ends of the considered edge; in particular, we will
consider the case
p
eij = (vi , vj ) ∈ E 7→ wp (eij ) ≡ wp,ij = kξi − ξj k ,
p ∈ R+ .
(2.1.3)
As in the non-Euclidean case, Euclidean optimisation problems are therefore formulated as
the search for the subgraph g∗ ⊆ G among a set of feasible subgraphs F such that a certain
functional
1 X
w(e), g = Graph(Vg , Eg ) ⊆ G,
(2.1.4)
E (p) [g; Φ] :=
|Eg |
e∈Eg
29
30
Chapter 2. Euclidean Matching Problems
is minimised:
E (p) [gΦ ] = min E (p) [g; Φ].
g∈F
(2.1.5)
In the previous expression we denoted by gΦ the subgraph that realises the optimality.
In the Tsp, for example, G ≡ KN , whilst F is the set of all Hamiltonian cycles; in the
monopartite matching problem, G ≡ K2N and the set of feasible solutions is given by the
set of all possible perfect matchings on G.
The Euclidean versions of many optimisation problems are of great interest, due to the
fact that some optimisation problems are actually defined in the geometric space (consider,
for example, the Tsp to be solved in a certain geographical area); on the other hand, the
Euclidean constraint determines correlation between the weights that cannot be neglected
when we deal with average properties of some Euclidean random optimisation problems.
2.1.1. Euclidean optimisation and randomness: empirical measures
As in the general case, we are not interested in a specific solution of a certain instance of an
Euclidean optimisation problem: indeed, from an algorithmic point of view, the fact that
the weights are obtained from Euclidean distances or not, does not matter. Instead, we are
interested in the average properties of a certain Euclidean optimisation problem in presence
of disorder. In a random Euclidean optimisation problem, the vertices of the graph G, on
which the problem is defined, are associated to points randomly generated on Ωd ⊆ Rd .
The weights to be associated to the edges are then computed as functions of the point
positions. As anticipated above, this fact determines, above all, the presence of correlations
(due to the underlying Euclidean structure) between the so obtained random weights. The
average procedure on the interesting quantities is therefore more subtle than in the purely
uncorrelated case and the point generation procedure becomes of paramount importance.
This generation takes place usually considering a set of points independently and identically distributed on ΩdRand using a certain probability distribution density ρ(x) : Ωd → R+
on the same domain, Ωd ρ(x) dd x = 1. In the present work we assume for Ωd the form
(2.1.1) and therefore we deal with a compact subset of Rd . Let us suppose then that a set
XN = {ξi }i=1,...,N ∈ Ωd is generated on the considered domain with probability distribution
density ρ; we introduce the empirical measure
N
1 X (d)
δ (x − ξi ) .
ρN [x; XN ] =
N i=1
(2.1.6)
It can be proved [21] that for N → ∞ the empirical measures ρN converge to ρ almost
surely.
2.1.2. Asymptotics for Euclidean functionals
The asymptotics of Euclidean functionals in the form (2.1.5) has been widely studied in the
hypothesis of randomly generated points on a certain domain Ω ⊆ Rd (see the monograph
of Yukich [56] for a complete treatment). In particular, the asymptotics of the Tsp problem
was firstly analysed by Beardwood, Halton, and Hammersley [6] in a celebrated result
subsequently generalised by Redmond and Yukich [49]: we will reproduce here only the
main results, being the analysis of the specific probabilistic methods used in the proofs
outside the purpose of this thesis. The scaling properties of the optimal solution will be
however of great interest in the following.
In the previous paragraph we introduced the optimal value of a certain Euclidean functional E [gΦ ] (2.1.5), in which we assume the form (2.1.3) for the weight function on the
edges of the considered graph. The defined functional is homogeneous and translationally
2.2. Euclidean matching problems
31
invariant, i.e., if xi 7→ λxi + r ∀i, λ ∈ R+ , r ∈ Rd , then E [gΦ ] 7→ λp E [gΦ ]. Moreover, we
say that the considered functional is continuous if, assuming Ω ≡ Ωd ,
1− p
(2.1.7)
|EgΦ |E (p) [gΦ ] − EgΦ0 E (p) [gΦ0 ] ≤ cVgΦ 4gΦ0 d ,
for some constant c ∈ R+ . Finally, we distinguish three type of Euclidean functionals in
relation to their behaviour respect to a partition of Ωd in M d cubes {ωi }i=1,...,M d of edge
1
length M
.
Subadditive functional An Euclidean functional E (p) is said subadditive if there exist a
constant cs ∈ R+ such that
X
|EgΦ |E (p) [gΦ ] ≤
|EgΦ ∩ωi |E (p) [gΦ ∩ ωi ] + cs M d−p
(2.1.8)
i
for all possible Φ. In the previous expression gΦ ∩ ωi denotes the graph obtained
suppressing all vertices v ∈ VgΦ such that v 7→ ξ ∈ Ωd \ ωi and all edges at them.
Superadditive functional An Euclidean functional E (p) [gΦ ] is said superadditive if there
exist a constant cS ∈ R+ such that
X
|EgΦ |E (p) [gΦ ] ≥
|EgΦ ∩ωi |E (p) [gΦ ∩ ωi ] − cS M d−p
(2.1.9)
i
for all possible Φ in Ωd .
Quasi-additive functional A subadditive Euclidean functional E (p) [gΦ ] is said quasi-additive
(p)
(p)
if there exist a superadditive Euclidean functional ES [gΦ ] such that ES [gΦ ] ≤
(p)
E [gΦ ] for all possible Φ and
(p)
1− p
|Eg |E (p) [gΦ ] − ES [gΦ ] = o |Vg | d ,
(2.1.10)
where the average • is intended over all possible Φ.
Redmond and Yukich [49] proved that for a quasi-additive Euclidean functional with 0 <
p < d and for independently and identically distributed positions of the vertices of the
graph, generated following the distribution density ρ(x) on Ωd , the identity
lim
|VgΦ |→∞
|EgΦ |E (p) [gΦ ]
|VgΦ |
d
1− p
Z
= αE (p) ,d
p
1− d
[ρ(x)]
dd x
(2.1.11)
Ωd
holds. Moreover they proved that the Euclidean functional (2.1.4) for the Tsp and for the
Euclidean monopartite matching problem on the complete graph K2N are quasi-additive
p
and therefore the optimal cost scales N − d for p ≥ 1. In the following we will propose new
arguments and, moreover, we will derive by our considerations finite size corrections to the
leading scaling with exact coefficients.
2.2. Euclidean matching problems
As anticipated in many occasions, in the present work we will consider Euclidean matching
problems on the unit hypercube Ωd . Random points on the unit hypercube are intended
generated independently and with the same uniform distribution ρ(x) = 1, unless otherwise
specified. We will consider here two different types of Euclidean matching problems.
32
Chapter 2. Euclidean Matching Problems
(a) p = 0.5.
(b) p = 1.
(c) p = 1.5.
Figure 2.2.1.: Optimal grid–Poisson Euclidean bipartite matching with N = 100 on the square with open
boundary conditions for the same instance: observe that for p = 1 there are no intersecting links.
Euclidean monopartite matching problem In the Euclidean monopartite matching problem (Emmp), we consider a matching problem on the complete graph K2N , assuming
that to each vertex v ∈ V is associated a point ξ = Φ(v) randomly generated on
the unit hypercube Ωd . We are interested in the perfect matching m∗ ⊆ K2N that
minimises the functional (2.1.4):
1 X
wp (e),
m = Graph(Vm ; Em ) matching;
(2.2.1)
E (p) [m; Φ] :=
N
e∈Em
where wp (e) has the expression (2.1.3). In the following we will denote by
(p,m)
EN
[Φ] :=
min
m matching
E (p) [m; Φ]
(2.2.2)
the optimal matching cost and by
(p,m)
EN
(p,m)
(d) := EN
[Φ]
(2.2.3)
the average optimal matching cost in d dimensions.
Euclidean bipartite matching problem In the Euclidean bipartite matching problem (Ebmp),
we consider a matching problem on the complete bipartite graph KN,N = Graph(V, U; E),
N := |V| = |U| ∈ N. Randomness can be introduced in two different ways: we can
assume that both vertices in V and vertices in U are associated to randomly generated
points in Ωd (Poisson–Poisson Euclidean bipartite matching problem, PP Ebmp) or
generating randomly one set of points, and fixing the other ones on a certain lattice
positions: if a square lattice is adopted, we say that we are dealing with a grid–Poisson
Euclidean bipartite matching problem (gP Ebmp). In the Ebmp we are interested on
the perfect matching m such that the functional
1 X
(p)
EN [m; Φ] :=
w(e),
m = Graph(Vm ; Em ) matching;
(2.2.4)
N
e∈Em
is minimised as usual. Denoting by V := {vi }i=1,...,N and by U := {ui }i=1,...,N , a given
matching can be uniquely associated to an element σ ∈ SN of the set of permutations
of N elements, in such a way that σ ↔ m if (vi , uσ(i) ) ∈ Em . In the following we will
use quite often this correspondence and the previous functional (2.2.4) will be written
as
N
1 X
(p)
w((vi , uσ(i)) );
(2.2.5)
EN [σ; Φ] :=
N i=1
2.2. Euclidean matching problems
33
Figure 2.2.2.: Optimal gP Euclidean bipartite matching with N = 225 and p = 2 on the torus.
We will denote by
(p)
EN [Φ] :=
(p)
min
m matching
(p)
EN [m; Φ] = min EN [σ; Φ]
(2.2.6)
σ∈SN
the optimal matching cost and by
(p,PP)
EN
(p)
(p,gP)
(d) := EN [Φ],
EN
(p)
(d) := EN [Φ]
,
(2.2.7)
one set fixed
the average optimal costs for the PP Ebmp and the gP Ebmp respectively in d
dimensions.
In the present work we will consider both matching problems on the unit hypercube Ωd
and matching problems on the unit hypercube with periodic boundary conditions, i.e., on
the flat hypertorus Td := Rd /Zd : in this case the Euclidean norm in eq. (2.1.3) is intended
on this flat manifold. Observe finally that, although the Ebmp appears as a slight variation
of the Emmp, it presents a crucial complication. Let
S us consider for example an Euclidean
matching problem on Ωd and a certain partition i ωi ⊂ Ωd : in a greedy approach, in
the monopartite case, we can always perform a optimal matching procedure inside each ωi
leaving at worst one point unmatched for each element of the partition; in the bipartite
case it could happen that a relevant fraction of points remains unmatched for each ωi
and therefore they have to be connected with points outside ωi . This suggests that local
fluctuations in the number of points of different type in the Ebmp can be relevant in the
scaling properties of the average optimal cost of the Ebmp in a non trivial way.
The scaling of the average optimal cost for the Emmp can be directly obtained from the
general results on subadditive Euclidean functional. It turns out indeed that, for 0 < p < d,
in the Emmp on Ωd we have that
(p,m)
EN
p
(d) ∼ N − d .
(2.2.8)
Remarkably, from a mathematical point of view, the treatment of the scaling of the Ebmp
appears more difficult, in particular in low dimensions. We will treat exactly this problem
for d = 1 later. For d = 2, in 1983 Ajtai, Koml´os, and Tusn´ady [1] proved that the average
optimal cost scales as
p
ln N 2
(p,PP)
EN
(2) ∼
.
(2.2.9)
N
34
Chapter 2. Euclidean Matching Problems
Observe that a logarithmic correction appears respect to the monopartite case: this correction is indeed due to the relevance of local fluctuations in the number of points of different
type. For d ≥ 3 and p = 1 Dobri´c and Yukich [20] proved that
(1,PP)
EN
1
(d) ∼ N − d .
(2.2.10)
The previous result can be justified by an heuristic argument: given a certain point on
1
, and therefore we
Ωd , its nearest neighbour of different type is expected at a distance √
d
N
expect that
p
(p,PP)
(2.2.11)
EN
(d) ∼ N − d .
This result is indeed confirmed by numerical simulations for d ≥ 3. To our knowledge,
however, no further scaling properties of the average optimal costs for Euclidean matching
problems are known from a rigorous point of view. In the following, we will show that, using
a proper scaling ansatz, we derived the correct known scaling properties for the Ebmp in
any dimension and we obtained also the finite size corrections to the leading terms for d ≥ 2.
ˇ transport problem
2.3. The Monge–Kantorovic
To properly introduce our scaling ansatz and our results on the Ebmp, let us consider a
“continuum version” of the Ebmp, the so called Monge–Kantoroviˇc problem, or optimal
transport problem, a well studied problem in Measure Theory.
Let us suppose that two non-negative measures ρR (x) and ρB (y) are given on Rd , such
that the overall mass balance condition is satisfied
Z
Z
d ρR (x) = d ρB (y).
(2.3.1)
Rd
Rd
Let us define also the set of maps


Z
Z


M(ρR , ρB ) := T : Rd → Rd h(T(x)) d ρR (x) = h(y) d ρB (y) ∀h ∈ C0 (Rd ) .


Rd
Rd
(2.3.2)
We ask for the map M ∈ M(m, n) such that, given the cost functional
Z
def
E[T; ρR , ρB ] =
w(x, T(x)) d ρR (x),
(2.3.3)
Rn
and the cost function w(x, y) : Rd × Rd → R+ , we have
E[M; ρR , ρB ] =: E[ρR , ρB ] :=
inf
T∈M(ρR ,ρB )
E[T; ρR , ρB ].
(2.3.4)
The previous formulation is due to Monge (1780): in the original problem, Monge considered
a finite set of piles of soil and a finite set of excavations of the same cardinality and he wanted
to find a proper matching to minimise the transportation cost. It is self-evident that the
problem in the formulation above appears as a generalisation of the Ebmp, in which points
are replaced with measures on a certain support. In the 1940s Kantoroviˇc suggested a
slightly different approach. He considered the set of transfer plans




Z
Z


MK (ρR , ρB ) := π : Rd × Rd → R+ d π(x, y) = d ρR (x),
d π(x, y) = d ρB (y) .




d
d
y∈R
x∈R
(2.3.5)
2.3. The Monge–Kantoroviˇc transport problem
35
and the relaxed functional
Z
EK [π; ρR , ρB ] :=
w(x, y) d ρ(x, y);
(2.3.6)
Rn ×Rn
∗
he asked therefore for the transfer plan π such that
EK [π∗ ρR , ρB ] :=
inf
π∈MK (ρR ,ρB )
EK [π; ρR , ρB ].
(2.3.7)
As mentioned before, this problem is a weak formulation of the original Monge transport
problem: the delicate problem of the existence of an optimal transport plan is treated in
[54]. In general we note that the existence of a transport plan π does not guarantee that an
optimal map M in the Monge sense exists. We will not discuss this problem here and we
will assume that, in our hypothesis, an optimal map always exists. It can be proved indeed
[4] that, if w is a continuous function on its support and ρR (x) has no atoms1 , an optimal
transport plan and a corresponding optimal map always exists. In the following we will
consider only cost function in the form w(x, y) ≡ w(x − y).
Kantoroviˇc introduced also a dual formulation, defining the following space:
K(w) := (u, v)|u, v : Rd → R, u, v ∈ C0 (Rd ), u(x) − v(y) ≤ w(x − y) (x, y) ∈ Rd × Rd .
(2.3.8)
and asking for the couple (u∗ , v ∗ ) ∈ K(w) that maximise the following functional
Z
Z
K[u, v; ρR , ρB ] := u(x) d ρR (x) − v(y) d ρB (y).
(2.3.9)
Rd
Rd
It can be proved [23] that, if w is uniformly continuous, such a couple of functions exists
and they satisfy the following relations
(
u∗ (x) = inf y∈Rd [v ∗ (y) + w(x − y)]
(2.3.10)
v ∗ (y) = supx∈Rd [u∗ (x) − w(x − y)]
The correspondence between the previous set of equations and the Bp equations is remarkable. To understand what is the relation between the couple of functions (u∗ , v ∗ ) and the
optimal map of the original problem M, we present the following theorem:
Theorem (Strictly convex cost). Let w ∈ C 1 (Rd × Rd ) a strictly convex function, i.e.,
w(αx1 + (1 − α)x2 ) < αw(x1 ) + (1 − α)w(x2 ) for α ∈ (0, 1). Then an optimal map M for
the original plan exists almost everywhere and it satisfies the following differential equation
∇w (x − M(x)) = ∇u∗ (x);
(2.3.11)
K[u∗ , v ∗ ; ρR , ρB ] = E[M; ρR , ρB ].
(2.3.12)
moreover
If the cost function w is not strictly convex (e.g., w(x) = |x|), then the uniqueness of
the map is not guaranteed any more. If the cost function w is strictly convex, under some
additional assumptions on m and n, we can also write a differential equation for M in the
form of a change-of-variable formula (sometimes called Jacobian equation)
d ρR (x) = d ρB (M(x)) det JM (x),
(2.3.13)
where JM (x) is the Jacobian matrix for the map M. Remarkably, this formula does not
hold in general, as we will see later; note that the optimal map is not usually “smooth”
and the meaning of the Jacobian matrix itself must be clarified. The validity conditions
of the Jacobian equation (2.3.13) are pointed out in the following theorem (a more general
statement can be found in [54, Chapter 11])
1 In
Measure Theory, given a measurable space (m, Σ), A ⊂ Σ is called atom if m(A) > 0 and ∀B ⊂ A
measurable such that m(B) < m(A), m(B) = 0.
36
Chapter 2. Euclidean Matching Problems
Theorem (Jacobian equation). Let be ρR , ρB two non-negative measures on Rd and suppose that d ρR (x) = ρR (x) dd x, where ρR (x) ∈ L1 (Rd ). Let us assume that M ∈ M(ρR , ρB )
and
• ∃Σ ⊂ Rd such that ρR (x) = 0 almost everywhere outside Σ and M is injective on Σ;
• M is approximately differentiable2 on Σ almost everywhere.
If ∇M(x) is the approximate gradient of M and by definition JM (x) := det |∇M(x)|, then
Eq. (2.3.13) holds.
Working with strictly convex potentials and compact submanifolds of Rd , the hypothesis
of the previous theorem are satisfied.
As an important particular case, let us consider the following cost function:
2
w(x − y) := kx − yk ,
x, y ∈ Rd ,
(2.3.14)
where k•k is the Euclidean norm. The cost is strictly convex; we suppose also that two
measures are given on Rd such that
d ρR (x) := ρR (x) dd x,
d ρB (y) := ρB (y) dd y,
(2.3.15)
such that ρR (x), ρB (x) ∈ L1 (Rd ). The optimal map is a minimum of the following functional
Z h
i
2
L[M, λ] :=
kx − M(x)k + λ(x) (ρB (M(x)) det JM (x) − ρR (x)) dd x
(2.3.16)
Rd
where λ(x) is a Lagrange multiplier. A direct computation of the Euler–Lagrange equations
gives that
M(x) = ∇φ(x)
(2.3.17)
for a certain (convex) potential φ, i.e., in the quadratic case, the optimal map can be
expressed as a gradient of a convex potential. The Jacobian equation assumes the well
known form of a Monge–Amp`ere equation,
ρR (x) = ρB (∇φ(x)) det Hessφ (x),
(2.3.18)
where Hessφ is the Hessian matrix of φ.
2.4. L One dimensional Ebmp: an exact solution for
convex weights
Let us start considering the Ebmp on the line and on the circle. The low dimensionality
allows us to give an exact solution to the problem in the case of convex cost functions.
In particular, we will consider the gP Ebmp in one dimension both with open boundary
conditions (obc) and with periodic boundary conditions (pbc) on the interval Ω1 = [0, 1].
We will suppose that we work on the complete graph KN,N = Graph(V, U; E), in which the
first set of vertices is associated to a set of fixed points on the interval and in particular
vi ∈ V 7→ Φ(vi ) = ri ≡
2 It
2i − 1
,
2N
i = 1, . . . , N,
(2.4.1)
is said that F(x) is approximately differentiable in x if there exists a differentiable function Φ(x) such
that the set {F(x) = Φ(x)} has zero density in x; ∇Φ(x) is called approximate gradient of F(x).
2.4. One dimensional Ebmp: an exact solution for convex weights
37
whilst the set of vertices U is associated to a set of N points, {bi }i=1,...,N , randomly generated in the interval, such that ui ∈ U 7→ Φ(ui ) = bi . We will suppose the U-vertices indexed
in such a way that i ≤ j ⇒ bi ≤ bj . Finally, we will consider the following cost functionals
v
u
N
u1 X
(p,gP)
(p,gP)
p
˜
:= min E˜N(p,gP) [σ; Φ], (2.4.2a)
EN
[σ; Φ] = t
wp ri − bσ(i) , E˜N
σ
N i=1
(p,gP)
EN
[σ; Φ]
=
N
1 X
wp ri − bσ(i) ,
N i=1
(p,gP)
EN
:= min EN(p,gP) [σ; Φ],
σ
(2.4.2b)
in which Φ is the map between the complete graph and the points on the line defined above
(i.e. represent as usual the random process by which the points are generated), p ∈ R+
0 and
the function wp : [0, 1] → [0, 1] is given by:
(
p
|x|
for obc,
(2.4.3)
wp (x) =
p
1
1
p
|x| θ 2 − |x| + (1 − |x|) θ |x| − 2
for pbc.
Before any analytical consideration on the optimal cost, let us discuss the structure of the
optimal matching. The point configurations in our problem are of this type:
Let us consider for a moment the cost functional (2.4.2a) assuming p ∈ (0, +∞). It can be
easily observed that, in the hypothesis p > 1, the optimal matching is always ordered, i.e.
the optimal permutation σ ∗ is
(
i
for obc,
∗
(2.4.4)
σ (i) =
i + λ mod N for pbc, for a certain λ ∈ {1, . . . , N }.
The optimal matching can be pictorially represented as
This simple fact follows directly from the convexity of the weight function wp for p > 1:
indeed, it can be seen by direct inspection in the N = 2 case that the ordered solution
minimises the cost functional among all possibilities. If instead p < 1, the weight function wp
is concave and the ordered solution is not necessarily the optimal one. Indeed, the optimal
matching σ ∗ has to satisfy a different requirement, i.e., it must be uncrossing, or nested [10,
17]: given two interval [ri , bσ∗ (i) ] and [rj , bσ∗ (j) ], i 6= j, either [ri , bσ∗ (i) ] ∩ [rj , bσ∗ (j) ] = ∅ or
one of the two intervals is a subset of the other one. Pictorially, drawing arcs connecting
matched pairs above the line, they must be uncrossing, as in the picture below:
Clearly many nested solutions are possible: the uncrossing condition is necessary for the
optimal solution in the p ∈ (0, 1) case but not sufficient to identify it: this simple fact in
the discrete case, appears as a no crossing rule for the optimal map in the one dimensional
Monge–Kantoroviˇc transport problem in presence of concave cost functions [32]. Remarkably, for p = 1 it is possible to have instances in which the same optimal cost is obtained
by different matching solutions.
38
Chapter 2. Euclidean Matching Problems
The observation above on the properties of the solution for p > 1 solves completely the
problem. Indeed, we can perform a simple probabilistic argument to obtain the correct
average optimal cost and its distribution in the large N limit both on the line and on the
circumference [10]. Here we will perform also a different calculation following our paper [12]
and inspired by the Monge–Kantoroviˇc theory.
2.4.1. Ebmp on the interval
As shown before, in the case of the gP Ebmp on Ω1 , the optimal matching is always ordered
for p ∈ (1, +∞), i.e. the optimal permutation is simply σ ∗ (i) = i. Note that, due to the
monotony of the function x 7→ xp for p > 1, the optimal solution for the cost functional
(p,gP)
(p,gP)
E˜N
[σ; Φ] is also optimal for the cost functional EN
[σ; Φ]. The probability density
distribution for the position of the i-th b-point is:
N i
i
(2.4.5)
Pr (bi ∈ d b) =
y (1 − b)N −i d b,
b
i
where we used the short-hand notation x ∈ d z ⇔ x ∈ [z, z + d z]. In the N → ∞ limit, a
non trivial result is obtained introducing the variable µ(y)
√
i
−b
(2.4.6)
µ(b) := N
N
expressing the rescaled (signed) distance between a b-point in [b, b+d b] and its corresponding
r-point in the optimal matching. We finally obtain a distribution for the variable µ(t)
depending on the position on the interval t ∈ [0, 1]:
µ2
e− 2t(1−t)
d µ.
Pr (µ(t) ∈ d µ) = p
2πt(1 − t)
(2.4.7)
The distribution (2.4.7) is the one of a Brownian bridge on the interval [0, 1], a continuous
time stochastic process defined as
B(t) := W(t) − tW(1),
t ∈ [0, 1],
(2.4.8)
where W(t) is a Wiener process (see Appendix A). The joint distribution of the process can
be derived similarly. In particular, the covariance matrix for the 2-points joint distribution
has the form, for s, t ∈ [0, 1],
2φ(s)
φ(s) + φ(t) − φ(t − s)
Σ2 =
,
(2.4.9)
φ(s) + φ(t) − φ(t − s)
2φ(t)
where we introduced the function
φ(x) := |x|
1 − |x|
.
2
Averaging over the positions s, t and fixing the distance τ := |t − s|, we have
1
c
c − φ(τ )
hΣ2 i (τ ) =
, c= .
c − φ(τ )
c
6
(2.4.10)
(2.4.11)
The Euclidean matching problem on the interval [0, 1] with open boundary conditions and
cost functional (2.4.2a) is therefore related to the Brownian bridge in the N → ∞ limit. By
using this correspondence, the correlation function ∀p > 1 is computed as
hµ(t)µ(t + τ )i =
1
− φ(τ ).
6
(2.4.12)
µ(t)
2.4. One dimensional Ebmp: an exact solution for convex weights
39
0
−0.5
0
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
1
(a) Plot of µ(t) for a certain realisation of the problem with N = 4096.
0.15
2
ρ˜p (u)
µ(r)µ(0)
p=2
p = 50
p = 500
p→∞
3
0.1
1
0.05
0
0
0.1
0.2
0.3
0.4
0.5
0
0.5
r
1
1.5
2
u
(b) Correlation function ∀p > 1 in the obc case
for N = 1000 obtained averaging over 104 realisations (for clarity, not all data are represented).
(c) Distribution density of the rescaled cost
p
(p,gP)
N 2 E˜N
for different values of p computed
with 5 · 105 iterations and N = 5000 for the obc
case.
Figure 2.4.1.: One dimensional gP Ebmp on the interval.
where the average h•i is intended on the position t, whilst we denoted by • the average over
different realisations of the problem. This theoretical prediction was confirmed numerically.
Introducing the normalised variable
σ(t) =
µ(t)
= sign(µ(t)),
|µ(t)|
(2.4.13)
Boniolo, Caracciolo, and Sportiello [10] computed also the correlation function for this
quantity, finding
s
√
Z1−t
2
min(s, t)(1 − max(s, t))
1− t
√ . (2.4.14)
σ(s)σ(t) = arctan
⇒
σ(s)σ(s + t) d s =
π
|t − s|
1+ t
0
Both formulas were confirmed numerically. To compute the average cost of the matching,
from eq. (2.4.7)
N
p
2
(p,gP) N →∞
EN
−−−−→
Z1
p
|B(t)| d t =
p
1
Γ
+1 .
2
2 (p + 1)
p
2
(2.4.15)
0
The optimal cost
√
(p,gP)
N E˜N
in the N → ∞ limit can be written as
v
uZ1
u
√
(p,gP) N →∞ u
p
p
N E˜N
−−−−→ t
|B(t)| d t.
0
(2.4.16)
40
Chapter 2. Euclidean Matching Problems
Although the previous expression is difficult to evaluate exactly for finite p (see for example
the paper of Shepp [51] for additional information about the distribution of the random
R1
p
variable Xp := 0 |B(t)| d t), the calculation can be easily performed in the relevant limit
p → ∞, being
v
uZ1
u
u
p→∞
p
p
t
|B(t)| d t −−−→ sup |B(t)|.
(2.4.17)
t∈[0,1]
0
The distribution of the sup of the absolute value of the Brownian bridge is the well known
Kolmogorov distribution [21]
!
+∞
X
2 2
Pr sup |B(t)| < u =
(−1)k e−2k u
(2.4.18)
t∈[0,1]
and therefore
√
k=−∞
(p,gP) N →∞
N E˜N
−−−−→
r
π
ln 2.
2
(2.4.19)
h p
i
(p,gP)
≤u
for different values of p:
In figure 2.4.1c we plotted ρp (u) := ddu Pr N 2 E˜N
observe that ρp approaches the Kolmogorov distribution in the large p limit.
(p,gP)
Finally, observe also that the variance of EN
2
(1 + p)Γ(p + 2) − (2p + 1)Γ2 p2 + 1
(p,gP)
(p,gP)
p
N EN
− EN
=
(2.4.20)
2p (p + 1)2 (2p + 1)
increases with p and that
(p,gP)
EN
(p,gP)
− EN
(p,gP)
EN
2
=
2
(p + 1)Γ(p + 2)
p→∞
2 − 1 −−−→ +∞.
p
(2p + 1)Γ 2 + 1
(2.4.21)
(p,gP)
requires a larger
From a numerical point of view, this means that a computation of EN
amount of iterations as p increases and fluctuations around the mean value become extremely large in the p → ∞ limit. On the other hand, fluctuations of the optimal cost
(p,gP)
(p,gP)
E˜
around its mean value E˜
for p → ∞ remain finite
N
N
2 p→∞ π 2
π
(p,gP)
(p,gP)
− E˜N
−−−→
− ln2 2.
N E˜N
12
2
(2.4.22)
(p,gP)
This fact allows to perform a precise computation of E˜N
for large p.
2.4.2. Ebmp on the circumference
Let us now consider the case of periodic boundary conditions. For p > 1, the solution for
both the functionals (2.4.2), with p ∈ (1, +∞), is again ordered; however, in this case the
optimal permutation is i 7→ σ ∗ (i) = i + λ mod N for a certain λ ∈ {0, 1, . . . , N − 1}. In the
continuum limit, the solution is a generalised Brownian bridge, µp (t) = B(t) + λp , t ∈ [0, 1],
for a certain constant λp ∈ R depending on p. The constant λp can be found by optimality
condition on the cost functional (2.4.2a):
v
uZ1
u
∂ u
p
p
t
|B(t) + λp | d t = 0.
(2.4.23)
∂λp
0
2.4. One dimensional Ebmp: an exact solution for convex weights
41
The
R 1 previous equation can be solved only
R 1 for p = 2 and p → +∞. For p = 2, λ2 =
− 0 B(τ ) d τ ; therefore, m2 (t) = B(t) − 0 B(t) d t and
hµ2 (t)µ2 (t + τ )i =
1
− φ(τ ).
12
(2.4.24)
R
p1 p→∞
1
p
−−−→ supt∈[0,1] |B(t) + λ∞ | and therefore the optimality
For p → ∞, 0 |B(t) + λp | d t
condition becomes
!
supt∈[0,1] B(t) + inf t∈[0,1] B(t)
∂
sup |B(t) + λ| = 0 ⇒ λ∞ = −
.
(2.4.25)
∂λ t∈[0,1]
2
We have therefore, for t ∈ [0, 1],
µ∞ (t) = B(t) −
sups∈[0,1] B(s) + inf s∈[0,1] B(s)
.
2
(2.4.26)
The correlation function can be directly found using the known joint distributions for the
Brownian bridge and its sup and for the sup and inf of a Brownian bridge (see Appendix
A). After some calculations we obtain
hµ∞ (t)µ∞ (t + τ )i =
2 − ζ(2)
− φ(τ ),
4
(2.4.27)
P∞
2
where ζ(2) := n=1 n12 = π6 . The value µ2∞ (t) = 2−ζ(2)
= 0.0887665 . . . is very close to
4
1
the value obtained for p = 2, µ22 (t) = 12
= 0.08¯3. In figure 2.4.2b we plot the values of
cp := µ2p (t)
(2.4.28)
as function of p. Note, finally, that due to the fact that we have imposed pbc, h•i ≡ • holds
in all the previous formulas.
Let us now introduce the normalised transport field
σp (t) :=
µp (t)
= sign(µp (t)).
|µp (t)|
(2.4.29)
The correlation function σp (s)σp (t) can be computed from the covariance matrix observing
that the process is still Gaussian. The correlation function is found in the form
σp (s)σp (t) =
2
cp − φ(t − s)
arctan p
π
φ(t − s) (2cp − φ(t − s))
(2.4.30)
1
where c2 = 12
for p = 2 and c∞ = 2−ζ(2)
.
4
The optimal cost in the p → ∞ limit can be evaluated as the average spread of the
Brownian bridge. Denoting
ξ := sup B(s) − inf B(s),
s∈[0,1]
(2.4.31)
s∈[0,1]
the distribution of the spread ξ is given by [21]
2
2
d
Pr (ξ < u) = θ3 e−2u + u
θ3 e−2u ,
du
where
θ3 (x) ≡ ϑ3 (0, x),
ϑ3 (z, q) := 1 + 2
∞
X
n=1
2
q n cos(2nz)
(2.4.32)
(2.4.33)
42
Chapter 2. Euclidean Matching Problems
p=2
p=5
p = 200
p→∞
0.088
0.05
cp
hµp (t)µp (t + τ )i
0.1
0.086
0
0.084
p=2
−0.05
0
0.1
0.2
0.3
0.4
5
0.5
10
15
(a) Correlation function in the pbc case for N =
1000 obtained averaging over 104 realisations (for
clarity, not all data are represented). The continuous line is the theoretical prediction for p = 2;
the dashed line corresponds to the p → ∞ case.
(b) Value of cp in the pbc case for N = 1000
obtained from the fit of the correlation function,
averaging over 5000 realisations, for different values of p.
101
ρ˜p (u)
tan π2 σp (t)σp (t + τ )
p=2
p = 50
p = 200
p→∞
4
2
100
10−1
10−2
p=2
p = 200
0
0
(c)
d
du
20
p
τ
0.2
Density
p
(p,gP)
Pr N 2 E˜N
0.4
0.6
u
distribution
<u
of
0.8
1
1.2
0
0.1
0.2
0.3
0.4
0.5
τ
ρ˜p (u)
the
:=
rescaled
cost for different values of p computed with
2 · 104 iterations and N = 1000 for the pbc case.
(d) Plot for the normalised map correlation function (2.4.30) obtained with 104 iterations and
N = 1000 for the pbc case. The continuous line
is the theoretical prediction for p = 2, whilst
the dotted line is obtained from the theoretical
prediction
for p → ∞ The pole is at τ (p) =
√
1
2
−
1−8cp
.
2
Figure 2.4.2.: One dimensional gP Ebmp on the circumference.
2.4. One dimensional Ebmp: an exact solution for convex weights
43
is the third Jacobi theta function. From eq. (2.4.32) the distribution of the optimal cost in
the p → ∞ limit is easily derived. Moreover,
r
!
√
ξ
1
1 π
N
→∞
(p,gP)
(2.4.34)
N E˜N
−−−−→ =
sup B(s) −
inf B(s)
=
p→∞ 2
2 s∈[0,1]
2 2
s∈[0,1]
with corresponding variance
2 p→∞ π 2 − 3π
(p,gP)
(p,gP)
N E˜N
− E˜N
.
−−−→
24
(2.4.35)
2.4.3. General solution and universality for p > 1
In the present section we will justify and generalise the previous results looking at the
so Monge–Kantoroviˇc formulation of the problem. Let us consider the interval Ω1 (L) :=
[0, L] ⊂ R, L ∈ R+ and let us suppose also that two different measures are given on Ω, i.e.,
the uniform (Lebesgue) measure
d m(x) :=
1
d x,
L
(2.4.36)
and a non uniform measure d n(x) with measure density ν(x),
∞ X
2πkx
2πkx
dx
ν1 (k) cos
+ dx
+ ν2 (k) sin
.
d n(x) := ν(x) d x =
L
L
L
(2.4.37)
k=1
We ask for the optimal map M : Ω1 (L) → Ω1 (L) such that the following transport condition
is satisfied
Z
Z
1
dx =
d n(x) ∀A ⊂ Ω measurable.
(2.4.38)
L
A
M −1 (A)
and M minimises the following functional
ZL
Ep [M ; m, n] :=
p
|x − M (x)| d n(x),
p ∈ R+ .
(2.4.39)
0
In the hypothesis p > 1 the Jacobian equations (2.3.13) and eq. (2.4.38) can be rewritten
as a change-of-variable formula:
L d n(x) = d M (x).
(2.4.40)
Imposing pbc, that is M (0) = M (L) − L, the solution of (2.4.40) determines the optimal
map up to a constant as
M = x + Mp (0) + ϕ(x).
(2.4.41)
In the previous equation we have introduced ϕ(x)
X
∞
∞
X
L2 sin kπx
kπx
L2
kπx
L
cos
+
ν2 (k)
sin2
.
ϕ(x) :=
ν1 (k)
πk
L
πk
L
k=1
(2.4.42)
k=1
Note that ϕ(0) = ϕ(L) = 0. The value of Mp (0) must be determined requiring that the
functional (2.4.39) is minimum: we have that
ZL
p−1
sign (Mp (0) + ϕ(x)) |µp (0) + ϕ(x)|
p
0
d n(x) = 0.
(2.4.43)
44
Chapter 2. Euclidean Matching Problems
If instead obc are considered, then Mp (0) = 0 and the solution is obtained explicitly ∀p > 1.
Let us now suppose that L ≡ N ∈ N and that the measure n(d x) is obtained as a limit
measure of a random atomic measure of the form
!
N
N
dx X
1 X
d nN (x) :=
δ (x − ηi ) = d
θ (x − ηi ) ,
(2.4.44)
N i=1
N i=1
where {ηi }i=1,...,N is a set of N points uniformly randomly distributed in Ω1 (N ). The
previous measure can be written as
N
∞ r
X
2 Zk
πkx
x
1 X
sin
+
+
ηi − 1.
(2.4.45)
nN (x) =
N πk
N
N
N i=1
k=1
where we have introduced
N
1 X
Zk ≡ Zk (x) :=
zi (x),
N i=1
√
2ηi + x
zi (x) := − 2N cos kπ
.
N
(2.4.46)
Observe now that Zk (x) is a sum of independent identically distributed random variables.
From the central limit theorem, we have that Zk (x) is normally distributed as
Zk ∼ N (0, 1)
∀k ∈ N.
(2.4.47)
Remarkably the previous distribution does not depend on x. Moreover, the Zk and Zl are
independent random variables for k 6= l, being Gaussian distributed and hZl Zk i = 0, where
the average h•i is intended over the possible values {ηi }i . In eq. (2.4.45) the Karhunen–
Lo`eve expansion for the Brownian bridge [5] on the interval [0, N ] appears:
BN (x) :=
∞
X
√
k=1
2N
πkx
Zk
sin
,
πk
N
Zk ∼ N (0, 1)
∀k ∈ N.
(2.4.48)
It follows that nN (x) can be written for large N , up to irrelevant additive constants, as
nN (x) '
1
x
BN (x) +
N
N
(2.4.49)
and therefore we cannot associate a density measure to it, due to the fact that nN (x) is not
differentiable. However the solution of the matching problem in the continuum can be still
obtained directly from eq. (2.4.37); considering pbc, then
Mp (x) = Mp (0) + x + BN (x),
x ∈ [0, N ].
(2.4.50)
Denoting by
µp (x) := Mp (x) − x,
(2.4.51)
it follows that ∀p > 1
µp (x)µp (y) = cp (N ) − N φ
x−y
N
.
(2.4.52)
where cp (N ) is a constant depending on N and p. Adopting the notation F (N ) ∼ G(N ) ⇔
F (N )
0 < limN →∞ G(N
) < +∞, for two positive real functions F and G depending on N , note
that
cp (N ) ∼ cp N
(2.4.53)
for some positive constant cp depending on p. Indeed, in the discrete case, from the fact
that the solution must be ordered, in the large N limit it can be easily seen that
p
min Ep [M ; m, n] ∼ N 2 ,
M
(2.4.54)
2.4. One dimensional Ebmp: an exact solution for convex weights
45
where Ep is the cost functional (2.4.39) in which the measure (2.4.44) is adopted. Therefore,
√
Mp (0) = µp (0) ∼ µp (x) ∼ N . Moreover, note also that Mp2 (t) < N 2 ⇒ cp (0) = 0. For
p = 2, eq. (2.4.43) becomes
1
M2 (0) = −
N
ZN
1
BN (x) ◦ d BN (x) −
N
0
ZN
1
BN (x) d x = −
N
0
ZN
BN (x) d x
(2.4.55)
0
where the first integral is intended in the Stratonoviˇc sense. The result is in agreement with
the one presented in the Subsection 2.4.2. If we consider the transport cost functional
v
uN
uZ
u
p
p
˜
Ep [M ; m, n] := t
|x − M (x)| d n(x)
(2.4.56)
0
the matching problem has the same solution M˜p ≡ Mp obtained
√ for the cost (2.4.39) for
all finite values of p, due to the fact that the function f (x) = p x is monotone. However,
for the functional cost (2.4.56), in the p → ∞ limit, we can reproduce the computations of
Section 2.4.2, obtaining
lim M˜p (0) = −
p→∞
sups∈[0,N ] BN (s) + inf s∈[0,N ] BN (s)
.
2
(2.4.57)
If obc are considered, then Mp (0) = 0 and we have simply, ∀p > 1,
µp (x) ≡ µ(x) = BN (x),
x ∈ [0, N ];
(2.4.58)
It can be easily seen that
1
N
ZN
µ(x)µ(x + r) d x = c(N ) + N φ
r
,
N
c(N ) =
N
.
6
(2.4.59)
0
Note that, therefore, if we consider the problem of a matching of N random b-points to N
lattice r-points on the interval [0, N ], for s, t ∈ [0, N ] the correlation function assumes the
form
|τ |
|τ |
G(τ ) := hµp (t)µp (t + τ )i = cp N −
1−
,
(2.4.60)
2
N
where Mp (t) is the signed distance between t ∈ [0, N ] and its corresponding inverse image
under the action of the optimal map. It follows that for N → ∞ the correlation function
G(τ ) has a divergent part, G(0) = cp N , depending through cp on the specific details of
the problem (e.g., the boundary conditions adopted or the value of p), a universal finite
τ2
part − |τ2| and a (universal) finite size correction 2N
. We obtained also numerical evidences
of the validity of eq. (2.4.60) for different values of p both with obc and with pbc: an
exact derivation of cp for obc ∀p > 1 and for c2 and c∞ for pbc was presented. This fact
suggests that all Euclidean matching problems in one dimension with strictly convex cost
functionals belong to the same universality class and that the specific details of the model
determine only the value of the constant cp in the divergent contribution G(0). Similarly,
on the interval [0, N ] eq. (2.4.30) becomes
s
τ
2cp N − τ2 1 − N
1
2
1
2τ
hσp (t)σp (t + τ )i = arctan q
= 1 − π c N + o √
π
τ
N
p
1 − τ 4cp N − τ 1 − τ
2
N
N
(2.4.61)
in which the universal part is given by the constant 1 and finite size corrections scale as
√1 up to a scaling factor depending on cp .
N
46
Chapter 2. Euclidean Matching Problems
2.5. L Higher dimensional
Euclidean matching problem
In Section 2.3 we discussed the Monge–Kantoroviˇc problem and we observed that it appears
as a “continuum” generalisation of our discrete matching problem. In the previous section,
moreover, we gave a general solution for the one dimensional case for p > 1 using the
Jacobian equation (2.3.13), that for d = 1 is linear.
Inspired by the previous results and by the continuum theory, we tried, in a non conventional way, to obtain useful informations on the discrete problem for any dimension from
the continuum problem by proper regularisation procedures.
In particular, we made an ansatz about the functional dependence of the optimal cost
(2,PP)
of the Ebmp EN
from the empirical measures obtained from the two sets of points.
Indeed, let us consider a bipartite matching problem on KN,N := Graph(V, U; E), and let us
suppose that we associate to the vertices of KN,N 2N points independently and uniformly
randomly generated on the hypercube Ωd = [0, 1]d ⊂ Rd through the map Φ:
V := {vi }i=1,...,N
U := {ui }i=1,...,N
7→
R := Φ(V) = {ri := Φ(vi )}i=1,...N ⊂ Ωd ,
7
→
B := Φ(U) = {bi := Φ(ui )}i=1,...N ⊂ Ωd .
(2.5.1a)
(2.5.1b)
As above, we want to find the optimal permutation σ ∗ that minimises the following functional:
N
1 X
(2,PP)
bσ(i) − ri 2 .
(2.5.2)
[σ; Φ] :=
EN
N i=1
We can associate to the set R and to the set B two empirical measures respectively
ρR (x) :=
N
1 X
δ (x − ri ) ,
N i=1
(2.5.3a)
ρB (x) :=
N
1 X
δ (x − bi ) ,
N i=1
(2.5.3b)
N →∞
such that ρR (x), ρB (x) −−−−→ ρ(x) = 1 weakly. We suppose that periodic boundary conditions are imposed, so we work on the d dimensional flat torus Td . Let us call
δρ(x) := ρR (x) − ρB (x)
(2.5.4)
the difference between the two densities and
δ ρˆ(k) :=
N
X
e−2πik·ri − e−2πik·bi
N
i=1
,
k ∈ Zd ,
(2.5.5)
the corresponding Fourier modes. We ask therefore for an optimal map M : Ωd → Ωd such
that
M(ri ) = bσ∗ (i) .
(2.5.6)
The previous map is indeed a map between the two atomic densities (2.5.3) that minimises
the following transport functional
Z
E[ν; ρR , ρB ] :=
Ωd
2
kx − ν(x)k ρR (x) dd x ≡
N
1 X
2
kri − ν(ri )k .
N i=1
(2.5.7)
among all possible maps between the two given atomic measures. Observe that the optimal
map is not unique in our hypotheses (indeed, there are infinite possible maps satisfying the
2.5. Higher dimensional Euclidean matching problem
47
constraint (2.5.6)). Neglecting for a moment the fact the highly non regular structure of
our measures, we have that for a quadratic random cost such as (2.5.7) the optimal map
can be expressed as a gradient of a scalar function ϕ(x),
M(x) := x + µ(x),
µ(x) = ∇ϕ(x)
(2.5.8)
satisfying the Monge–Amp`ere equation (2.3.18):
∂ 2 ϕ(x)
ρR (x) = ρB (x + ∇ϕ(x)) det I +
.
∂xi ∂xj
(2.5.9)
In the limit N → ∞ however we expect that |µ(x)| 1 (the two empirical measures
converge weakly to the same uniform measure and therefore M(x) → x, identity map):
under this working hypothesis, the Monge–Amp`ere equation can be linearised as
∆ϕ(x) = δρ(x).
(2.5.10)
The previous equation has only one non trivial solution on the d-dimensional hypertorus
and therefore identifies our optimal map. Substituting in the equation (2.5.7) we have that
for the optimal map M
2
X
|δ ρˆ(k)|
(2.5.11)
E[M; ρR , ρB ] =
2.
4π 2 kkk
k∈Zd \{0}
Our hypothesis is that the previous functional at large N captures the leading terms of the
(2,PP) ∗
exact optimal cost of our original matching problem EN
[σ ; Φ]; in particular, observing
that
2
2
|δ ρˆ(k)| =
(2.5.12)
N
we expect that
X
1
1
(2,PP)
(d) ≈
(2.5.13)
EN
2.
2
N
2π kkk
d
k∈Z \{0}
One dimensional case Let us consider the Ebmp in dimension d = 1. In low dimensions,
that is for d = 1, 2, the optimal cost is no longer a subadditive quantity and the fluctuations
of the density of points are dominant. Moreover for d = 1 the optimal cost is also not selfaveraging [27]. The one-dimensional case is the simplest application of our formula, since
this is the only where no regularisation is needed. We obtain straightforwardly
(2,PP)
EN
(1) =
1
6N
(2.5.14)
This result is exact and it can be obtained reproducing arguments similar to the gP case
presented in Section 2.4. We checked the validity of our prediction numerically: results are
presented in fig. 2.5.1.
Two dimensional case For d ≥ 2 the sum appearing in eq. (2.5.13) diverges; therefore
we have to find a suitable regularization to give meaning to the expression. To regularise
the sum, let us introduce a proper cut-off in the momentum space, i.e., let us consider only
the element of the sum with
2π
1
kkk ≤
, ` := √
,
(2.5.15)
d
`
N
where ` is the characteristic length of the system at finite N , being of the order of the
distance between two points of different type. Clearly 2π`−1 → +∞ for N → +∞.
48
Chapter 2. Euclidean Matching Problems
−0.155
Numerical data
(2)
Fit by f1 (N )
(2,PP)
0.164
−0.16
−0.165
N EN
EN
(2,PP)
(1)
(1) −
N
6
0.166
0.162
−0.17
0
0.005 0.01 0.015 0.02 0.025 0.03
N
0
0.01
0.01
0.02
−1
N
(2,PP)
(a) Numerical data for N EN
0.02
0.03
(2,PP)
(b) Numerical results for N EN
(1).
0.03
−1
(1) − 16 .
Figure 2.5.1.: Numerical simulations for d = 1; the fit was performed using the three parameters fit function
(2)
(2)
(2)
f1 (N ) = α1
+
e1
N
+
c
.
N2
We averaged the optimal cost of the Ebmp given by an exact algorithm [18] for system sizes up to N =
(2)
2048. From a least square fit we obtained the coefficient α1 = 0.166668(3), in perfect agreement with
our analytical prediction. Once verified the validity our theoretical result, we used it to extrapolate the
(2)
(2)
(2)
subleading coefficient e1 , fixing α1 = 16 and using the fitting function f1 (N ) with two free parameters
(see table 2.1).
Let us start with the d = 2 case. We choose a regularizing smooth function F (x) such
that F (0) = 1 and F (x) → 0 for x → +∞. The function has to decrease rapidly enough to
make the series
X
2πknk
1
(2.5.16)
2F
2π`−1
knk
2
n∈Z \{0}
converge. Finally, let us denote by
Nd (r) := {x ∈ Zd \ {0}|kxk < r},
(2.5.17)
the number of lattice points (excluded the origin) included in a ball of radius r centred in
the origin in dimension d. Then, for arbitrary a ∈ (0, 1), we can write
1
X
k∈Z2 \{0}
kkk
2F
kkk
√
N
ZR
= lim
R→∞
1
F
r2
a
ZR
≈ lim
R→∞
r
√
N
Z∞ ∂N2 (r)
r
dr
− 2πr d r+2π F √
∂r
r
N
a
1 ∂N2 (r)
− 2πr d r + 2π
r2
∂r
Z∞
F (r)
d r,
r
(2.5.18)
√a
N
a
where we have isolated in the last term the singular part of the series. The first integral in
the right hand side is well behaved: indeed,
ZR
a
R ZR
N2 (r) − πr2 1 ∂N2 (r)
N2 (r) − πr2
−
2πr
d
r
=
.
+
2
2
r
∂r
r
2r3
a
(2.5.19)
a
Both the first and the second term are finite in the R → ∞ limit due to the fact that [26]
2.5. Higher dimensional Euclidean matching problem
49
0.134
0.132
(2) −
(2,PP)
1
0.13
N EN
N EN
(2,PP)
(2)
ln N
2π
1.5
0.5
0.128
Numerical data
Fit by
0
10−5
10−4
(2)
f2 (N )
10−3
10−2
10−1
10−4
100
N −1
10−3
10−2
10−1
N −1
(2,PP)
(a) Numerical data for N EN
(2,PP)
(b) Numerical results for N EN
(2).
(2) −
ln N
2π
.
(2,PP)
Figure 2.5.2.: Numerical simulations for d = 2. We fitted our numerical data for N EN
(2) using the
function
b
(2)
(2)
.
f2 (N ) = a ln N + e2 +
ln N
1
The ln N correction was suggested by the right hand plot. From a least square fit we obtain the coefficient
2πa = 1.0004(6), in perfect agreement with our analytical prediction. Once verified the theoretical prediction
(2)
1
and fitting the other two
for a, we used it to extrapolate the subleading coefficient e2 , fixing a = 2π
parameters (see table 2.1).
d=1
1
2π 2 ζd (1)
(2)
αd
(2)
ed
d=2
−
−
0.1332(5)
1
6
0.166668(3)
−0.1645(13)
d=3
−0.45157 . . .
−0.4489(16)
0.66251(2)
d=4
−0.28091 . . .
−0.282(4)
0.571284(6)
d=5
−0.21423 . . .
−0.2139(13)
0.584786(2)
(2)
Table 2.1.: Results of numerical simulations for p = 2. In the first line the analytical prediction for αd
for d 6= 2 is presented.
√
N2 (r) − πr2 ≤ 1 + 2 2πr. Therefore we have
1
X
k∈Z2 \{0}
kkk
2F
kkk
√
N
+∞
Z
≈
√
Z∞
F (r)
1 N
2
N2 (r) − πr d r + 2π log
+ 2π
d r.
3
2r
a
r
a
1
(2.5.20)
Eq. (2.5.13) for the case d = 2 can then be rewritten as
(2)
(2,PP)
EN
(2)
where e2
(2) ∼
ln N
e
+ 2 ,
2πN
N
(2.5.21)
is some constant. To our knowledge the result
(2,PP)
N EN
(2)
1
=
N →∞
ln N
2π
lim
(2.5.22)
is new to the literature. The validity of eq. (2.5.21) has been confirmed by numerical
simulations, see fig. 2.5.2.
Higher dimensional case As in the previous paragraph, we use a sufficiently rapidly
decaying function F (x), with limx→∞ F (x) = 0 and limx→0 F (x) = 1, to regularise our
50
Chapter 2. Euclidean Matching Problems
0.572
0.66
(4)
(2,PP)
0.64
0.568
N EN
√
√
3
N 2 EN
(2,PP)
(3)
0.57
0.62
0.566
0.564
0
0.06
0.12
1.5 · 10−2
N −γ4
0
N −γ3
(a) d = 3.
3 · 10−2
(b) d = 4.
0.584
0.583
√
5
N 2 EN
(2,PP)
(5)
0.585
0.582
0
5 · 10−3
1.3 · 10−2
N
−γ5
(c) d = 5.
Figure 2.5.3.: Numerical simulations for d > 2. We fitted our numerical data for
function
c
(2)
(2)
(2)
fd (N ) = ed + αd N −γd + .
N
√
d
(2,PP)
N 2 EN
(d) using the
(2)
The expected value for αd was 2π1 2 ζd (1). We verified the validity of eq. (2.5.33) with numerical simulation
on systems with sizes up to N = 10648 in dimension d = 3, N = 14641 in dimension d = 4 and N = 32768
in dimension d = 5. The scaling exponents γd are readily confirmed to be the right ones and the fitted
(2)
(2)
ζd (1)
coefficients αd are in strong agreement with the analytical prediction. We fixed also αd = 2π
in
2
(2)
(2)
fd (N ) to extrapolate the extensive coefficients ed , reported in table 2.1.
divergent series. Denoting as before by Nd (r) the number of lattice points inside a sphere
centred in the origin and of radius r, with the exclusion of the origin itself, we can write
1
X
2F
kkk
k∈Zd \{0}
+∞
Z
=
kkk
√
d
N
1
F
r2
0
=
r
√
d
N
+∞
Z
≈
0
Z∞
d−2
∂Nd (r)
d−1
− Sd r
d r + N d Sd F (r) rd−3 d r
∂r
0
d−2
1 ∂Nd (r)
− Sd rd−1 d r + N d Sd
2
r
∂r
Z∞
0
F (r) rd−3 d r,
(2.5.23)
2.5. Higher dimensional Euclidean matching problem
51
d
where Sd =
2π 2
Γ( d
2)
is the unit sphere surface in d dimensions. The last term in Eq. (2.5.23)
cannot be explicitly computed since it depends on the choice of the regularizing function
F (x). We name
+∞
Z
1 ∂Nd (r)
d−1
Σd :=
− Sd r
d r.
(2.5.24)
r2
∂r
0
It can be shown that Σd = ζd (1), the analytic continuation to the point s = 1 of the function
1
X
ζd (s) :=
k∈Zd \{0}
kkk
2s
for <s >
d
.
2
(2.5.25)
The previous function is a particular case of a more general class of zeta functions, called
Epstein zeta functions. Indeed, let us fix d > 2 and a > 0. Then, for <s > d2 , we can rewrite
eq. (2.5.25) as



ζd (s) = lim 
R→+∞ 

ZR
1
X
− Sd
2s
k∈Zd \{0}
kkk≤R
kkk
rd−1
d r + Sd
r2s
a

R→+∞
a
kkk≤R
ZR
1
X
= lim 
2s
k∈Zd \{0}
ZR
kkk
− Sd

rd−1

d
r

r2s


d−2s
rd−1
 + Sd a
d
r
. (2.5.26)
r2s
2s − d
a
Assuming that the limit in the last equation exists also for <s <
singular term. The analytic continuation of ζd (s) then reads




ζd (s) = lim 
R→+∞ 
ZR
1
X
2s
k∈Zd \{0}
kkk≤R
kkk
− Sd
0

rd−1

d
r

r2s

d
2,
we have isolated the
for <s <
d
,
2
(2.5.27)
where the limit a → 0 has been taken. Note that, comparing the previous equation with
eq. (2.5.24), ζd (1) ≡ Σd . On the other hand, for <s > d2
ζd (s) =
X
k∈Zd \{0}
1
Γ(s)
+∞
+∞
Z
Z
πs
s−1 −kkk2 z
z
e
dz =
z s−1 Θd (z) − 1 d z,
Γ(s)
0
(2.5.28)
0
P+∞
2
where Θ(z) is defined by Θ(z) := n=−∞ e−πn z ≡ ϑ3 (0; iz), where ϑ3 (τ ; z) is the third
Jacobi theta function. Noticing that for z 1 asymptotically we have Θ(z) ∼ √1z , while
Θ(z) ∼ 1 + 2 e−z for z 1, we can isolate the singular parts of ζd (s) writing


+∞
Z
Z1
1
1
πs  2
− +
z s−1 Θd (z) − 1 d z + z s−1 Θd (z) − d d z  .
ζd (s) =
Γ(s) 2s − d s
z2
1
0
Last expression can be readily continued to the region <s < d2 . Using the property
Θ(t−1 ), we can write


+∞
Z
d
2
Σd = ζd (1) = π 
−1+
1 + z 2 −2 Θd (z) − 1 d z  .
2−d
1
√ (2.5.29)
t Θ(t) =
(2.5.30)
52
Chapter 2. Euclidean Matching Problems
d=3
−0.233(2)
0.7472(2)
p=1
d=4
−0.152(17)
0.7181(2)
d=5
−0.127(16)
0.73905(5)
αd
(3)
ed
d=3
−0.652(3)
0.6313(2)
p=3
d=4
−0.36(1)
0.4804(1)
d=5
−0.28(4)
0.4830(1)
(4)
αd
(4)
ed
d=3
−0.863(3)
0.6286(2)
p=4
d=4
−0.44(2)
0.4183(2)
d=5
−0.34(1)
0.41043(4)
(1)
αd
(1)
ed
(3)
√
d
Table 2.2.: Results of the fit of numerical data for
(p,PP)
N p EN
(p)
(p)
(p)
fd (N ) = ed
+
αd
N γd
+
(d) by a function of the form
c
.
N
Therefore we finally have that for d > 2,
(2,PP)
EN
(2)
2
(2) ∼ ed N − d +
ζd (1)
.
2π 2 N
(2.5.31)
(2)
The expression for ζd (1) is given by Eq. (2.5.30), while ed have to be determined numerically. Note that for d → +∞ we recover the correct meanfield scaling behaviour already
analysed by Houdayer, Boutet de Monvel, and Martin [27] for the average optimal cost
(mf)
EN of the random assignment problem
√
d
(2,PP)
N 2 EN
d→∞
(mf)
(d) −−−→ EN
(2.5.32)
Observe indeed that the rescaled variable above has the scaling behaviour
√
d
(2,PP)
N 2 EN
(2)
(d) ∼ ed +
ζd (1)
,
2π 2 N γd
2
γd := 1 − .
d
(2.5.33)
However, for finite d, the scaling behaviour can be very different from the mean field one
that Houdayer, Boutet de Monvel, and Martin [27] used to fit also the Euclidean optimal
cost. The predictions above were confirmed by proper numerical simulations (see fig. 2.5.3).
Numerical results for generic p The asymptotic form we proposed for the average
optimal cost in the Ebmp with quadratic costs and periodic boundary conditions could
not be extended to cover the case of generic cost exponent p. Nonetheless, our numerical
(p,PP)
simulations give strong evidence that, for d > 2 and any p ≥ 1, EN
(d) has the asymptotic
form
(p)
√
α
d
(p,PP)
(p)
N p EN
(d) ∼ ed + dγ .
(2.5.34)
N d
We thus find the same scaling exponent γd analytically predicted only for the case p = 2.
Again, the non-trivial scaling exponent γd differs from the mean-field exponent γ∞ = 1
of the random link matching problem [46] and of the Eucliden monopartite matching [27].
(p)
The identification of the correct exponent γd is crucial to the extrapolation of ed from
numerical data. Since in the literature a mean-field like scaling has been considered [27], we
2.6. Correlation functions for the Ebmp
(p)
53
(p)
report in table 2.2 the values of ed and αd for different values of p. They were obtained
√
(p,PP)
fitting d N p EN
(d) with functions of the form
(p)
(p)
(p)
fd (N ) = ed +
αd
c
+ .
N γd
N
(2.5.35)
2.6. L Correlation functions for the Ebmp
Let us now discuss a different aspect of the Ebmp with reference to the optimal matching
map µ(ri ) := bσ∗ (i) − ri , σ ∗ optimal permutation, between a set of N r-points and a set
of N b-points, in the notation above, in the large N limit. In particular, let us analyse the
correlation function of the optimal map in the Ebmp with quadratic cost both in the gP
case and in the PP case. To avoid finite size effects, we will work on the unit hypertorus
Td (i.e., we will assume periodic boundary conditions). The correlation function is defined
as follows:
.
(2.6.1)
Cd (x) := µ(ri ) · µ(rj )
ri −rj =x
To obtain the previous quantity we will proceed in two different ways:
• we will generalise the scaling ansatz proposed for the average optimal cost [13];
• we will introduce a functional formalism to compute the two point correlation function
for the optimal map [11].
Both the methods and the results are, to our knowledge, new to the literature. Let us
start with the first approach: in the next Section we will introduce the second one in all
generality.
Let us consider the Ebmp between two set of points, R = {ri }i=1,...,N ⊂ Ωd and B =
{bi }i=1,...,N ⊂ Ωd , in the unit hypercube with periodic boundary conditions and quadratic
cost. In the following we will adopt the same notation of Section 2.5. Using the simple
equation (2.5.10), obtained by a proper linearisation of the Monge–Amp´ere equation in
the large N limit. We can simply write down an expression for the correlation function
assuming that
µ(x) = ∇ϕ(x), ∆ϕ(x) = δρ(x) := ρR (x) − ρR (x).
(2.6.2)
R
d
Being Ωd ρ(x) d x = 0, eq. (2.5.10) has a unique solution on the compact manifold Td ,
given by
Z
ϕ(x) = ρ(y)Gd (y, x) dd y,
(2.6.3)
Td
where Gd is the Green’s function for the Laplace operator ∆ on the flat torus Td defined by
∆y Gd (x, y) = δ (d) (x − y) − 1,
(2.6.4)
where we have introduced the empirical measures (2.5.3); in a Fourier mode expansion, we
can write
X e2πin·(x−y)
Gd (x, y) ≡ Gd (x − y) = −
(2.6.5)
2 .
4π 2 knk
n∈Zd \{0}
Assuming that at least one set of points is randomly generated, we introduce the following
correlation function
CdPP (x − y) :=∇φ(x) · ∇φ(y)
ZZ
=
∇z Gd (z − x) · ∇w Gd (w − y)ρ(z)ρ(w) dd z dd w
(2.6.6)
54
Chapter 2. Euclidean Matching Problems
where we used the fact that periodic boundary conditions are assumed and eq. (2.6.3) holds;
moreover, we denoted by
ρ(x) := ρR (x) − ρB (x)
N
i
1 X h (d)
δ (x − ri ) − δ (d) (x − bi )
=
N i=1
(2.6.7)
and the average • is intended over all possible instances as usual. As above, we will distinguish two different cases:
PP Ebmp The points of R and the points of B are random points uniformly distributed
within Ωd . In this case we obtain
i
2 h (d)
δ (x − y) − 1 ,
ρ(x)ρ(y) =
(2.6.8)
N
and therefore the correlation function is
CdPP (x − y) = −
2
Gd (x − y).
N
(2.6.9)
gP Ebmp In the gP Ebmp we suppose that N = Ld for some natural L ∈ N and that one
set of points, e.g. the set R = {ri }i=1,...,N ,is fixed on the vertices
of an hypercubic
k
lattice, in such a way that {ri }i=1,...,N = L
|k ∈ [0, L]d ∩ Nd , whilst the set B =
{bi }i=1,...,N ⊂ Ωd is obtained as before considering randomly generated points in Ωd .
We have
2
ρ(x)ρ(y) = N1 δ (d) (x − y) + N N−N
2
P (d)
1
+ N 2 ij δ (x − ri ) δ (d) (y − rj )
P − N1 i δ (d) (x − ri ) + δ (d) (y − ri ) .
(2.6.10)
In this case the correlation function is therefore
Cdgp (x − y) = −
1
Gd (x − y).
N
(2.6.11)
We will consider also the correlation function for the normalised transport field, i.e. the
following quantity:
cPP
(2.6.12)
d (x − y) = σ(x) · σ(y),
in which the correlation between the values normalized transport field
σ(x) :=
µ(x)
∇x φ(x)
=
kµ(x)k
k∇x φ(x)k
(2.6.13)
in different positions is evaluated. Note that σ lives on the d-dimensional unit sphere.
To compute the correlation function (2.6.12) for the normalised field in the PP case, we
assume a Gaussian behaviour for the joint probability distribution of two values of the
optimal transport field, and therefore we have
ZZ
−1
1 T
µ1 · µ2 e− 2 µ ·Σ (x,y)·µ
d
d
PP
cd (x − y) =
d µ1 d µ2
(2.6.14)
√
d
kµ1 kkµ2 k
2π det Σ
T
where µ = µ1 µ2 and Σ(x, y) is the covariance matrix,
µ(x) · µ(x)
µ(x) · µ(y)
CdPP (0)
Σ(x, y) :=
≡
PP
Cd (x − y)
µ(y) · µ(x)
µ(y) · µ(y)
CdPP (x − y)
.
CdPP (0)
(2.6.15)
2.6. Correlation functions for the Ebmp
55
For d ≥ 2 (the case d = 1 was studied in [10]), introducing
A :=
CdPP (0)
,
det Σ(x, y)
(2.6.16a)
B :=
CdPP (x − y)
,
det Σ(x, y)
(2.6.16b)
2
+
PP
PP
1− d
observe that B
A → 0 for N → ∞, being N Cd (x) finite for x 6= 0 and N Cd (0) ∼ N
for d > 2, N CdPP (0) ∼ ln N for d = 2. We have therefore that, in the notation above,
det Σ =
A2
1
− B2
(2.6.17)
and
B 2Γ2 d+1
2
cPP
d (x, y) =
A dΓ2 d2
B2
1− 2
A
d2
d+1
2
d
2
2 F1
d+1
2
B 2 N →∞ 2
;
−−−−→
B
d
+ 1 A2
A →0
Γ
Γ
d+1
2
d
2
!2
In the previous expression, we have introduced the hypergeometric function
∞
X
a b
Γ (a + 1)
(a)n (b)n z n
; z :=
, (a)n :=
.
2 F1
c
(c)n n!
Γ (a − n + 1)
n=0
CdPP (x − y)
(2,PP)
EN
(d)
(2.6.18)
(2.6.19)
Observe that the corresponding correlation function for the normalised field in the gP case
has the same expression, i.e.,
!2
2 Γ d+1
CdgP (x − y)
gP
2
cd (x − y) =
.
(2.6.20)
d
(2,gP)
d
Γ 2
EN
(d)
Finally, for d ≥ 2 we can compute also the so called wall-to-wall correlation function for
the PP case:
 1

!2
Z
d
Y
G1 (r)
4 Γ d+1
PP
2


.
(2.6.21)
Wd (r) :=
d xi cd (r, x2 , . . . , xd ) = −
d
(2,PP)
d
Γ 2
EN
(d)
i=2
0
Similarly, the computation for the gP case gives
 1

Z
d
Y
2
 d xi  cgp
WdgP (r) :=
d (r, x2 , . . . , xd ) = −
d
i=2
0
Γ
Γ
d+1
2
d
2
!2
G1 (r)
.
(2,gP)
(d)
EN
(2.6.22)
In the following we consider explicitly the cases d = 1 and d = 2 and we numerically verify
the results presented above.
One dimensional case In Section 2.4 we discussed an exact solution of the one dimensional matching problem in the grid–Poisson case. Let us consider now the PP case (as
always, similar considerations can be carried on for the gP case). We have that
G1 (r) = −
X
n6=0
1
|r|
1
e2πinr = − +
(1 − |r|) .
2
2
4π n
12
2
(2.6.23)
It follows from eq. (2.6.9) that
C1PP (x − y) =
1
N
1
6
− |x − y| (1 − |x − y|) ;
(2.6.24)
.
56
Chapter 2. Euclidean Matching Problems
G2
0
−0.2
0.5
−0.4
0
−0.4 −0.2
0
0.2
x
y
0.4 −0.5
Figure 2.6.1.: Laplacian Green’s function on T2 .
moreover, note that the average total cost of the optimal matching is given by
(2,PP)
EN
(1) = C1PP (0) =
1
6N .
(2.6.25)
In the gP the results of Section 2.4 are easily recovered; Boniolo, Caracciolo, and Sportiello
[10] obtained the correlation function for the normalised transport field as
"
#
2
1
−
6x(1
−
x)
gP
cPP
arctan p
.
(2.6.26)
1 (x) = c1 (x) =
π
12x(1 − x) (1 − 3x(1 − x))
Two dimensional case
have that [30]
G2 (x − y) = −
For d = 2, denoting by x = (x1 , x2 ) and by y = (y1 , y2 ), we
X
1
n6=0
2
4π 2 knk
e2πin·(x−y)
1
3/4 ϑ1 (πz|i)
ln 2π
=
1
2π Γ 4 2
z=(x1 −y1 )+i(x2 −y2 )
(x2 − y2 )
−
,
2
(2.6.27)
where, denoting by q := eiπτ ,
1
ϑ1 (z|τ ) := 2q 4
∞
X
(−1)n q n(n+1) sin [(2n + 1)z]
(2.6.28)
n=0
is the first Jacobi theta function. From eq. (2.6.9) we have simply
"
#
1 1
x22
3/4 ϑ1 (πz|i)
PP
C2 (x) = −
ln 2π
.
−
N 2π 2
Γ 41 (2.6.29)
z=x1 +ix2
In the gP case, we have as usual
C2gP (x − y) =
1 PP
C (x − y).
2 2
(2.6.30)
Observe that the previous expressions contains no free parameters and therefore a direct
comparison with numerical data is possible. We present our numerical results both from the
2.6. Correlation functions for the Ebmp
57
gP case and the PP case in fig. 2.6.2a. The average optimal cost for the PP Ebmp is given
(2)
by EN (2) = C2 (0): however, the right hand side of (2.6.29) is divergent: as explained in
Section 2.5, a proper regularisation led to the correct scaling,
ln N
βPP
1
C2PP (0) =
+
+o
, βPP = 0.1332(5).
(2.6.31)
2πN
N
N
A numerical fit of the optimal costs for d = 2 for the gP Ebmp gives
1
ln N
1
C2gP (0) =
+ βgP + o
, βgP = 0.3758(5).
2N
2π
N
(2.6.32)
The correlation function (2.6.12) for the normalised matching field in the PP case has
the expression (2.6.18),
π C2PP (x − y)
.
(2.6.33)
cPP
(x
−
y)
=
2
4 C2PP (0)
Observe that the only free parameter in this quantity is C2PP (0): inserting the value obtained
by Caracciolo et al. [14], eq. (2.6.31), we obtain the theoretical prediction in fig. 2.6.2b,
where we also present some numerical results for c2 (x, y) that show the agreement with
the analytical prediction. As remarked above, the form of the correlation function for the
normalised transport field in the gP case, cgP
2 (x − y) is the same as in the PP case, being
the derivation of eq. (2.6.18) independent from the explicit form of C2gP (x). Using the value
(2.6.32) in eq. (2.6.20) for d = 2 we obtained the theoretical curve depicted in fig. (2.6.2b),
where, once again, an excellent agreement is found with numerical data.
Finally, let us compute the wall-to-wall correlation function for the PP Euclidean matching problem on the flat torus. The theoretical prediction is given by eq. (2.6.21),
W2PP (r) = −
π
G1 (r).
2N C2PP (0)
(2.6.34)
π
G1 (r).
4N C2gP (0)
(2.6.35)
In the gP case, instead, we have
W2gP (r) = −
Numerical results both for the PP case and for the gP case are presented in fig. 2.6.2c
and fig. 2.6.2d. Once again observe that the values of the average optimal cost in the
corresponding cases, eq. (2.6.31) and eq. (2.6.31), fix completely the expression of the wallto-wall correlation function.
Three-dimensional case For d = 3 eq. (2.7.24) and eq. (2.6.9) give
C3PP (x − y) =
1
2π 2 N
X
1
2
knk
n∈Z3 \{0}
e2πin·(x−y) .
(2.6.36)
Clearly the previous function can not be represented in a plot, but we computed the correlation function for a system of size N = 8000 averaging on 104 realisations. From the
correlation function C3PP (x), the wall to wall correlation function can be obtained as before
in the form
16
W3PP (r) = −
G1 (r).
(2.6.37)
3πN C3 (0)
As in the previous cases, C3PP (0) can be evaluated from the cost fit and it is equal to
2
C3 (0) = 0.66251(2)N − 3 − 0.45157...
(see Section 2.5). Following the same procedure of the
N
PP case, we can compute the wall-to-wall correlation function for on the unit hypercube in
58
Chapter 2. Euclidean Matching Problems
1.5
1
C2PP (r1 , 0)
C2gp (r1 , 0)
cPP
2 (r1 , 0)
cgp
2 (r1 , 0)
0.8
1
cPP
2
C2PP
0.6
0.4
0.5
0.2
0
0
0
0.1
0.2
0.3
0.4
0.5
0
0.1
0.2
0.3
r1
0.4
0.5
r1
(a) Section C2PP (r1 , 0) and C2gP (r1 , 0) of the correlation function for N = 104 in the PP case and
for N = 3600 in the gP case and corresponding
theoretical predictions.
(b)
Section cPP
= 104 and
2 (r1 , 0) for N
cgp
(r
,
0)
for
N
=
3600
and
theoretical
predic1
2
tions, eq. (2.6.18) and eq. (2.6.20): note that the
theoretical curves almost overlap.
p=2
Theoretical prediction
p=2
Theoretical prediction
0.05
W2gp
W2PP
0.05
0
0
−0.05
−0.05
0
0.1
0.2
0.3
0.4
0.5
r
(c) Wall-to-wall correlation function in two dimensions for the PP matching with N = 104 on the
unit flat torus. The continuous line corresponds to
the analytical prediction.
0
0.1
0.2
0.3
0.4
0.5
r
(d) Wall-to-wall correlation function in two dimensions for the gP matching with N = 3600
on the unit flat torus. The continuous line corresponds to the analytical prediction.
Figure 2.6.2.: Correlation functions for the Euclidean bipartite matching problem in two dimensions and
numerical results.
2.7. Functional approach
59
PP
gP
0.01
W3
0.005
0
−0.005
0
0.1
0.2
0.3
0.4
0.5
r
Figure 2.6.3.: Wall-to-wall correlation function in three dimensions for the PP matching and the gP
matching with d = 3 and N = 8000 on the unit flat hypertorus. We attributed the discrepancy between
numerical data and theoretical prediction for r → 0 for the PP case to binning effects.
the three-dimensional case for the gP matching problem. Reproducing the computations
of the d = 2 case we have
W3gp (r) = −
8
G1 (r).
3πN C3gp (0)
(2.6.38)
We evaluated C3gp (0) from the cost scaling, obtaining
2
C3gp (0) = 0.4893(4)N − 3 −
0.23(5)
.
N
(2.6.39)
The prediction obtained and the numerical data are presented in fig. 2.6.3.
2.7. L Functional approach
The previous results suggest that the Ebmp with quadratic cost, in the large N limit,
appears as a Gaussian free theory on the d-dimensional flat hypertorus, in such a way that
the correlation function of the matching map is related directly with the free propagator
of the theory itself. To properly prove this fact, let us consider the Ebmp on the unit
hypercube and, for the sake of clarity, let us restate once again the problem. We suppose
that the we are working on a bipartite complete graph KN,N = Graph(V, U; E) and that we
have a certain map Φ such that to each element v ∈ V ∪ U 7→ Φ(v) ∈ Ω, point randomly
generated on a certain domain Ω (with given boundary conditions) with a certain probability
distribution density ρ(x). In particular, we will use the following notation:
vi ∈ V
ui ∈ U
7→ ri := Φ(vi ) ∈ Ω,
7→ bi := Φ(ui ) ∈ Ω.
(2.7.1a)
(2.7.1b)
We define the empirical measures
ρR (x) :=
ρB (y) :=
1
N
1
N
PN
i=1
PN
i=1
δ (d) (x − ri ) ,
δ
(d)
(y − bi ) .
(2.7.2a)
(2.7.2b)
A perfect matching m = Graph(Vm ; Em ) on KN,N can be represented by a N × N square
matrix M = (mij )ij such that
(
N
N
X
X
1 if (vi , uj ) ∈ Em ,
mij =
,
mij =
mij = 1.
(2.7.3)
0 otherwise.
i=1
j=1
60
Chapter 2. Euclidean Matching Problems
We define the following functional
ZZ
1 X
p
p
mij kbj − ri k =
kx − yk π(x, y) dd x dd y
N i,j
Ω
Z
p
= kµ(x)k ρR (x) dd x,
(p)
EN [M ; Φ] :=
(2.7.4)
Ω
where we have introduced
π(x, y) :=
1 X
mij δ (d) (x − ri ) δ (d) (y − bi )
N i,j
and the transport vector µ(x) on Ω such that
X
µ(rk ) =
mkj (bj − rk ),
(2.7.5)
∀k = 1, . . . , N.
(2.7.6)
j
In the previous expressions, k•k is the Euclidean norm on the considered domain, taking
into account the imposed boundary conditions. Observe that we can write
π(x, y) = δ (d) (y − x − µ(x)) ρR (x) = δ (d) x − y − µ−1 (y) ρB (y)
(2.7.7)
= δ (d) (y − x − µ(x)) ρB (y).
The constraints (2.7.3) can be expressed in terms of π as
R
π(x, y) dd x = ρB (y),
Ω
R
π(x, y) dd y = ρR (x),
Ω
(2.7.8a)
(2.7.8b)
or, alternatively, in terms of µ(x) substituting the expression (2.7.7) in the previous relations. Observe however that, if adopting (2.7.7), only one constraint is needed: indeed,
R
Ω
π(x, y) dd x =
ρR (z) =
R
Ω
δ (d) (y − x − µ(x)) ρR (x) dd x = ρB (y) ⇒
RR
Ω
δ (d) (y − z − µ(z)) δ (d) (y − x − µ(x)) ρR (x) dd x dd y =
Z
= δ (d) (y − z − µ(z)) ρB (y) dd y. (2.7.9)
Ω
We can now introduce a partition function for our problem in the form
X
(p)
Z[β, N ; Φ] :=
e−βN EN [M ;Φ]
M

=
X
N
Y

{mij =0,1}
' −β 2 N 2
j=1
ZZZ
δ 1−
N
X
!
mij
i=1
N
Y

δ 1 −
i=1
N
X

−β
mij  e
P
ij
mij kbj −ri kp
j=1
[DπDuDv] e−βN Sπ [π,u,v]
π>0
ZZZ
' iβN
[DµDλ] e−βN Sµ [µ,λ]
(2.7.10)
where we introduced the two actions
RR
p
Sπ [π, u, v] := Ω [(kx − yk − u(x) − v(y)) π(x, y)] dd x dd y
R
+ Ω [u(x)ρR (x) + v(x)ρB (x)] dd x
(2.7.11a)
2.7. Functional approach
61
and
p
[kµ(x)k ρR (x) − λ(x)ρB (x)] dd x
RR + Ω λ(y)δ (d) (y − x − µ(x)) ρR (x) dd x dd y
R
p
= Ω [kµ(x)k ρR (x) + λ(x)%(x)] dd x
R
+ Ω ρR (x)µ(x) · ∇λ(x) dd x + s[µ, λ],
Sµ [µ, λ] :=
R
Ω
(2.7.11b)
where we have introduced
%(x) := ρR (x) − ρB (x).
(2.7.12)
and s[µ, λ] are higher order non-linear terms in the fields obtained from the Taylor series
expansion of around
P µ = 0.
P Observe that we have implicitly relaxed the condition mij = 0, 1
requiring only i mij = j mij = 1. Let us use the saddle point approximation: we have
that for Sπ
δSπ
δπ(x,y)
δSπ
δu(x)
δSπ
δv(x)
p
= kx − yk − u(x) − v(y) = 0,
R
= − Ω π(x, y) dd y + ρR (x) = 0,
R
= − Ω π(y, x) dd y + ρB (x) = 0.
(2.7.13a)
(2.7.13b)
(2.7.13c)
Let us denote by π ∗ , u∗ and v ∗ a solution of the previous set of equations; observe that
in the functional integration u and v are supposed imaginary, whilst in the saddle point
u∗ + v ∗ is real.
The saddle point equations for the functional Sµ can be written instead as
δSµ
δµ(x)
=
pµ(x)kµ(x)k
δSµ
δλ(x)
=
%(x) − ∇ · (ρR (x)µ(x)) +
p−2
ρR (x) − ρR (x)∇λ(x) +
δs
δλ(x)
δs
δµ(x)
= 0,
= 0,
(2.7.14a)
(2.7.14b)
where we supposed that either periodic boundary conditions are imposed on Ω or µ is zero
on its boundary.
In the thermodynamical limit, i.e. for N → ∞, we expect that kµ(x)k → 0 ∀x ∈ Ω
due to the fact that both the empirical measures ρR and ρB converge weakly to the same
probability density function ρ(x). In this limit it is more convenient to use the functional
Sµ , in which we neglect the higher order contribution s to the action. Moreover, in the
βN → ∞ limit we can perform a saddle point approximation. The saddle point is given by
the solutions µ0 (x), λ0 (x) of eq. (2.7.14). In particular, we obtain
hµ(x) · µ(y)i = µ0 (x) · µ0 (y).
(2.7.15)
To properly perform the approximation, we have to take into account that kµk ∼ λ ∼ %.
For the sake of simplicity, let us consider the quadratic case, p = 2. In the spirit of the
saddle point approximation, we neglect therefore the non quadratic terms as higher order
corrections in the inspected limit. The approximate action3 is
i
R h
R
2
2
Sµ
[µ, λ] = Ω kµ(x)k ρ(x) + λ(x)%(x) dd x + Ω ρ(x)µ(x) · ∇λ(x) dd x.
(2.7.18)
3 We
observe here that the functional reproduce the correct discrete in the limit of the validity of the
approximation
µ(ri ) · ∇λ(x)|x=ri ≈
X
mij (λ(bj ) − λ(ri )) ,
i = 1, . . . , N,
(2.7.16)
j
for a given function λ : Ω → R: for d > 2 and ρ(x) = 1, for example, the
optimal
transport vector
1
2
scale as kµk ∼ N − d and therefore the error on the previous relation is o N − d . If we assume that
62
Chapter 2. Euclidean Matching Problems
In this approximation the Euler–Lagrange equations greatly simplify as
%(x)
=
µ(x)
=
∇ · (ρ(x)µ(x)) ,
i∇λ(x)
.
2
(2.7.19a)
(2.7.19b)
The same linearisation procedure can be performed in the functional Sπ , obtaining the same
result: we can indeed adopt the expression
π(x, y) = δ (d) (y − x) ρR (x) − ρ(x)µ(x) · ∇y δ (d) (y − x) .
(2.7.20)
Substituting in eq. (2.7.11a) and observing that
hR
i
R
d
u(x)
π(x,
y)
d
y
−
ρ
(x)
dd x = 0,
R
Ω
Ω
i
hR
R
R
R
v(y) Ω π(x, y) dd x − ρB (y) dd y = Ω v(x)%(x) dd x + Ω ρ(x)µ(x) · ∇v(x) dd x.
Ω
(2.7.21)
reproducing therefore the contribution in eq. (2.7.18) up to the identification v(x) ≡ λ(x).
As suggested by the notation, the Lagrange multipliers corresponds therefore to the functions u, v introduced by Kantoroviˇc in its dual formulation, presented in Section 2.3, of
the transport problem. Remarkably, in eq. (2.7.19b) we recover the exact result (2.3.17)
for the optimal transport map; we justify also the scaling ansatz proposed in Section 2.5,
generalised in eq. (2.7.19a). We can therefore compute directly by Gaussian integration the
correlation function
∇ϕ(x) · ∇ϕ(y)
,
(2.7.22)
hµ(x) · µ(y)i =
4
where we denoted by ϕ(x) the solution of the following equation on the d-dimensional
hypertorus:
∇ · [ρ(x)∇ϕ(x)] = 2%(x).
(2.7.23)
R
Observing that Ω %(x) dd x = 0, we introduce now the Green’s function Gρ (x, y) on Ω
defined by
X φi (x)φi (y)
0
0
∇y · [ρ(y)∇y Gρ (x, y)] = δ (d) (x − y) −
(2.7.24)
R i 2 d ,
φ
(z)
d
z
i
0
Ω
In the previous expression φi0 (y) are the zero mode eigenfunctions
∇x · ρ(x)∇x φi0 (x) = 0.
(2.7.25)
We can write an explicit expression for ϕ(x) as
Z
ϕ(x) = 2 Gρ (x, y)%(y) dd y,
(2.7.26)
Ω
eq. (2.7.16) holds, the partition function in the discrete case is recovered as


Z
Z = iβN
Z
[DµDλ] exp −βN
[kµ(x)kp ρR (x) + λ(x)%(x) + ρR (x)µ(x) · ∇λ(x)] dd x
Ω
=
N
X Y
{mij } i=1
(2.7.17)
!
δ
X
mij − 1
!
δ
X
j
up to the relaxed constraint mij = 0, 1.
j
mji − 1
e
−β
PN
i=1
p
mij kbj −ri k
2.7. Functional approach
63
from which we obtain
ZZ
[%(z)%(w)∇x Gρ (x, z) · ∇y Gρ (y, w)] dd z dd w.
(2.7.27)
Averaging over the disorder, we obtain easily the following expression:
ZZ h
i
hµ(x) · µ(y)i =
%(z)%(w)∇x GR (x, z) · ∇y GR (y, w) dd z dd w
(2.7.28)
hµ(x) · µ(y)i =
Ω
Ω
where we denoted by • the average over all instances and
%(z)%(w) = 2
i
ρ(z) h (d)
δ (z − w) − ρ(w) .
N
Therefore, up to higher order corrections, we have
Z
2
hµ(x) · µ(y)i =
[ρ(z)∇x Gρ (x, z) · ∇y Gρ (y, z)] dd z
N
Ω
ZZ
2
[ρ(z)ρ(w)∇x Gρ (x, z) · ∇y Gρ (y, w)] dd z dd w.
−
N
(2.7.29)
(2.7.30)
Ω
The previous expressions provide a recipe for the calculation of the average optimal cost
and for the correlation function in the random Euclidean bipartite matching problem in the
case of quadratic cost. Observe that a general expression for the distribution from which
the points are extracted is assumed.
2.7.1. Uniform density on the hypertorus
If ρ(x) = 1 and Ω ≡ Td , then the previous expression can be greatly simplified.Observe
indeed that in this case Gρ (x, y) ≡ Gd (x − y), where Gd (x − y) is Green’s function of the
Laplacian on the unit flat hypertorus Td :
∇2y Gd (x − y) = δ (d) (x − y) − 1.
Under these hypotheses we have that, up to o N1 terms,
2
1
hµ(x) · µ(y)i = − Gd (x − y) + o
,
N
N
(2.7.31)
(2.7.32)
where we used the fact that, if ρ(x) = 1,
%(z)%(w) =
i
2 h (d)
δ (z − w) − 1 .
N
(2.7.33)
We recover therefore the same expression obtained through a scaling ansatz already adopted.
2.7.2. Matching problem on the line
Let us consider now Ω = R case and a certain probability density distribution ρ(x) on
R
it, R ρ(x) d x = 1, limx→±∞ ρ(x) = limx→±∞ ∂ρ(x)
= 0, ρ(x) > 0 ∀x ∈ Ω. We assume
∂x
Neumann boundary conditions for the scalar potential ϕ(x)
lim ∂x ϕ(x) = 0,
x→±∞
(2.7.34)
64
Chapter 2. Euclidean Matching Problems
N = 5000
Theoretical prediction
N hµ(x)µ(0)i
3
2.5
2
1.5
0
0.5
1
1.5
x
Figure 2.7.1.: Numerical result and theoretical prediction for the correlation function of the optimal Euclidean matching on the line between two set of N = 5000 points independently and identically distributed
according to a standard Gaussian distribution. The numerical data are obtained averaging over 5000 realisations.
(i.e., the map µ(x) = ∂x ϕ(x) is zero at large distances). It follows that
Zx
2
∂x ϕ(x) =
ρ(x)
%(y) d y,
(2.7.35)
−∞
and therefore
Zy
Zx
hµ(x)µ(y)i =
d z1
−∞
d z2
2 Φρ (min{x, y}) − Φρ (x)Φρ (y)
%(z1 )%(z2 )
=
ρ(x)ρ(y)
N
ρ(x)ρ(y)
(2.7.36)
−∞
where we introduced the partition function
Zx
Φρ (x) :=
ρ(ξ) d ξ.
(2.7.37)
−∞
If we assume, for example, a Gaussian distribution
1
x2
ρ(x) = √
exp − 2 ,
2σ
2πσ
(2.7.38)
we obtain the theoretical prediction for hµ(x)µ(0)i for σ = 1 in fig. 2.7.1. Numerical data
are is an excellent agreement with the theoretical prediction.
Observe also that eq. (2.7.36) provides the correct correlation function for the Euclidean
matching problem on the segment Ω = [0, 1] with uniform distribution. Assuming indeed
(
1 x ∈ Ω,
ρ(x) =
(2.7.39)
0 otherwise,
we have
(
2 min{x,y}−xy
N
hµ(x)µ(y)i =
0
(x, y) ∈ [0, 1]2
otherwise.
(2.7.40)
Observe that the covariance of a Brownian bridge process naturally appear, as shown in
[10] (observe however that in the cited work one set of point is supposed fixed and therefore
the factor 2 is not present).
2.8. Emmp via replicas
65
2.8. L Emmp via replicas
Let us now consider the problem of the Emmp in a replica approach. The computation
follows directly the method used for the Rmmp but in this context a more subtle correlation
between the different weights appears, due to the presence of an Euclidean structure. We
will present here a detailed computation for this case: we consider the weighted matching
problem on the complete graph K2N = Graph(V; E), whose vertices are associated to point
independently and uniformly generated on the unit hypercube Ωd ,
vi ∈ V 7→ Φ(vi ) = ri ∈ Ωd ,
(2.8.1)
with weights given by
p
wij := w ((vi , vj )) = kri − rj k ,
p ∈ R+ .
(2.8.2)
We want to find a perfect matching m on K2N such that the following functional
(p,m)
EN
[m, Φ] :=
1 X
w(e)
N e∈m
(2.8.3)
is minimised. In particular, let us introduce the adjacency matrix M for the matching m as
M = (mij ∈ {0, 1})1≤i,j≤2N ,
mij = mji ,
2N
X
mij = 1;
(2.8.4)
i=1
the functional above can be written as
(p,m)
EN
(p)
[m, Φ] ≡ EN [M ; Φ] :=
1 X
mij wij .
N i<j
(2.8.5)
In a celebrated paper, M´ezard and Parisi [40] showed that the Emmp can be studied via a
replica approach considering correlations between different weights as a perturbation to the
purely random case, computing the very first order corrections due to correlations among
three different weights (two weights are always uncorrelated). We want to perform here a
general computation following their perturbative approach. We start from the following
partition function
Z[β, N ; Φ] :=
X
p
+1
e−βN d
(p)
EN [M ;Φ]
M

X
=
2N
Y

{mij =0,1}

=
2π
δ 1−
j=1
2N
X
i=1
!

mij  exp −βN

p
d
X
mij wij 
(2.8.6)
i<j

2N Z
i
Y
p
eiλi d λi  Y h
1 + exp −βN d wij − iλi − iλj .
2π
i=1
i<j
0
p
In the previous expression we have properly rescaled the temperature β → βN d to obtain
p
a finite average cost in the large N limit (note that the average optimal cost scales as N − d
in any dimension in the Emmp). As in the purely random case, we have to properly average
over the disorder, i.e., over all the possible maps Φ, and therefore we proceed via a replica
66
Chapter 2. Euclidean Matching Problems
trick, as in Section 1.4 for the Rmmp. The replicated partition function is


n Yh
n Y
2N Z2π iλa
i
Y
Y
i d λa
p
e
i 
n

Z [β, N ; Φ] =
1 + exp −βN d wij − iλai − iλaj
2π
a=1 i<j
a=1 i=1 0


n Y
2N Z2π iλa
Y
e i d λai  Y

=
(1 + uij ) ,
2π
a=1 i=1
i<j
(2.8.7)
0
uij :=
∞
X
p
e−i
X
e−rβN d wij
Pr
m=1
(λai m +λaj m ) .
a1 <···<ar
r=1
In the Rmmp the average can be performed directly, using the fact that the joint probability distribution of the weights wij factorizes [37]. Here this is not true any more, due to
the Euclidean constraint. Therefore, we write the following:
Y
(1 + uij ) = 1 +
i<j
∞
X
X0
E
Y
uek ,
(2.8.8)
E=1 {ek }k=1,...,E ⊂E k=1
−1)
where the primed sum runs over all the N (2N
set of different E edges. In other words,
E
the sum runs over all the possible subgraphs g ⊆ K2N . In particular, let us suppose that
g = Graph(Vg ; Eg ) ⊆ K2N is a generic connected subgraph with V = |Vg | ≤ 2N vertices
Q
and E = |Eg | edges. Then we have to evaluate e∈Eg ue . The average must be performed
using the following joint distribution4


! E
E Z
Y
Y
Y
X
p
ρg (w(e1 ), . . . , w(eE )) =
dd zi
δ (w(ei ) − kzi k )
δ (d) 
zi  ,
i=1
k=1
f∈F (g)
i : ei ∈Ef
(2.8.9)
where F(g) is the set of cycles f := Graph(Vf ; Ef )5 surrounding the faces of g. To evaluate
QE
k=1 uek we have to compute an integral in the form
E Z
Y
p
−rk βN d w(ek )
d w(ek ) e
!
ρg (w(e1 ), . . . , w(ek )),
r1 , . . . , rE ∈ N.
(2.8.10)
k=1
In the relevant limit of short links, i.e., taking into account that the occupied links are of
p
the order N − d , the previous quantity can be rewritten as
!
E Z
Y
1
Wβ (g; r1 , . . . , rE )
d(ek ) e−rk βw(ek ) ρg (w(e1 ), . . . , w(eE )) =:
.
(2.8.11)
N E−L
N E−L
k=1
where L := |F(g)| is the number of faces of the graph. Using the Euler formula, V −E +L =
1 − 2g, where g is the genus of the graph, we have that the contribution of the subgraph g
1
.
scales as N V −1+2g
Now we must evaluate the number of distinct subgraph of K2N homeomorphic to g.
(2N )!
V
There are (2N
different ordered subsets of V vertices in K2N . Being the
−V )! ∼ (2N )
graph complete, we can associate to each set a subgraph homeomorphic to g. However, if
g has some additional symmetry under the exchange of vertices, we overcount the number
of subgraphs and a certain correcting factor taking into account this symmetry σ(g) must
4 Note
5 We
d(V −1)
−E
that ρg ({αw}) = α p
ρg ({w}).
consider here also cycle of length 3.
2.8. Emmp via replicas
67
be introduced6 : note that this factor depends on the subgraph size and not on the size
N of the complete graph in which it is embedded. Finally, we have that a given topology
gives a contribution of the order N 1−2g . If we restrict ourselves to the leading order contribution, than we must consider planar graphs, for which g = 0 and therefore the Euler
formula holds in its simpler form V − E + L = 1. To proceed further, note that for a
generic connected subgraph g of the complete graph K2N , if, e.g., v is a vertex separating g = Graph(Vg ; Eg ) into g1 = Graph(V1 ; E1 ) and g2 = Graph(V2 , E2 ), V1 ∩ V2 = {v},
Q
Q
Q
then, e∈Eg ue = e∈E1 ue e∈E2 ue . Remember that each connected graph can be decomposed in non-separable components in a unique manner [55]. For a generic subgraph
g = Graph(Vg ; Eg ) of the complete weighted graph K2N , we can introduce therefore the
Ursell function of g




X
Y Y
Y ∂
X
 ln exp 
(2.8.12)
U (g) :=
(−1)|Π|−1 (|Π|−1)!
ue ≡ 
ze ue 
∂z
e
e∈Eg
e∈Eg
Π
π∈Π e∈π
ze =0
where Π is a partition in |Π| subsets of the set Eg : if g is separable, U (g) = 0. Indeed, in
˜ such that
this case ∃!Π


!
X
X
X


ln exp
ln exp
ze ue .
(2.8.13)
ze ue =
e∈π
˜
π∈Π
e∈Eg
1
Finally, we can rewrite Eq. (2.8.8), up to o N terms, as


∞ X
Y
X
X0
(1 + uij ) ≈ exp 
σ(g)
U (g)
i<j
E=1 [g,E]
[E]
(2.8.14)
1
1
1
1
1
+
+
+
+
+ ... .
= exp
2
6
8
10
4
P0
where the primed sum [E=x] runs over the all possible subsets of E = x distinct edges
P
and [g,E] is a sum over all planar different topologies of non separable planar graphs with
E edges. The previous expression generalises an analogue one appearing in [40] in which
E ≤ 3 was considered. We can further simplify our computation observing that, given a
planar non-separable graph g,
U (g) :=
X
Y Y
Y
X Y
(−1)|Π|−1 (|Π| − 1)!
ue −
ue
ue˜ + . . .
ue =
Π
π∈Π e∈π
e∈Eg
e∈Eg
(2.8.15)
e˜6=e
The first term of the previous sum is of
order O N V1−1 as explained above; all other terms
1
of the sum are of order O N V +2|Π|−3
(each “cut” augments the number of vertices of 2).
Therefore at the first order we can simply consider
U (g) '
Y
ue
(2.8.16)
e∈Eg
6 Let
us consider for example the graph

1



g=2

4

3
For a given set {x1 , x2 , x3 , x4 } of points, we can arrange them in 4! ways but only 3! are really different,
due to the symmetry under the exchanges 1 ↔ 3 and 2 ↔ 4. For this graph σ(g) = 14 .
68
Chapter 2. Euclidean Matching Problems
Following M´ezard and Parisi [40], let us introduce now, using a proper set of Lagrange
multipliers qˆa1 ,...,ak , the order parameter,


2N
k
X
X
1
a
qa1 ...ak :=
exp −i
(2.8.17)
λi j  , 1 ≤ k ≤ n,
2N i=1
j=1
symmetric under permutation of replica indices. For a given value of E and a planar graph
g = Graph(Vg ; Eg ), |Vg | = V , we have in the notation above
X0
X Wβ (g; {ri }) X0 Y
2qa(v)
(2.8.18)
U (g) = N
QE
r
!
k
1
E
r
,...,r
k=1
v∈V
[E]
1
E
(a )...(a )
g
where (ai ) = (ai1 , . . . , airi ) is a set of ri replica indices, i = 1, . . . , E, the primed sum runs
over all different sets of replica indices. In the expression above we used also the notation
[
a(v) =
(ak ),
(2.8.19)
k : ek ∈E(v)
where E(v) is the set of edges at v. Once again, if we restrict ourselves to the E = 3
contribution, the previous formula reduces to the results of M´ezard and Parisi [40]. We are
interested in the replica symmetric solution, in which
qa1 ...ap ≡ Qp ,
and therefore
X0
U (g) = 2V N
X
{re }e∈Eg
[E]
(2.8.20)
Y
Wβ (g; {re })
Q
ςn (g; {re })
Qr(v) ,
e∈Eg re !
r(v) :=
v∈Vg
X
rj .
j : ej ∈E(v)
(2.8.21)
In the previous expression we P
introduced the combinatorial coefficient ςn (g; {re }), denoting
the number of ways to select e re (not necessarily distinct) replica indices satisfying the
following properties: we can chose V sets Rv , v = 1, . . . , V , such that, given v ∈ V,
Rv ∩ Rv0 = ∅ for ∀v 0 ∈ ∂v. Note that ς0 = 0: we denote by ς = limn→0 ςnn .
The averaged replicated partition function is therefore
Z
Zn =
d µ(q, qˆ) e−N βS[β;q,ˆq] ,
(2.8.22a)
−βS[β; q, qˆ] :=
Σ(q) + 2 ln z(ˆ
q) − 2
n X
n
k=1
where
R
d µ(q, qˆ) :=
Σ(q) :=
R +∞
k=1 −∞
Qn
∞ X
X
E=1 [g,E]
n→0
−−−→ n
∞
X
d qk
X
2V
{re }e∈Eg
X
E=1 [g,E]
V
R +i∞
−i∞
k
qˆk qk ,
(2.8.22b)
d qˆk and
V
Y
Wβ (g; {re })
Q
ςn (g; {re })
qr(v)
e∈Eg re !
v=1
2 σ(g)
(2.8.23a)
X
{re }e∈Eg
Y
Wβ (g; {re })
Q
ς(g; {re })
qr(v) .
e∈Eg re !
v
The purely random link contribution is related to the E = 1 contribution and omitting all
other terms we recover the Rmmp as mean field approximation. Following [37],


!
Pr
n Z2π iλa
n a
Y
X
aj
e
d
λ
n
−i
λ
 exp

j=1
2 ln z(ˆ
q ) := 2 ln 
qˆr e
2π
r
a=1 0
r=1
(2.8.23b)
+∞
Z
βx
n→0
−−−→ 2nβ
e− e − e−G (x) d x,
−∞
2.8. Emmp via replicas
69
where the function
G (x) := G(βx),
G(x) :=
∞
X
(−1)r−1
r=1
qˆr erx
r!
(2.8.24)
and we used the fact that
n n→0 (−1)k−1
−−−→ n
.
k
k
(2.8.25)
The saddle point equations can be written as
+∞
Z
Qr = β
eβrx−G (x)
d x.
(r − 1)!
(2.8.26a)
−∞
∞ X
X
V
X
Q
er(vi )βx j6=i Qr(vj )
Wβ (g; {re })
Q
G (x) =
ςn (g; {re })
2
σ(g)
Γ(r(vi ))
e∈Eg re !
i=1 {re }e∈Eg
E=1 [g,E]
ZZ
∂
β(x + y − w)
=−2
ρ(w) e−G (y)
J0 2 exp
dwdy
∂x
2



V
∞ X
X
YZ
YZ
X
ρg ({w})

d w(e)
d xv G 0 (xv ) e−G (xv ) 
+
2V −1 σ(g) 
·
β
e∈g
i=1
E=3 [g,E]
v6=i
∂
Kg [{β(xa + xb − wab )}] ·
∂xi
xi ≡x
(2.8.26b)
V −1
X
where J0 (x) is a Bessel function of the first kind. In the previous equation we introduced
the following function depending on the topology of the considered graph
"
Kg ({xe }) :=
X
{re ∈N}
#
ς(g; {re }) Y ere xe
Q
.
re !
v∈V r(v)! i
(2.8.27)
We can finally write down a general expression for the free energy as
+∞
+∞
Z
Z
Zn − 1
− eβx
−G (x)
¯
− f (β) := lim
=2
e
−e
dx −
G (x) e−G (x) d x+
N →∞ nN β
n→0
−∞
−∞
" E Z
!
#
∞ X
V Z
X
Y
Y
ρ
({w}
)
g
k
Kg [{β(xa + xb − wab )}] .
+
2V σ(g)
d we
d xv G 0 (xv ) e−G (xv )
β
v=1
E=3 [g,E]
e∈E
(2.8.28)
From the previous expression we have that
p
p
(p)
(p)
lim N d EN (d) := lim N d min EN [M ; Φ] = lim f¯(β).
N
N
M
β→∞
(2.8.29)
2.8.1. Recovering the mean field approximation
In the mean field approximation only the E = 1 contribution is considered (i.e., different
weights in the complete graph are considered identically and independently distributed).
70
Chapter 2. Euclidean Matching Problems
The saddle point equations becomes:
+∞
Z
eβrx−G (x)
d x,
(r − 1)!
−∞
ZZ
β(x + y − w)
−G (y) ∂
G (x) = −2
ρ(w) e
J0 2 exp
dwdy
∂x
2
Qr = β
Sd d
p −1 ,
p w
where in the short-link limit ρ(w) ∼
(2.8.30a)
(2.8.30b)
d
where Sd :=
2π 2
Γ( d
2)
is the surface of a d-
dimensional unit sphere as usual. In the large β limit, we have that
βx
β→∞
− J0 2 exp
−−−−→ θ(x)
2
(2.8.31)
and therefore eq. (2.8.30b) becomes
d
4π 2
G (x) =
pΓ d2
Z∞
d
w p −1 e−G (w−x) d w.
(2.8.32)
0
The previous equation is the mean field equation for the function G , as already found in
[37, 40] and can be numerically solved for given values p, d. The mean field contribution to
the average optimal length is therefore obtained from eq. (2.8.28) in the β → ∞ limit:
lim N
N
p
d
(p,mf)
(d)
EN
Z∞
=
G (x) e
−G (x)
dx − 2
−∞
+∞
Z
θ(x) − e−G (x) d x.
(2.8.33)
−∞
The previous equation and eq. (2.8.32) shows that, up to a rescaling factor, the mean field
solution for the Emmp with parameters p, d corresponds to the mean field solution of a
d
Rmmp whose weight distribution has the form ρ(w) ∼ w p −1 for w → 0. Observing that,
for α > 0
Z∞
f (x) = α e−f (y−x) d y ⇒ f (x) = ln (1 + eαx ) ,
(2.8.34)
0
then for p = d we have the explicit solution
"
d
G (x) = ln 1 + exp
2π 2
x
Γ d2 + 1
!#
(2.8.35)
and the average optimal length is
(d,mf)
lim N 2 EN
N
(d) =
1 π2
,
α(d) 12
α(d) :=
Sd
.
d
(2.8.36)
2.8.2. Polygonal approximation and zero temperature limit
Preliminaries Let us now consider the polygonal corrections in the evaluation of the free
energy: we will show that a formal correspondence between the polygonal contribution and
the finite size corrections evaluated by Parisi and Rati´eville [46] for the Rmmp appears in the
1
computation. Let us denote by pE the polygon with E edges: for this graph, σ(pE ) = 2E
.
Moreover, we can write the saddle point action (2.8.22a) in the n → 0 limit as
ˆ := Smf [β; Q, Q]
ˆ +
S[β; Q, Q]
∞
X
ˆ
SE [β; Q, Q]
,
2E
E=3
(2.8.37)
2.8. Emmp via replicas
71
where
+∞
Z
ˆ := 2nβ
− βSmf [β; Q, Q]
− eβx
e
−G (x)
−e
+∞
Z
d x − nβ
G (x) e−G (x) d x
(2.8.38)
−∞
−∞
is the well known mean field contribution, whilst, denoting by
"E Z
#
!
E
Y
X
p
d
(d)
ρE (w1 , . . . , wE ) :=
d zi δ (wi − kzi k ) δ
zi ,
i=1
(2.8.39)
i=1
the series terms are given by
ˆ := 2E
−βSE [β; Q, Q]
X
ς(pE , {rk })
r1 ,...rE
E Z
Y
−ri βwi
d wi e
i=1
Qri +ri+1
ri !
ρE (w1 , . . . , wE )
∞
Z
E
Y
2E Sd X ς(pE , {rk })
d−1
=
k
Qri +ri+1 gri (k) d k.
QE
d
(2π) r ,...r
i=1 ri !
i=1
E
1
0
(2.8.40)
In the previous expression Sd =
d
2π 2
d
Γ 2
( )
is the solid angle in d dimensions and we have
introduced
gr (k) :=
Z∞
Sd−1
β
Zπ
dz
d
p
0
Z∞
= Sd
z
d φ z d−1 eikβ
−1
p
z cos φ−rz p
sind−2 φ
0
d−1 −rβz p
e
(2.8.41)
0 F1
−
d
2
0
k2 z 2
;−
d z.
4
In analogy with the computation of Parisi and Rati´eville [46] for the finite size corrections
to the Rmmp, we can introduce the (2n − 1) × (2n − 1) matrix
q
Tαα0 := δα∩α0 =∅ Q|α|+|α0 | g|α| (k)g|α0 | (k),
(2.8.42)
where α is an element of the power set of the replica indices with cardinality |α| and
(
1 if α ∩ α0 = ∅
δα∩α0 =∅ =
(2.8.43)
0 otherwise.
The contribution of the polygon pE can be written as
E
ˆ = 2 Sd
− βSE [β; Q, Q]
(2π)d
Z∞
k d−1 tr T E (k) d k.
(2.8.44)
0
To proceed further, we will attempt to diagonalize the operator T following the strategy of
Almeida and Thouless [3]. An eigenvector cα of the matrix T must satisfy the equation
q
X
X
Tαγ cγ =
Q|α|+|γ| g|α| (k)g|γ| (k)cγ = λcα .
(2.8.45)
γ
γ : α∩γ=∅
cqα
We will look for eigenvectors
with q distinguished replicas, in the form
(
0
if |α| < q,
cqα =
i
d|α| if α contains q − i + 1 distinguished replica indices, i = 1, . . . , q + 1.
(2.8.46)
72
Chapter 2. Euclidean Matching Problems
If we consider q − 1 distinguished replicas, q ≥ 2, it can be proved [36] that the following
orthogonality condition holds:
q−j X
k
|α| − (k + j) q+1−(k+j)
d|α|
= 0.
q−j
n−q
(2.8.47)
k=0
The orthogonality conditions provide a relation between all the different values di|α| , showing
that we can keep only one value, say d1|α| , as independent. Using this assumption, it can be
shown [46] that the eigenvalues of the original Tαα0 matrix can be evaluated diagonalising
the infinite dimensional matrices in the n → 0 limit
p
Γ(a + b)Γ(b)
(q)
Qa+b ga (k)gb (k),
(2.8.48)
Nab = (−1)b
Γ(a)Γ(b − q + 1)Γ(b + q)
In particular, for q = 0 a direct computation gives
p
n−a
Γ(a + b)
n→0
(0)
Nab =
Qa+b gb (k) −−−→ (−1)b
Qa+b ga (k)gb (k)
b
Γ(a)b!
(2.8.49)
and for q = 1 we obtain
(1)
Nab =
p
n−a
b
Qa+b ga (k)gb (k).
b−n
b
(2.8.50)
(q)
Remarkably, limn→0 N (1) = N (0) . Computing the spectrum of the matrix Nab is equivalent
to the computation of the spectrum of
s
gb+q (k) Γ(a + 1)Γ(b + q) (q)
(q)
Mab := (−1)a+b
N
ga+q (k) Γ(b + 1)Γ(a + q) b+q a+q
(2.8.51)
a+q Γ(a + b + 2q)
= (−1)
Qa+b+2q gb+q (k);
Γ(a + 2q)b!
n
each eigenvalue of the previous matrix has multiplicity nq − q−1
in the spectrum of T .
(q)
The eigenvalue equation for M
has the form
Z
0
∞
X (q) (q)
(−1)r e(r+q)βu X e(r +q)βu−G (u) (q)
q
M
c
=
β(−1)
λc(q)
=
cr0 gr0 +q (k) d u.
r
rr 0 r 0
Γ(r + 2q)
r0 !
r0
r 0 =0
(2.8.52)
Defining now
G (u)
∞
X
e(r+q)βu− 2 (q)
(q)
φ (u; k) :=
cr gr+q (k)
(2.8.53)
r!
r=0
the eigenvalue equation becomes
λφ(q) (u; k) = (−1)q
Z
A(q) (u, v; k)φ(q) (v; k) d v,
(2.8.54)
where
A(q) (u, v; k) := β e−
G (u)+G (v)
+qβ(u+v)
2
∞
X
(−1)r erβ(u+v)
r=0
−
= βSd e
G (u)+G (v)
+qβ(u+v)
2
Γ(r + 2q)r!
gr+q (k)
+∞
Z
p
∞
− k 2 z 2 −qβzp X (−1)r erβ(u+v−z )
d−1
dz z
e
.
0 F1 d ; −
4
Γ(r + 2q)r!
2
r=0
0
(2.8.55)
2.8. Emmp via replicas
73
The relation between A(q) (u, v; k) and βSE can be explicitly written as
Z∞
∞
E 2E Sd X
n
n
qE
d−1
(q)
ˆ
− βSE [β; Q, Q] =
k
(−1)
−
tr A (u, v; k)
d k.
(2π)d q=0
q
q−1
0
(2.8.56)
The combinatorial coefficient in the n → 0 limit has the


1
n
n
n→0
−
−−−→ −1 + n

q
q−1
n(−1)q 2q−1
q(1−q)
form
if q = 0,
if q = 1, + o(n).
if q > 1,
(2.8.57)
Sectors q ≥ 2 Let us consider the contribution of the q ≥ 2 sectors; in the n → 0 limit
we have that the second line of (2.8.56) becomes
Z∞
∞
E 2E Sd X
n
n
qE
d−1
(q)
(−1)
−
k
tr
A
(u,
v;
k)
dk
(2π)d q=2
q
q−1
0
2E Sd
−−−→ n
(2π)d
n→0
∞
X
(−1)
q(E+1)
q=2
2q − 1
q(1 − q)
Z∞
k d−1 tr
A(q) (u, v; k)
E dk
0

∞
Z
∞

E 2E Sd X 4q − 1
(2q)
d−1
A
(u,
v;
k)
k
tr
dk
=n
(2π)d 
2q(1 − 2q)
+(−1)E
q=1
0
∞
X
Z∞
q=1
4q + 1
2q(2q + 1)
k d−1 tr
0

E 
A(2q+1) (u, v; k)
d k . (2.8.58)

To perform the zero temperature limit, let us follow Parisi and Rati´eville [46] writing for
q≥2
A
(q)
iβSd − G (u)+G (v)
2
(u, v; k) =
e
2π
+∞
Z
0
d
2 Z
` p −1 d `
− k2 ` p
es(q,β;z,u+v−`) d z
0 F1 d ; −
p
4
2
γ
s(q, β; z, u + v − `) := βq(u + v − `) − z − 2q ln(−z) +
eβ(u+v−`)
, (2.8.59)
z
where γ is the contour in fig. 2.8.1 and we used the fact that
∞
X
(−1)r eβ(q+r)x
r=0
Γ(r + 2q)r!
=
i eβqx
2π
Z
e−z−2q ln(−z)+
eβx
z
d z.
(2.8.60)
γ
To compute the β → ∞ limit, let us consider the saddle point equation for s, obtaining
(
∂s
∂z
∂s
∂`
= 0,
= 0,
(
⇒
zsp = −q,
q
`sp = u + v − 2 ln
β .
∂ 2 s ∂ 2 s q ln2 q
,
=
0,
=
−
sp
sp
t
∂z 2 z=z
∂`2 z=z
`=`
`=`
sp
sp
∂ 2 s ln q
=
. (2.8.61)
z=z
sp
∂z∂`
t
`=`sp
74
Chapter 2. Euclidean Matching Problems
Moreover, observe that the saddle point has fixed position
assuming that ln q = tβ for some t: we will consider this case,
with q → ∞. Indeed, taking q fixed and assuming β → ∞,
it is easily seen from eq. (2.8.58) that A(q) (u, v; k) → ∞ for
u + v > 0, whilst, for u + v < 0 A(q) (u, v; k) → 0. Observing
that only for u + v − 2t > 0 the saddle point is inside the range
of integration, we have that
Figure 2.8.1.: Path of integration γ .
=z
γε
<z
Ht,k (u, v) := lim A(q) (u, v; k)
β→∞
q→∞
ln q=βt
(2.8.62)
Sd − G (u)+G (v) d,p
2
Θk (u + v − 2t) ,
e
≈
p
where we introduced
Θd,p
k (x)
:= x
d
p −1
0 F1
−
d
2
2 k2 x p
;−
θ(x).
4
(2.8.63)
For k = t = 0 the operator H0,0 has eigenvalue 1 with the eigenvector
h10,0 (u) = G 0 (u) e−
Observing that
P∞
1
q=2 βq
→
R +∞
0
(
E (p, d) :=
G (u)
2
.
(2.8.64)
d t let us define
2E+1 Sd
(2π)d
RR ∞
0
h
i
E
k d−1 tr Hk,t
dtdk
E odd,
(2.8.65)
E even.
0
A numerical analysis of the operator Hk,t was carried on through a proper discretization
both of saddle point equation (2.8.32), to obtain G in the mean field approximation, and of
eq. (2.8.62) defining the operator itself. In particular, the largest eigenvalue λ1 (k, t) of the
operator determines the convergence behaviour of series (2.8.76).
Sectors q = 0 and q = 1 Let us compute now the contribution of the two sectors q = 0
and q = 1 in eq. (2.8.56). Up to a global multiplicative constant, it is given by
Z∞
k
d−1
tr
A
(0)
E
E (1)
(u, v; k) + (n − 1) −A (u, v; k)
d k.
(2.8.66)
0
Observe that, from eq. (2.8.49) and eq. (2.8.50),
(1)
(0)
(0)
˜ (0) (k) := Nab (k) − Nab (k) = Nab (k) + o(n)
N
ab
n
b
p
Γ(a
+
b)
Q
a+b
= (−1)b
ga (k)gb (k) + o(n).
Γ(a)b!
b
(2.8.67)
In the usual approximation in which Qa is given by (2.8.30a), we can introduce an integral
operator to diagonalise. Indeed, denoting by (˜
cp )p the eigenvector of
s
˜ (0) (k) = (−1)a Γ(a + b) Qa+b gb (k),
˜ (0) (k) := (−1)a+b a + 1 gb (k) N
M
(2.8.68)
ab
b + 1 ga (k)
Γ(a)b!
b
corresponding to the eigenvalue λ, we have that
A˜(0) (u, v; k) := βSd e−
G (u)−G (v)
2
Z∞
0
z d−1 0 F1
−
d
2
;−
u+v−z p
k2 z 2
J0 2 eβ 2
dz
4
(2.8.69)
2.8. Emmp via replicas
75
˜ (0) (k) with eigenvectors
has the same eigenvalues of M
φ˜(0) (u; k) :=
G (u)
∞
X
erβu− 2
c˜r gr (k)
r!r
r=0
(2.8.70)
Under the hypotheses above we can write
Z∞
k d−1 tr
A(0)
E
E + (n − 1) A(0) + nA˜(0)
dk =
0
Z∞
=n
k
d−1
tr
A
(0)
E Z∞
d k + nE
0
k d−1 tr
A(0)
E−1
A˜(0) d k + o(n). (2.8.71)
0
Observe now that A(1) (u, v; k) = −A(0) (u, v; k) and that
A
(0)
−
(u, v; k) := −βSd e
G (u)+G (v)
2
+∞
Z
u+v−z p
− k 2 z 2 β u+v−zp
d−1
2
dz z
J1 2 eβ 2
,
e
0 F1 d ; −
4
2
0
= Sd e−
G (u)+G (v)
2
+∞
Z
u+v−z p
− k2 z 2 ∂
d−1
.
dz z
J0 2 eβ 2
0 F1 d ; −
4 ∂u
2
0
(2.8.72)
Remembering eq. (2.8.31) we have that in the large β limit we obtain a finite expression
β→∞
A(0) (u, v; k) −−−−→ −Sd e−
G (u)+G (v)
2
(u + v)
d−1
p
0 F1
−
d
2
;−
2 k 2 (u + v) p
θ(u+v) =: −Kk (u, v).
4
(2.8.73)
Similarly, we have that
βSd −
β→∞
A˜(0) (u, v; k) −−−−→ −
e
d
Results
G (u)+G (v)
2
d
(u+v) p 0 F1
2 −
k 2 (u + v) p
˜ k (u, v).
θ (u + v) =: −β K
;
−
d
4
2 +1
(2.8.74)
We can collect in a unique expression the previous results. Defining
2E Sd
κE (p, d) :=
(2π)d
Z∞
h
i
E−1 ˜
k d−1 tr (Kk )
Kk d k
(2.8.75)
0
the average optimal cost, up to non poligonal contributions, is therefore given by
p
p
(p,mf)
lim N d EN = lim N d EN
N →∞
N →∞
(d) +
∞
X
(−1)E κE (p, d) +
E=3
∞
X
E (p, d)
+ ...,
2E
(2.8.76)
E=3
odd
p
(p,mf)
where E (p, d) is given by eq. (2.8.65), whilst limN N d EN
(d) is obtained as in eq. (2.8.33).
Observe that three different contributions appears in the polygonal approximations: the first
one is related to the mean field approximation for the Emmp; the second one and the third
one are due to corrections related to correlations among weights. Remarkably, the third
contribution has an expression totally analogue to the finite size corrections for the Rmmp
computed by Parisi and Rati´eville [46].
Conclusions
In the present work we discussed some connections between Combinatorial optimisation
and Statistical physics. In particular, we analysed the so-called Euclidean bipartite matching problem, in which an optimal matching between two sets of points in a certain domain
in Rd must be found in such a way that a given functional of the matching (depending on
the positions of the points) is minimised. We considered the random version of the problem,
assuming that the points are randomly generated and independently and identically distributed according to given a certain probability distribution density, that we supposed in
general uniform. The presence of randomness in the presence of Euclidean constraints makes
the study of the average properties of the solution highly non trivial: Statistical Physics
in general, and the Theory of disordered systems and of Stochastic Processes, can be very
useful in the inspection of these properties. For what concerns the Euclidean matching
problem, in this thesis we presented the following results.
1. We provided a complete and general solution for the one dimensional problem in the
case of convex cost functional, showing that the solution can be mapped into specific
stochastic processes related to the Brownian bridge process depending on the imposed
boundary conditions. We computed also the correlation function for the solution of
the one dimensional problem.
2. We proposed a scaling ansatz for the average optimal matching cost in the quadratic
case, obtaining an analytical expression for the finite size corrections to the thermodynamical limit in any dimension.
3. We generalised the scaling ansatz above to compute the correlation function of the
optimal matching in any dimension.
4. We obtained, using a functional approach, a general recipe for the computation of
the correlation function of the optimal matching in any dimension and any domain
in the hypothesis of points generated according to a given (in general non uniform)
probability distribution density.
5. We performed a replica computation (in the replica symmetric hypothesis) for the
Euclidean monopartite matching problem in d dimensions, generalising some results
known to the literature in which correlations between weights are considered as perturbation to a mean field model.
Many of the results above suggest further directions of investigation.
1. The case of convex cost functionals has been largely studied in this thesis; however,
very little is known about the thermodynamical limit of the concave case even on the
line.
2. The proposed functional approach seems able to fully justify the scaling ansatz proposed and to generalise our results, even to the case of concave cost functional: a
proper investigation in this sense is required (actually, it is in progress).
77
78
Chapter 2. Euclidean Matching Problems
3. The replica approach proved itself quite powerful in the treatment of random optimisation problems; however in random Euclidean optimisation problems correlations
highly complicate the computation. We isolated the simplest part (that we called
polygonal contribution), generalising the classical results, but we neglected completely
the non-polygonal terms: a proper (even numerical) treatment of these contributions
is required.
As the present thesis, in its own small way, suggests, combinatorial optimisation problems
in general, and Euclidean combinatorial optimisations problem, can be analysed using very
different approaches from very different research areas and, on the return, they provide insights on the methodologies adopted and on the interconnection between apparently distant
research fields of Mathematics and Physics.
Appendix A
The Wiener process and
the Brownian bridge process
Summary of the appendix In the present appendix we will introduce some fundamental properties of
the Wiener process and of the Brownian bridge process. A complete treatment of these processes is outside
the purpose of this small Appendix and excellent books on the subject are already present in the literature
(see, for example, the work of Rogers and Williams [50] or the monograph of M¨
orters and Peres [41]): here
we will restrict ourselves to the main properties, presenting also some useful probability distributions for
the Wiener process and the Brownian bridge process.
A.1. Wiener process and Brownian bridge process
Let us suppose that Ω is the sample space of the process and that F corresponds to the
set of events, σ-algebra1 over Ω; let us suppose also that a probability P : F → R+ is given
on the space of events. A Wiener process W, or standard Brownian process W : R+ → R
on this probability space is such that ∀ω ∈ Ω Wω (0) = 0 and the map t → Wω (t) is
continuous ∀t ∈ R+ and ∀ω ∈ Ω. Moreover, we require that the process satisfies the
fundamental property that ∀t, τ ∈ R+ the increment W(t + τ ) − W(t) is independent from
d
W(u), u ∈ [0, t], and normally distributed as W(t + τ ) − W(t) ∼ N (0, τ ). This set of
properties uniquely identify the Wiener process: it can be proved that such a process
exists. A Wiener process can be introduced also using different probabilistic terminologies
connected to its main properties that follow directly from the definition.
1 Recall
that a σ-algebra on a certain set A is a collection Σ of subset of A such that A ∈ Σ, a ∈ Σ ⇒
A \ a ∈ Σ and a, b ∈ Σ ⇒ a ∪ b ∈ Σ. If C is a set of subsets of A, then σ(C) is the smallest σ-algebra on
A containing C.
W(t)
0.5
0
−0.5
−1
0
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
1
Figure A.1.1.: A realisation of a Wiener process.
79
80
Appendix A. The Wiener process and the Brownian bridge process
B(t)
1
0
−1
0
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
1
Figure A.1.2.: A realisation of a Brownian bridge process.
Gaussian process A Gaussian process is a certain stochastic process X(t) : T ⊆ R →
R such that, given a set of n element {t1 , . . . , tn } ∈ T , the joint distribution of
(X(t1 ), . . . , X(tn )) is a multivariate Gaussian distribution. It follows that we have to
specify only the mean2 µ(t) := E(X(t)) and the covariance E (X(t)X(t0 )). The Wiener
process is a Gaussian process with mean E(W(t)) = 0 and covariance E (W(t)W(t0 )) =
min{t, t0 }. Most importantly, if a continuous process satisfies the previous relations,
it is a Wiener process.
Martingale The concept of martingale is at the basis of the modern analysis of stochastic
processes. Due to its importance, let us briefly introduce the fundamental ideas.
Let us firstly consider a probability space (Ω, F, P) as above; we say that a family
{Ft : t ∈ R+ }, sub σ-algebras of F, is a filtration if, for s < t,
!
[
Fs ⊆ Ft ⊆ F∞ := σ
Fτ ⊆ F,
(A.1.1)
τ
A certain process X(t), t ∈ R+ on our probability space is said to be adapted to the
filtration if X(t) is measurable on Ft . Given an adapted process such that E (X(t)) <
+∞ ∀t and E (X(t)|Fs ) = X(s) for s < t almost surely, then the process is called
martingale. In this sense we can say that the Wiener process is a martingale, due to
the fact that, denoting by Ws := σ ({W(τ ) : τ ≤ s}) the filtration of the probability
space, E (W(t) − W(s)|Ws ) = 0 ⇒ E (X(t)|Ws ) = X(s) directly from the defining
properties of the Wiener process.
Markov process An adapted process X(t), t ∈ R+ , with filtration {Fs }s∈R+ , is Markov
process if there exists a Markov kernel pX (τ, X(s), A) for A open subset of R, such
that
Pr (X(t) ∈ A|Fs ) = pX (t − s, X(s), A).
(A.1.2)
The Wiener process is a Markov process, having
(x−y)2
2τ
e−
pW (τ, y, (x, x + d x)) = √
2πτ
d x =: ρ(x − y; τ ) d x.
(A.1.3)
It is well known that the probability density ρ(x, τ ) is the fundamental solution of the
heat equation
1
∂t ρ(x, t) = ∂x2 ρ(x, t),
(A.1.4)
2
obtained by Einstein in his celebrated work on Brownian diffusion.
2 We
denote by E(•) the expected value of a certain quantity f : Ω → R, E(f ) =
similarly, if G ⊂ F is a sub σ-algebra of F , E(f |G) :=
R
Ω
f (ω)P(ω|G) d ω.
R
Ω
P(ω)f (ω) d ω;
A.2. Probability distributions
81
A stochastic process strictly related to the Wiener process is the Brownian bridge process.
A Brownian bridge B(t) is a Gaussian process on the unit interval [0, 1] such that E(B(t)) = 0
and E(B(s)B(t)) = min{s, t} − st. It can be written in terms of a Wiener process on the
unit interval as
B(t) := W(t) − tW(1), t ∈ [0, 1].
(A.1.5)
It can be proved that the previous process can be obtained simply conditioning a Wiener
process to be equal to zero for t = 1. We can introduce also the Gaussian stochastic process
B(−1) (t) :=
Zt
B(s) d s,
t ∈ [0, 1],
(A.1.6)
0
2 area of the stochastic process. It is easily proved that E B(−1) (t) = 0 and E B(−1) (1)
=
1
12 .
Finally, the zero-area Brownian bridge can be defined as
B0 (t) = B(t) − B (−1) (1).
(A.1.7)
The previous process has
0
E B (t) = 0,
1
E B (s)B (t) =
2
0
0
1
|s − t| −
2
2
−
1
.
24
(A.1.8)
A.2. Probability distributions
A series of mathematical results is available for the probability distributions of some interesting quantities related to the Wiener process or the Brownian bridge on the unit interval.
We collect here some of these results, without proving them (for further details and more
general results, see the detailed paper of Beghin and Orsingher [7]). However, we remark
here the probability analysis performed is usually carried on through the so-called reflection
principle: the general idea is that, given a certain time t∗ and a value W(t∗ ) for the Wiener
process, given any set of possible realisations starting at W(t∗ ) for t = t∗ , the set obtained
reflecting them in the line w = W(t∗ ) has the same probability. By this simple procedure,
it can be easily proved that
Z∞
!
sup W(τ ) > W
Pr
=2
τ ∈(0,t)
z2
e− 2t
√
d z.
2πt
(A.2.1)
W
Similarly it can be proved that
!
Pr
sup W(τ ) > W |W(t) = w
=
(
exp − 2W (Wt −w)
1
τ ∈(0,t)
W > w,
(A.2.2)
w < W.
If w = 0 and t= 1 we have the probability
distribution for the sup of the Brownian bridge
−2B 2
process as Pr supτ ∈(0,1) B(τ ) > B = e
. For the absolute value of the Wiener process
we have that
!
X
2kW (kW − w)
Pr sup |W(τ )| < W |W(t) = w =
(−1)k exp −
.
(A.2.3)
t
τ ∈(0,t)
k∈Z
82
Appendix A. The Wiener process and the Brownian bridge process
P
Again, for w = 0 and t = 1 we have Pr supτ ∈(0,1) |B(τ )| < B = k∈Z (−1)k exp −2k 2 B 2 .
Darling [16] proved that for the zero-area Brownian bridge the following distribution holds:
√ √ ∞
8B
4 πX 1
Pr B (t) < B =
ψ
,
3 n=1 αn
3αn
0
B > 0,
(A.2.4)
where 0 < α1 < · · · < αn < . . . are the zeros of the combination of Bessel functions
f (α) := J 31 (α) + J− 31 (α),
(A.2.5)
whilst ψ(x) is the solution of the following integral equation:
Z∞
2
e−λx ψ(x) d x = e−λ 3 .
(A.2.6)
0
The theorem of Darling is not trivial at all, both in the derivation and in the final result.
The explicit treatment of the probability distribution density is indeed quite involved. More
implicit results on the Laplace transform of the probability distribution of B(−1) have been
obtained by Perman and Wellner [47], but we do not present them here.
Appendix B
The Simplex Method
B.1. Linear optimisation problems
The simplex algorithm is the most widely used algorithm for the solution of linear optimisation problems. A linear optimisation problem can be stated as follows: let us consider
an n × m matrix of integers A = (aij )ij , a vector of m integers b ∈ Zm and a vector of n
integers c ∈ Zn . We want to find a vector X = (Xi )i=1,...,n ∈ (R+ )n such that
z := c · X = min c| · x
(B.1.1)
x∈F
on a certain non-empty space F of feasible solutions,
F := {x = (x1 , . . . , xn ) ∈ Rn : A · x = b, xi ≥ 0 ∀i = 1, . . . , n} =
6 ∅.
(B.1.2)
The linear optimisation problem, when stated in the previous form, is said in a standard
form.
The algorithmic solution of the previous optimisation problem is due to George B. Dantzig that formulated the celebrated simplex algorithm. To have a solution of our optimisation problem it is necessary that n − rank [A] > 0 (otherwise
no point satisfies all the constraints): we require therefore
for simplicity that rank [A] = m < n. In the space Rn−m ,
the constraint condition A · x = b, with the additional constraints xi ≥ 0 ∀i = 1, . . . , n, delimits eventually an (n − m)dimensional bounded convex subspace called convex polytope.
B.2. The algorithm
In the hypothese above, we can select m linearly independent
columns in A, let us call them B := {aik }k=1,...,m , such that
the system
A · x = b,
xj = 0 if j 6= ik ∀k = 1, . . . m
Figure B.1.1.: An example of 3dimensional convex polytope.
(B.2.1)
83
84
Appendix B. The Simplex Method
has a unique solution, i.e. we have a problem in the form
x1
b1
x2
b2
x3
b3
0
b4
a1,1 a2,1 a3,1 a4,1 a5,1 a6,1 a7,1 a8,1 a9,1
a1,2 a2,2 a3,2 a4,2 a5,2 a6,2 a7,2 a8,2 a9,2
a1,3 a2,3 a3,3 a4,3 a5,3 a6,3 a7,3 a8,3 a9,3
· x 5 = b5
(B.2.2)
a1,4 a2,4 a3,4 a4,4 a5,4 a6,4 a7,4 a8,4 a9,4
x6
b6
0
b7
x8
b8
a1,5 a2,5 a3,5 a4,5 a5,5 a6,5 a7,5 a8,5 a9,5
a1,6 a2,6 a3,6 a4,6 a5,6 a6,6 a7,6 a8,6 a9,6
|
{z
A
}
0
b9
| {z}
|{z }
x
b
where we have highlighted the submatrix B. If the solution x of the previous problem has
xi ≥ 0, it is called basic feasible solution (bfs). Remarkably, the optimal solution is a bfs.
It can be proved that
x ∈ F and x is bfs ⇔ x is a vertex of the polytope.
(B.2.3)
Moreover, for any instance of a linear programming problem there is an optimal bfs. By
the previous operation, let us suppose that a certain bfs x∗ is known and let us suppose
also that this bfs corresponds for the sake of simplicity to the set of columns {ai }i=1,...,m
in such a way that x∗ = (x∗1 , . . . , x∗m , 0, . . . , 0).
It follows that an optimal solution can be found among the vertices of the polytope
corresponding to the instance. In the simplex method, we move from one vertex to another
until the optimal vertex is found through proper pivoting operations. The first step is to
write down the simplex tableaux:
(
z − c| · x = 0
A
0 b
⇒
(B.2.4)
−c| 1 0.
A ·x=b
Let us now give a pictorial representation of the algorithm (in the pictures, m = 6 and
n = 9). The simplex matrix is
a1,1 a2,1 a3,1 a4,1 a5,1 a6,1 a7,1 a8,1 a9,1
0
b1
a1,2 a2,2 a3,2 a4,2 a5,2 a6,2 a7,2 a8,2 a9,2
0
b2
a1,3 a2,3 a3,3 a4,3 a5,3 a6,3 a7,3 a8,3 a9,3
0
b3
a1,4 a2,4 a3,4 a4,4 a5,4 a6,4 a7,4 a8,4 a9,4
0
b4
a1,5 a2,5 a3,5 a4,5 a5,5 a6,5 a7,5 a8,5 a9,5
0
b5
a1,6 a2,6 a3,6 a4,6 a5,6 a6,6 a7,6 a8,6 a9,6
0
b6
−c1 −c2 −c3 −c4 −c5 −c6 −c7 −c8 −c9
1
0
(B.2.5)
After a sequence of row operation we can transform the simplex tableaux in the following
B.2. The algorithm
85
way
1
0
0
0
0
0
a
˜7,1 a
˜8,1 a
˜9,1
0
˜b1
0
1
0
0
0
0
a
˜7,2 a
˜8,2 a
˜9,2
0
˜b2
0
0
1
0
0
0
a
˜7,3 a
˜8,3 a
˜9,3
0
˜b3
0
0
0
1
0
0
a
˜7,4 a
˜8,4 a
˜9,4
0
˜b4
0
0
0
0
1
0
a
˜7,5 a
˜8,5 a
˜9,5
0
˜b5
0
0
0
0
0
1
a
˜7,6 a
˜8,6 a
˜9,6
0
˜b6
0
0
0
0
0
0
−˜
c7 −˜
c8 −˜
c9
1
zB
(B.2.6)
where zB is the value of z on the current bfs. If all the entries of {˜
ci }i=m+1,...,n are positive,
then the current bfs is optimal and zB is the optimal cost.
Otherwise, we have to proceed further: we choose a non-zero pivot element a
˜rc 6= 0 in
the simplex tableaux, and we multiply the corresponding row for a1rc ; proper multiples of
˜ in such a way that the kth column of the
the new row are added to the remaining rows of A
new matrix has 1 in correspondence of the original pivot element and zero otherwise. The
chosen variable is a new basic variable (entering variable) substituting the old rth basic
variable (leaving variable): we switch the rth-column with the current cth-column to obtain
a new simplex tableaux in the form (B.2.6).
Due to the fact that the value of z must be minimised, the entering variable is chosen in
a column c in such a way that cc < 0 (non-zero values of the new selected variable decrease
z). The condition that the new solution must be feasible determines a criterion for the row:
it can be shown that this condition implies that, being c the chosen pivot column, the row
˜r
r must be such that a˜brc
is minimum among all rows r. The iteration of this sequence of
steps leads to the optimal solution exploring the bfss. Note that the method requires as
starting point a bfs.
Finally, observe that if we have constraints expressed in terms of inequalities, e.g.,
n
X
aij xj < bi ,
(B.2.7)
j=1
Pn
we can introduce a new slack variable si ≥ 0 for each inequality and write it as j=1 aij xj +
si = bi . Obviously we can introduce a “dependence” of z from the new variables in a trivial
way as
n
X
X
z=
ci xi + 0 ·
sj
(B.2.8)
i=1
j
Bibliography
[1]
M. Ajtai, J. Koml´
os, and G. Tusn´
ady. “On optimal Matchings”. In: Combinatorica
4.4 (1984), pp. 259–264.
[2]
D. J. Aldous. “The ζ(2) limit in the random assignment problem”. In: Random Structures and Algorithms 2 (2001), pp. 381–418.
[3]
J. R. de Almeida and D. J. Thouless. “Stability of the Sherrington-Kirkpatrick solution
of a spin glass model”. In: Journal of Physics A: Mathematical and General 11.5
(1978).
[4]
L. Ambrosio. Lecture notes on optimal transport problems. 2003, pp. 1–52.
[5]
M. Barczy and E. Igl´
oi. “Karhunen-Lo`eve expansions of α-Wiener bridges”. In: Central
European Journal of Mathematics 9.1 (Dec. 2010), pp. 65–84.
[6]
J. Beardwood, J. H. Halton, and J. M. Hammersley. “The shortest path through many
points”. In: Mathematical Proceedings of the Cambridge Philosophical Society 55.04
(Oct. 1959), p. 299.
[7]
L. Beghin and E. Orsingher. “On the maximum of the generalized Brownian bridge”.
In: Lithuanian Mathematical Journal 39.2 (Apr. 1999), pp. 157–167.
[8]
K. Binder and A. P. Young. “Spin glasses: Experimental facts, theoretical concepts,
and open questions”. In: Reviews of Modern Physics 58.4 (1986).
[9]
V. I. Bogachev and A. V. Kolesnikov. “The Monge-Kantorovich problem: achievements, connections, and perspectives”. In: Russian Mathematical Surveys 67.5 (Oct.
2012), pp. 785–890.
[10]
E. Boniolo, S. Caracciolo, and A. Sportiello. “Correlation function for the Grid-Poisson
Euclidean matching on a line and on a circle”. In: Journal of Statistical Mechanics:
Theory and Experiment 2014.11 (Nov. 2014), P11023.
[11]
S. Caracciolo and G. Sicuro. “Functional Approach for the Euclidean Bipartite Matching Problem”. in preparation. 2014.
[12]
S. Caracciolo and G. Sicuro. “One-dimensional Euclidean matching problem: Exact
solutions, correlation functions, and universality”. In: Physical Review E 90.4 (Oct.
2014), p. 042112.
[13]
S. Caracciolo and G. Sicuro. “Scaling hypothesis for the Euclidean bipartite matching
problem. II Correlation functions”. in preparation. 2014.
[14]
S. Caracciolo et al. “Scaling hypothesis for the Euclidean bipartite matching problem”.
In: Physical Review E 90.1 (2014), p. 012118.
[15]
T. Castellani and A. Cavagna. “Spin-glass theory for pedestrians”. In: Journal of
Statistical Mechanics: Theory and Experiment (May 2005), P05012.
[16]
D. A. Darling. “On the supremum of certain gaussian processes”. In: The Annals of
Probability 11.3 (1983), pp. 803–806.
87
88
Bibliography
[17]
J. Delon, J. Salomon, and A. Sobolewski. “Minimum-weight perfect matching for
non-intrinsic distances on the line”. In: Zapiski Nauchykh Seninarov POMI 390.52
(2011).
[18]
B. Dezs¨
o, A. J¨
uttner, and P. Kov´acs. “LEMON – an Open Source C++ Graph Template Library”. In: Electronic Notes in Theoretical Computer Science 264.5 (July
2011), pp. 23–45.
[19]
R. Diestel. Graph Theory: Springer Graduate Text GTM 173. Springer Graduate Texts
in Mathematics (GTM). 2012.
[20]
V. T. Dobri´c and J. E. Yukich. “Asymptotics for Transportation Cost in High Dimensions”. In: Journal of Theoretical Probability 8.1 (1995), pp. 97–118.
[21]
R. Dudley. Real Analysis and Probability. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2002.
[22]
S. F. Edwards and P. W. Anderson. “Theory of spin glasses”. In: Journal of Physics
F: Metal Physics 5 (1975), pp. 965–974.
[23]
L. C. Evans. “Partial differential equations and Monge-Kantorovich mass transfer”.
In: Current developments in mathematics (1997).
[24]
D. Fichera. “Cavity methods in optimization problems: exact and approximated algorithms”. PhD thesis. Universit`a di Milano, 2008.
[25]
K. Fischer and J. Hertz. Spin Glasses. Cambridge Studies in Magnetism. Cambridge
University Press, 1993.
[26]
G. H. Hardy. Ramanujan: twelve lectures on subjects suggested by his life and work.
AMS Chelsea Publishing Series, 1999.
[27]
J. Houdayer, J. H. Boutet de Monvel, and O. C. Martin. “Comparing mean field and
Euclidean matching problems”. In: The European Physical Journal B 6.3 (Nov. 1998),
pp. 383–393.
[28]
H. A. Kramers and G. H. Wannier. “Statistics of the two-dimensional ferromagnet.
Part I”. In: Physical Review 60 (1941).
[29]
H. W. Kuhn. “The Hungarian method for the assignment problem”. In: Naval research
logistics quarterly 2 (1955), pp. 83–97.
[30]
C.-S. Lin and C.-L. Wang. “Elliptic functions, Green functions and the mean field
equations on tori”. In: Annals of Mathematics 172.2 (2010), pp. 911–954.
[31]
L. Lov´
asz and D. Plummer. Matching Theory. AMS Chelsea Publishing Series. 2009.
[32]
R. J. McCann. “Exact Solutions to the Transportation Problem on the Line”. In:
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
455.1984 (1999), pp. 1341–1380.
[33]
M. M´ezard, G. Parisi, and M. Virasoro. Spin Glass Theory and Beyond. Lecture Notes
in Physics Series. World Scientific Publishing Company, Incorporated, 1987.
[34]
M. M´ezard and A. Montanari. Information, Physics, and Computation. Oxford Graduate Texts. OUP Oxford, 2009.
[35]
M. M´ezard and G. Parisi. “Mean-field equations for the matching and the travelling
salesman problems”. In: Europhysics Letters 2.12 (1986), pp. 913–918.
[36]
M. M´ezard and G. Parisi. “On the solution of the random link matching problems”.
In: Journal de Physique 48.9 (1987), pp. 1451–1459.
[37]
M. M´ezard and G. Parisi. “Replicas and optimization”. In: Journal de Physique Lettres
46.17 (1985), pp. 771–778.
[38]
M. M´ezard and G. Parisi. “The Bethe lattice spin glass revisited”. In: The European
Physical Journal B 20.2 (Mar. 2001), pp. 217–233.
Bibliography
89
[39]
M. M´ezard and G. Parisi. “The cavity method at zero temperature”. In: Journal of
Statistical Physics 111.1/2 (2003), pp. 1–24. arXiv:cond-mat/0207121v2 [arXiv:cond-mat].
[40]
M. M´ezard and G. Parisi. “The Euclidean matching problem”. In: Journal de Physique
49 (1988), pp. 2019–2025.
[41]
P. M¨
orters and Y. Peres. Brownian Motion. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2010.
[42]
H. Nishimori. Statistical Physics of Spin Glasses and Information Processing. An
Introduction. Clarendon Press Oxford, 2001.
[43]
C. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Dover Books on Computer Science Series. Dover Publications, 1998.
[44]
G. Parisi. “A conjecture on random bipartite matching”. In: arXivorg cond-mat.9801176
(1998). arXiv:9801176v1 [arXiv:cond-mat].
[45]
G. Parisi. “Order parameter for spin-glasses”. In: Physical Review Letters 50.24 (1983),
pp. 1946–1948.
[46]
G. Parisi and M. Rati´eville. “On the finite size corrections to some random matching
problems”. In: The European Physical Journal B 29.3 (Oct. 2002), pp. 457–468.
[47]
M. Perman and J. A. Wellner. “On the distribution of brownian areas”. In: The Annals
of Applied Probability 6.4 (1996), pp. 1091–1111.
[48]
T. Plefka. “Convergence condition of the TAP equation for the infinite-ranged Ising
spin glass model”. In: Journal of Physics A: Mathematical and General 15 (1982),
pp. 1971–1978.
[49]
C. Redmond and J. E. Yukich. “Asymptotics for Euclidean functionals with powerweighted edges”. In: Stochastic Processes and their Applications 61.2 (Feb. 1996),
pp. 289–304.
[50]
L. Rogers and D. Williams. Diffusions, Markov Processes, and Martingales: Volume
1, Foundations. Cambridge Mathematical Library. Cambridge University Press, 2000.
[51]
L. A. Shepp. “On the integral of the absolute value of the pinned Wiener process”.
In: The Annals of Probability 10.1 (1982), pp. 234–239.
[52]
D. Thouless, P. Anderson, and R. Palmer. “Solution of ’Solvable model of a spin
glass’”. In: Philosophical Magazine 35.3 (1977), pp. 593–601.
[53]
C. Villani. “Current trends in optimal transportation”. In: 2 (), pp. 1–13.
[54]
C. Villani. Optimal transport. Old and new. Springer–Verlag, 2009.
[55]
H. Whitney. “Non-separable and planar graphs”. In: Proceedings of the National
Academy of Sciences of the United States of America 17.1 (1931), pp. 125–127.
[56]
J. Yukich. Probability theory of classical Euclidean optimization problems. Lecture
notes in mathematics. Springer, 1998.
[57]
J. Zinn-Justin. Quantum Field Theory and Critical Phenomena. The International
Series of Monographs on Physics Series. Clarendon Press, 2002.