The Kardashian Kernel - Find a separating hyperplane with this One

Transcription

The Kardashian Kernel - Find a separating hyperplane with this One
The Kardashian Kernel
David F. Fouhey
Sublime and Distinguished Grand Poobah,
CMU, Karlsruhe Inst. of Technology, Kharkiv Polytechnic Inst.
Daniel Maturana
Distinguished Appointed Lecturer of Keeping it Real,
CMU, KAIST, Kyushu Inst. of Technology
Introduction
Outline
1 Introduction
Motivation
Related work
2 The Kardashian Kernel
Formalities
On Some Issues Raised by the Kardashian Kernel
3 Applications
Kardashian SVM
Graph Kardashiancian
Kardashian Kopula
4 Conclusions and future work
2 / 46
Introduction
Motivation
Motivation
• Kernel machines are popular
• Have fancy math
• They work well
3 / 46
Introduction
Motivation
Motivation
• Kernel machines are popular
• Have fancy math
• They work well
• The Kardashians are popular
• (TODO)
4 / 46
Introduction
Motivation
Motivation
• Kernel machines are popular
• Have fancy math
• They work well
• The Kardashians are popular
• (TODO)
• Why not combine them?
5 / 46
Introduction
Related work
Related work
•
•
•
•
•
•
•
•
•
•
•
•
•
Kronecker product
Krylov subspace methods
Kolmogorov axioms
Kalman Filters
Kent distribution
Karhunen-Loève Transform
Keypoint retrieval w/ K-d tree search
Kriging (AKA Gaussian process regression)
Kohonen maps (AKA Self-Organizing Maps)
K-grams
K-folds
K-armed bandits
...
6 / 46
Introduction
Related work
Related work
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Kronecker product
Krylov subspace methods
Kolmogorov axioms
Kalman Filters
Kent distribution
Karhunen-Loève Transform
Keypoint retrieval w/ K-d tree search
Kriging (AKA Gaussian process regression)
Kohonen maps (AKA Self-Organizing Maps)
K-grams
K-folds
K-armed bandits
...
Our approach: provably k-optimal, as our paper has
significantly more k’s and substantially more pictures of the
Kardashians
7 / 46
The Kardashian Kernel
Outline
1 Introduction
Motivation
Related work
2 The Kardashian Kernel
Formalities
On Some Issues Raised by the Kardashian Kernel
3 Applications
Kardashian SVM
Graph Kardashiancian
Kardashian Kopula
4 Conclusions and future work
8 / 46
The Kardashian Kernel
Formalities
Definitions
• Let X be an instance space.
9 / 46
The Kardashian Kernel
Formalities
Definitions
• Let X be an instance space.
• The Kardashian Kernel is an inner product operator
KK : X × X → R.
10 / 46
The Kardashian Kernel
Formalities
Definitions
• Let X be an instance space.
• The Kardashian Kernel is an inner product operator
KK : X × X → R.
• Kernel trick (Mercer): KK (x, x 0 ) = κ(x)T κ(x), with
κ : X → K.
11 / 46
The Kardashian Kernel
Formalities
Definitions
• Let X be an instance space.
• The Kardashian Kernel is an inner product operator
KK : X × X → R.
• Kernel trick (Mercer): KK (x, x 0 ) = κ(x)T κ(x), with
κ : X → K.
• can leverage the Kardashian Feature space without suffering
the Kurse of Dimensionality.
12 / 46
The Kardashian Kernel
Formalities
The Kardashian Kernel Trick
13 / 46
The Kardashian Kernel
On Some Issues Raised by the Kardashian Kernel
On Some Issues Raised by the Kardashian Kernel
14 / 46
The Kardashian Kernel
On Some Issues Raised by the Kardashian Kernel
On Reproducing Kardashian Kernels
• Does KK define a Reproducing Kernel Hilbert Space (RKHS)?
i.e. are the Kardashians Reproducing Kernels?
15 / 46
The Kardashian Kernel
On Some Issues Raised by the Kardashian Kernel
On Reproducing Kardashian Kernels
• Does KK define a Reproducing Kernel Hilbert Space (RKHS)?
i.e. are the Kardashians Reproducing Kernels?
• Only proven for case of Kourtney
16 / 46
The Kardashian Kernel
On Some Issues Raised by the Kardashian Kernel
On Reproducing Kardashian Kernels
• Does KK define a Reproducing Kernel Hilbert Space (RKHS)?
i.e. are the Kardashians Reproducing Kernels?
• Only proven for case of Kourtney
• But prominent bloggers argue that it is also true for Kim
17 / 46
The Kardashian Kernel
On Some Issues Raised by the Kardashian Kernel
On Divergence Functionals
Crucial question: does the space induced by κ have structure
that is advantageous to minimizing the f -divergences?
Theorem
n
n
i=1
j=1
1X
1X
λn
min =
hw , κ(xi )i −
loghw , κ(yj )i + ||w ||2K
w
n
n
2
Proof.
Obvious by the use of the Jensen-Jenner Inequality.
18 / 46
Applications
Outline
1 Introduction
Motivation
Related work
2 The Kardashian Kernel
Formalities
On Some Issues Raised by the Kardashian Kernel
3 Applications
Kardashian SVM
Graph Kardashiancian
Kardashian Kopula
4 Conclusions and future work
19 / 46
Applications
Kardashian SVM
Kardashian SVM problem setting
Regular Support Vector Machines (SVMs) are boring. We propose
to solve the following optimization problem, which is subject to the
Kardashian-Karush-Kuhn-Tucker (KKKT) Conditions:
n
X
1
ξi
min ||w||2 + C
w,ξ,b 2
i=1
20 / 46
Applications
Kardashian SVM
Kardashian SVM problem setting
Regular Support Vector Machines (SVMs) are boring. We propose
to solve the following optimization problem, which is subject to the
Kardashian-Karush-Kuhn-Tucker (KKKT) Conditions:
n
X
1
ξi
min ||w||2 + C
w,ξ,b 2
i=1
such that
yi (wT κ(xi ) − b) ≥ 1 − ξi
ξi ≥ 0
ζj = 0
1≤i ≤n
1≤i ≤n
1 ≤ j ≤ m.
21 / 46
Applications
Kardashian SVM
Learning algorithm
• Standard approach: Kuadratic Programming (KP)
22 / 46
Applications
Kardashian SVM
Learning algorithm
• Standard approach: Kuadratic Programming (KP)
• But see Kurvature of optimization manifold
23 / 46
Applications
Kardashian SVM
Learning algorithm
• Standard approach: Kuadratic Programming (KP)
• But see Kurvature of optimization manifold
• Take advantage of geometry: Konvex-Koncave Procedure
(KKP)
24 / 46
Applications
Kardashian SVM
Experiment: Kardashian or Cardassian?
(a) Kardashian - (l. to r.)
Kim, Khloé,
Kourtney, Kris
(b) Cardassian - (l. to r.)
Gul Dukat,
Elim Garak
(c) A failure case (or is it?):
Kardashian - Rob
Our “Kardashian or Cardassian” dataset.
25 / 46
Applications
Kardashian SVM
Schematic for Kardashian or Cardassian SVM
wTx – b = 0
Cardassian Union
Cardassia Prime
w
Bajor
Earth
United Federation
of Planets
In the feature space K induced by κ, the decision boundary
between Cardassian and Kardashian lies approximately 5 light years
from Cardassia Prime.
26 / 46
Applications
Graph Kardashiancian
Graph Kardashiancian
27 / 46
Applications
Graph Kardashiancian
Graph Kardashiancian
• The Graph Laplacian `
`i,j


deg(vi ) if i = j
:= −1
if i 6= j and vi is adjacent to vj


0
otherwise.
28 / 46
Applications
Graph Kardashiancian
Graph Kardashiancian
• The Graph Kardashiancian K
Ki,j


deg(vi )
:= −κ


0
if i = j
if i 6= j and vi is Kardashian-adjacent to vj
otherwise.
29 / 46
Applications
Graph Kardashiancian
Graph Kardashiancian
30 / 46
Applications
Graph Kardashiancian
Graph Kardashiancian
• Application: KardashianRank
31 / 46
Applications
Kardashian Kopula
Kardashian Kopula
32 / 46
Applications
Kardashian Kopula
Kardashian Kopula
• Powerful generalization of the Gaussian Copula
cΣ (u) = √
−1
1
1 −1 T −1
exp − Φ (u) Σ − I Φ (u)
2
det Σ
33 / 46
Applications
Kardashian Kopula
Kardashian Kopula
• Powerful generalization of the Gaussian Copula
cΣK (u)
=√
−1
1
1 −1 T −1
exp − K (u) Σ − I K (u)
2
det Σ
34 / 46
Applications
Kardashian Kopula
Kardashian Kopula
• Powerful generalization of the Gaussian Copula
cΣK (u)
=√
−1
1
1 −1 T −1
exp − K (u) Σ − I K (u)
2
det Σ
• Video illustrating the
Kardashian Kopula
(featuring rapper Ray J)
may be found in the
supplementary material
35 / 46
Conclusions and future work
Outline
1 Introduction
Motivation
Related work
2 The Kardashian Kernel
Formalities
On Some Issues Raised by the Kardashian Kernel
3 Applications
Kardashian SVM
Graph Kardashiancian
Kardashian Kopula
4 Conclusions and future work
36 / 46
Conclusions and future work
Celebrity-based Machine Learning
We have exhausted Kardashianity, but currently working on:
37 / 46
Conclusions and future work
Celebrity-based Machine Learning
The Tila Tequila Transform (TTT )
38 / 46
Conclusions and future work
Celebrity-based Machine Learning
The Jensen-Shannon-Jersey-Shore (JS 2 ) divergence
A powerful generalization of The Kardashian-Kulback-Leibler
(KKL) divergence
39 / 46
Conclusions and future work
Celebrity-based Machine Learning
Jamie Lee Curtis Regularization
min ||y −
β (t)
L
X
!
Xβ(t)l ||22
+ λ||β(t) − β(t − 24h)||2
l=1
40 / 46
Conclusions and future work
Celebrity-based Machine Learning
The Richard Pryor Prior
41 / 46
Conclusions and future work
Celebrity-based Machine Learning
The Carrie Fisher Information Matrix
42 / 46
Conclusions and future work
Celebrity-based Machine Learning
Miley Cyrus Markov Chain Monte Carlo (MCMCMC ) methods for
inference
43 / 46
Conclusions and future work
Celebrity-based Machine Learning
Hannah Montana Hidden Markov Models (HMHMHMM).
Train with MCMCMC for best of both worlds!
44 / 46
Conclusions and future work
Celebrity-based Machine Learning
The Orlando Bloom Filter
45 / 46
Conclusions and future work
Celebrity-based Machine Learning
Johnny Depp Belief Nets (JDBNs)
46 / 46
Conclusions and future work
Thank you
47 / 46