Slides

Transcription

Slides

Writer-Independent Feature Learning for Offline
Signature Verification using Deep Convolutional Neural
Networks
Luiz G Hafemann, Robert Sabourin and Luiz Oliveira
June 22, 2016
Luiz Gustavo Hafemann
WI Feature Learning for Signature Verification using CNNs
1 / 29
Table of Contents
1
Review on Neural Networks
2
Introduction
3
Proposed solution
4
Experimental Protocol
5
Results and Conclusion
2 / 29
Review of Feedforward Neural Networks
Models that use composition of functions (in “layers”)
In the simplest case (fully connected layer), simply an affine
transformation followed by a element-wise nonlinear function:
hil = f (Wi:l hl−1 + bil )
Learn a non-linear function of the input (in fact, they are universal
function approximators). Drawback: non-convex optimization
3/29
3 / 29
Training Neural Networks
Key idea is to use gradient-based learning: define a loss function that
is differentiable, and use components that are differentiable
Example:
P for multi-class classification, use cross-entropy loss
L = − i log P(y (i) |X (i) )
Calculate gradients (partial derivatives w.r.t each parameter of the
network): ∇W
Update weights iteratively until convergence: W = W − α∇W
4/29
4 / 29
Training Neural Networks
Key idea is to use gradient-based learning: define a loss function that
is differentiable, and use components that are differentiable
Example:
P for multi-class classification, use cross-entropy loss
L = − i log P(y (i) |X (i) )
Calculate gradients (partial derivatives w.r.t each parameter of the
network): ∇W
Update weights iteratively until convergence: W = W − α∇W
Output units
l
l
wkl
Hidden units H2
wkl
k
j
wij
Input units
i
E
E yl
=
zl
yl zl
k
wjk
Hidden units H1
E
=yl tl
yl
wjk
yj =f (zj )
j
zj =
wij
wij xi
i
Figure: Forward propagation (left) and backpropagation (right) (Lecun et al., 2015)
4/29
4 / 29
NNs for Computer Vision
Network architectures that exploit the characteristics of the data:
Convolutional Neural Networks: hl,c = W l,c ∗ hl−1
Uses local connections (explore spatial 2D correlation of pixels)
Uses shared weights (same detector is used in multiple parts of the
image)
Figure: Visualization of a convolution operation
5/29
5 / 29
The input to the network is the image itself (pixel intensities)
Figure: Lenet5 (1989)
6/29
6 / 29
Nowadays using almost exclusively CNNs
Improvements in architecture (ReLUs, Batch Normalization)
Improvements in training (Nesterov Momentum, Adagrad, Adadelta,
Adam)
Mostly: more computing power (GPUs) and larger datasets
Resources:
http://videolectures.net/deeplearning2015_bengio_
theoretical_motivations/
LeCun, et al. ”Deep learning.” (Nature article)
Bengio, et al, “Deep learning.” (book)
7/29
7 / 29
Relevant papers (to this presentation)
CNNs trained in a purely supervised manner halved the error rates on
ImageNet in 2012
Krizhevsky, A, et al. “Imagenet classification with deep convolutional
neural networks.”
CNNs learned in one dataset can be used to extract features in other
datasets (aka Transfer Learning)
Donahue, J, et al. “Decaf: A deep convolutional activation feature for
generic visual recognition.”
Razavian, Ali, et al. “CNN features off-the-shelf: an astounding
baseline for recognition.”
8/29
8 / 29
Transfer Learning
Figure: (Yosinski et al., 2015)
9/29
9 / 29
What are the layers doing?
Last layer is a linear classifier on the activations from the previous
layer
Layers learn non-linear transformations that “disentangle” the inputs
so that in the end the classes are linearly separable
a
Input
(2)
Hidden
(2 sigmoid)
Output
(1 sigmoid)
http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/
10/29
10 / 29
Table of Contents
1
Review on Neural Networks
2
Introduction
3
Proposed solution
4
Experimental Protocol
5
Results and Conclusion
11/29
11 / 29
Offline Handwritten Signature Verification
Biometric verification system based on handwritten signatures
Widely used to identify a person’s identity in legal, financial and
administrative areas
12/29
12 / 29
Offline Handwritten Signature Verification
Biometric verification system based on handwritten signatures
Widely used to identify a person’s identity in legal, financial and
administrative areas
“Offline” refers to the acquisition process: scanning the document
that contain the signature.
12/29
12 / 29
Problem formulation
Enrollment: user provides few genuine samples.
13/29
13 / 29
Problem formulation
Operations: person provides a sample and claims an identity
13/29
13 / 29
Problem formulation
Operations: person provides a sample and claims an identity
Main challenge: discriminating skilled forgeries: forgeries created
targeting a particular individual
Figure: Three genuine signatures and one skilled forgery for a set of users
Other challenges: Low #samples per user, list of users not fixed
13/29
13 / 29
Related Work
There two main approaches for the task:
Writer Dependent classification: One classifier is trained for each user
Writer Independent classification: A single classifier is trained, that
compares a query signature to a template signature
14/29
14 / 29
Related Work
There two main approaches for the task:
Writer Dependent classification: One classifier is trained for each user
Writer Independent classification: A single classifier is trained, that
compares a query signature to a template signature
Most of the research effort was devoted to answer the question: “What is
a good representation for signatures”
14/29
14 / 29
Proposed solution
Learn features for Signature Verification from data.
Learning features for each user is impractical (low # samples.)
Signatures from different users share common properties
15/29
15 / 29
Proposed solution
Learn features for Signature Verification from data.
Learning features for each user is impractical (low # samples.)
Signatures from different users share common properties
Proposed solution: two-stage approach:
Learn features in a Writer-Independent format
Train writer-dependent classifiers using the learned representation
15/29
15 / 29
Proposed solution
Writer-Independent Feature learning
CNN
training
Development dataset ( )
CNN model ( )
Writer-Dependent training
Feature Extraction
WD Classifier
Extracted
features (
Signature
Images ( )
)
Training set (from )
Binary classifier ( )
Generalization
Decision
(Accept/Reject)
Feature Extraction
New Sample (from )
Signature
Image (
)
Verification
Extracted
features (
)
16/29
16 / 29
Writer-Independent Feature Learning
Learn to classify between users in the Development set.
Using a Deep CNN, with a last layer that estimates P(y |X ), where X
is the signature image, and y ∈ Y is the user in D
Input is a signature image of
Psize 155x220 pixels.
Minimize cross-entropy: − i log P(y (i) |X (i) ) using Stochastic
Gradient Descent
Convolutions
155x220
N
Max-pooling
4096
Fully-connected
17/29
17 / 29
Preprocessing
Noise removal (using OTSU’s algorithm)
18/29
18 / 29
Preprocessing
Noise removal (using OTSU’s algorithm)
Size Normalization:
(a)
(b)
Figure: (a) Simply resizing to desired size; (b) centering signatures in a canvas
and then resizing (suffix CNNnorm in the experiments)
18/29
18 / 29
Writer-Dependent classification
For each user, we build a training set: r genuine signatures as positive
samples, and genuine signatures from users in D as negative samples
For each signature X in the training set, we compute the
representation φ(X ) by performing forward propagation on the CNN
until the last layer before softmax
We use these feature vectors to train a binary classifier f (Linear
SVM and with RBF kernel)
For a new query signature Xnew , we compute f (φ(Xnew ))
19/29
19 / 29
Datasets
Dataset
Brazilian PUC-PR
GPDS-960
#users
168
881
#genuine samples
30
24
#skilled forgeries
10
30
20/29
20 / 29
Datasets
Dataset
Brazilian PUC-PR
GPDS-960
#users
168
881
#genuine samples
30
24
#skilled forgeries
10
30
Training
Testing
Samples
Dataset split:
Users 1 - 160 (160) or
Users 1 - 300 (300)
Users
Users 161 - 881 (721) or
Users 301 - 881 (581)
20/29
20 / 29
Experiments
Experiments in GPDS with the two size-normalization approaches
Training the Writer-Dependent classifiers with variable # of samples
per user available for training
21/29
21 / 29
Experiments
Experiments in GPDS with the two size-normalization approaches
Training the Writer-Dependent classifiers with variable # of samples
per user available for training
Learn the feature representation in one dataset (GPDS) and use it to
obtain representations to another dataset (Brazilian PUC-PR)
21/29
21 / 29
Metrics
We evaluate the following criteria:
FAR (False Acceptance Rate): % of forgeries accepted as genuine
FRR (False Rejection Rate): % of genuine signatures rejected as
forgeries
EER (Equal Error Rate): % of errors when FAR = FRR, using
user-specific thresholds.
Mean AUC (Area Under the Curve): Average area under ROC curves
for each user
22/29
22 / 29
Results
Testing different preprocessing options:
Table: Classification errors on GPDS-160 (%) and mean AUC
Features
Classifier
CNN GPDS
CNN GPDS
CNN GPDSnorm
CNN GPDSnorm
SVM (Linear)
SVM (RBF)
SVM (Linear)
SVM (RBF)
EER
Mean AUC
14.35
14.64
11.32
10.70
0.9153
0.9097
0.9381
0.9459
23/29
23 / 29
Results
18
0.8
16
0.7
14
0.6
12
0.5
10
0.4
Mean AUC 8
EER
4
8
Training set size
(a) GPDS-160
12
6
1.0Performance with a variable number of samples per user (Brazilian PUC-PR)20
0.9
0.8
Classification error (%)
Performance with a variable number of samples per user (GPDS-160) 20
Area Under the Curve
1.0
0.9
Classification error (%)
Area Under the Curve
Performance as we vary the number samples per user for training
15
0.7
0.6
10
0.5
0.4
Mean AUC
EER
1
5
10
Training set size
5
15
(b) Brazilian PUC-PR
24/29
24 / 29
Results
Using features learned in GPDS for discriminating users in the PUC-PR
dataset
Table: Classification errors on the Brazilian PUC-PR dataset (%) and mean AUC
Features
Classifier
FRR
FARskilled
EER
Mean AUC
CNN Brazilian
CNN Brazilian
CNN GPDSnorm
CNN GPDSnorm
SVM (Linear)
SVM (RBF)
SVM (Linear)
SVM (RBF)
1.00
2.83
0.17
2.17
27.17
14.17
29.00
13.00
7.33
4.17
6.67
4.17
0.9668
0.9837
0.9653
0.9800
25/29
25 / 29
Results
Table: Comparison with state-of-the art on GPDS-160 (errors in %)
Reference
Features
Classifier
FRR
FAR
EER
Hu and Chen
Yilmaz
Yilmaz
Guerbai et al.
LBP, GLCM, HOG
LBP
LBP, HOG
Curvelet transform
Adaboost
SVM (RBF)
Ensemble of SVMs
OC-SVM
12.5
19.4
7.66
9.64
6.97
-
Present work
CNN GPDSnorm
SVM (RBF)
19.81
5.99
10.70
26/29
26 / 29
Results
Table: Comparison with the state-of-the-art on the Brazilian PUC-PR dataset
(errors in %)
Reference
Features
Classifier
FRR
FAR random
FAR simple
FAR skilled
AER
AERgenuine + skilled
EERgenuine + skilled
Bertolini et al.
Batista et al.
Rivard et al.
Eskander et al.
Graphometric
Pixel density
ESC + DPDF
ESC + DPDF
SVM (RBF)
HMM + SVM
Adaboost
Adaboost
10.16
7.5
11
7.83
3.16
0.33
0
0.02
2.8
0.5
0.19
0.17
6.48
13.5
11.15
13.5
5.65
5.46
5.59
5.38
8.32
10.5
11.08
10.67
-
Present Work
CNN GPDSnorm
SVM (RBF)
2.17
0.17
0.50
13.00
3.96
7.59
4.17
27/29
27 / 29
Conclusion
We can learn features for signature verification in a
Writer-Independent way, competitive with the state-of-the-art
Features generalize well to different users, and even across different
datasets
Proper preprocessing techniques (e.g. size normalization) are required
since the network requires fixed-size inputs
Results in terms of EER are good, but FRR and FAR are imbalanced
and not stable across users or datasets. This highlights the
importance of defining user-specific thresholds (future work)
28/29
28 / 29
29/29
29 / 29

Slides

Transcription

Similar documents

designed by redvil.com Project: CKX TRÖJA Date

join us - Signature HomeStyles

Host Company Tour Application

Big Sandy ISD

Assumere Viagra Scaduto » Discount Prescription Drugs

Spooktacular Style! - Signature HomeStyles

signature

record copy request form - National Society of the Sons of the

St. Bede`s College, Shimla

APPLICffiIO}{. .FOR GREDIT