Writer-Independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks Luiz G Hafemann, Robert Sabourin and Luiz Oliveira June 22, 2016 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 1 / 29 Table of Contents 1 Review on Neural Networks 2 Introduction 3 Proposed solution 4 Experimental Protocol 5 Results and Conclusion Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 2 / 29 Review of Feedforward Neural Networks Models that use composition of functions (in “layers”) In the simplest case (fully connected layer), simply an affine transformation followed by a element-wise nonlinear function: hil = f (Wi:l hl−1 + bil ) Learn a non-linear function of the input (in fact, they are universal function approximators). Drawback: non-convex optimization 3/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 3 / 29 Training Neural Networks Key idea is to use gradient-based learning: define a loss function that is differentiable, and use components that are differentiable Example: P for multi-class classification, use cross-entropy loss L = − i log P(y (i) |X (i) ) Calculate gradients (partial derivatives w.r.t each parameter of the network): ∇W Update weights iteratively until convergence: W = W − α∇W 4/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 4 / 29 Training Neural Networks Key idea is to use gradient-based learning: define a loss function that is differentiable, and use components that are differentiable Example: P for multi-class classification, use cross-entropy loss L = − i log P(y (i) |X (i) ) Calculate gradients (partial derivatives w.r.t each parameter of the network): ∇W Update weights iteratively until convergence: W = W − α∇W Output units l l wkl Hidden units H2 wkl k j wij Input units i E E yl = zl yl zl k wjk Hidden units H1 E =yl tl yl wjk yj =f (zj ) j zj = wij wij xi i Figure: Forward propagation (left) and backpropagation (right) (Lecun et al., 2015) 4/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 4 / 29 NNs for Computer Vision Network architectures that exploit the characteristics of the data: Convolutional Neural Networks: hl,c = W l,c ∗ hl−1 Uses local connections (explore spatial 2D correlation of pixels) Uses shared weights (same detector is used in multiple parts of the image) Figure: Visualization of a convolution operation 5/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 5 / 29 NNs for Computer Vision The input to the network is the image itself (pixel intensities) Figure: Lenet5 (1989) 6/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 6 / 29 NNs for Computer Vision Nowadays using almost exclusively CNNs Improvements in architecture (ReLUs, Batch Normalization) Improvements in training (Nesterov Momentum, Adagrad, Adadelta, Adam) Mostly: more computing power (GPUs) and larger datasets Resources: theoretical_motivations/ LeCun, et al. ”Deep learning.” (Nature article) Bengio, et al, “Deep learning.” (book) 7/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 7 / 29 Relevant papers (to this presentation) CNNs trained in a purely supervised manner halved the error rates on ImageNet in 2012 Krizhevsky, A, et al. “Imagenet classification with deep convolutional neural networks.” CNNs learned in one dataset can be used to extract features in other datasets (aka Transfer Learning) Donahue, J, et al. “Decaf: A deep convolutional activation feature for generic visual recognition.” Razavian, Ali, et al. “CNN features off-the-shelf: an astounding baseline for recognition.” 8/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 8 / 29 Transfer Learning Figure: (Yosinski et al., 2015) 9/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 9 / 29 What are the layers doing? Last layer is a linear classifier on the activations from the previous layer Layers learn non-linear transformations that “disentangle” the inputs so that in the end the classes are linearly separable a Input (2) Hidden (2 sigmoid) Output (1 sigmoid) 10/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 10 / 29 Table of Contents 1 Review on Neural Networks 2 Introduction 3 Proposed solution 4 Experimental Protocol 5 Results and Conclusion 11/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 11 / 29 Offline Handwritten Signature Verification Biometric verification system based on handwritten signatures Widely used to identify a person’s identity in legal, financial and administrative areas 12/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 12 / 29 Offline Handwritten Signature Verification Biometric verification system based on handwritten signatures Widely used to identify a person’s identity in legal, financial and administrative areas “Offline” refers to the acquisition process: scanning the document that contain the signature. 12/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 12 / 29 Problem formulation Enrollment: user provides few genuine samples. 13/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 13 / 29 Problem formulation Enrollment: user provides few genuine samples. Operations: person provides a sample and claims an identity 13/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 13 / 29 Problem formulation Enrollment: user provides few genuine samples. Operations: person provides a sample and claims an identity Main challenge: discriminating skilled forgeries: forgeries created targeting a particular individual Figure: Three genuine signatures and one skilled forgery for a set of users Other challenges: Low #samples per user, list of users not fixed 13/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 13 / 29 Related Work There two main approaches for the task: Writer Dependent classification: One classifier is trained for each user Writer Independent classification: A single classifier is trained, that compares a query signature to a template signature 14/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 14 / 29 Related Work There two main approaches for the task: Writer Dependent classification: One classifier is trained for each user Writer Independent classification: A single classifier is trained, that compares a query signature to a template signature Most of the research effort was devoted to answer the question: “What is a good representation for signatures” 14/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 14 / 29 Proposed solution Learn features for Signature Verification from data. Learning features for each user is impractical (low # samples.) Signatures from different users share common properties 15/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 15 / 29 Proposed solution Learn features for Signature Verification from data. Learning features for each user is impractical (low # samples.) Signatures from different users share common properties Proposed solution: two-stage approach: Learn features in a Writer-Independent format Train writer-dependent classifiers using the learned representation 15/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 15 / 29 Proposed solution Writer-Independent Feature learning CNN training Development dataset ( ) CNN model ( ) Writer-Dependent training Feature Extraction WD Classifier Extracted features ( Signature Images ( ) ) Training set (from ) Binary classifier ( ) Generalization Decision (Accept/Reject) Feature Extraction New Sample (from ) Signature Image ( ) Verification Extracted features ( ) 16/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 16 / 29 Writer-Independent Feature Learning Learn to classify between users in the Development set. Using a Deep CNN, with a last layer that estimates P(y |X ), where X is the signature image, and y ∈ Y is the user in D Input is a signature image of Psize 155x220 pixels. Minimize cross-entropy: − i log P(y (i) |X (i) ) using Stochastic Gradient Descent Convolutions 155x220 N Max-pooling 4096 Fully-connected Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 17/29 17 / 29 Preprocessing Noise removal (using OTSU’s algorithm) 18/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 18 / 29 Preprocessing Noise removal (using OTSU’s algorithm) Size Normalization: (a) (b) Figure: (a) Simply resizing to desired size; (b) centering signatures in a canvas and then resizing (suffix CNNnorm in the experiments) 18/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 18 / 29 Writer-Dependent classification For each user, we build a training set: r genuine signatures as positive samples, and genuine signatures from users in D as negative samples For each signature X in the training set, we compute the representation φ(X ) by performing forward propagation on the CNN until the last layer before softmax We use these feature vectors to train a binary classifier f (Linear SVM and with RBF kernel) For a new query signature Xnew , we compute f (φ(Xnew )) 19/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 19 / 29 Datasets Dataset Brazilian PUC-PR GPDS-960 #users 168 881 #genuine samples 30 24 #skilled forgeries 10 30 20/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 20 / 29 Datasets Dataset Brazilian PUC-PR GPDS-960 #users 168 881 #genuine samples 30 24 #skilled forgeries 10 30 Training Testing Samples Dataset split: Users 1 - 160 (160) or Users 1 - 300 (300) Users Users 161 - 881 (721) or Users 301 - 881 (581) 20/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 20 / 29 Experiments Experiments in GPDS with the two size-normalization approaches Training the Writer-Dependent classifiers with variable # of samples per user available for training 21/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 21 / 29 Experiments Experiments in GPDS with the two size-normalization approaches Training the Writer-Dependent classifiers with variable # of samples per user available for training Learn the feature representation in one dataset (GPDS) and use it to obtain representations to another dataset (Brazilian PUC-PR) 21/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 21 / 29 Metrics We evaluate the following criteria: FAR (False Acceptance Rate): % of forgeries accepted as genuine FRR (False Rejection Rate): % of genuine signatures rejected as forgeries EER (Equal Error Rate): % of errors when FAR = FRR, using user-specific thresholds. Mean AUC (Area Under the Curve): Average area under ROC curves for each user 22/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 22 / 29 Results Testing different preprocessing options: Table: Classification errors on GPDS-160 (%) and mean AUC Features Classifier CNN GPDS CNN GPDS CNN GPDSnorm CNN GPDSnorm SVM (Linear) SVM (RBF) SVM (Linear) SVM (RBF) EER Mean AUC 14.35 14.64 11.32 10.70 0.9153 0.9097 0.9381 0.9459 23/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 23 / 29 Results 18 0.8 16 0.7 14 0.6 12 0.5 10 0.4 Mean AUC 8 EER 4 8 Training set size (a) GPDS-160 12 6 1.0Performance with a variable number of samples per user (Brazilian PUC-PR)20 0.9 0.8 Classification error (%) Performance with a variable number of samples per user (GPDS-160) 20 Area Under the Curve 1.0 0.9 Classification error (%) Area Under the Curve Performance as we vary the number samples per user for training 15 0.7 0.6 10 0.5 0.4 Mean AUC EER 1 5 10 Training set size 5 15 (b) Brazilian PUC-PR 24/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 24 / 29 Results Using features learned in GPDS for discriminating users in the PUC-PR dataset Table: Classification errors on the Brazilian PUC-PR dataset (%) and mean AUC Features Classifier FRR FARskilled EER Mean AUC CNN Brazilian CNN Brazilian CNN GPDSnorm CNN GPDSnorm SVM (Linear) SVM (RBF) SVM (Linear) SVM (RBF) 1.00 2.83 0.17 2.17 27.17 14.17 29.00 13.00 7.33 4.17 6.67 4.17 0.9668 0.9837 0.9653 0.9800 25/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 25 / 29 Results Table: Comparison with state-of-the art on GPDS-160 (errors in %) Reference Features Classifier FRR FAR EER Hu and Chen Yilmaz Yilmaz Guerbai et al. LBP, GLCM, HOG LBP LBP, HOG Curvelet transform Adaboost SVM (RBF) Ensemble of SVMs OC-SVM 12.5 19.4 7.66 9.64 6.97 - Present work CNN GPDSnorm SVM (RBF) 19.81 5.99 10.70 26/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 26 / 29 Results Table: Comparison with the state-of-the-art on the Brazilian PUC-PR dataset (errors in %) Reference Features Classifier FRR FAR random FAR simple FAR skilled AER AERgenuine + skilled EERgenuine + skilled Bertolini et al. Batista et al. Rivard et al. Eskander et al. Graphometric Pixel density ESC + DPDF ESC + DPDF SVM (RBF) HMM + SVM Adaboost Adaboost 10.16 7.5 11 7.83 3.16 0.33 0 0.02 2.8 0.5 0.19 0.17 6.48 13.5 11.15 13.5 5.65 5.46 5.59 5.38 8.32 10.5 11.08 10.67 - Present Work CNN GPDSnorm SVM (RBF) 2.17 0.17 0.50 13.00 3.96 7.59 4.17 27/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 27 / 29 Conclusion We can learn features for signature verification in a Writer-Independent way, competitive with the state-of-the-art Features generalize well to different users, and even across different datasets Proper preprocessing techniques (e.g. size normalization) are required since the network requires fixed-size inputs Results in terms of EER are good, but FRR and FAR are imbalanced and not stable across users or datasets. This highlights the importance of defining user-specific thresholds (future work) 28/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 28 / 29 29/29 Luiz Gustavo Hafemann WI Feature Learning for Signature Verification using CNNs 29 / 29