PILL-ID

Transcription

PILL-ID
PILL-ID: Matching and Retrieval
of Drug Pill Imprint Images
Young-Beom Lee1, Unsang Park2, and Anil K. Jain1,2
1Brain
and Cognitive Engineering
Korea University, Korea
2Computer
Science and Engineering
Michigan State University, USA
http://Biometrics.cse.msu.edu
• Legal drug pill or illicit drug pill?
• If illicit pill, which cartel manufactured it?
• What is the effective way to identify illicit drug?
• ~35M in the U.S. used illicit or abused prescription drugs;
$14B spent for drug treatment & prevention (2007)
• Prescription pills must be identifiable (by color, shape,
and imprints) per FDA regulations
• Illicit pills (e.g., narcotics) also contain imprints to
identify the cartel or distributor
• Databases of prescription pills and illegal pills
are available (pharmaceutical companies, FBI)
Query
•
•
•
•
•
Rank-1
2
Imprint : 5883
Shape : round
Color : brown
Ingredient : MDMA, BZP, TFMPP
Cartel : Gulf
contents
3
4
5
6
• Imprint is an indented or printed mark on a
pill, tablet or capsule
• Symbol, text, digits or their combination
Legal drug pills
Illicit drug pills
• Sobel operator to obtain
gradient magnitude image
• Segmentation, scale
normalization
Original Image
Gradient magnitude
Image
• Rotation normalization
Primary & Secondary
Dominant Orientations
• Landmarks (key-points) are
selected within a preset
radius (SIFT descriptor)
Multiple template with
Rotation variation
• Gradient magnitude images have smaller intra-class variations
Original image
Gray image
Gradient
Magnitude
image
Rank-1 accuracy (%)
Method
Gradient
magnitude
Grayscale
Optimized SIFT descriptor
90.03
83.55
(using 602 query-gallery dataset)
Images that did not match at rank-1 using SIFT but matched using
the proposed method (fixed key points + SIFT descriptor)
Method
Number of key-points
Rank-1 accuracy (%)
Original SIFT
Min
Max
Avg.
17
340
126
43.02
Our method
(SIFT descriptor)
29
90.03
Red dots: SIFT key points, Blue dots: preset key points
• Select a set of key-points
• Collect gradient magnitude and
orientation with Gaussian
weighting and tri-linear
interpolation
• Truncation
• Length of feature vector:
4 × 4 × 8 = 128
128 × 29 = 3712
Gaussian weighting
Gaussian window
centered at a key point
Tri-linear interpolation
Truncation
• LBP histograms with multiple neighborhood parameters (P,R) are
created and concatenated
P=8, R=1.0
P=4, R=1.0
P=12, R=2.0
• Feature vectors are constructed with the following parameters
(P, R)
Window size
Shift value
U(8, 1)
20 X 20
4
U(4, 1)
10 X 10
2
U(12, 2)
30 X 30
6
• Length of feature vector:
U(8,1) = 59, (4,1) = 16, U(12,2) = 135
59 X(13 X 13)+16 X(31 X 31)+135 X(7 X 7) = 31,962
• Given a query image (q) and N gallery images (g), the K
feature vectors of the query are compared with the Ln feature
vectors of the nth gallery images (n = 1 to N, L2 norm).
• Ln is different for each gallery image
• The ID of the closest match in the gallery is selected as the ID
Feature vectors
j
of gallery images, g n
Feature vectors
i
of a query image, qm
Ln (=j)
…
…
…
…
Km (=i)
…
N
........
n
........
IDm  arg min d (qmi , gnj )
.....
…
• 822 illicit drug pill images from the Australian Federal
Police; 138 illicit drug pill images and 14,003 legal pill
images from the U.S. DEA website, Drug information
online and pharmer.org
• Image size: from 48 X 42 to 2,088 X 1,550 pixels; 96 dpi
• Query set: 602 illicit drug pills with duplicate images of
the same imprint pattern (88 distinctive patterns)
• Gallery set: 960 (illicit drug pill images) + 14,003 (legal
drug pill images) = 14,963 images
• Leave-one-out method to match each of the 602 query to
all the 14,962 gallery images
•
SIFT descriptor parameters are optimized for pill imprint matching
1. Smoothing
2. Gradient orientation & magnitude
3. Gaussian weighting
4. Trilinear interpolation
5. Truncation with threshold values of 0.2, 0.5 and 1
Method
Rank-1 accuracy (%)
Truncation
value
Rotation
Normalization
Edge image
Grayscale
image
SIFT with 1, 2, 3, 4, 5
(Original sift)
0.2
No
83.89
83.39
SIFT with 2, 3, 4, 5
0.2
No
87.87
78.74
SIFT with 2, 4, 5
0.2
No
88.70
79.57
SIFT with 2, 5
0.2
No
87.54
81.56
SIFT with 2, 4, 5
0.5
No
87.71
-
SIFT with 2, 4, 5
1.0
No
87.71
-
SIFT with 2, 4, 5
0.2
Yes
90.03
-
• 602 query and 14,962 gallery images
Method
Rank 1 (%)
Rank 20 (%)
MLBP
64.78
82.72
SIFT descriptor
82.72
90.20
SIFT (0.7)+MLBP (0.3)
84.39
91.53
Query
Top-6 retrievals
•
Queries that were not correctly retrieved in top 20 matches
Query
Top-6 retrievals
Rank of
true mate
− Illumination noise in the background
13042
− Similar shape and imprints
12841
3402
3259
1897
− Very similar pattern between query and top retrieved images
• Numeric or text information in imprints can be used for matching/filtering
5883
•
•
•
•
•
Imprint : 5883
Shape : round
Color : brown
Ingredient : MDMA, BZP, TFMPP
Cartel : Gulf
Shape : Round
Color: Pink
Text: no
Numbers: no
Query
…
Rank 1
2
3
4
5
6
Using only imprints
7
…
97
…
Rank 1
2
3
4
5
6
Using imprint shape and color
7
…
15
Content based matching can reduce retrieval errors
• Proposed an image retrieval system for identifying illicit drugs
• 84.4% rank-1 (91.53% rank-20) accuracy with ~600 query and
~15K gallery images
• Evaluated two image descriptors (SIFT and MLBP) & their fusion;
rotation invariant matching scheme was used
• Computation time: 2.3 (0.5) sec/image for feature extraction
and 13.0 (4.0) sec for each query with ~15K gallery for SIFT
(MLBP); code in MATLAB running on 2.8 GHz CPU, 8 GB RAM
• Future work
– Content based matching/filtering
– Evaluation on a larger database; collaboration with AFP
– More efficient matching scheme
• If we can identify numbers or texts in
imprints, content based methods can be
used.
Number : 5883
Text : WYETH
Examples of the number and text imprint
•
MLBP is also evaluated with a various parameters using 602 querygallery dataset to optimize it for pill imprint matching
1. Number of LBPs
2. Sub-region (window size, shift value)
3. Input image size
Method
Rank-1 accuracy
(%)
LBP
Sub-region
Image size
u2
LBP8,1+4,1
No
60
51.01
u2
u2
LBP8,1+4,1+12,2
No
60
54.15
u2
u2
LBP8,1+4,1+12,2
No
70
55.81
u2
u2
LBP8,1+4,1+12,2
(32, 8)(16, 4)(48, 12)
70
63.12
u2
u2
LBP8,1+4,1+12,2
(16, 4)(8, 2)(24, 6)
70
65.78
u2
u2
LBP8,1+4,1+12,2
(20, 4)(10, 2)(30, 6)
70
75.42
Gradient magnitude
image
Multiple Templates
Orientation histogram
15
10
5
……
……
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35