WG4 PatMedia Description

Transcription

WG4 PatMedia Description
PatMedia - Patent Image Retrieval
Stefanos Vrochidis
Centre for Research and Technology Hellas
Informatics and Telematics Institute
MUMIA
WG4: Semantic Search, Faceted Search and Visualization
Vienna, 9 June 2011
CERTH-ITI Multimedia Group
• Personnel
•
25 people (researchers,
(researchers developers
developers, administration)
• Participation in European and national research
projects:
•
•
FP7: WeKnowIt (coordination), JUMAS, CHORUS+,
PESCaDO
FP6: AceMedia,
AceMedia X-Media,
X Media MESH,
MESH BOEMIE,
BOEMIE VIDI-Video,
VIDI Video
K-Space, PATExpert, ELU, etc.
• Contracts with Industryy
• 30 Journal publications, 100+ conference
publications, 18 book chapters, 3 patents
• Numerous events: ACM CIVR09, WWW09 tutorial,
WIAMIS 2007, etc.
Centre for Research and Technology Hellas
Informatics and Telematics Institute
2
Expertise Relevant to WP4
Main research interests
•
Semantics
•
•
Image & Video analysis
•
•
feature extraction, content classification, concept
extraction, event detection
Image & Video Retrieval
•
•
ontology design, reasoning
Search engines, multimodal retrieval
P t t IImage R
Patent
Retrieval
ti
l
Centre for Research and Technology Hellas
Informatics and Telematics Institute
3
PatMedia
• Patent Image Retrieval system
• Query by visual example was presented in
IRFS 2008
• Evaluated by professional patent searchers in
IRFS 2010
p extraction presented
p
in IRFS 2011
• Concept
Centre for Research and Technology Hellas
Informatics and Telematics Institute
4
Introduction
•
•
Why patent images are important
•
Figures, drawings and diagrams are almost always
patents, as a means to further specify
p
y the
contained in p
objects and ideas to be patented.
•
Image examination is important to patent experts in their
attempt to deeply understand the patent contents and find
relevant inventions and prior art
Advantages
g
•
•
Images are the same in every language
Problems
•
Low quality, handwritten labels
•
Complex black and white images
Centre for Research and Technology Hellas
Informatics and Telematics Institute
5
Overview
• PatMedia Framework
• P
Patent page segmentation
i into
i
figures
fi
• Figure textual descriptions extraction and
matching to corresponding images
• Visual Feature extraction from patent images
• PatMedia Retrieval Engine
• Image Retrieval by visual similarity
• Textual search on figure captions
• Concept based Retrieval based on visual and
textual features
Centre for Research and Technology Hellas
Informatics and Telematics Institute
6
PatMedia
•
Page
g Segmentation
g
to single
g figures
g
Frequently, a patent drawing page
contains more than one figure.
An image segmentation step that
splits a multi - figure page to single
images has to be applied.
•
•
•
•
Each separate figure
g
in patents is
accompanied
d by
b a label
l b l off the
h form
f
“Figure x”.
OCR for Figure Label Detection.
This task is particularly challenging in certain cases:
•
•
•
a single figure consists of spatially disjoint elements or when multiple
figures are adjacent.
Figure Labels are hand written.
written
Centre for Research and Technology Hellas
Informatics and Telematics Institute
7
Visual Features
Binary CBIR
•
A special case of Content Based Image Retrieval
Patent images are mostly binary (black and white) images
•
•
•
th d
they
do nott contain
t i any color
l or texture
t t
iinformation
f
ti
Use shape-based feature vector extraction methods
•
•
describe the image geometric information accurately
Patent databases contain a vast amount of patents (and so
patent drawings)
•
•
Feature vector extraction and retrieval techniques need to be
computational inexpensive
Extraction of Adaptive Hierarchical Density Histograms
(AHDH) as visual feature vectors [1]
•
•
•
Global visual features based on the pixel distribution of a
d
drawing
i
Low dimension feature vector (~100 features)
[1] P. Sidiropoulos, S. Vrochidis, I. Kompatsiaris, "Content-Based Binary Image Retrieval using the Adaptive
Hierarchical Density Histogram
Histogram",, Pattern Recognition Journal, Elsevier, Volume 44, Issue 4, pp 739739
750, April 2011.
Centre for Research and Technology Hellas
Informatics and Telematics Institute
8
Textual Features
Textual information extraction
•
Example: Patent US 20020152637 A1
FIG. 7 shows the reversible tongue containing
a pocket in its upper half, and which may be
secured by Velcro, or the like, into closure
Bag of words technique
•
Indexing
g textual information using
g Lemur [2]
[ ]
Stemming using Porter stemmer
Creation of Lexicon using training data
Feature vector includes lexicon term weights
•
•
•
•
•
•
<boot snowboard illustr tongu footwear heel...>
[0 0 0 0.0909091 0 0...]
[2] http://www.lemurproject.org/
http://www lemurproject org/
Centre for Research and Technology Hellas
Informatics and Telematics Institute
9
Query by Visual Example
•
•
•
•
The images are ranked based on L1 of the AHDH
f t
features.
The system was tested with more than 300.000
patent images from around 10.000 patents of the
optical recording domain
Query time was around 5-7seconds.
An R-tree structure was used
d to speed
d up the
h
retrieval process
Centre for Research and Technology Hellas
Informatics and Telematics Institute
10
Concept Extraction Framework
• Supervised
p
Machine learning
g based framework based
on Support Vector Machines
• Trained with textual and visual low level features
• Requires Manually annotated Training Data
Patent
Figures
Patents
Page to figure
P
fi
segmentation
Concept annotation
Feature
Vectors
Textuall and
T
d
Visual
Features
Extraction
Trained
Model
Training
Data
Training
T
i i andd
testing data
separation
Training
T
i i andd
testing of
Classifier
Test
Data
Centre for Research and Technology Hellas
Informatics and Telematics Institute
11
Dataset and Concepts
Patent Images
•
•
•
A43B IPC sub-class
Contain “Parts of Footwear”
Concepts*
•
•
•
•
•
•
•
Cleat
Ski boot
High heel
Lacing Closure
Spring Heel
Tongue
* The selection of concepts was done with the help of Dominic DeMarco
Centre for Research and Technology Hellas
Informatics and Telematics Institute
12
Concepts
Cleat
•
•
•
Description: A short piece of rubber, metal etc attached to the bottom
of a sports shoe used mainly for preventing someone from slipping
IPC subclass: A43B5/18S
Ski boot
•
•
•
Description:
p
A specially
p
y made boot that fastens onto a ski
IPC subclass: A43B5/04
Centre for Research and Technology Hellas
Informatics and Telematics Institute
13
Concepts
High Heel
•
•
•
Description:
i i
Sh
Shoes
with
i h high
hi h heels
h l
IPC subclass: A43B21
Lacing closure
•
•
•
Description: A cord that is drawn through eyelets or around hooks in
order to draw together the two edges of a shoe
IPC subclass: A43B5/04
Centre for Research and Technology Hellas
Informatics and Telematics Institute
14
Concepts
Spring Heel
•
•
•
Description:
D
i i
(Heels
H l with
i h metall springs
i
IPC subclass: A43B21/30
Tongue
•
•
•
Description: The part of a shoe that lies on top of your foot, under the
part where you tie it
IPC subclass: A43B23/26
Centre for Research and Technology Hellas
Informatics and Telematics Institute
15
Dataset Statistics
Concept
All figures
Train Data
Test Data
Cleat
148
89
59
Ski boot
123
74
49
High heel
148
89
59
Lacing Closure
117
71
46
Spring Heel
106
64
42
Tongue
124
75
49
Total
766
352
304
Creation of training/testing set
•
•
•
Testing/Training ratio = 3/5
Positive/Negative ratio = 1/3
Centre for Research and Technology Hellas
Informatics and Telematics Institute
16
Approach
We trained one classifier for each concept
The following cases are considered
•
•
Visual
•
•
SVMs were trained only with visual features (AHDH)
Visual extended
•
•
•
Extension of visual case
The output of all classifiers forms a vector and is passed to a new
classifier to yield the final score
Textual
•
•
SVMs were trained only with textual features
Visual + Textual
•
•
•
SVMs were trained with a feature vector containing visual and
textual features
200 ffeatures
t
(100 visual
i
l and
d 100 textual)
t t l)
Centre for Research and Technology Hellas
Informatics and Telematics Institute
17
Max
score
output
Approach
Feature
Vector
Visual
Textual
Textual + Visual
Cleat
Classifier
Cleat
score
Cleat score
Classifier
Ski boot
Classifier
Ski boot
score
Ski boot score
Classifier
High Heel
Classifier
High Heel
score
High Heel
score
Classifier
Lacing
Closure
Classifier
Lacing
Closure
score
Lacing
Closure score
Classifier
Spring
Heel
Classifier
Spring
Heel
score
Spring Heel
score
Classifier
Tongue
Classifier
Tongue
score
Tongue score
Classifier
Centre for Research and Technology Hellas
Informatics and Telematics Institute
Visual extended
Max
score
output
18
Time
i
for
f the
h d
demos
http://mklab-services.iti.gr/patmedia
http://mklab-services.iti.gr/patmediac
Centre for Research and Technology Hellas
Informatics and Telematics Institute
19
Results
• Average
g Results
for all concepts
Features
Accuracy
Precision
Recall
F-score
Visual
85,58%
70,38%
63,78%
60,7%
Textual
90,3%
71,59%
71,94%
71,25%
Visual ext.
87,22%
65,6%
73,13%
66,81%
Visual + Textual
93,25%
82,24%
78,14%
78,58%
Centre for Research and Technology Hellas
Informatics and Telematics Institute
20
Conclusions
• Image search could provide quickly results especially in multilingual
patents without much preprocessing
• Combination of visual and textual information performs better for
concept extraction.
• Classification based solely on visual information is still very
satisfactory.
• Cl
Classification
ifi i based
b d on visual
i
l features
f
could
ld fail
f il when
h two visually
i
ll
similar images are described with different concepts.
• Automatic segmentation could be supported, however an error
(~20%) might be introduced.
• The concept retrieval module could be a part of a larger patent
retrieval framework.
Centre for Research and Technology Hellas
Informatics and Telematics Institute
21
Future Work
• Further optimize the retrieval function for visual query by
example
• Produce results for more concepts, bigger datasets and for
different IPC classes.
classes
• Test performance in the case of automatically segmented
images (i.e. of lower quality).
• Combine more efficiently text and visual information
•
give more weight to visual or textual description depending on the concept
type
• Investigate late fusion techniques.
Centre for Research and Technology Hellas
Informatics and Telematics Institute
22
Relevant Publications and Presentations
•
S. Vrochidis, "Patent Image Retrieval based on Concept Extraction and
Classification", Information Retrieval Facility Symposium 2011 (IRFS
2011), Vienna, Austria, June 7, 2011.
•
P. Sidiropoulos,
p
, S. Vrochidis,, I. Kompatsiaris,
p
, "Content-Based Binaryy
Image Retrieval using the Adaptive Hierarchical Density Histogram",
Pattern Recognition Journal, Elsevier, Volume 44, Issue 4, pp 739-750,
April 2011.
•
S. Vrochidis, S. Papadopoulos, A. Moumtzidou, P. Sidiropoulos, E.
Pianta, I. Kompatsiaris, "Towards Content-based Patent Image
Retrieval; A Framework Perspective",
Perspective , World Patent Information
Journal, Volume 32, Issue 2, pp 94-106, June 2010.
•
G. Ypma, "Evaluation of Patent Image Retrieval", Information Retrieval
Facility Symposium 2010 (IRFS 2010),
2010) Vienna
Vienna, Austria,
Austria June 1-4,
1 4 2010.
2010
Centre for Research and Technology Hellas
Informatics and Telematics Institute
23
Relevant Publications
•
J. List, "Review of ITI Approach for Searching Non-Text Information in
Patents", Information Retrieval Facility Symposium 2010 (IRFS 2010),
Vienna, Austria, June 1-4, 2010.
•
P. Sidiropoulos,
p
, S. Vrochidis and I. Kompatsiaris,
p
, "Content-Based
Retrieval of Complex Binary Images", in Proceedings of the 8th
International Workshop on Content-Based Multimedia Indexing (CBMI
2010), pp. 182-187, Grenoble, France, 23-25 June 2010.
•
S. Vrochidis, "Towards Patent Image Retrieval", International Patent
Information Conference & Exposition, IPI-ConfEx 2009, Venice, Italy,
March 1
1-4,
4, 2009.
•
S. Vrochidis, "Patent Image Retrieval", Information Retrieval Facility
Symposium 2008 (IRFS 2008), Vienna, Austria, November 5-7, 2008.
•
J. Codina, E. Pianta, S. Vrochidis and S. Papadopoulos, "Integration of
Semantic, Metadata and Image search engines with a text search
engine for patent retrieval", Semantic Search 2008 Workshop, Tenerife,
S i June
Spain,
J
2,
2 2008.
2008
Centre for Research and Technology Hellas
Informatics and Telematics Institute
24
Feel free to test the demo!
http://mklab services iti gr/patmediac
http://mklab-services.iti.gr/patmediac
Thank you!
http://mklab.iti.gr
Centre for Research and Technology Hellas
Informatics and Telematics Institute
25