Object Matching for Improving Information Quality.

Transcription

Object Matching for Improving Information Quality.
Erhard Rahm
http://dbs.uni-leipzig.de
http://dbs.uni-leipzig.de/wdi-lab
November 25, 2009
Innovation Lab at Univ. of Leipzig on
semantic Web Data Integration
` Funded by BMBF (German ministry for research and
education)
`
¾ 2009:
initial phase
¾ Full funding starts in Jan. 2010 (10 full-time
employees
l
+ students)
d
)
`
Goals
¾ Semi-automatic,
Semi automatic
high quality data integration of
heterogeneous (web) data
¾ Faster development
p
of data integration
g
solutions than
with traditional integration approaches, e.g. data
warehouses
M k research
h approaches
h ready
d for
f the
th market
k t
¾ Make
2
`
Mashup/Workflow-like data integration
¾ Framework
F
k
to specify
if and
d execute workflows
kfl
f data
for
d
acquisition from web sources, data transformation,
integration and analysis
¾ Support for dynamic (runtime) data integration
¾ Research prototype: iFuice + extensions
BusinessIntelligence
Analysis
Query generation
(e g for Google
(e.g.
product search)
Data extraction
and
transformation
Product
list
Object
Matching
Generation of
web service requests
(e.g. Amazon.com)
Repository
3
`
Ontology and Schema matching
¾ Semi-automatic
S i
i
generation
i
off mappings
i
b
between
related schemas (e.g., XML business schemas) or
ontologies (e.g., product catalogs)
¾ Support for large schemas/ontologies
¾ Research prototype: COMA++
Shopping.Yahoo.com
Electronics
DVD Recorder
Beamer
Digital Cameras
Digital Photography
Camcorders
4
Amazon.com
Electronics & Photo
TV & Video
DVD Recorder
Projectors
Camera & Photo
Digital Cameras
`
Object
j
Matching
g ((Entity
y resolution,, Deduplication)
p
)
¾ Effective
strategies for matching related objects
(entities, instances) from one or several sources
¾ Offline
ffl
matching
h
(
(e.g.
with
h data
d
warehouse)
h
) and
d online
l
matching (e.g., within mashup applications)
¾ Research prototypes: MOMA
MOMA, FEVER
5
`
Introduction (Object Matching)
`
FEVER platform for object matching strategies
¾ Architecture
¾ Manually
M
ll
specified
ifi d match
h strategies
i ((operator trees))
¾ Training-based learning of match strategies
¾ Evaluation
`
Dynamic object matching in mashups
¾ OCS
`
(Online
(O li Citation
Ci i
Service)
S i )
Instance-based ontology
gy matching
g
¾ Approaches
¾ Support
`
6
in COMA++
Conclusions
Data Quality Problems
Single-Source
Problems
Schema Level
(Lack of integrity
constraints, poor
schema design)
- Uniqueness
U i
- Referential integrity
…
Better schemas/
models
Instance Level
(Data entry errors)
Multi-Source Problems
Schema Level
Instance Level
(Heterogeneous
data models and
schema designs)
(Overlapping,
contradicting
and inconsistent
data)
- Misspellings
- Redundancy/duplicates
- Contradictory values
…
- Naming conflicts
-Structural conflicts
…
- Inconsistent aggregating
- Inconsistent timing
…
Object matching
Schema matching,
conflict resolution
Object matching
(entity resolution,
duplicate detection,
reference reconcilation …))
* E. Rahm, H. H. Do: Data Cleaning: Problems and Current Approaches.
IEEE Techn. Bull. Data Eng., Dec. 2000
7
`
Identify semantically equivalent (matching) objects
within one data source or between different sources
¾ to integrate (merge) them, compare them, improve data
quality, etc.
¾
`
Most previous work for structured (relational) data
Source1: Customer
Cno LastName FirstName Gender Address
23 Harley St, Chicago
24
Smith
Christoph
M
IL, 60633-2394
493
Smith
Source2:
Client
8
Kris L.
F
2 Hurley Place, South
Fork MN, 48503-5998
Phone/Fax
333-222-6542
/ 333-2226599
444-555-6666
CID
Name
Street
City
11
Kristen Smith
2 Hurley Pl South Fork, MN 48503
24
Christian Smith Hurley St 2 S Fork MN
Sex
0
1
9
Duplicates due to
• Order of authors
• Extraction error ((title,, author))
• Different titles!
• Typos (author name)
etc.
etc
10
Object matching approaches
Value-based
Context-based
...
Multiple attributes
unsupervised
p
supervised
p
...
• Aggregation function with threshold
• User-specified Rules:
• Hernandez et al. (SIGMOD 1995)
• Clustering
• Monge, Elkan (DMKD 1997)
• Mc Callum et al. (SIGKDD 2000)
• Cohen, Richman (SIGKDD 2002)
• decision trees
• Verykios et al. (Inf.Sciences 2000)
• Tejada et al. (Inf. Systems 2001)
• Support vector machine
• Bilenko, Mooney (SIGKDD 2003)
• Minton et al. (2005)
11
http://dc-pubs.dbs.uni-leipzig.de
12
• Hierarchies:
• Ananthakrishna et.al. (VDLB 2002)
• Graphs:
• Bhattacharya, Getoor (DMKD 2004)
• Dong et al. (SIGMOD 2005)
• Ontologies
l
...
Single attribute
`
Support combination of several match techniques
`
Manual construction of combined strategies
¾ BN,
BN
`
MOMA,
MOMA SERF …
Learning-based frameworks
¾ FEBRL,
FEBRL
`
MARLIN,
MARLIN TAILOR
TAILOR, Active Atlas …
Problems
¾ Evaluation
results not conclusive
¾ High tuning effort needed
¾ Dependency
d
on training data
d
f learning-based
for
l
b
d
approaches
* H. Köpcke, E. Rahm: Frameworks for Entity Matching: An Overview.
Data and Knowledge Engineering, 2009
13
`
Introduction (Object Matching)
`
FEVER platform for object matching strategies
¾ Architecture
¾ Manually
M
ll
specified
ifi d match
h strategies
i ((operator trees))
¾ Training-based learning of match strategies
¾ Evaluation
`
Dynamic object matching in mashups
¾ OCS
`
(Online
(O li Citation
Ci i
Service)
S i )
Instance-based ontology
gy matching
g
¾ Approaches
¾ Support
`
14
in COMA++
Conclusions
FEVER = Framework for EValuating Entity Resolution
¾ Platform for configuration and evaluation of entity
resolution (object matching) algorithms and strategies
¾ Key
K features:
f t
¾ Flexible specification of object matching workflows
¾ Semi
Semi-automatic
automatic parameter configuration (e.g.,
(e g similarity
thresholds)
¾ Support for training-based matching to reduce manual
tuning effort
¾ Comparative evaluations of different match approaches
¾
Köpcke, H.; Thor, A.; Rahm, E.: Comparative evaluation of entity
resolution approaches with FEVER. Demo, Proc. VLDB, 2009
FEVE
ER Core
Comp
ponents
Workflow
Edittor
Runtime
Engiine
15
16
Parameter
Setting
Param’s
Operator Tree
Execution
• Logging
• Breakpoints
ea po ts
Data Specification
Entities
Mappings
• Source
• Target
• Generated M.
• Training M.
• Perfect M.
Evaluation
• Prec., Recall, ...
Result • Reduc. Ratio, ..
• Runtime,
u t e, ...
Operator Tree Specification
Operator Library
• Blockingg
• Matcher
• Combination
• ...
Quality
Analysis:
Quality vs.
• Param. Effort
• Labeling
abe g Effort
ot
Configuration Specification
Configuration
Strategies for
• Non-Learn. Matchers
• Learn.-based Matchers
`
Match results are represented as instance mappings
(
(correspondences)
d
) between
b t
2 sources
¾ Mappings
`
can be stored for re-use
Source1
Source2
Sim
p1
p‘1
1
p2
p‘1
p
09
0.9
p3
p‘3
0.8
Matchers also operate on mappings
¾ Cartesian
product between input sources
¾ Output of previously executed matchers/operators
17
¾
Describe workflows implementing a match strategy
¾
Leaves: data sources
¾
inner nodes: operators (for blocking, matching etc.)
¾
Execution in post-order
post order traversal sequence
¾
Match result = Result of root operator
Merge
Matcher 1
Matcher 2
Blocking
Source1
18
Source2
¾ Blocking:
Sorted Neighborhood, Canopy Clustering, …
9 Necessary to reduce search space from Cartesian product to
more likely matching object pairs
¾ Attribute
matchers (on preselected pair of attributes):
¾ string similarity (TFIDF, Jaccard, Cosine, Trigram, …))
¾ PPJoinPlus, EdJoin
¾ External implementations, e.g., Fuzzy Lookup (MS SQL Server)
¾ Context
matchers (e.g., Neighborhood matcher)
¾ Combination
of match results: Merge, Compose
19
map1
1. Merge
A1
Titl Matcher
Title
M t h
map1
A2
map2
Author Matcher
map2
• Overcome short-comings (e.g., precision or recall)
2. Compose
A1
map1
A3
map2
• Efficient re-use of mappings
20
dblp
A2
p1
p2
p4
p‘‘1
pp‘‘2
p‘‘4
p‘1
pp‘2
p‘3
map2
B1
map1
A1
B2
p1
map3
A2
p2
... pn
v1
p‘2
p‘n ...
pp‘1
dblp
v‘1
`
C
Combine
bi
match
t h mappings
i
with
ith generall object
bj t relationships
l ti
hi
`
Bibliographic example: Conference@DBLP - Conference@ACM
` Attribute matching suffers from highly different values
`
`
„Two conferences are the same if they share a significant number of
publications.“
`
Reuse of match result for publications
Very effective in experiments
Thor, A.; Rahm, E.: MOMA - A Mapping-based Object Matching
System. Proc. CIDR, 2007
21
Configuration Specification
Operator Tree Specification
Model
Application
• Learner (Dec.
(Dec Tree,
Tree SVM
SVM, ...))
• Matcher selection
• No. of examples
• Selection scheme (Ratio, Random)
• Similarity threshold
22
Model
Generation
Blocking
Training
Data
Selection
Training
Data
Source1
Source2
¾ Use
of training data to find effective matcher combination
and configuration (supervised learning)
¾ Learners
for model generation in FEVER:
¾ Decision Tree, Logistic Regression, SVM
¾ Multiple learning approach
Cosine(title) > 0.629
-
+
Trigram(venue) > 0.197
-
+
EditDistance(year) > 0.25
Non-match
Trigram(authors) > 0.7
+
...
...
...
match
tree-based
Decision tree
based
match strategy
23
`
Training data: set of object pairs with manually
l b ll d match/mon-match
labelled
t h/
t h decisions
d i i
¾#
`
training pairs should be low (limit manual effort)
Training pairs should be non-trivial
non trivial
¾ Similarity
`
above a certain threshold
Selection approaches in FEVER
¾ RANDOM: randomly select n object pairs above a
similarity threshold t for labeling
¾ RATIO: reduce n randomly selected pairs (above sim.
threshold t) so that at least a fraction ratio (<=0.5)
(< 0 5) of
matching or non-matching pairs are in the training set
9 ratio 0.4: 40%/60%
matches/non-matches
/
/
((or vice versa))
9 Balances positive and negative training
24
7 real data sources:
`
Bibliographic:
g p
DBLP,, ACM Digital
g
library,
y,
GoogleScholar (GS)
E-commerce: Abt.com, Buy.com, Amazon.com,
G
Google
le Product
P d t Search
Se h (GP),
(GP)
from 1,100 to 64,000 objects per source
¾
¾
¾
4 match tasks
`
¾
¾
publications: DBLP-ACM
DBLP-GS
E-Commerce: Abt-Buy
Amazon - GP
P f
Perfect
mapping:
i
`
¾
¾
manually determined for bibliographic tasks
E commerce data
use of UPCs for E-commerce
25
Use of MS Fuzzy Lookup for comparison
` Similarity on 1-2 attributes
` Default setting for similarity threshold vs. manually
tuned settings (varying more than 1000 settings on
test data
d
using FEVER))
`
90%
70%
Baseline F-measure results
1 attribute
2 attributes
2 attributes (tuned)
50%
30%
Amaz
.-GP
ABTBuy
DBL
P-GS
26
DBL
PACM
10%
Amazon-GoogleBase
1.0
1.0
0.9
0.9
08
0.8
0.8
0.7
0.7
F-measu
ure
F-meas
sure
Abt-Buy
y
0.6
0.5
0.4
03
0.3
Random
Ratio
baseline strategy
0.2
0.1
0.6
0.5
0.4
Random
R ti 0
Ratio
0.4
4
baseline strategy
03
0.3
0.2
0.1
0.0
0.0
0.3
0.4
0.5
0.6
0.7
0.8
0.3
similarity threshold
0.4
0.5
0.6
0.7
0.8
similarity threshold
E-commerce tasks, labeling effort 50
27
1.0
0.9
0.9
0.8
Decision Tree
Logistic Regression
SVM
Multiple Learning
baseline strategy
0.7
06
0.6
0.8
Decision Tree
Logistic Regression
SVM
Multiple Learning
baseline strategy
0.7
06
0.6
0.5
0.5
20
50
100
labeling effort
28
F-me
easure
10
1.0
F-me
easure
DBLP-Scholar
F-me
easure
F-me
easure
DBLP-ACM
500
20
50
100
500
labeling effort
Bibliographic tasks, Ratio training selection
Abt-Buy
Abt
Buy
Amazon GoogleBase
Amazon-GoogleBase
1.0
1.0
09
0.9
0.8
F-mea
asure
F-me
easure
0.9
0.7
06
0.6
0.8
Decision Tree
Logistic Regression
SVM
Multiple Learning
baseline strategy
0.7
06
0.6
0.5
20
50
100
500
0.5
20
labelling effort
50
100
500
labeling effort
E-commerce tasks, Ratio training selection
29
30
Decision Tree
Logistic Regression
SVM
Multiple Learning
baseline strategy
`
Match configurations with several matchers are
diffi lt to
difficult
t tune
t
manually
ll with
ith currentt
implementations, e.g. MS Fuzzy Lookup
`
Learning-based
L
i
b
d match
h strategies
i can clearly
l
l
outperform manual match strategies even with
small training data,
data especially for challenging tasks
`
Ratio is a simple and effective approach for training
`
Multiple learning approach effectively combines
selection providing a balanced number of matching
and non-matching object pairs
several basic learners
`
Introduction (Object Matching)
`
FEVER platform for object matching strategies
¾ Architecture
¾ Manually
M
ll
specified
ifi d match
h strategies
i ((operator trees))
¾ Training-based learning of match strategies
¾ Evaluation
`
Dynamic object matching in mashups
¾ OCS
`
(Online
(O li Citation
Ci i
Service)
S i )
Instance-based ontology
gy matching
g
¾ Approaches
¾ Support
`
in COMA++
Conclusions
31
`
On-demand citation service (OCS)*
What are the most cited papers of conference X or author Y?
¾ Frequent changes, i.e., new publications & new citations
¾
`
Idea: Combine publication lists, e.g. from DBLP or
Pubmed, with citation counts, e.g from Google
Scholar Citeseer or Scopus
Scholar,
DBLP, Pubmed: high bibliographic data quality
¾ GS: large
g coverage
g of citations counts
¾
`
Query and match problem: Given a set of DBLP
publications
bli i
→ How
H
to effectively
ff i l find
fi d corresponding
di
GS publications?
* http://labs.dbs.uni-leipzig.de/ocs
32
Select publications
from DBLP
(Author venue,
(Author,
venue …))
Query generation
. for Google Scholar
Publication
list
Object Matching
(Query results vs.
publications
Iterative queryy refinement
Automatic generation of search queries, e.g. on
author, venue, title (pattern)
` Dynamic object matching for search results
`
33
Bibliographic data from DBLP
Corresponding
GS publications
34
Sum of
GS citations
35
`
Interactive approach, i.e., user selects match
th h ld
thresholds
Title
Year
Authors
80% +/- two years
50%
85% +/- one year
60%
90% equal year
70%
95%
80%
100%
90%
relaxed
restrictive
i i
100%
`
36
Aggregated result is adjusted automatically based
on match definition
`
Introduction (Object Matching)
`
FEVER platform for object matching strategies
¾ Architecture
¾ Manually
M
ll
specified
ifi d match
h strategies
i ((operator trees))
¾ Training-based learning of match strategies
¾ Evaluation
`
Dynamic object matching in mashups
¾ OCS
`
(Online
(O li Citation
Ci i
Service)
S i )
Instance-based ontology
gy matching
g
¾ Approaches
¾ Support
`
in COMA++
Conclusions
37
O1
pp g
Mapping:
Matching
O2
O1, O2
iinstances
t
`
O1.e11, O2.e23, 0.87
O1.e13,O2.e27, 0.93
…
further input,
e.g. dictionaries
Process of identif
identifying
ing semantic correspondences
between 2 ontologies
¾ Result:
ontology mapping
¾ Mostly equivalence mappings: correspondences specify
equivalent ontology concepts
`
38
Variation of schema matching problem
Amazon.com
Shopping.Yahoo.com
Electronics & Photo
Electronics
TV & Video
DVD Recorder
Beamer
DVD Recorder
Projectors
Digital Cameras
Digital Photography
Camera & Photo
Digital Cameras
Camcorders
`
Ontology mappings useful for
IImproving
i
query results,
l
e.g. to fi
find
d specific
ifi products
d
` Automatic categorization of products in different catalogs
` Merging catalogs
`
39
semantics of a concept/category may be better
p
by
y the instances associated to category
g y
expressed
than by metadata (e.g. concept name, description)
` Categories with most similar instances should match
`
¾ Requires
concepts
shared or similar instances for most/all
?
O1
O1
instances
?
O2
O2
instances
Two cases
- ontologies share instances
- ontologies do not share but have similar instances
40
Amazon
by Brands
Microsoft
Novell
Softunity
by Category
Books
DVD
Kid & H
Kids
Home
Software
Software
Languages
Utilities & Tools
Traveling
B i
Business
&P
Productivity
d ti it
Operating System
Windows
Burning
Software
Operating
System
Handheld
Software
Linux
Id =158298302X
EAN = "662644467122"
Title = "SuSE Linux 10.1 (DVD)"
Price = 49.99
Ranking = 180
Id = B0002423YK
Id = ECD435127K
EAN = 0662644467122
ProductName = "SuSE Linux 10.1"
DateOfIssue = 02.06.2006
P i =Id59
Price
59.95
= 95
ECD851350K
EAN = 0805529832282
ProductName = "WindowsXP Home"
DateOfIssue = 15.10.2004
Price = 238.90
EAN = 0805529832282
Title = "Windows XP Home Edition incl. SP2"
Price = 191.91
Ranking = 47
Thor, A., Kirsten, T., Rahm, E.: Instance-based matching of
hierarchical ontologies. Proc. 12th BTW Conf., 2007
41
Dmoz
Yahoo
Clothing
Sports
Sports
Swimwear
Water Sports
Swimming and Diving
Swimming and Diving
Gear and Equipment
Apparel
pp
URL =http://www.beachwear.net
Name =The Beachwear Network
Description
=Selection of beachwear.
URL
=http://www.skinzwear.com/
Name =Skinz Deepp
Description
=Swimwear, bikinis and
URL
=http://www.ritchieswimwear.com/
streetwear.
Name
= Ritchie Swimwear
Description =Designer brand for
women, men and little girls.
42
URL =www.skinzwear.com
Name =Skinz Deep, Inc.
Description
=Bikinis, swimwear,
URL =www.ritchieswimwear.com
beachwear,
and streetwear
Name =Ritchie
Swimwearfor men and
women.
Description =Offers bathing suits, beachware,
and cover-ups for men, women, and children.
Stores located throughout South Florida.
* Massmann, S., Rahm, E.: Evaluating Instance-based Matching of Web
Directories. Proc. WebDB 2008
Extends previous COMA prototype (VLDB2002)
` Matching of XML & rel
rel. Schemas and OWL ontologies
` Several match strategies: Parallel (composite) and
`
sequential
q
matching;
g Instance-based matching;
g Fragmentg
based matching for large schemas; Reuse of previous match
results
Model
M
d l Pool
P l
Directed
graphs
S1
S2
Import,
Load,
Save
Model
Manipulation
Match Strategy
Component
identification
{s11, s12, ...}}
{s21, s22, ...}
Mapping Pool
Matcher
execution
Similarity
Combination
Matcher 1
s11↔s21
s12↔s22
s13↔s23
Matcher 2
Matcher 3
Similarity cube
Mapping
s
Mapping
Nodes, ...
Paths, ...
Name, Children,
Leaves,
NamePath, …
Aggregation,
Direction,
Selection,
CombinedSim
Diff, Intersect,
Union,
MatchCompose,
Eval, ...
Component
Types
Matcher
Library
Combination
Library
Mapping
Manipulation
*Schema and Ontology Matching with COMA++. Proc. SIGMOD 2005
43
`
Instance matchers introduced in 2006
Constraint-based matching
` Content-based matching: 2 variations
`
`
Coma++ maintains instance value set per element
XML schema instances
(
Ontology instances
44
Instance constraints are assigned to schema elements
`
General constraints: always applicable
Example: average length and used characters (letters
(letters, numeral
numeral,
special char.)
Numerical constraints: for numerical instance values
Example
p : p
positive or negative,
g
, integer
g or float
“M @
“[email protected]”
il
” vs.
Pattern constraints:
“[email protected]”
Example: Email and URL
¾
¾
¾
Use of constraint similarity matrix to determine
element
l
t similarity
i il it (lik
(like d
data
t type
t
matching)
t hi )
` Simple and efficient approach
`
Effectiveness depends on availability of constrained value ranges / pattern
¾
A
Approach
h does
d
nott require
i shared
h
d iinstances
t
`
Compare
Constraints
c11↔c21
…
c1h↔c2k
Determine Constraints
General constraints
element1
Numerical constraints
element2
Pattern constraints
Similarity
value
Using
U
i Constraint
C t i t
Similarity Matrix
45
2 variations
`
¾
Value Matching: pairwise similarity comparison of instance
¾
Document (value set) matching: combine all instances into a
¾
`
values
virtual document and compare documents
Both approaches do not require shared instances
Document matching
¾
¾
1 instance document per category or selected string category attribute
(e.g. description)
Document comparison
p
based on TF-IDF to focus on most significant
g
terms
element1
element2
Compare virtual documents
document1
document2
instance11
instance21
instance12 Similarity function instance22
…
instance1n
46
based on
TF-IDF
…
instance2m
Similarity
value
Introduction (Object Matching)
` FEVER platform for object matching strategies
`
¾ Architecture
¾ Manually
specified match strategies (operator trees)
¾ Training-based
b
d llearning off match
h strategies
¾ Evaluation
`
Dynamic object matching in mashups
¾ OCS
`
(Online Citation Service)
Instance-based ontology matching
¾ Approaches
pp
¾ Support
`
in COMA++
Conclusions
47
`
Object matching is a critical step for data quality and
data integration
`
`
Offline and online data integration
Effective match strategies combining several
matchers are hard to find and tune
Very large number of possible combinations and
configurations
` High quality vs. efficiency tradeoff
` Utilization of domain knowledge
`
`
Learning-based approaches support semi-automatic
generation of suitable match strategies
Requires suitable training selection (e.g. Ratio approach)
` Multiple
is robust and effective
p Learning
g approach
pp
(but slow)
`
48
`
Instance
based matching of ontologies facilitated by
Instance-based
object matching
Instances can reflect well semantics of categories
Same/similar instances required in both ontologies
`
`
`
Instance-based
Instance
based matching in COMA++
`
`
3 basic instance matchers (constraint-based, content-based) not
requiring shared instances
Flexible combination with many metadata-based
metadata based approaches
`
Correct ontology mappings NOT limited to 1:1
correspondences
`
Support for high efficiency and high effectiveness
49
¾ Performance
P f
50
techniques,
h i
e.g. parallel
ll l object
bj
matching
hi
`
Evaluation and validation for larger datasets
`
Self-Tuning
Self
Tuning of context matchers
`
Scalable
l bl instance-based
b
d ontology
l
match
h
approaches
`
`
`
`
`
`
51
Köpcke, H.; Thor, A.; Rahm, E.: Comparative evaluation of
entity resolution approaches with FEVER. Proc. 35th Intl.
Conference on Very Large Databases (VLDB), Demo, 2009
Köpcke, H., Rahm, E.: Frameworks for Entity Matching An
Overview Data and Knowledge Engineering,
Overview.
Engineering 2009
Köpcke, H.; Rahm, E.: Training Selection for Tuning Entity
Matching. Proc. 6th Int. Workshop on Quality in Databases and
Management of Uncertain Data (QDB/MUD)
(QDB/MUD), 2008
Massmann, S.; Rahm, E.: Evaluating Instance-based matching
of web directories. Proc. 11th Int. Workshop on the Web and
Databases (WebDB)
(WebDB), 2008
Thor, A., Kirsten, T., Rahm, E.: Instance-based matching of
hierarchical ontologies. Proc. 12th German Database Conf.
(BTW) 2007
(BTW),
Thor, A.; Rahm, E.: MOMA - A Mapping-based Object
Matching System. Proc. of the 3rd Biennial Conf. on Innovative
Data Systems Research (CIDR)
(CIDR), 2007