Anthony Fader, Stephen Soderland, and Oren Etzioni

Transcription

Identifying Relations for
Open Information
Extraction
Anthony Fader, Stephen Soderland, and Oren Etzioni
2011
TextRunner TR2
TR3 WOE ReVerb OLLIE
2007 2008 2009 2010
2011
2012
Agenda
•
•
•
Solution overview
Solution details
Evaluation
Why?
WOE and TextRunner
extractions are
incoherent,
uninformative
Why?
What?
WOE and TextRunner
extractions are
incoherent,
uninformative
Improve the quality
Why?
What?
WOE and TextRunner
extractions are
incoherent,
uninformative
Improve the quality
How?
Add constraints both
syntactic
and
lexical
Why?
WOE and TextRunner
extractions are
incoherent,
uninformative
Samples
Incoherentextraction phrase
has no meaningful interpretation
Why?
WOE and TextRunner
extractions are
incoherent,
uninformative
Samples
{
The Extractor makes a decision
about each word separately
}
up to 13-30% of output are incoherent
Why?
WOE and TextRunner
extractions are
incoherent,
uninformative
How?
Syntactic constraint:
multiword relation
1. begins with a verb;
2. end with a preposition;
3. is a contiguous sequence
of words in a sentence;
{… made a deal with …}
Samples
{
The Extractor makes a decision
about each word separately
}
up to 13-30% of output are incoherent
Why?
WOE and TextRunner
extractions are
incoherent,
uninformative
Samples
Uninformative extraction that
omit critical information
Why?
WOE and TextRunner
extractions are
incoherent,
uninformative
Samples
{
The Extractor handles improperly
verb-noun relation phrases (LVC)
}
4-7% are uninformative
Ex. Faust made a deal with the devil.
(Faust, made, a deal)-(Faust, made a deal with, the devil)
Why?
WOE and TextRunner
extractions are
incoherent,
uninformative
How?
Samples
{
The Extractor handles improperly
verb-noun relation phrases (LVC)
}
4-7% are uninformative
Syntactic constraint:
multiword relation
1. begins with a verb;
2. end with a preposition;
3. is a contiguous sequence
of words in a sentence;
{… made a deal with …}
Ex. Faust made a deal with the devil.
(Faust, made, a deal)-(Faust, made a deal with, the devil)
Demo time
Are we perfect now?
No, overly specific relation phrases
Example
The Obama administration
is offering only modest greenhouse gas reduction targets at
the conference
Example
[The Obama administration]
is offering only modest greenhouse gas reduction targets at
[the conference]
Example
[The Obama administration]
{is offering only modest greenhouse gas reduction targets at }
[the conference]
How: what is inside?
Constraints
Syntactic
Constraints
Syntactic
Constraints
Syntactic
Lexical
Constraints
Syntactic
Lexical
Valid relation phrase
should take many distinct
arguments in a large corpus
Constraints
[ arg1 - rel - arg2 ]
common relation for OIE
Syntactic
Lexical
Constraints
Syntactic
Lexical
Extendicare agreed to buy Arbor Health Care
for about US $432 million in cash and assumed debt.
Constraints
Syntactic
Lexical
“Extendicare agreed to buy Arbor Health Care
for about US $432 million in cash and assumed debt.”
TextRunner output: (Arbor Health Care, for assumed, debt).
First evaluation: how much do we lose?
Loosing recall
First test set
-
Random web pages
300 sentences
327 verb relation phrases
Loosing recall
First test set
-
Random web pages
300 sentences
?
Loosing recall
First test set
-
Random web pages
300 sentences
dependency parsers
are still slow on web-scale
ReVerb
Sentence
POSed
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
ReVerb
1. Relation extraction
Sentence
POSed
2. Argument extraction
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
ReVerb
Sentence
POSed
(find a verb->
expand it satisfying constraints)
(find the nearest on the left,
find the nearest on the right)
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
ReVerb
Sentence
POSed
(find a verb->
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
We
talk about Open Information Extraction
PRP VBP IN
NNP NNP
NNP
B-NP B-VP B-PP B-NP I-NP
I-NP
ReVerb
Sentence
POSed
(find a verb->
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
We
[talk] about Open Information Extraction
PRP [VBP ] IN
NNP NNP
NNP
B-NP [B-VP] B-PP B-NP I-NP
I-NP
ReVerb
Sentence
POSed
(find a verb->
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
We
[talk about] Open Information Extraction
PRP [VBP IN
] NNP NNP
NNP
B-NP [B-VP B-PP ] B-NP I-NP
I-NP
ReVerb
Sentence
POSed
(find a verb->
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
{We } [talk about] Open Information Extraction
{PRP } [VBP IN
] NNP NNP
NNP
{B-NP} [B-VP B-PP ] B-NP I-NP
I-NP
ReVerb
Sentence
POSed
(find a verb->
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
{We } [talk about] {Open Information Extraction}
{PRP } [VBP IN
] {NNP NNP
NNP
}
{B-NP} [B-VP B-PP ] {B-NP I-NP
I-NP
}
ReVerb
Sentence
POSed
(find a verb->
(X1; R1; Y1)
(X2; R2; Y2)
…
(Xn; Rn; Yn)
{We } [talk about] {Open Information Extraction}
{PRP } [VBP IN
] {NNP NNP
NNP
}
{B-NP} [B-VP B-PP ] {B-NP I-NP
I-NP
}
(‘we’; ’talk about’; ‘open information extraction’)
But!
Still low precision
even though recall is comparatively high
But!
Still low precision
even though recall is comparatively high
?
Confidence function
train conf. function based on
Web+Wiki,
all extractions from 1000 sent.
Confidence function
Web+Wiki,
Confidence function
Web+Wiki,
Evaluation results
500 sent.
from Web
2 judges,
0.68 agr.
Evaluation results
500 sent.
from Web
2 judges,
0.68 agr.
boost over lex
flow
TR & WOE
1.
2.
3.
Auto Labeling the sentence
An extractor is learned
using sequence labeling
graphical model
extraction: arguments, label
the relation between args
as part of relation phrase
ReVerb
1.
2.
Relation extraction
Argument extraction
Evaluation results
Achievements
Achievements
Incoherent extractions elimination if much better
Achievements
Outperforming precision-recall
Achievements
Outperforming precision-recall
Faster
ReVerb Error analysis
Possible improvements
Precision
ReVerb Error analysis
Possible improvements
Recall
Thank you
Open Language Learning
for Information Extraction
Mausam, Michael Schmitz, Robert Bart,
Stephen Soderland, and Oren Etzioni
TextRunner TR2
TR3 WOE ReVerb OLLIE
2007 2008 2009 2010
2011
2012
2
Agenda
Introduction
Relation Extraction
Context Analysis
Evaluation
3
why?
Reverb and WOE
V | V P | V W*P Only for verbs
OLLIE
Nouns, adjectives, and more
4
why?
Reverb and WOE
No context is taken into account
OLLIE
Including contextual information
5
why?
Reverb and WOE
OLLIE
XXXX
XXXX
Not factual extractions
6
why?
Reverb and WOE
The last version (Nov 2015)
doesn’t give these results
OLLIE
XXXX
XXXX
Not factual extractions
7
Introduction
+ conf.function
8
Bootstrapping set
Goal
Create large training set
Hypothesis
Every relation
can be expressed in Reverb style
Sentences express original tuple
110th seed tuples
from ReVerb from ClueWeb
(Students,build, bootstrap set)
Extract all sentences
with the same content words
…
(Bootstrap set is built by students)
(while working on OIE, students built the set)
9
Bootstrapping set
Goal
Create large training set
Hypothesis
Every relation
can be expressed in Reverb style
Sentences express original tuple
110th seed tuples
from ReVerb from ClueWeb
(Students,build, bootstrap set)
Extract all sentences
with the same content words
Bootstrapping error reduction
(Bootstrap set is built by students)
(while working on OIE, students built the set)
(students worked on a set of tasks,
workers built a new cafe on the campus)
linear path size<5
10
Open Pattern Learning
Goal
Learn general patterns that
encode diff types of relations
11
Goal
Open pattern templates
dep path
Sample for 2.:
open extraction
We do interesting things.
12
Goal
Open pattern templates
dep path
open extraction
But what are syntactic and semantic patterns?
13
Purely syntactic patterns
There are no slot nodes in the path
Relation node in bw/ (arg1, arg2)
The prep edge in the pattern matches
the prep in relation
Path has no nn and amor edges
14
Semantic/lexical patterns
together with words/types
the pattern is used with
(generalize words into types
The prep edge in the pattern matches if it’s possible)
15
together with words/types
the pattern is used with
(generalize words into types
The prep edge in the pattern matches if it’s possible)
16
Example
17
Context analysis
Conditional truth
Attribution
ClausalModifier
AttributeTo
advcl dep edge
only if we meet
{if, then, although}
ccomp dep edge
match context verb
with list of
comm/cogn verbs from
VerbNet
18
Comparison
19
20
21
22
Thank you
23

Anthony Fader, Stephen Soderland, and Oren Etzioni

Transcription

Similar documents