Christos Christodoulopoulos, Cynthia Fisher and Dan Roth

Transcription

Christos Christodoulopoulos, Cynthia Fisher and Dan Roth
Exploring the assumptions of
language acquisition models
Christos Christodoulopoulos, Cynthia Fisher and Dan Roth
Midwest Speech & Language Days 2015
Models of language acquisition
“The girl chases the boy”
2
Models of language acquisition
“The girl chases the boy”
2
Models of language acquisition
“The girl chases the boy”
2
Models of language acquisition
“The girl chases the boy”
“The boy runs”
2
Semantic Role Labeling
PropBank corpus
[Palmer et al. 2005]
Core arguments:
A0 - Agent
A1 - Patient
A2 - Recipient
…
Modifiers:
Locative
Temporal
Manner
…
3
“The girl chases the boy”
A0
pred
A1
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
“The girl chases the boy”
A0
pred
A1
4
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
Adult utterances (cleaned up)
“The girl chases the boy”
A0
pred
A1
4
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
Adult utterances (cleaned up)
Focus on verb predicates
“The girl chases the boy”
A0
pred
A1
4
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
Adult utterances (cleaned up)
Focus on verb predicates
1 verb 2 args (24% of sent.)
“The girl chases the boy”
A0
pred
A1
4
Experiment 1: Supervised learning
Given perfect feedback, do simple, bottom-level
features capture anything useful about semantic
roles/verb preferences?
5
Experiment 1: Supervised learning
•
Supervised classifier (average perceptron)
•
6
LBJava [Rizzolo and Roth, 2010]
•
Train on BabySRL corpus
•
Test on novel verb sentences
Experiment 1: Supervised learning
•
Supervised classifier (average perceptron)
•
LBJava [Rizzolo and Roth, 2010]
•
Train on BabySRL corpus
•
Test on novel verb sentences
Intransitive: “The bunny krads”
Transitive: “The boy krads the girl”
Ditransitive: “The girl krads the boy a bunny”
6
Experiment 1: Features
•
7
Most frequent label
A0
A1
The girl chases the boy
Experiment 1: Features
•
•
7
Most frequent label
Lexical features
A0
A1
The girl chases the boy
chase-girl
chase-boy
Experiment 1: Features
•
7
Most frequent label
•
Lexical features
•
Noun Pattern
A0
A1
The girl chases the boy
chase-girl
chase-boy
1st of 2
2nd of 2
Experiment 1: Features
•
7
Most frequent label
A0
A1
The girl chases the boy
chase-girl
chase-boy
Noun Pattern
1st of 2
2nd of 2
Verb Position
Before
After
•
Lexical features
•
•
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
Predicate
knowledge
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Multiple predicates
“Remember how we play the surprise game?”
A1
“Remember how we play the surprise game?”
A0
9
A1
Multiple predicates
“Remember how we play the surprise game?”
A1
“Remember how we play the surprise game?”
A0
9
# sent
%
1 verb
10,356
69.86
2 verbs
3,614
24.38
A1
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
25
0
all
10
first
last
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
25
0
all
10
first
last
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
25
0
all
10
first
last
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
Syntactic surface the same
e.g. NPat: 4th out of 5
25
0
all
10
first
last
Experiment 1: Supervised learning
Given perfect feedback, do simple, bottom-level
features capture anything useful about semantic
roles/verb preferences?
11
Experiment 1: Supervised learning
Given perfect feedback, do simple, bottom-level
features capture anything useful about semantic
roles/verb preferences?
Yes, but predicate knowledge is crucial
11
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
12
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
Syntactic Bootstrapping via Structure-Mapping
[Gleitman, 1990; Fisher et al. 2010]
12
Experiment 2: Unsupervised learning
•
HMM over 2.2M tokens (CHILDES)
•
80 induced clusters, list of function words
•
List of seed nouns [Dale and Fenson, 1996]
•
Noun identification
“Cluster contains more than k seed nouns”
13
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
14
krads
51
a
19
F
red
60
truck
73
N
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
15
krads
51
a
19
F
red
60
truck
73
N
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
krads
51
a
19
red
60
F
60
0 args
1 arg
2 args
3 args
45
30
15
0
51
15
truck
73
N
60
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
krads
51
a
19
red
60
F
60
0 args
1 arg
2 args
3 args
45
30
15
0
51
15
truck
73
N
60
Experiment 2: Results
1
0.75
0.5
0.25
arg-F
verb-F
verbRand-F
0
1
16
25
49
73
Experiment 2: Results
1
0.75
0.5
0.25
arg-F
verb-F
verbRand-F
0
1
16
25
49
73
Experiment 2: Results
1
0.75
0.5
0.25
arg-F
verb-F
verbRand-F
0
1
16
25
49
73
Experiment 2: Parameters
17
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
•
Verb/predicate evaluation
•
Multiple predicates
•
Seed noun threshold k
•
Null predictions
•
Function words
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verb FREQ
verbRand
verbRand FREQ
Verb/predicate evaluation
0.8
•
Multiple predicates
•
Seed noun threshold k
0.4
•
Null predictions
0.2
•
18
Function words
0.6
0
1
25
49
73
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verb FREQ
verbRand
verbRand FREQ
Verb/predicate evaluation
0.8
•
Multiple predicates
•
Seed noun threshold k
0.4
•
Null predictions
0.2
•
18
Function words
0.6
0
1
25
49
73
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
19
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Freq
Freq + Var
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
19
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Freq
Freq + Var
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
20
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Relaxed
Strict (all)
Strict (first)
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
20
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Relaxed
Strict (all)
Strict (first)
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
20
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Relaxed
Strict (all)
Strict (first)
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
21
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
Yes, with as few as 24 seed nouns
21
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
Yes, with as few as 24 seed nouns
ne
21
s
e
t
a
c
i
d
e
r
p
e
l
p
i
t
l
u
m
r
e
d
i
s
n
o
c
ed to
Conclusions
•
BabySRL model of language acquisition
•
•
22
Evidence for syntactic bootstrapping
Exploration of assumptions
•
Data representation
•
Evaluation
•
Psycholinguistic validity
Future Directions
•
BabySRL from scratch [Connor et al. 2012]
•
Beyond single predicates
•
23
•
Multiple verbs
•
Prepositions
Relaxing perfect feedback (scene ambiguity)
•
Superset
•
Bootstrapped Animacy
Future Directions
•
BabySRL from scratch [Connor et al. 2012]
•
Beyond single predicates
•
23
•
Multiple verbs
•
Prepositions
Thanks
Relaxing perfect feedback (scene ambiguity)
•
Superset
•
Bootstrapped Animacy

Similar documents