Christos Christodoulopoulos, Cynthia Fisher and Dan Roth
Transcription
Christos Christodoulopoulos, Cynthia Fisher and Dan Roth
Exploring the assumptions of language acquisition models Christos Christodoulopoulos, Cynthia Fisher and Dan Roth Midwest Speech & Language Days 2015 Models of language acquisition “The girl chases the boy” 2 Models of language acquisition “The girl chases the boy” 2 Models of language acquisition “The girl chases the boy” 2 Models of language acquisition “The girl chases the boy” “The boy runs” 2 Semantic Role Labeling PropBank corpus [Palmer et al. 2005] Core arguments: A0 - Agent A1 - Patient A2 - Recipient … Modifiers: Locative Temporal Manner … 3 “The girl chases the boy” A0 pred A1 BabySRL [Connor et al. 2008; 2010] BabySRL corpus Adam, Eve, Sarah [Brown, 1973] “The girl chases the boy” A0 pred A1 4 BabySRL [Connor et al. 2008; 2010] BabySRL corpus Adam, Eve, Sarah [Brown, 1973] Adult utterances (cleaned up) “The girl chases the boy” A0 pred A1 4 BabySRL [Connor et al. 2008; 2010] BabySRL corpus Adam, Eve, Sarah [Brown, 1973] Adult utterances (cleaned up) Focus on verb predicates “The girl chases the boy” A0 pred A1 4 BabySRL [Connor et al. 2008; 2010] BabySRL corpus Adam, Eve, Sarah [Brown, 1973] Adult utterances (cleaned up) Focus on verb predicates 1 verb 2 args (24% of sent.) “The girl chases the boy” A0 pred A1 4 Experiment 1: Supervised learning Given perfect feedback, do simple, bottom-level features capture anything useful about semantic roles/verb preferences? 5 Experiment 1: Supervised learning • Supervised classifier (average perceptron) • 6 LBJava [Rizzolo and Roth, 2010] • Train on BabySRL corpus • Test on novel verb sentences Experiment 1: Supervised learning • Supervised classifier (average perceptron) • LBJava [Rizzolo and Roth, 2010] • Train on BabySRL corpus • Test on novel verb sentences Intransitive: “The bunny krads” Transitive: “The boy krads the girl” Ditransitive: “The girl krads the boy a bunny” 6 Experiment 1: Features • 7 Most frequent label A0 A1 The girl chases the boy Experiment 1: Features • • 7 Most frequent label Lexical features A0 A1 The girl chases the boy chase-girl chase-boy Experiment 1: Features • 7 Most frequent label • Lexical features • Noun Pattern A0 A1 The girl chases the boy chase-girl chase-boy 1st of 2 2nd of 2 Experiment 1: Features • 7 Most frequent label A0 A1 The girl chases the boy chase-girl chase-boy Noun Pattern 1st of 2 2nd of 2 Verb Position Before After • Lexical features • • Experiment 1: Results 100 A1A1 A0A0 A0A2 A0A1 75 50 25 0 Most Freq. 8 Lex +NPat +VPos +NPat & VPos Experiment 1: Results 100 A1A1 A0A0 A0A2 A0A1 75 50 25 0 Most Freq. 8 Lex +NPat +VPos +NPat & VPos Experiment 1: Results 100 A1A1 A0A0 A0A2 A0A1 75 50 25 0 Most Freq. 8 Lex +NPat +VPos +NPat & VPos Experiment 1: Results 100 A1A1 A0A0 A0A2 A0A1 75 50 25 0 Most Freq. 8 Lex +NPat +VPos +NPat & VPos Experiment 1: Results 100 A1A1 A0A0 A0A2 A0A1 75 50 25 0 Most Freq. 8 Lex +NPat +VPos +NPat & VPos Experiment 1: Results 100 A1A1 A0A0 A0A2 A0A1 75 Predicate knowledge 50 25 0 Most Freq. 8 Lex +NPat +VPos +NPat & VPos Multiple predicates “Remember how we play the surprise game?” A1 “Remember how we play the surprise game?” A0 9 A1 Multiple predicates “Remember how we play the surprise game?” A1 “Remember how we play the surprise game?” A0 9 # sent % 1 verb 10,356 69.86 2 verbs 3,614 24.38 A1 Effect of multiple predicates (Noun Pattern) 100 A2 A1 A0 75 50 25 0 all 10 first last Effect of multiple predicates (Noun Pattern) 100 A2 A1 A0 75 50 25 0 all 10 first last Effect of multiple predicates (Noun Pattern) 100 A2 A1 A0 75 50 25 0 all 10 first last Effect of multiple predicates (Noun Pattern) 100 A2 A1 A0 75 50 Syntactic surface the same e.g. NPat: 4th out of 5 25 0 all 10 first last Experiment 1: Supervised learning Given perfect feedback, do simple, bottom-level features capture anything useful about semantic roles/verb preferences? 11 Experiment 1: Supervised learning Given perfect feedback, do simple, bottom-level features capture anything useful about semantic roles/verb preferences? Yes, but predicate knowledge is crucial 11 Experiment 2: Unsupervised learning Can we predict arguments/predicates using distributional clusters and a few seed nouns? 12 Experiment 2: Unsupervised learning Can we predict arguments/predicates using distributional clusters and a few seed nouns? Syntactic Bootstrapping via Structure-Mapping [Gleitman, 1990; Fisher et al. 2010] 12 Experiment 2: Unsupervised learning • HMM over 2.2M tokens (CHILDES) • 80 induced clusters, list of function words • List of seed nouns [Dale and Fenson, 1996] • Noun identification “Cluster contains more than k seed nouns” 13 Experiment 2: Verb Identification She HMM 45 N Ident. N Funct. 14 krads 51 a 19 F red 60 truck 73 N Experiment 2: Verb Identification She HMM 45 N Ident. N Funct. 15 krads 51 a 19 F red 60 truck 73 N Experiment 2: Verb Identification She HMM 45 N Ident. N Funct. krads 51 a 19 red 60 F 60 0 args 1 arg 2 args 3 args 45 30 15 0 51 15 truck 73 N 60 Experiment 2: Verb Identification She HMM 45 N Ident. N Funct. krads 51 a 19 red 60 F 60 0 args 1 arg 2 args 3 args 45 30 15 0 51 15 truck 73 N 60 Experiment 2: Results 1 0.75 0.5 0.25 arg-F verb-F verbRand-F 0 1 16 25 49 73 Experiment 2: Results 1 0.75 0.5 0.25 arg-F verb-F verbRand-F 0 1 16 25 49 73 Experiment 2: Results 1 0.75 0.5 0.25 arg-F verb-F verbRand-F 0 1 16 25 49 73 Experiment 2: Parameters 17 • Random/frequent seed noun selection • Variants + plurals of seed nouns • Verb/predicate evaluation • Multiple predicates • Seed noun threshold k • Null predictions • Function words Experiment 2: Parameters • Random/frequent seed noun selection • Variants + plurals of seed nouns 1 • verb verb FREQ verbRand verbRand FREQ Verb/predicate evaluation 0.8 • Multiple predicates • Seed noun threshold k 0.4 • Null predictions 0.2 • 18 Function words 0.6 0 1 25 49 73 Experiment 2: Parameters • Random/frequent seed noun selection • Variants + plurals of seed nouns 1 • verb verb FREQ verbRand verbRand FREQ Verb/predicate evaluation 0.8 • Multiple predicates • Seed noun threshold k 0.4 • Null predictions 0.2 • 18 Function words 0.6 0 1 25 49 73 Experiment 2: Parameters • Random/frequent seed noun selection • Variants + plurals of seed nouns 1 • verb verbRand @24 seed nouns Verb/predicate evaluation 0.9 • 19 Multiple predicates 0.8 • Seed noun threshold k • Null predictions 0.6 • Function words 0.5 0.7 Freq Freq + Var Experiment 2: Parameters • Random/frequent seed noun selection • Variants + plurals of seed nouns 1 • verb verbRand @24 seed nouns Verb/predicate evaluation 0.9 • 19 Multiple predicates 0.8 • Seed noun threshold k • Null predictions 0.6 • Function words 0.5 0.7 Freq Freq + Var Experiment 2: Parameters • Random/frequent seed noun selection • Variants + plurals of seed nouns 1 • verb verbRand @24 seed nouns Verb/predicate evaluation 0.9 • 20 Multiple predicates 0.8 • Seed noun threshold k • Null predictions 0.6 • Function words 0.5 0.7 Relaxed Strict (all) Strict (first) Experiment 2: Parameters • Random/frequent seed noun selection • Variants + plurals of seed nouns 1 • verb verbRand @24 seed nouns Verb/predicate evaluation 0.9 • 20 Multiple predicates 0.8 • Seed noun threshold k • Null predictions 0.6 • Function words 0.5 0.7 Relaxed Strict (all) Strict (first) Experiment 2: Parameters • Random/frequent seed noun selection • Variants + plurals of seed nouns 1 • verb verbRand @24 seed nouns Verb/predicate evaluation 0.9 • 20 Multiple predicates 0.8 • Seed noun threshold k • Null predictions 0.6 • Function words 0.5 0.7 Relaxed Strict (all) Strict (first) Experiment 2: Unsupervised learning Can we predict arguments/predicates using distributional clusters and a few seed nouns? 21 Experiment 2: Unsupervised learning Can we predict arguments/predicates using distributional clusters and a few seed nouns? Yes, with as few as 24 seed nouns 21 Experiment 2: Unsupervised learning Can we predict arguments/predicates using distributional clusters and a few seed nouns? Yes, with as few as 24 seed nouns ne 21 s e t a c i d e r p e l p i t l u m r e d i s n o c ed to Conclusions • BabySRL model of language acquisition • • 22 Evidence for syntactic bootstrapping Exploration of assumptions • Data representation • Evaluation • Psycholinguistic validity Future Directions • BabySRL from scratch [Connor et al. 2012] • Beyond single predicates • 23 • Multiple verbs • Prepositions Relaxing perfect feedback (scene ambiguity) • Superset • Bootstrapped Animacy Future Directions • BabySRL from scratch [Connor et al. 2012] • Beyond single predicates • 23 • Multiple verbs • Prepositions Thanks Relaxing perfect feedback (scene ambiguity) • Superset • Bootstrapped Animacy