Morphosyntactic production in a head

Transcription

Morphosyntactic production in a head
1
Morphosyntactc producton in a
head-marking language
Order, agreement, and optonal morphology
in Yucatec Maya
Lindsay K. Butler, U. Rochester
Elisabeth Norcliffe, MPI for Psycholinguistics
Jürgen Bohnemeyer, U. at Buffalo, SUNY
T. Florian Jaeger, U. Rochester
Acknowledgments: It takes a team
• Space & participant recruitment:
–
–
–
–
–
Carlos Pérez, Director of UNO
Marta Beatriz Poot Nahuat
Ángel Viriglio Salazar
Michal Brody
Betsy Kraf
• Programming of experiments:
– Andrew Watts
– Carlos Gomez Gallo (post-doc, Miami)
• Experiment and travel logistics
– Carlos Gomez Gallo (post-doc, Miami)
– Ashlee Shinn (Univ. at Buffalo)
– Katrina Furth (grad, Boston U)
• Transcription and annotation:
– Samuel Canul Yah (UNO)
– José Cano Sosaya (UNO)
Stimulus preparation
Yucatec recordings:
Samuel Canul Yah (UNO)
Gerónimo Can Tec (UNO)
Serapio Canul Dzib (UNO)
Video creation
Katrina Furth (grad, Boston U)
Cassandra Jacobs, Irene Minkina, Andy Wood (undergrads,
Rochester)
Funding:
NSF Grant BCS-0844472 to JB and TFJ
Wilmot Award, Alfred P. Sloan Fellowship to TFJ
Dissertation Improvement Grant from SBSRI, Univ. of
Arizona to LKB
Mellon/ACLS dissertation completion fellowship awarded
to EJN
[2]
Why study the processing of
“exotc” languages?
• Psycholinguistics and the empiricist turn in the
social/behavioral sciences
• Moving away from data exclusively from
• College students
• who are members of the WEIRDest societies
– “western, educated, industrialized, rich, and democratic”
[Henrich et al 2010]
• and speak mostly English or closely related languages
[3]
Sources of potental
language-specific effects
• Variation
– configurationality
– constituent order
– head-marking vs. dependentmarking
– argument ellipsis
– presence and organization
of grammatical relations
– voice and alignment systems
– other functional categories
– lexical categories …
Dryer, Matthew S., 2011. Order of Object and Verb.
In: Dryer, Matthew S. & Haspelmath, Martin (eds.)
WASL. Munich: Max Planck Digital Library, feature 83A.
Available online at http://wals.info/feature/83A.
Accessed on 2011-12-20.
 Short-before-long reversed for head-final
languages [Hawkins 2004,2007, Yamashita
and Chang 2001, Chang 2001, Choi 1997]
[4]
Roadmap
• Background
– Part 1a: Challenges and methods
– Part 1b: Introducing Yucatec Maya
• Example studies
– Part 2: Redundancy and reduction
– Part 3: Accessibility-based production
– Part 4: Optional plural marking
• Part 5: Revisitng methods and conclusions
[5]
Part 1a
Challenges and methods
[6]
Challenges and methodological issues
• High uncertainty about linguistic structures of target language
• Methodological and cultural issues
–
–
–
–
–
Literacy
Computer literacy
Attention span
Interpretation of the task (more practice trials, more instructions)
Different norms about privacy, personal information
• Logistical issues
– Maximizing output per visit
– Participant recruitment
[7]
Part 1b
Yucatec Maya
[8]
Yucatec basics
• Mayan, Yucatecan branch
• 759,000 speakers age
5+ in Mexico in 2005
– http://www.inegi.gob.mx
• Polysynthetic
– but relatively rigid order
Our study site:
UNO Valladolid
JB’s field site
- Yaxley
• Verb-initial, VOS
– but lef-dislocation
pervasive in discourse
• Typologically different
from English
Figure 1. Approximate geographic area
where Yucatec is spoken
[9]
Yucatec: Our study sites
• La Universidad de Oriente in
Valladolid, Yucatán, Mexico
– Sound-proof recording room
– Computer-literate
participants
– Familiar with testing
paradigms
• Other field sites
– Valladolid surrounding
villages
– Yaxley
[10]
Part 2
Redundancy and Reducton
(Norcliffe 2009, Jaeger & Norcliffe, in prep)
[11]
Redundancy and grammatcal choice
• The language production system exhibits a bias to reduce the
expected (contextually redundant), e.g.:
– Phonetic or phonological reduction is more common for contextually
expected instances of words
[e.g. AylettTurk04,06; BellETAL03,09; GahlGarnsey04; TilyETAL09]
– Morphological contraction of negation or auxiliaries is more common
when the contractible element is contextually expected [FrankJaeger08;
Melnick10; cf. BybeeScheibman99]
– Optional function words are likely to be omitted if the phrase they
introduce is contextually expected
[Jaeger10,11; LevyJaeger07; WasowETAL11]
– Optional arguments are more likely to be omitted if their meaning is
more expected given the verb [Resnik96]
[12]
Theoretcal relevance
• Findings like these have been taken by some as evidence that the
mechanisms underlying language production are organized to
facilitate robust communicaton
[Jaeger 06,10; LevyJaeger07; see also AylettTurk04; Fenk-Oczlon01; Lindblom90;
vanSon&vanSanten05 and related ideas: e.g. Zipf49, Givon92]
Also offers potential account of:
Differential case-marking [FedzechkinaETAL11; KurumadaJaeger12]
Pronominalization [Arnold98; ArnoldGriffin07; TilyPiantadosi09]
Word order alternations [MauritsETAL10]
Derivation of Zipf’s law [PiantadosiETAL11]
• But, except for some work on phonetic reduction, all evidence
comes from English.
[13]
Assessing the effects of redundancy in
Yucatec sentence producton
• We present a first step as to how to explore this question for a
language like Yucatec.
• The phenomenon: Optional morphology in Yucatec Maya
relative clauses [Bricker78, Gutiérrez-BravoMonforte09, Norcliffe09]
a.
Le turista ku-t’aan-ik maya-o’
DEF tourist ASP.A3-speak-INC maya-D2
b.
Le turista t’aan-ik maya-o’
DEF tourist speak maya-D2
“The tourist who speaks Maya”
[ ‘Agent-Focus voice’]
[14]
Assessing the effects of redundancy in
Yucatec sentence producton
• Hypothesis: choice of morphological form is infuenced by the
expectedness of the relative clause
• Point of departure: parallels with English omission phenomena
– Optional that in English object relative clauses
The cake that he baked
The cake he baked
• That omission correlates with expectedness of the relative clause
[Wasow, Jaeger, Orr, 2011]
[15]
Optonal that in English relatve clauses
• For pragmatic reasons, some properties of noun phrases
lead to increased probability of a relative clause …
[that I had ever seen …]
This is the thickest book
DT ADJ
N
… which correlates with increased omission of that
Relative Clause Rate and that Rate by Determiner
Omission
of that
RCs without that
100%
75%
50%
25%
0%
0%
adjusted r2=.91
2%
4%
6%
8%
10%
Expectedness of RC given DT
NPs with RCs
12%
[Wasow, Jaeger, Orr, 2011]
[16]
Language-specific morphosyntactc cue
• Yucatec boundary morphology is a cue to the likelihood of an
upcoming RC
– Definite NPs require a NP-final deictic particle (–o’)
Absence of -o’
Xmariae’ tu-che’ehtah le turista-o’
ku t’aanik maya-o’
Maria laughed at the tourist
who speaks Maya
• Therefore, absence of –o’ particle directly afer the noun is a strong cue
that the NP contains post-nominal modification (including relative clauses)
[17]
Predicton
• The distribution of Yucatec
boundary morphology increases
the expectedness of relative
clauses afer definite NPs,
compared to indefinite NPs …
• … speakers should prefer to use
reduced verb forms afer definite
NPs.
[18]
Method
Spoken sentence recall
[19]
Result
• Click to edit Master text styles
– Second level
• Third level
*
– Fourth level
» Fifh level
Fewer full RC verb
forms (with ku) if
modified NP is
definite and lacks the
-o’ particle.
Full RC
verb
Final
particle
Le turista ku-t’aan-ik maya-o’
Modified NP
Relative clause
Conclusion
• Yucatec speakers prefer morphological reduction where
RCs are highly expected
 An effect of a preference for communicative robustness
seems to show up in Yucatec as in English
This generalization only becomes apparent once the
language-specific morpho-syntactic cue is taken into
account [cf. Hawkins04,07,11]
[21]
Part 3
Accessibility
(Butler, Jaeger, Bohnemeyer, Gómez Gallo, Furth, in prep)
[22]
Accessibility-based producton
• Crosslinguistically, conceptually accessible, e.g. more
animate, tend to be ordered early and aligned with
prominent grammatical function, e.g. subject
[BockWarren85, Branigan et al. 2007, Tanaka et al. 2007]
– Does this effect hold across languages , e.g. head-marking?
“The swing
hit the
scooter”
“The man
was hit by
the swing”
[from Prat-Sala & Branigan 2000]
[23]
Experiment
• Video description task: Human and animal agents and
undergoers
[24]
Experiment (cntd)
• Video description task: Human and inanimate “agents”
and undergoers
[25]
Results
• Animacy significantly affected constituent order (human
patients more likely to result in OVS) (X2 (1) = 17.1, p < 0.0001)
• Animacy of the patient, however, did not significantly
affect voice choice (active vs. passive)
“The man,
the truck
pulled him”
“The man
was chased
by the dog”
[26]
Additonal results
• Universal effects of animacy on constituent order
• Variation in size of the effect and language-particulars
Animacy and order in Yucatec
Animacy and order in Spanish
[27]
Part 4
Plural Marking
(Butler 2011)
[28]
Language-specific morphosyntax
• The optional nominal plural marker in Yucatec, –o’ob, is
right-adjoined to the DP (occupying a high position and
occuring linearly late in the phrase) [Butler11]
– Predicted by the syntax of plural marking [Wiltschko08]
a. The girlSG and the womanSG
b. The girlsPL and the womanSG
c. The girlSG and the womenPL
d. The girlsPL and the womenPL
[29]
Experiment Design
• Is there experimental evidence for the DP-adjoined plural
hypothesis in Yucatec Maya?
Translation task with conjoined noun phrases
●
N1-SG and N2-SG Verb (intransitive)
●
N1-SG and N2-PL Verb (intransitive)
●
N1-PL and N2-SG Verb (intransitve)
●
N1-PL and N2-PL Verb (intransitive)
The DP-adjoined plural hypothesis predicts N1-Ø N2-PL
responses in Yucatec to be possible in Cond. 3
[30]
Results
• Phrase-final, DP-adjoined plural hypothesis predicts
Yucatec responses
Spanish stmulus conditons
[31]
Results
• Responses ruled out by “underspecification”
Yucatec responses
Spanish stmulus conditons
[32]
Point of departure
• Morphosyntactc priming in translaton: Plural marking
is obligatory in Spanish and optional in Yucatec, thus an
inherent potential for crosslinguistic priming in the task
Spanish
Stimulus: Las muchachas[PL] y las mujeres[PL] …
The girls
and the women
Yucatec:
Response: Le x-ch’úupal-o’ob[PL] yéetel le ko’olel-o’ob[PL]-o’
The girls
and
the women
[33]
Translaton vs. picture descripton
• Use of plural marking in singular/one, two, and
plural/many conditions compared
• TRANSLATION TASK: Singular, “Two”, Plural
“The baby is crying”
“Two babies are crying” “The babies are crying”
• PICTURE DESCRIPTION TASK: One, Two, Many (seven)
[34]
Translaton vs. picture descripton results
*
• Click to edit Master text styles
*
– Second level
• Third level
– Fourth level
» Fifh level
Plural use in translation task
Plural use in picture description task
[35]
Results
• Accounted for by “underspecification” and priming
Yucatec responses
Spanish stmulus conditons
Remaining data
unambiguously accounted
for by DP-adjoined, phrasefinal morphosyntax
[36]
Conclusion
• Some responses only
accounted for by the DPadjoined phrase-final
plural hypothesis
• Production results
informing linguistic theory
[37]
Part 5
Revisitng methods & Conclusions
[38]
Methods revisited
Advantages
Disadvantages
Video description
tasks
Elicits unscripted
speech
Limited to messages
that can be
unambiguously
depicted
Recall tasks
Especially useful for Unfamiliar task
messages that are
not easily depictable
Translation tasks
More familiar (in
bilingual
communities)
Priming from
stimulus language
[39]
Conclusions
• Despite inherent challenges to field-based
psycholinguistics, the crosslinguistic perspective provided
by typologically diverse languages is essential to research
on human language processing
• Language-specific effects can explain results that are
otherwise counter to known effects [cf. Hawkins04,07,11]
• Quantitative production data to address structural
differences informed by linguistic theory
[40]
Acknowledgments
• Space & participant recruitment:
–
–
–
–
–
Carlos Pérez, director of UNO
Marta Beatriz Poot Nahuat
Ángel Viriglio Salazar
Michal Brody
Betsy Kraf
• Programming of experiments:
– Andrew Watts (U. Rochester)
– Carlos Gomez Gallo (post-doc, Miami)
• Experiment and travel logistics
– Carlos Gomez Gallo (post-doc, Miami)
– Ashlee Shinn (Univ. at Buffalo)
– Katrina Furth (grad, Boston U)
• Transcription and annotation:
– Samuel Canul Yah (UNO)
– José Cano Sosaya (UNO)
Stimulus preparation
Yucatec recordings:
Samuel Canul Yah (UNO)
Gerónimo Can Tec (UNO)
Serapio Canul Dzib (UNO)
Video creation
Katrina Furth (grad, Boston U)
Cassandra Jacobs, Irene Minkina, Andy Wood (undergrads,
Rochester)
Funding:
NSF Grant BCS-0844472 to JB and TFJ
Wilmot Award, Alfred P. Sloan Fellowship to TFJ
Dissertation Improvement Grant from SBSRI, Univ. of
Arizona to LKB
Mellon/ACLS dissertation completion fellowship awarded
to EJN
[41]
Selected references
Arnold, J. E. (1998). Reference Form and Discourse Patterns. Dissertation, Stanford University
Arnold, J.E., & Griffin, Z. (2007). The Effect of Additional Characters on Choice of Referring Expression: Everyone Competes. Journal of Memory and Language
Aylett, M. P., & Turk, A. (2004). The Smooth Signal Redundancy Hypothesis: A Functional Explanation for Relationships between Redundancy, Prosodic
Prominence, and Duration in Spontaneous Speech. Language and Speech, 47(1), 31-56.
Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M., & Gildea, D. (2003). Effects of disfuencies, predictability, and utterance position on word form
variation in Englishconversation. Journal of the Acoustical Society of America, 113(2), 1001-1024.
Bell, A., Brenier, J. M., Gregory, M., Girand, C., & Jurafsky, D. (2009). Predictability effects on durations of content and function words in conversational English.
Journal of Memory and Language, 60(1), 92-111.
Bock, J. K. (1986). Syntactic persistence in language production. Cognitive Psychology, 18, 355- 387.
Bock, J. K., & Warren, R. (1985). Conceptual accessibility and syntactic structure in sentence formulation. Cognition, 21.
Branigan, H. P., Pickering, M. J., & Tanaka, M. (2008). Contributions of animacy to grammatical function assignment and word order during production. Lingua,
118 (172-189).
Bresnan, J., Cueni, A., Nikitina, T., & Baayen, H. (2007). Predicting the Dative Alternation. In G. Boume, I. Kraemer & J. Zwarts (Eds.), Cognitive Foundations of
Interpretation (pp. 69-94).Amsterdam: Royal Netherlands Academy of Science.
Bricker, V. R. 1978. The source of the ergative split in Yucatec Maya. Journal of Mayan Linguistics, 2, pp. 83–127.
Butler, L. K. 2011. The morphosyntax and processing of number marking in Yucatec Maya. Ph.D. thesis. University of Arizona.
Bybee, Joan and Joanne Scheibman. 1999. The effect of usage on degrees of constituency: the reduction of don't in English. Linguistics 37-4. 575-596.
Christianson, K. and F. Ferreira. 2005. Conceptual accessibility and sentence production in a free word order language. Cognition 98: 105—135.
Fedzechkina, M., Jaeger, T. F. and E. Newport. Functional biases in language learning: Evidence from word order and case-marking interaction. The 33rd Annual
Meeting of the Cognitive Science Society, Boston, July 2011.
Fenk-Oczlon, G. 2001. Familiarity, Information Flow, and Linguistic Form. In: J. Bybee and P. Hopper (eds.) Frequency and the Emergence of Linguistic Structure,
431-448. Amsterdam/Philadelphia: John Benjamins Publishing Company.
Ferreira, V. and H. Yoshita. 2003. Given-new ordering effects on the proudction of scrambled sentences in Japanese. Journal of Psycholinguistic Research 32: 6,
669-692.
Frank, A., & Jaeger, T. F. 2008. Speaking Rationally: Uniform Information Density as an Optimal Strategy for Language Production The 30th Annual Meeting of
the Cognitive Science Society (CogSci08) (pp. 933-938).
Gahl, S., & Garnsey, S. M. (2004). Knowledge of grammar, knowledge of usage: syntactic probabilities affect pronunciation variation. Language, 80(4), 748-775.
[42]
Selected references
Gutiérrez-Bravo, R. & J. Monforte. 2009. 'Focus, agent focus and relative clauses in Yucatec Maya', in New Perspectives on Mayan Linguistics, H. Avelino, J.
Coon, & E. Norcliffe (eds.), MIT Working Papers in Linguistics.
Hawkins, J. A. (2004). Efficiency and Complexity in Grammar. Oxford: Oxford University Press.
Hawkins, J. A. (2007). Processing typology and why psychologists need to know about it. NewIdeas in Psychology, 25, 87–107.
Jaeger, T. F. (2006). Redundancy and Syntactic Reduction in Spontaneous Speech. PhD thesis, Stanford University.
Jaeger, T. F. (2010). Redundancy and reduction: Speakers manage syntactic information density. Cognitive Psychology, 61, 23-62.
Jaeger, T. F., & Norcliffe, E. (2009). The cross-linguistic study of sentence production. Language and Linguistics Compass, 3, 866-887
Kurumada, C. and T. F. Jaeger. 2012. Communicatively efficient language production and case-marker omission in Japanese. Paper presented at the 86th
Annual Meeting of the Linguistic Society of America.
Levy, R. and T. F. Jaeger. 2007. Speakers optimize information density through syntactic reduction. Proceedings of the Twentieth Annual Conference on
Neural Information Processing Systems.
Lindblom, B. (1990). Explaining phonetic variation: a sketch of the H&H theory. Speech production and speech modelling, 55, 403-439.
Maurits, L., Perfors, A., & Navarro, D. (2010). Why are some word orders more common than others? A uniform information density account. Adv. in Neural
Information Processing Systems, 23, 1585-1593.
Norcliffe, E. 2009. Head-marking in usage and grammar: A study of variation and change in Yucatec Maya. PhD Thesis, Stanford University, Stanford, CA.
Piantadosi, S. T., H. Tily, and E. Gibson. Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences,
108(9):3526, 2011.
Prat-Sala, M., & Branigan, H. P. 2000. Discourse constraints on syntactic processing in languageproduction: A corss-linguistic study in English and Spanish.
Journal of Memory and Language,42, 168-182.
Tanaka, M., Branigan, H.P., McLean, J.F., & Pickering, M.J. 2011. Conceptual infuences on word order and voice in sentence production: Evidence from
Japanese. Journal of Memory and Language, 65, 318-330.
Tily, H. and S. T. Piantadosi. 2009. Refer efficiently: Use less informative expressions for more predictable meanings. In Proceedings of the workshop on the
production of referring expressions: Bridging the gap between computational and empirical approaches to reference .
Van Son R. and Van Santen, J.P.H. (2005). "Duration and spectral balance of intervocalic consonants: A case for efficient communication", Speech
Communication 47, 100-123.Wasow, T., Jaeger, T. F., & Orr, D. (2011). Lexical Variation in Relativizer Frequency. In H. Wiese& H. Simon (Eds.), Proceedings of
the 2005 DGfS workshop “Expecting the unexpected:Exceptions in Grammar” (pp. 175-196). Berlin/NewYork: De Gruyter Mouton.
Wiltschko, M. 2008. The syntax of plural marking. Natural Language and Linguistic Theory
Zipf, G. K. 1949). Human Behavior and the Principle of Least Effort. Addison-Wesley.
[43]