full text - Queen Margaret University
Transcription
full text - Queen Margaret University
Language Interaction in the Bilingual Acquisition of Sound Structure: A longitudinal study of vowel quality, duration and vocal effort in pre-school children speaking Scottish English and Russian Olga B. Gordeeva PhD Thesis, 2005 QMUC SSRC Theses Online, release date: 31.05.2006 © Olga Gordeeva, 2005 This is an electronic (pdf) version of Olga Gordeeva’s PhD, submitted by the author as a pdf file, with this cover sheet added. The definitive printed and bound version of the thesis is available by inter-library loan (including via microfilm or electronically scanned), but reference can be made to this version if it is clearly identified as the “QMUC SSRC Theses Online” version with the appropriate date of release. LANGUAGE INTERACTION IN THE BILINGUAL ACQUISITION OF SOUND STRUCTURE: A longitudinal study of vowel quality, duration and vocal effort in pre-school children speaking Scottish English and Russian Olga Gordeeva A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Speech and Hearing Sciences Queen Margaret University College February 2006 Declaration I confirm that the thesis submitted is my own work and that appropriate credit has been given where reference has been made to the work of others. Olga Gordeeva 15 February, 2006 ii Publications from the Thesis Gordeeva, O., Mennen, I. and Scobbie, J.M. (2003). Vowel duration and spectral balance in Scottish English and Russian. In M.J. Solé, D. Recasens and J. Romero (Eds.). Proceedings of the 15th International Congress of Phonetic Sciences (pp. 3193 – 3196), Universitat Autònoma de Barcelona. iii Acknowledgements Conducting a PhD has been an enjoyable time for me from start to finish. My deep appreciation is to Peter, my husband, who encouraged me to pursue this path. All this time he was of invaluable support in many ways, most importantly by being as he is. He helped me out practically by writing the algorithms of noise subtraction and RMS-power analysis. My bilingual son Maxim has given me a great motivation for this research, and he is just such a cute boy. I dedicate this thesis to him. I am lucky to have had Dr. Ineke Mennen and Dr. Jim Scobbie as supervisory team. Ineke has been encouraging and motivating from the moment we got in touch. This was crucial to me. Her great knowledge of intonation, bilingual and second language acquisition have inspired me and affected my views in the thesis. Jim has given me great support throughout, not the least with his deep analytic way of thinking, and he infected me with his keen interest and insights of sociophonetic variation. Both of them have been around with a critical piece of advice, and we have been a fantastic team and friends. Thanks for all this! This PhD would have been impossible without the kind and enduring participation of the two lovely bilingual girls and their parents on many occasions in 2002/2003. Claire Withnell was of a big help in the Scottish part of recordings. Many thanks are to all the children, their parents and adults who participated in this study. Thanks to QMUC for the financial contribution during my PhD, and to all members of staff at Speech and Language Sciences supporting my research. Special thanks are to Steve Cowen for his technical support, and to Robin Lickley for his help with statistical analyses. Thanks to Ben Matthews, Suzanne Fuchs, Alan Wrench, Joanne McCann, Lianne Carroll, Ioulia Grichkovtsova and Natalia Zharkova for nice chats during coffee breaks and practical help. Thanks to Alice Turk and Bert Remijsen for giving good advice. Thanks to Michael Jessen for introducing me to his study. I am grateful to Prof. Jeannine Vereecken and Prof. Voordeckers at the department of Slavic Languages at the University of Ghent (Belgium), for revealing me the joys of scientific thinking. iv Abstract This PhD thesis contributes new empirical knowledge to the question of what paths bilingual acquisition of sound structure can take in early simultaneous bilinguals. The issues of language differentiation and interaction are considered in their relationship to language input, crosslinguistic structure and longitudinal effects. Two Russian-Scottish English subjects aged between 3;4 and 4;5 were recorded longitudinally. Russian was spoken in their families, and Scottish English in the community (Edinburgh, UK). The family environments were similar, but one subject had received substantially more input in Russian than the other one. We addressed the detail of their production of prominent syllable-nuclear vowels /i / in Scottish English and /i u/ in Russian with regard to their vowel quality, duration and vocal effort. Language differentiation and interaction patterns were derived by accounting for the language mode, and by statistical comparison of the crosslinguistic structures to the speech of monolingual peers (n=7) and adults (n=14). Subjects’ bilingual results revealed both substantial language differentiation and systematic language interaction patterns. The extent of language differentiation and directionality of interaction depended on the amount of language exposure. Its directionality did not necessarily depend on the markedness of the crosslinguistic structures, and could be bi-directional for the same properties. Longitudinally, language differentiation increased, while interaction reduced. The amount of reduction depended on both language input and the structural complexity of the languages with segmental tense/lax contrast and complex postvocalic vowel duration conditioning showing more persistent language interaction effects. The results confirmed the importance of language input. We showed that in bilingual phonological development language interaction should be considered as a normal but non-obligatory process. Besides, some structurally complex processes potentially explainable by ‘markedness’ (applied to isolated segments) could rather be explained by lexical and phonotactic factors. v Table of Contents Declaration.....................................................................................................................ii Publications from the Thesis........................................................................................ iii Acknowledgements.......................................................................................................iv Abstract ..........................................................................................................................v Table of Contents..........................................................................................................vi List of Tables ...............................................................................................................xii List of Figures .............................................................................................................xvi List of Equations .........................................................................................................xxi List of Abbreviations and Conventions .....................................................................xxii 1 Background ............................................................................................................1 1.1 Introduction....................................................................................................1 1.2 Important Concepts and Definitions ..............................................................2 1.2.1 Bilinguals and Bilingualism...................................................................2 1.2.2 Language Interaction .............................................................................5 1.3 Bilingual Language Differentiation and Interaction ......................................8 1.3.1 What is it about? ....................................................................................8 1.3.2 Factors Affecting Language Interaction ..............................................10 1.3.2.1 Language Mode and Pragmatic Awareness.....................................10 1.3.2.2 Language Mixing in the Input..........................................................12 1.3.2.3 Structural differences of the languages in contact ...........................13 1.3.2.3.1 Why should language structure be important?...........................13 1.3.2.3.2 Cross-Language Cue Competition.............................................18 1.3.2.3.3 Markedness ................................................................................20 1.3.2.4 Language dominance .......................................................................22 1.3.2.5 Bilingual bootstrapping....................................................................24 1.4 Summary ......................................................................................................25 2 Crosslinguistic Differences in Sound Structure and Their Acquisition...............26 2.1 Differences in Sound Structure between Scottish English and Russian ......26 2.1.1 Introduction..........................................................................................26 2.1.2 Theoretical Framework for the Research Variables ............................27 2.1.2.1 A Short Sketch of the Research Variables.......................................27 2.1.2.2 ‘Stress-Accent Hypothesis’..............................................................28 2.1.2.3 Stress and Vocal Effort in ‘Stress Accent’ Languages ....................29 2.1.2.4 Acoustic Correlates of Vocal Effort in ‘Stress-Accent’ Languages 31 2.1.2.5 Functional Load ...............................................................................36 2.1.3 Segmental Differences between Scottish English and Russian ...........41 2.1.3.1 Russian vowel system ......................................................................41 2.1.3.2 Scottish English vowel system.........................................................42 2.1.3.3 Segmental Differences in the Focus of Investigation ......................43 2.1.4 Prosodic Differences between Scottish English and Russian ..............44 2.2 Language Interaction in Bilingual Acquisition of Vowel Quality...............51 2.2.1 Monolingual Acquisition .....................................................................51 2.2.1.1 Non-Scottish English and Scottish English .....................................51 2.2.1.2 Russian.............................................................................................55 2.2.2 Bilingual Acquisition ...........................................................................58 2.3 Language Interaction in Bilingual Acquisition of Vowel Duration.............64 vi 2.3.1 Monolingual Acquisition .....................................................................64 2.3.2 Bilingual Acquisition ...........................................................................67 2.4 Acquisition of Vocal Effort .........................................................................71 2.4.1 Monolingual Acquisition .....................................................................71 2.4.2 Bilingual Acquisition ...........................................................................74 2.5 Summary and Research Questions...............................................................74 3 Methodology ........................................................................................................77 3.1 Introduction..................................................................................................77 3.2 Subjects ........................................................................................................78 3.2.1 Common Linguistic and Environmental Background .........................78 3.2.2 Differences in Linguistic and Environmental Background .................81 3.2.2.1 Subject BS........................................................................................81 3.2.2.2 Subject AN.......................................................................................83 3.3 Control groups .............................................................................................84 3.3.1 Children................................................................................................84 3.3.2 Adults...................................................................................................86 3.4 Materials ......................................................................................................87 3.4.1 Children................................................................................................87 3.4.2 Adults...................................................................................................90 3.5 Data Collection ............................................................................................90 3.5.1 Children................................................................................................90 3.5.1.1 Recording Equipment and Set up ....................................................90 3.5.1.2 Procedure .........................................................................................91 3.5.1.3 Games ..............................................................................................92 3.5.2 Adults...................................................................................................94 3.5.3 Summary of the Elicited Data..............................................................95 3.5.4 Digital Audio Data Formats.................................................................96 3.6 Phonetic and Acoustic Measurements .........................................................96 3.6.1 Overview..............................................................................................96 3.6.2 Data Annotation ...................................................................................98 3.6.2.1 Phonetic Labelling ...........................................................................98 3.6.2.2 Annotation of Timing ......................................................................98 3.6.2.3 Annotation of Prominence and Utterance Type.............................102 3.6.3 Automatic Acoustic Measurements ...................................................102 3.6.3.1 Steady-State of the Vowel .............................................................102 3.6.3.2 Formant Analysis ...........................................................................103 3.6.3.3 RMS-Power Analysis.....................................................................106 3.6.3.4 Fundamental Frequency Analysis..................................................107 3.6.4 Data Validation and Normalisation. ..................................................108 3.6.4.1 Validation of Phonetic Labels........................................................108 3.6.4.2 Validation of Estimated Formant Frequencies ..............................109 3.6.4.2.1 Introduction..............................................................................109 3.6.4.2.2 Adults.......................................................................................110 3.6.4.2.3 Children....................................................................................113 3.6.4.3 Normalisation of RMS-Power Measurements ...............................114 4 Acquisition of Vowel Quality............................................................................118 4.1 Introduction................................................................................................118 4.2 Statistical Analysis.....................................................................................119 4.3 Acquisition of Vowel Quality....................................................................119 4.3.1 Scottish English Monolingual Results ...............................................119 4.3.1.1 Acquisition of close(-mid) unrounded vowels...............................119 vii 4.3.1.2 Acquisition of close rounded vowels.............................................122 4.3.1.3 Summary of results for the SSE monolingual peers ......................125 4.3.2 Bilingual Acquisition .........................................................................126 4.3.3 Subject AN.........................................................................................126 4.3.3.1 Acquisition of close unrounded vowels.........................................126 4.3.3.1.1 Language differentiation..........................................................126 4.3.3.1.2 Longitudinal perspective..........................................................129 4.3.3.2 Acquisition of close rounded vowels.............................................130 4.3.3.2.1 Language differentiation..........................................................130 4.3.3.2.2 Longitudinal perspective..........................................................132 4.3.3.3 Summary of AN’s results...............................................................134 4.3.4 Subject BS..........................................................................................136 4.3.4.1 Acquisition of close unrounded vowels.........................................136 4.3.4.1.1 Language differentiation..........................................................136 4.3.4.1.2 Longitudinal perspective..........................................................138 4.3.4.2 Acquisition of close rounded vowels.............................................139 4.3.4.2.1 Language differentiation..........................................................139 4.3.4.2.2 Longitudinal results .................................................................141 4.3.4.3 Summary of BS’ Results................................................................143 5 Acquisition of Vowel Duration..........................................................................146 5.1 Introduction................................................................................................146 5.2 Data Analysis .............................................................................................147 5.3 Acquisition of Vowel Duration..................................................................148 5.3.1 A comparison of adult models ...........................................................148 5.3.1.1 Vowel /i/ ........................................................................................148 5.3.1.2 Vowel // ........................................................................................150 5.3.1.3 Close rounded vowels ....................................................................153 5.3.1.4 Summary of results for monolingual adults...................................155 5.3.2 SSE monolingual acquisition.............................................................157 5.3.2.1 Vowel /i/ ........................................................................................157 5.3.2.1.1 Group results............................................................................157 5.3.2.1.2 Individual results......................................................................159 5.3.2.2 Vowel // ........................................................................................161 5.3.2.2.1 Group results............................................................................161 5.3.2.2.2 Individual results......................................................................163 5.3.2.3 Close rounded vowel......................................................................165 5.3.2.3.1 Group results............................................................................165 5.3.2.3.2 Individual results......................................................................167 5.3.2.4 Summary of results for the SSE monolingual children .................169 5.3.3 Bilingual acquisition ..........................................................................170 5.3.3.1 Subject AN.....................................................................................170 5.3.3.1.1 SSE /i/ ......................................................................................170 5.3.3.1.2 SSE // ......................................................................................172 5.3.3.1.3 SSE // .....................................................................................174 5.3.3.1.4 MSR/SSE differentiation for /i/ ...............................................176 5.3.3.1.5 MSR/SSE differentiation for /u/ and // ..................................179 5.3.3.1.6 Summary of AN’s results.........................................................181 5.3.3.2 Subject BS......................................................................................184 viii 5.3.3.2.1 SSE /i/ ......................................................................................184 5.3.3.2.2 SSE // ......................................................................................186 5.3.3.2.3 SSE // .....................................................................................189 5.3.3.2.4 MSR/SSE differentiation for /i/ ...............................................192 5.3.3.2.5 MSR/SSE differentiation for /u/ and //. .................................194 5.3.3.2.6 Summary of BS’ results ...........................................................196 6 Acquisition of Vocal Effort ...............................................................................198 6.1 Introduction................................................................................................198 6.2 Data Analysis .............................................................................................200 6.3 Acquisition of Vocal Effort .......................................................................201 6.3.1 A comparison of adult models ...........................................................201 6.3.1.1 Unrounded vowel /i/ ......................................................................201 6.3.1.2 Vowel /i/ compared to // ...............................................................204 6.3.1.3 Rounded vowels.............................................................................206 6.3.1.4 Summary of results for monolingual adults...................................209 6.3.2 SSE monolingual children .................................................................210 6.3.2.1 Vowel /i/ ........................................................................................210 6.3.2.1.1 Group results............................................................................210 6.3.2.1.2 Individual results......................................................................212 6.3.2.2 Vowel /i/ compared to // ...............................................................214 6.3.2.2.1 Group results............................................................................214 6.3.2.2.2 Individual results......................................................................215 6.3.2.3 Vowel // .......................................................................................217 6.3.2.3.1 Group results............................................................................217 6.3.2.3.2 Individual results......................................................................219 6.3.2.4 Summary of results for the SSE monolingual children .................220 6.3.3 Bilingual Acquisition .........................................................................221 6.3.3.1 Subject AN.....................................................................................221 6.3.3.1.1 SSE /i/ ......................................................................................221 6.3.3.1.2 SSE /i/ compared to //.............................................................223 6.3.3.1.3 SSE // .....................................................................................225 6.3.3.1.4 MSR/SSE differentiation for /i/ ...............................................227 6.3.3.1.5 MSR/SSE differentiation for /u/ and // ..................................230 6.3.3.1.6 Summary of AN’s results.........................................................232 6.3.3.2 Subject BS......................................................................................235 6.3.3.2.1 SSE /i/ ......................................................................................235 6.3.3.2.2 SSE /i/ compared to //.............................................................237 6.3.3.2.3 SSE // .....................................................................................239 6.3.3.2.4 MSR/SSE differentiation for /i/ ...............................................241 6.3.3.2.5 MSR/SSE differentiation for /u/ and // ..................................243 6.3.3.2.6 Summary of BS’ results ...........................................................246 7 Discussion and Conclusion ................................................................................248 7.1 Overview of the main findings ..................................................................248 7.1.1 Language differentiation and interaction patterns .............................248 7.1.2 Conditioning Factors of Language Differentiation and Interaction...253 7.1.2.1 The role of language input conditions versus language structure..253 ix 7.1.2.2 Sound-structural effects .................................................................257 7.1.2.3 Lexicalisation effects .....................................................................261 7.1.2.4 Maturation and age effects.............................................................261 7.1.2.5 Other environmental effects...........................................................265 7.1.2.6 Methodological issues....................................................................266 7.1.3 Implications of the bilingual findings ................................................268 7.1.3.1 Language differentiation/interaction patterns and their mental representation.................................................................................................268 7.1.3.2 Implications of the findings for the theory and models of language acquisition ......................................................................................................270 7.1.4 Implications of vocal effort findings..................................................272 7.2 Suggestions for further research ................................................................275 7.3 General Conclusion....................................................................................276 References..................................................................................................................277 Appendix A Phonetic ranges of the production of the target /i/ by the SSE monolingual children. ................................................................................................291 Appendix B Distributions of the three most frequent phonetic labels (per carrier word) for the target // produced by the SSE monolingual children. ........................292 Appendix C Duration of the close(-mid) vowels produced by the adult subjects as a function of the following consonant in SSE, MSR and SSBE. .................................293 Appendix D Duration of the close(-mid) vowels produced by the adult subjects averaged per language (SSE, MSR and SSBE) and speaker as a function of the following consonant...................................................................................................296 Appendix E Individual results of the SSE monolingual children for the duration of the vowel /i/ as a function of the following consonant. .............................................297 Appendix F Individual results of the SSE monolingual children for the duration of the vowel // as a function of the following consonant. ............................................298 Appendix G Individual results of the SSE monolingual children for the duration of the vowel // as a function of the following consonant. .............................................299 Appendix H Duration of the vowel /i/ as a function of the following consonant produced by the bilingual subject AN: longitudinal results for MSR and SSE.........300 Appendix I Duration of the vowels // and /u/ as a function of the following consonant produced by the bilingual subject AN: longitudinal results for MSR and SSE. 301 Appendix J Duration of the vowel /i/ as a function of the following consonant produced by the bilingual subject BS: longitudinal results for MSR and SSE..........302 Appendix K Duration of the vowels /u/ and // as a function of the following consonant produced by the bilingual subject BS: longitudinal results for MSR and SSE. 303 Appendix L Mean RMS-power around F2 (dB) for the adult subjects averaged per language (SSE, MSR and SSBE) for the vowel /i/ as a function of the following consonant. 304 Appendix M Mean RMS-power around F2 (dB) for the adult subjects averaged per language (SSE, MSR and SSBE) for the close rounded vowels as a function of the following consonant...................................................................................................305 Appendix N Mean RMS-power around F2 (dB) produced by the SSE subjects of different ages for the vowel /i/ as a function of the following consonant. ................306 x Appendix O Mean RMS-power around F2 (dB) produced by the SSE subjects of different ages for the vowels /i/ and // across all consonantal contexts. ..................307 Appendix P Mean RMS-power around F2 (dB) produced by the SSE subjects of different ages for the vowel // as a function of the following consonant.................308 Appendix Q Descriptive statistics of SSE/MSR bilingual production of vocal effort for the vowel /i/ as a function of the following consonant based on three acoustic measures A2, A2*a, A2*b (dB) per speaker, language and age................................309 Appendix R Descriptive statistics of bilingual SSE production of vocal effort for the tense/lax vowels /i/ and // based on three acoustic measures A2, A2*a, A2*b (dB) per speaker and age. .........................................................................................................312 Appendix S Descriptive statistics of SSE/MSR bilingual production of vocal effort for the close rounded vowels as a function of the following consonant based on three acoustic measures A2, A2*a, A2*c (dB) per speaker, language and age..................313 Appendix T Durational ratios for the postvocalic conditioning of vowel duration for all subjects by language, age and bilinguality. ..........................................................316 xi List of Tables Table 2-1 Russian vowel phonemes (Bondarko, 1998)................................................41 Table 2-2 Russian vowel allophones (adopted from Bondarko, 1998; Kuznetsov, 1997) ............................................................................................................................41 Table 2-3 Scottish English vowel monophthongs (adopted from Wells, 1982) ...........42 Table 2-4 Comparison between monophthong phonemes between SSE and SSBE (adapted from Matthews, 2002)...................................................................................42 Table 2-5 Broad differences and similarities between Russian and Scottish English word-prosodic systems.................................................................................................45 Table 2-6 Broad characterisations, for one representative vowel [], of vowel duration conditioning effects by various contexts in SSE and SSBE (adapted from Scobbie et al., 1999a)...................................................................................................48 Table 2-7 Most frequent ‘non-adult-like’ substitutes for SSE target // in child speech (adapted from Matthews, 2002)...................................................................................53 Table 2-8 Most frequent ‘non-adult-like’ substitutes for SSE target [] (adapted from Matthews, 2002)...........................................................................................................55 Table 2-9 Frequency of vowel phonemes in 5 subjects (the higher the row – the more frequent the sound in the table) (adapted from Zharkova, 2002). ...............................57 Table 2-10 A summary of five studies that dealt with bilingual phonological acquisition of vowel inventories...................................................................................60 Table 2-11 Summary of the total of 8 research variables for three levels of speech production, vowel sets, crosslinguistic differences and a cross-reference to Section numbers containing discussion for these variables. ....................................................75 Table 3-1 Identification codes, age and sex of the children who participated in experiments; the children are listed by age. ................................................................86 Table 3-2 Native language, age, sex of adult participants. .........................................87 Table 3-3 Elicited target words: orthography and adult target phonetic transcription per language. ...............................................................................................................89 Table 3-4 Main type carrier sentences used in the two languages..............................94 Table 3-5 Summary of the number of sessions and the total number of elicited tokens per child (and age sample) ..........................................................................................95 Table 3-6 Raw acoustic measurements in this study. ..................................................97 Table 3-7 Summary of the fixed frequency bandwidths for three frequency slices. ..107 Table 3-8 A comparison of different acoustic studies of formant frequencies (Hz), estimated for adult native speakers of SSBE, SSE, MSR and General American......112 Table 4-1 Phonetic ranges of adult target // produced by SSE monolingual children ....................................................................................................................................121 Table 4-2 Frequencies of adult and non-adult like realisations of /i/ and // for SSE monolingual children (aged 3;4 to 4;9).....................................................................122 Table 4-3 Phonetic range in the realisation of adult target [] by SSE monolingual children ......................................................................................................................124 Table 4-4 The effect of factor bilinguality of subject AN compared to the SSE monolingual peers for the production of phonetic variants [i] and [] for the target //. ...............................................................................................................................126 xii Table 4-5 AN’s production of phonetic variants [i] and [] for the target /i/ in SSE compared MSR (across age). .....................................................................................127 Table 4-6 Distribution of palatalised and non-palatalised consonants in the preceding context of the vowels [i] and [] for Russian target /i/. ...........................................129 Table 4-7 Longitudinal production of [i] and [] for target /i/ in SSE by AN. ........129 Table 4-8 Longitudinal production of [i] and [] for target /i/ in Russian by AN...130 Table 4-9 The effect of factor bilinguality of the subject AN on the production of phonetic variants [] and [u] in SSE in comparison to the SSE monolingual children. .....................................................................................................................131 Table 4-10 Phonetic ranges of the MSR target /u/ and SSE // produced by the bilingual subject AN...................................................................................................132 Table 4-11 Longitudinal production of phonetic variants [u] and [] for the MSR target /u/ by AN..........................................................................................................132 Table 4-12. The effect of carrier words on the proportions of the variants [] and [u] for the MSR target /u/ produced by the subject AN. ..................................................133 Table 4-13 The effect of factor bilinguality of the subject BS on the proportion of phonetic variants [i] and [] produced for the target // in comparison to the SSE monolingual children. ................................................................................................136 Table 4-14 The effect of language on the phonetic ranges for the target /i/ produced by BS in SSE compared to MSR language modes across age samples......................137 Table 4-15 Longitudinal production of [i] and [] for target // in SSE by the subject BS. ..............................................................................................................................138 Table 4-16 Phonetic ranges for the SSE target // produced by BS in comparison to the SSE monolingual children (across age)...............................................................139 Table 4-17 Contingency table showing the effect of the factor bilinguality on the distribution of two most frequent non-adult phonetic targets for SSE // produced by the subject BS in comparison to SSE monolingual children......................................140 Table 4-18 Phonetic ranges of the MSR adult target /u/ and SSE // produced by the bilingual subject BS. ..................................................................................................141 Table 4-19 Contingency table showing the effect of language on the realisations of [] and [u] for subject BS in SSE compared to MSR language modes across her age samples.......................................................................................................................141 Table 4-20 Longitudinal production of [u] and [] for the target // in SSE by the subject BS...................................................................................................................142 Table 4-21 Longitudinal production of [u] and [] for the target /u/ in MSR by the subject BS...................................................................................................................142 Table 4-22 The effect of BS’ age on the use of (non-) palatalised consonants preceding the MSR target /u/. ....................................................................................143 Table 5-1 Mean duration and standard deviation of the vowel /i/ (ms) for three right consonantal contexts per language averaged for all the adult speakers...................150 Table 5-2 Mean duration and standard deviation of the vowel // (ms) in three right consonantal contexts per language (SSE or SSBE) averaged for all the speakers. ..152 Table 5-3 Mean duration and standard deviation of close rounded vowels (ms) as a function of the following consonant averaged for all the SSE, MSR and SSBE adult speakers......................................................................................................................154 xiii Table 5-4 Mean duration and standard deviation for the SSE vowel /i/ as a function of the following consonant in four age groups of the SSE monolingual controls..........158 Table 5-5 Results of Tukey HSD post-hoc tests for the differences between age groups within SSE monolingual controls...............................................................................159 Table 5-6 Mean duration and standard deviation for the SSE vowel // as a function of the following consonant for each age group of the SSE monolingual controls.........162 Table 5-7 Results of Tukey HSD post-hoc tests for the age effects for the SSE monolingual speakers. ...............................................................................................163 Table 5-8 Mean duration and standard deviation for the SSE vowel // as a function of the following consonant for each age group of the SSE monolingual controls.....166 Table 5-9 Results of Tukey HSD post-hoc tests for the differences in the duration of // between age groups within SSE monolingual controls. .......................................167 Table 5-10 Number of tokens and duration of the vowel /i/ (ms) as a function of the following consonant for subject AN in three age samples.........................................171 Table 5-11 Number of tokens and duration of the vowel // as a function of the following consonant produced by the subject AN in three age samples....................173 Table 5-12 Number of tokens and median duration of the vowel // as a function of the following consonant for subject AN in three age samples ...................................175 Table 5-13 Median duration and number of tokens of the vowel /i/ as a function of the following consonant produced by the subject BS in three age samples. ...................185 Table 5-14 Number of tokens and median duration of the vowel // as a function of the following consonant produced by the subject BS in three age samples. ...................188 Table 5-15 Number of tokens and median duration of the vowel // as a function of the following consonant for subject BS for three longitudinal moments. ..................190 Table 6-1 Summary of the ANOVA results for adult controls for the vocal effort measures in the vowel /i/............................................................................................201 Table 6-2 Summary of the ANOVA results for the three normalisation methods of vocal effort for the tense/lax vowel pair /i / in adult SSE/SSBE speakers................205 Table 6-3 SSE and SSBE adult means and standard deviations for three normalisation methods of vocal effort for the vowels /i/ versus //. ..................................................206 Table 6-4 Summary of the ANOVA results for the three normalisation methods of vocal effort for the close rounded vowels in adults. ..................................................208 Table 6-5 Summary of the ANOVA results for the three normalisation methods of vocal effort of the vowel /i/ in four SSE monolingual age groups. ............................211 Table 6-6 Summary of the ANOVA results for the three normalisation methods of vocal effort of the vowels /i/ versus // produced by four SSE monolingual age groups. ....................................................................................................................................214 Table 6-7 Summary of the ANOVA results for the three normalisation methods of vocal effort for the vowel // in four SSE monolingual age groups...........................217 Table 6-8 Summary of the ANOVA results for the three normalisation methods of vocal effort for the SSE vowel /i/ produced by the bilingual subject AN as compared to the SSE monolingual peers. .......................................................................................221 Table 6-9 Summary of the ANOVA results for the three normalisation methods of vocal effort for the SSE vowel /i/and // produced by the bilingual subject AN compared to the SSE monolingual peers. ..................................................................223 xiv Table 6-10 Summary of the ANOVA results for the three normalisation methods of vocal effort for the SSE vowel // produced by the bilingual subject AN in comparison to the SSE monolingual peers. ...................................................................................225 Table 6-11 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*b, dB) for the vowel /i/ as a function of the following consonant produced by the bilingual subject AN in MSR and SSE.............................................228 Table 6-12 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*c, dB) for the SSE vowel // and MSR /u/ as a function of the following consonant produced by the bilingual subject AN in MSR and SSE. ..........230 Table 6-13 Summary of the ANOVA results for the normalisation methods of vocal effort for the SSE vowel /i/ produced by the bilingual subject BS as compared to the SSE monolingual peers. .............................................................................................236 Table 6-14 Summary of the ANOVA results for vocal effort for the SSE vowel /i/and // produced by the bilingual subject BS in comparison to SSE monolingual peers...........................................................................................................................237 Table 6-15 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*b, dB) for the vowel /i/ produced by the bilingual subject BS in MSR compared to SSE. ..............................................................................................242 Table 6-16 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*c, dB) for the SSE vowel // and MSR /u/ as a function of the following consonant produced by the bilingual subject BS in MSR and SSE............244 Table 7-1 Patterns of language differentiation and interaction observed for the two bilingual subjects (BS and AN) in different age samples, for three research variables and two vowel sets. ....................................................................................................252 xv List of Figures Figure 1-1 Visual representation of the language mode continuum (Grosjean, 2001, p.3). ..............................................................................................................................11 Figure 2-1 Variations in the flow glottogram of a single cycle (left part of the diagram) when a speaker was instructed to increase phonatory loudness (conditions 1a to 3a from soft to loud). Right part of the diagram represents the acoustic consequence of such increase in the radiated spectrum (2nd and 3rd ticks on the horizontal axes show frequencies between 2 and 3 kHz) (adapted from Gauffin & Sundberg, 1989)...........................................................................................................32 Figure 2-2 Acoustic representation of SSE, SSBE and MSR cardinal vowel space (adopted from Bondarko, 1998; Deterding, 1997; Kuznetsov, 1997; Walker, 1992) .44 Figure 2-3 Acoustic differences in extrinsic vowel duration conditioning (raw duration in ms) for close vowels between SSE and General American. The solid lines represent tense close vowels, while the broken lines represent the lax ones (adapted from House, 1961; Agutter, 1988; McKenna, 1988) ...................................................48 Figure 2-4 Mean duration (ms) of /i/ in SSE and Russian prominent CVC words as a function of the following consonant (per position in utterance pos1= medial, pos2=final in an utterance with more than one pitch accents, pos3=final in an utterance with one pitch accent). .................................................................................49 Figure 2-5 Mean spectral level (dB) in 4 frequency bands in three utterance positions in Scottish and in Russian. B1 = mean F1± 150 (Hz), B2 =mean F2 ± 300 (Hz), B3 = mean F3 ± 300 (Hz), B4 = mean F4 ± 300 (Hz)..........................................................50 Figure 2-6 Mean duration (ms) for SSE vowel /i/ as a function of the right consonantal context for three speakers in Matthews (2002). ......................................66 Figure 2-7 Individual means of the differences in intrinsic vowel duration in German (short and long vowels) in the speech production of bilingual German-Spanish (broken bars) and monolingual German (solid bars) children. ..................................68 Figure 2-8 Output sound pressure levels (dB) in female 4-, 8-year-olds and adults, when they are asked to adjust phonatory loudness for syllable trains /p/ (adopted from Strathopoulos & Sapienza, 1993)........................................................................73 Figure 2-9 Maximum flow declination rate (L/s/s) in female 4-, 8-year-olds and adults, when they are asked to adjust phonatory loudness for syllable trains /p/ (adopted from Strathopoulos & Sapienza, 1993) ........................................................73 Figure 3-1 BS’s language exposure pattern (% per 3 month) throughout the preschool period, based on nursery attendance hours and 336 waking hours/month......82 Figure 3-2 AN’s language exposure pattern throughout the pre-school period, based on nursery attendance hours and 336 waking hours/month........................................84 Figure 3-3 Data flow diagram of the encoding process of the acoustic waveform into acoustic parameters and phonetic labels.....................................................................97 Figure 3-4 Timing marker indicating the end of the voiceless fricative [s] and the beginning of the following vowel [] in “sieve” (annotated in SAMPA)....................99 Figure 3-5 Timing marker indicating the end of the vowel [] and the beginning of the devoiced stop [t] in “food” (annotated in SAMPA). ............................................99 Figure 3-6 Timing marker indicating the end of the vowel [] and the beginning of the voiceless stop [k] in “cook” (annotated in SAMPA). .........................................100 xvi Figure 3-7 Timing marker indicating the end of the vowel [i] and the beginning of the voiced fricative [z] in “cheese” (annotated in SAMPA). .........................................101 Figure 3-8 Two timing markers indicating the boundaries between the end of the vowel [] and the preaspirated whispered transition [] and the following voiceless fricative [s] in “fish” (annotated in SAMPA). The duration of [] is 142 ms..........101 Figure 3-9 Data flow diagram of the formant analysis process of the acoustic waveform and annotated timing of vowels.................................................................104 Figure 3-10 Data flow diagram of the RMS-power analysis of the acoustic waveform in fixed frequency bands. ...........................................................................................106 Figure 3-11 RMS errors (F1 to F3, Hz) for four automatic formant analysis methods as compared to manual formant measurements from FFT spectra...........................114 Figure 4-1 Phonetic range of variation in the production of the lax vowel // by SSE monolingual children (plotted by age on the horizontal axis)...................................121 Figure 4-2 Phonetic range of the production of the adult target vowel // by SSE monolingual children (sorted by age on the horizontal axis). ...................................124 Figure 5-1 Mean duration and standard deviation of the vowel /i/ in the three languages (SSE, MSR and SSBE) in the contexts before voiced fricatives, voiced stops and voiceless stops produced by monolingual adults. ...............................................150 Figure 5-2 Durational means (ms) in all SSE versus SSBE adults of the vowel // in the contexts before voiced fricatives, voiced stop and voiceless fricatives...............152 Figure 5-3 Mean duration (ms) and standard deviation of the close rounded vowels in the three languages (SSE, MSR and SSBE) in the contexts before voiced fricatives, voiced stops and voiceless stops produced by monolingual adults. ..........................154 Figure 5-4 Mean duration of the vowel /i/ (ms) as a function of the following consonant in four age groups of the SSE monolingual speakers...............................158 Figure 5-5 Individual results of SSE monolingual children on the duration of /i/ as a function of the following consonant...........................................................................160 Figure 5-6 Mean duration of the vowel // as a function of the following consonant in 4 SSE monolingual age groups ..................................................................................162 Figure 5-7 Individual results of SSE monolingual children on the duration of // as a function of the following consonant...........................................................................164 Figure 5-8 Mean duration of the vowel // (ms) as a function of the following consonant in four age groups of SSE monolingual speakers.....................................166 Figure 5-9 Individual results of SSE monolingual children on the duration of // as a function of the following consonant...........................................................................168 Figure 5-10 Median duration of the vowel /i/ (ms) as a function of the following consonant for subject AN compared to age matched SSE monolingual children in three age samples. .....................................................................................................171 Figure 5-11 Median duration of the vowel // (ms) as a function of the following consonant produced by the subject AN in comparison to the SSE monolingual peers in three age samples (plotted from left to right). ...........................................................173 Figure 5-12 Median duration of the vowel // (ms) as a function of the following consonant for subject AN compared to age matched SSE monolingual children in three age samples.......................................................................................................175 Figure 5-13 Mean duration of the vowel /i/ (ms) as a function of the following consonant produced by the subject AN in MSR and SSE in three longitudinal age samples (from left to right). .......................................................................................177 xvii Figure 5-14 A comparison of AN’s longitudinal results for the mean duration of /i/ (ms) to that of her mother speaking Russian and of the principal investigator (subject R3) in child directed speech.......................................................................................178 Figure 5-15 Mean duration of the close rounded vowels (ms) as a function of the following consonant for the subject AN in MSR and SSE..........................................180 Figure 5-16 A comparison of AN’s longitudinal results for the mean duration of MSR /u/ (ms) compared to that of her mother speaking Russian, and of the principal investigator (subject R3) in Russian child directed speech. ......................................180 Figure 5-17 Median duration of the vowel /i/ (ms) as a function of the following consonant produced by the subject BS compared to the SSE monolingual peers in three age samples.......................................................................................................185 Figure 5-18 Median duration of the target vowel // (ms) as a function of the following consonant produced by the subject BS compared to the SSE monolingual peers in three age samples.......................................................................................................187 Figure 5-19 Median duration of all phonetic realisations of [] (ms) as a function of the following consonant produced by the subject BS compared to the SSE monolingual peers in three age samples....................................................................187 Figure 5-20 Median duration of the vowel // (ms) as a function of the following consonant for subject BS compared to age matched SSE monolingual children for three longitudinal moments........................................................................................190 Figure 5-21 Longitudinal results for the mean duration of the vowel /i/ (ms) as a function of the following consonant produced by the subject BS in MSR and SSE ...193 Figure 5-22 A comparison of BS’ longitudinal results for the mean duration of /i/ in SSE and MSR to those of her mother speaking Russian, and those of the principal investigator (subject R3) in child directed MSR. .......................................................193 Figure 5-23 Mean duration of the close rounded vowels (ms) as a function of the following consonant produced by the subject BS in MSR and SSE in three age samples.......................................................................................................................195 Figure 5-24 A comparison of BS’ longitudinal results for the mean duration of /u/ and // (ms) in SSE and MSR to those of her mother speaking Russian, and those of the principal investigator (subject R3_CDS) in child directed MSR speech...................195 Figure 6-1 Crosslinguistic effect on vocal effort (based on A2*a measure, dB) produced by adults for the vowel /i/ as a function of the following consonant. ........202 Figure 6-2 Correlation between the measure A2*a (dB) and vowel duration (ms) between MSR (left panel) and SSE (right panel) adults speakers. ............................203 Figure 6-3 Individual results for SSE and MSR adults for the production of measure A2*a of vocal effort for the vowel /i/ as a function of the following consonant. .........204 Figure 6-4 Differences between vocal effort spent (based on mean A2*b, dB) to produce lax vowel // and tense vowel /i/ for 5 SSE and 4 SSBE adult speakers.......205 Figure 6-5 Crosslinguistic effect on vocal effort (based on mean A2*c , dB) in the adult production of close rounded vowels as a function of the following consonant. ........208 Figure 6-6 Individual results for SSE and MSR adults for the production vocal effort (based on median A2*c, dB) for the close rounded vowels as a function of the following consonant. ..................................................................................................................209 Figure 6-7 Context dependent vocal effort pattern (based on mean A2*a dB) for the vowel /i/ produced by the SSE adults compared to three groups of children aged 3;4 to 4;9. .........................................................................................................................212 xviii Figure 6-8 Individual SSE child results of vocal effort (based on median A2*a, dB) for the vowel /i/ as a function of the following consonant. .............................................213 Figure 6-9 Vowel dependent vocal effort (based on mean A2*a, dB) for the vowels /i/ versus // in SSE adults compared to three groups of children aged 3;4 to 4;9. .......215 Figure 6-10 Individual results for SSE children for the vocal effort differences (based on median A2*b, dB) between the tense/lax vowels /i /..............................................216 Figure 6-11 Context dependent vocal effort pattern (based on mean A2*a dB) for the vowel // in the SSE adults compared to three groups of children aged 3;4 to 4;9. .218 Figure 6-12 Individual SSE child results of vocal effort (based on median A2*a, dB) for the vowel // as a function of the following consonant........................................219 Figure 6-13 Vocal effort for the vowel /i/ (based on A2*a, dB) as a function of the following consonant produced by the subject AN as compared to the SSE monolingual peers in three age samples.........................................................................................222 Figure 6-14 Vocal effort applied to /i/ and // (based on A2*b, dB across all consonantal contexts) produced by the bilingual subject AN and by the SSE monolingual peers of three age groups. ....................................................................224 Figure 6-15 Vocal effort for the vowel // (based on mean A2*a, dB) as a function of the following consonant produced by AN in comparison to the SSE monolingual peers in the three age samples.............................................................................................226 Figure 6-16 AN’s crosslinguistic production of vocal effort for the vowel /i/ (based on mean A2*a, dB) as a function of the following consonant (age is plotted from left to right). .........................................................................................................................229 Figure 6-17 A comparison of AN’s vocal effort for /i/ in different consonantal contexts in MSR (based on median A2*a, dB) to that of her mother and experimenter (R3 in child directed speech). ....................................................................................229 Figure 6-18 AN’s crosslinguistic production of vocal effort for SSE // and MSR /u/ (based on mean A2*c, dB) as a function of the following consonant (age is plotted from left to right). ...............................................................................................................231 Figure 6-19 A comparison of AN’s vocal effort for /u/ in different consonantal contexts in MSR (based on median A2*c, dB) to that of her mother (reading) and experimenter (R3 in spontaneous speech). ................................................................232 Figure 6-20 Vocal effort for the vowel /i/ (based on mean A2*a, dB) as a function of the following consonant produced by the subject BS in comparison to the agematched SSE monolingual children in three age samples. ........................................236 Figure 6-21 Vocal effort of the vowel /i/ and // (based on mean A2*c, dB) produced by the bilingual subject BS compared to the SSE monolingual peers in three age samples (BS’ target /i/and // are plotted separately from the phonetic labels [i] []). ........238 Figure 6-22 Vocal effort for the vowel // (based on mean A2*a, dB) as a function of the following consonant produced by the subject BS in comparison to the SSE monolingual peers in three age samples....................................................................240 Figure 6-23 BS’s crosslinguistic production of vocal effort for the vowel /i/ (based on mean A2*a, dB) as a function of the following consonant (age is plotted from left to right). .........................................................................................................................242 Figure 6-24 A comparison of BS’s vocal effort for /i/ in different consonantal contexts in MSR (based on A2*a, dB) to that of her mother (read speech) and experimenter (R3 spontaneous speech). .................................................................................................243 xix Figure 6-25 BS’ crosslinguistic production of vocal effort for SSE // and MSR /u/ (based on mean A2*c, dB) as a function of the following consonant (age is plotted from left to right). ...............................................................................................................245 Figure 6-26 A comparison of BS’s vocal effort for /u/ in different consonantal contexts in MSR (based on A2*a, dB) to that of her mother (read speech) and experimenter (R3, in spontaneous speech). ...............................................................245 Figure 7-1 Visual footprint of BS’ and AN’s language differentiation in their two languages, speech immaturity and the direction of language interaction based on the results in this study.....................................................................................................251 Figure 7-2 Abstract representation of the longitudinal effect for the bilingual subjects AN and BS on their bilingual language differentiation based on the number of sound structure variables involved in total and partial language differentiation across their two languages. ...........................................................................................................263 xx List of Equations Equation 3-1 ............................................................................................................106 Equation 3-2...............................................................................................................107 Equation 3-3...............................................................................................................115 Equation 3-4...............................................................................................................116 Equation 3-5...............................................................................................................117 Equation 3-6...............................................................................................................117 Equation 3-7...............................................................................................................117 xxi List of Abbreviations and Conventions 1;2.3 This convention for the child’s' age means year;month.days BFLA bilingual first language acquisition BSLA bilingual second language acquisition C Consonant CCCH Cross-Language Cue Competition Hypothesis DTFT Discrete Time Fourier Transform F0 Fundamental frequency Fn n-th Formant FFT Fast Fourier Transform L2 second language MSR Modern Standard Russian ns not significant RMS root mean square SLA second language acquisition SSE Scottish Standard English SSBE Southern Standard British English SVLR The Scottish Vowel Length Rule V Vowel VOT Voice Onset Time VLF/VF ratio of the duration of vowels before voiceless fricatives relative to that before voiced fricatives VLS/VF ratio of the duration of vowels before voiceless stops relative to that before voiced fricatives VLS/VS ratio of the duration of vowels before voiceless stops relative to that before voiced stops xxii 1 Background 1.1 Introduction This study is built upon a general epistemological assumption that language acquisition is probabilistic rather than deterministic in nature. This idea is well expressed by Mohanan (1992, p. 653) in a metaphor of self-organisation of perfectly symmetrical six-branch snowflakes forming in interaction with the environment, while they infinitely vary in their patterns: “If the growth of form in language formation is analogous to that in snowflake formation, we must formulate it as a problem of morphogenesis: how does a grammar arise and develop in an individual through interaction with the linguistic environment.” Under this view the constrained randomness of a snowflake is similar to the problem of internalised human grammar, where “no two grammars are identical, even when the input is the same, yet the variability across grammars is severely constrained” (Mohanan, 1992, p.653). Patterns different in complexity may arise in the child grammar compared to that of adults, as well as more complex innovations are possible in language change. The outcome and the process of language acquisition are thus predictable to some degree; and the variability may be a part of the grammar. Over years of research one of the general questions addressed by second language (L2-) acquisition studies has been whether there is any functional difference between early and late L2-acquisition. One of the less controversial statements forthcoming from this research is that the age of onset of language acquisition has some predictive power about the level of ultimate linguistic attainment in adulthood: i.e. the age at which L2acquisition begins is the strongest predictor of level of ultimate attainment in adulthood (Flege et al., 1995; Birdsong, 2004). So we may state that, given continuous and systematic exposure to their two languages and positive motivation, bilingual children (like the Russian–Scottish English pre-school subjects in this study) are very likely to sound like native speakers of the two languages in adulthood. However, there remain controversial questions in simultaneous 1 bilingual acquisition such as: What does it take for a child to become a fluent speaker in adulthood? How systematic is language interaction in their speech production? And what are the conditioning factors for language differentiation and interaction? Besides it is still an open question whether bilingual phonological acquisition is different from monolingual and L2 acquisition. This study sets out to bridge the empirical gap for these general questions. In this introductory chapter we outline the concepts of language interaction and differentiation as they apply to bilingual acquisition of language in general, and to that of sound structure in particular. The aims of this chapter are to introduce important general concepts in bilingual studies; to review some issues in bilingual language acquisition specifically focusing on language differentiation and interaction, and to clarify the research questions of this study. In Section 1.2 we introduce general concepts and definitions in bilingual language acquisition such as ‘bilingual’, ‘bilingualism’ and ‘language mode’. In addition to that, we account for the typology of bilingual situations relevant to this study. Finally, we discuss the concept of ‘language interaction’, and its relation to other terms such as ‘interference’, ‘transfer’, ‘code-switching’ and ‘language mixing’. In Section 1.3 we discuss assumptions on the mental representation of a bilingual child’s two languages. Since this thesis addresses the issue of language interaction in bilingual acquisition of sound structure, we first discuss factors that are known to affect language interaction in general and we also address the issue of what forms language interaction can take in phonological bilingual acquisition. 1.2 Important Concepts and Definitions 1.2.1 Bilinguals and Bilingualism There is a general agreement in the literature that a ‘bilingual’ is a person who regularly uses two languages in his/her daily life (Weinreich, 1953; Grosjean, 1982). Consequently ‘bilingualism’ involves “regular use of the two languages” (Grosjean, 1982, p. 230), and this is a generic term used to refer to a range of bilingual situations. The term ‘bilinguality’ (or individual bilingualism) is used to refer to “the psychological state of the individual who has access of more than one linguistic codes as a means of social communication” (Hamers & Blanc, 2000, p.6). 2 Since this thesis is concerned with bilingual language acquisition from the early years of life and studies child speech production, we limit this discussion to terminology referring to bilingual children rather than adults. Bilingual language acquisition in pre-school children has been characterised as a result of “early, simultaneous, regular, and continued exposure to more than one language” (de Houwer, 1995, p.222) from before the age of two and after. There is a wide range of dimensions through which bilingual speech can be viewed. One of them is the age of onset of language acquisition. The exact configuration of this relationship between the age of acquisition and bilingual proficiency in adulthood is a controversial issue. There is no agreement on whether it is just a correlation (Flege et al., 1995): i.e. a younger person is more likely to have native-like proficiency in L2 than an older person; or whether there is some ‘critical age’ (Lenneberg, 1967). The critical age hypothesis of language acquisition was linked to cerebral lateralisation of language functions at some developmental point after which native-like attainment was thought to become less likely. However, different researchers have found different ‘critical ages’ ranging from three years of age up to puberty, and it seems to be different for different language components (Meisel, 2003). There are a number of typological distinctions of bilingual children made alongside the age of onset of bilingual acquisition. Some researchers have made a distinction between ‘simultaneous’ and ‘consecutive’ bilingual children (McLaughlin, 1984; Lyon, 1996; Hamers & Blanc, 2000). For example, Hamers & Blanc (2000, p.28) make a distinction between ‘simultaneous’ and ‘consecutive’ bilingual children. The term ‘simultaneous’ refers to early or infant bilinguals acquiring two mother tongues (LA and LB). On the other hand, ‘consecutive’ bilinguals first acquire some basis of the mother tongue (L1), and only then start acquiring L2 some time before the age of 10/11. However, this subdivision underspecified the upper age limit for being a ‘simultaneous’ and the lower age limit of a ‘consecutive’ bilingual. Therefore, it is not clear whether a child exposed to two languages from the age of 1;0 will become a simultaneous or a consecutive bilingual. A similar distinction is proposed by McLaughlin (1984) and Lyon (1996) between simultaneous and successive bilinguals with the difference that the boundary between the two types is set to the age of three years old. Other researchers have used the term “Bilingual First Language Acquisition” (Meisel, 1989; de Houwer, 1995). De Houwer (1995, p. 223) makes a distinction between Bilingual First Language Acquisition (BFLA) and Bilingual Second Language 3 Acquisition (BSLA). BFLA refers to “the acquisition of two or more languages from birth or at most a month after birth” (de Houwer, 1995, p. 223), while BSLA includes other cases. However, in this case it is not clear what maturational or empirical reasons are put forward for this distinction, which makes a child aged four weeks a BFLA type bilingual, and a child aged five weeks a functionally different BSLA type. Meisel (2003) proposes to distinguish between three types of bilingual acquisition: (1) simultaneous bilingual acquisition, in which the child begins to acquire two or more languages during first three to five years of life; (2) child second language acquisition, if the onset of L2 acquisition starts between ages 5 and 10; (3) adult L2 acquisition, after the age of ten. In this study, our bilingual subjects can best be regarded as simultaneous bilinguals. Bilingual acquisition began at the ages of 1;3 (subject BS) and 0;7 (subject AN), when both subjects started to attend the English-speaking nursery. Obviously, the distinction based on the ‘age of onset’ of language acquisition refers only to one important dimension affecting linguistic output of a bilingual. Language proficiency is another important factor (Hamers & Blanc, 2000). In this dimension bilinguals are subdivided into ‘balanced’ and ‘dominant’, where balanced bilinguals are equally proficient in both languages and dominant ones are not. We shall further discuss the issue of language dominance in Section 1.3.2, since it directly relates to the question of language interaction. Age and conditions of acquisition may also lead to differences in cognitive functioning described as ‘subordinate’, ‘compound’ and ‘coordinate’ bilingualism (Weinreich, 1953), and refers to the functioning of the cognitive system. In ‘subordinate’ bilingualism, there is one (L1-based) conceptual store associated with meanings for both L1 and L2. In the ‘compound’ type, there is one conceptual store merged from L1 and L2. In the ‘coordinate’ type, translation equivalents have two separate sets of concepts. However, since this cognitive factor depends on the age of acquisition and context (accounted for in this study), it should not influence the way we view the question of phonological bilingual acquisition. Finally, there are societal dimensions to bilingualism. Hamers & Blanc (2000) emphasize the importance of cultural identity, and the issue of ‘valorisation’ (relative 4 status of the language in society). Since this study addresses language-related issues in bilingualism, we need to control for the relative similarity of these societal factors across our subjects, and we account for these important issues in Chapter 3 dealing with methodology. 1.2.2 Language Interaction The term ‘language interaction’ used in this study covers a broad range of effects occurring during bilingual language acquisition due to the fact that the two languages may influence each other (Paradis & Genesee, 1996), though it is not necessary that they will. The term reflects empirical findings that there seems to be a functional separation of the two language systems of young simultaneous bilinguals (Genesee, 1989; Genesee et al., 1995; de Houwer, 1995; Deuchar & Quay, 2000; Petitto, 2001; Keshavarz & Ingram, 2002), but that the systems are not necessarily hermetically sealed from one another (Petersen, 1988; Döpke, 1998; Schlyter, 1993; Müller, 1998; Döpke, 2000; Paradis, 2001; Grosjean, 2001; Kehoe, 2002; Lleó, 2002; Guion, 2003), and can interact for a number of reasons. The factors enabling language interaction will be described in Section 1.3.2. In language acquisition, interaction is thought have a broad range of manifestations like ‘transfer’, ‘acceleration’ or ‘delay’ (Paradis & Genesee, 1996). We shall discuss these manifestations later in this section after introducing other related terms. The older terms ‘mixing’, ‘language mixing’ and ‘code mixing’ also cover any “interactions between the bilingual child’s developing systems” (Genesee, 1989). However, they usually refer to physical co-occurrences of elements from two or more languages within a single utterance (Genesee, 1989), and, in the past, presence or absence of language mixing was often taken as evidence to treat the question of initial single or dual language systems in bilingual children in an ‘either or’ fashion (Volterra & Taeschner, 1978; Redlinger & Park, 1980). As opposed to that, the term ‘language interaction’ acknowledges the fact that language interaction and differentiation are not mutually exclusive. ‘Code-switching’ (Muysken, 2000), is a more specific term that refers to a rulegoverned communication strategy among bilinguals. It involves a complex set of sociolinguistic and linguistic rules, whereby linguistic elements (usually lexical items) from a non-base language are used during communication in the base language for pragmatic 5 reasons. In addition, it may involve an adaptation of the non-base lexeme to the grammatical rules of the base language (Muysken, 2000). ‘Transfer’ is a term originating from the Second Language Acquisition (SLA) studies. It usually means “the influence resulting from similarities and differences between the target language and any other language that has been previously (and perhaps imperfectly) acquired” (Odlin, 1989, p.27). Thus, it presupposes that L1 of the L2-learner is already acquired. Nevertheless the term is also used in simultaneous bilingual acquisition studies. ‘Transfer’ has a lot in common with ‘interference’. The phenomenon of ‘interference’ was defined in Weinreich’s “Languages in Contact” (1953). He labeled it as “instances of deviation from the norms of either language, which occur in the speech of bilinguals as a result of their familiarity with more than one language, i.e. as a result of language contact” (Weinreich, 1953, p.1). The word ‘either’ means that both directions are possible: i.e. LA may influence LB, and LB may influence LA; and ‘deviation’ refers to some variation away from a monolingual norm. The difference between ‘transfer’ and ‘interference’ is in the fact that the former presupposes an established L1 competence, while the latter does not require it, so is more suitable for intermediate stages in acquisition. Weinreich conceived ‘interference’ to describe all types of “deviation from the norms” of monolinguals. However, the choice of words to label the phenomenon, the terms like ‘deviation’ and ‘norms’, may imply that language interference in bilinguals might be abnormal and that monolingual is ‘the norm’, even though phenomena like ‘code-switching’ are rule-governed and require intricate bilingual skills. Some researchers introduced more neutral terms like ‘translinguistic markers’ (Lüdi, 1987) or ‘transference’ (Clyne, 1967). Despite this criticism, the term ‘interference’ took off in the literature, and is often used interchangeably with ‘transfer’. Paradis (1993) and Grosjean (2001) narrowed down the term ‘interference’ to all non-pragmatic aspects of interference (e.g. excluding ‘code-switching’). In the narrowed down definition, ‘language interference’ is thought to be either: • Incidental in nature, and is called ‘dynamic interference’ (Paradis, 1993; Grosjean, 2001). Dynamic interference is explained as being an ‘ephemeral deviation’ due to the incidental ‘on-line’ influence of the deactivated language, or as “unrepaired slips of the tongue” (de Houwer, 1995, p.248). 6 • Or a ‘representational interference’ (static) (Paradis, 2004): i.e. it occurs at the point of acquisition, and is stored in the mental representation of the wrong language for the time being. ‘Static’ does not mean that it is stored and does not change any more. This representation can change with growing linguistic experience. A way to tease out dynamic and representational interference might be by accounting for the systematicity of interference. For example, if it is persistent that could be taken as a sign that this interference is representational. The effects of language interaction may not be confined to mere occurrence of some linguistic elements of a non-base language in the base language (like mixing, interference or transfer). In addition, in language development these effects are thought to take the form of either ‘acceleration’ or ‘delay’ (Paradis & Genesee, 1996). ‘Transfer’ in Paradis & Genesee’s (1996, p.3) understanding means a structural incorporation of a grammatical property from one language into another. ‘Acceleration’ means an earlier (compared to monolinguals) acquisition of a certain property due to the availability of the other language. One the other hand, ‘delay’ refers to the “whole rate of acquisition” and manifests itself in a slowdown in the “overall progress in the grammatical development” (1996, p.4). Paradis & Genesee have not found evidence for any of these language interaction processes. Their conclusion was that for the syntactic properties of their French-English bilingual subjects the bilingual development was autonomous rather than interdependent. Thus, the whole subdivision of manifestations of language interaction remains tentative, and we see some problems with it. First of all, the authors were not clear why ‘acceleration’ should refer to only one grammatical property, while ‘delay’ to the whole grammatical system. Secondly, they functionally separate ‘transfer’ from ‘acceleration’ and ‘delay’ in the typology of language interaction, but later in the paper ‘acceleration’ is compared to “transfer of knowledge” (Paradis & Genesee, 1996, p.8), and indeed we don’t see why ‘acceleration’ or ‘delay’ (if evidence found) could not be a type of ‘transfer’. For example, it might be that a static interference from LB persists in LA for some time (and superficially looks ‘delayed’), and later it is acquired in some native-like form. Finally, it needs to be emphasized that any claim of a ‘delay’ should be substantiated by evidence that a certain property (or maybe the whole grammatical system) is indeed acquired later within some monolingual range, for if not, we are probably dealing with interference and not with a delay. 7 1.3 Bilingual Language Differentiation and Interaction 1.3.1 What is it about? A great part of research on simultaneous bilingual acquisition has been devoted to the question of the abstract mental representation of the child’s two languages. In this discussion the presence/absence of language interaction is crucial. From at least the 1970’s bilingual language acquisition studies addressed the question of whether a child acquiring two languages simultaneously develops ‘one or two systems’ from the outset of linguistic experiences. Proponents of the ‘unitary system’ hypothesis (Volterra & Taeschner, 1978; Redlinger & Park, 1980) claimed that the presence of language-mixing and apparent lack of translation equivalents in the early speech of bilingual children should be interpreted as a sign of a unitary (single) system at the onset of speech production, some innate predisposition to acquire one language rather than more than one, and that the two language systems gradually became differentiated later in the acquisition process. Empirical evidence gathered to date strongly supports the view that rather than going through any initial ‘unitary’ lexical and syntactic development stages (Volterra & Taeschner, 1978) children growing up bilingually differentiate between their languages from the onset of their language production in the second year of life (Genesee, 1989; de Houwer, 1990; Genesee et al., 1995; Gawlitzek-Maiwald & Tracy, 1996; Paradis & Genesee, 1996; Deuchar & Quay, 2000; Petitto, 2001; Khattab, 2002; Keshavarz & Ingram, 2002). The evidence of language differentiation should not necessarily imply the total functional separation of the two systems, even though such claims have been made (Genesee, 1989; de Houwer, 1990; Genesee et al., 1995; Paradis & Genesee, 1996). Indeed young simultaneous bilinguals seem to produce the majority of linguistic structures within the ranges of monolingual peers, but there is also ample empirical evidence for the presence of language interaction in their speech (Petersen, 1988; Schlyter, 1993; Schnitzer & Krasinski, 1994; Gawlitzek-Maiwald & Tracy, 1996; Döpke, 1998; Döpke, 2000; Khattab, 2000; Paradis, 2000; Paradis, 2001; Kehoe et al., 2001; Kehoe, 2002; Keshavarz & Ingram, 2002; Khattab, 2002; Lleó, 2002). Such language interaction does not consist of ‘slips of the tongue’ only, as is sometimes claimed by proponents of the total language separation in bilinguals (de Houwer, 1995, p.248), but 8 there seems to be systematicity in the way this language interaction happens. Alternatively, bilingual children may acquire language-specific structures and gain differentiated control of their languages, where a marginal (and variable) LA/LB interaction forms a normal developmental path throughout the process. In addition, language differentiation should not necessarily imply an innate differentiation of language systems ‘from the start’, since it can alternatively be constructed with growing linguistic experience (Deuchar & Quay, 2000; Vihman, 2002; Paradis, 2004). Besides, dual (LA separated from LB) or unitary (LA stored alongside LB) mental representations may not be the only options for the type of mental construction. For example, the “Subsystems Hypothesis” and the “Activation Threshold Hypothesis” (Paradis, 1993; Paradis, 1998; Paradis, 2004) together form an alternative account for how a bilingual’s can be stored. Jointly these hypotheses explain why language differentiation and interaction may co-occur, and how it can happen. These hypotheses are part of neurolinguistic theory of bilingualism formulated by Paradis (1993; 1998; 2004). Findings are claimed to converge from studies on bilingual language differentiation and interaction with the evidence of different recovery patterns in language pathology like bilingual aphasia. The major tenets of the theory are (1) neurofunctional modularity with sets of dedicated neural pathways for each module; (2) the distinction between implicit knowledge (automatic procedures like grammar or phonology) and explicit knowledge (retrievable facts like metalinguistic knowledge); (3) a set of hypotheses about language processing. The “Subsystems Hypothesis” (Paradis, 2004, p. 210) is a neurofunctional proposal that postulates two independent language subsystems within one linguistic system. The subsystems are functionally independent in that they can be selectively inhibited or impaired, but they form part of the same neurofunctional language system. Thus this hypothesis is compatible with the evidence for language differentiation. It is also compatible with the evidence for language interaction. The choice of an appropriate language to speak is determined by the cognitive system, and not by the linguistic system or subsystems (Paradis, 2004, p.210). This choice of the language together with concepts activates the appropriate language networks (subsystems or modules), by lowering their ‘activation threshold’ values. The appropriate (lower than alternative) values are selected. The ‘activation threshold’ (Paradis, 2004, p.28) is lowered when a “sufficient amount of positive neural impulses have reached the neural substrate”. The ‘activation threshold’ depends on the frequency and recency of language experience. 9 However, under certain circumstances the activation of the alternative, like an item from the non-base language, is lower and the alternative is subsequently selected. This can be due to the input from the cognitive system (e.g. a pragmatic choice to code-switch activates the appropriate language items), or it might be that there is no appropriate alternative available, and, thus, dynamic interference occurs. Only recently bilingual child language studies have started to address the question of what parts of the language system are prone to language interaction and why. With a few exceptions the question of language interaction in phonetics and phonology is largely unexplored, and the results and analyses are divergent (Paradis, 2000; Paradis, 2001; Khattab, 2002; Kehoe, 2002; Whitworth, 2003; Kehoe, 2004; Keshavarz & Ingram, 2002; Lleó, 2002). It is the aim of the following sections to review current knowledge about the forms of language interaction in language components, and phonology specifically. In addition, we discuss the factors that may determine language interaction. 1.3.2 Factors Affecting Language Interaction 1.3.2.1 Language Mode and Pragmatic Awareness One of the facts about children acquiring two or more languages from infancy is that they are confronted with some pragmatic aspects of speech acts (such as the right language choice) with which their monolingual peers don’t usually have to deal. As Grosjean (1998, p132) pointed out, not only do bilinguals adapt to their language background accordingly, they also change their communication strategies depending on whether the person they are talking to is a bilingual or a monolingual (Lanza, 1992). Thus, ‘language mode’ is an empirically-based construct (see the overview of evidence in Grosjean, 2001, pp. 8 - 13) that models how this adaptation to the socio-linguistic background of the interlocutor takes place. Language mode is “a state of activation of the bilingual’s languages and language processing mechanisms at a given point in time” (Grosjean, 2001, p.3); the notion is exemplified in Figure 1-1. The figure represents activation of the two languages in a bilingual person communicating in a base LA. LB is the non-base language of the bilingual. The state of activation (the darker the box the more active the language) of the non-base LB differs depending on the continuous language mode situation. While the base LA is equally activated in all situations, the non-base LB is least activated in the monolingual language mode (state ‘1’ in the Figure), and it is activated most in the bilingual language mode (state ‘3’). 10 This empirically based model has important implications. First of all, this means that non-base LB is never totally deactivated. Grosjean draws evidence from the studies showing that speech of bilinguals exhibits signs of dynamic interferences even in the most monolingual of situations. It must be noted that, if we consider the possibility of further distinction between ‘dynamic’ and ‘representational interference’ (Paradis, 2004) discussed in Section 1.2.2, which is not considered by Grosjean, we would still assume that both types of interference should occur in the monolingual language mode, since both are unintended in a pragmatic sense. However, each separate type would have different implications for the state of activation of the deactivated (non-base) language: i.e. the presence of solely representational interference (as opposed to dynamic) can imply a total deactivation of the non-base language, since such interference is represented in LA rather than being borrowed on-line from LB. This is a purely a logical consequence, since we think that most plausibly both types of interference should operate at the same time. Figure 1-1 Visual representation of the language mode continuum (Grosjean, 2001, p.3). The second implication is that in the monolingual language mode (e.g. in communicating with monolinguals) the amount of ‘code-switching’ should be drastically reduced compared to the bilingual language mode. The remaining effect of the incomplete deactivation is dynamic interference (or static if it affected the representation of LA). 11 Thirdly, interference can also occur in the bilingual language mode (for example, in communication with other bilinguals). However, this is more difficult to separate from code-switching. Bilingual studies that controlled for the language mode found differences in the involvement of code-switching between monolingual and bilingual modes (syntactic: Lanza, 1992; phonological: Khattab, 2002). Therefore, this important model is accounted for in our methodology (see Chapter 3). 1.3.2.2 Language Mixing in the Input There are controversial accounts of the influence of parental language mixing on bilingual child language development. Traditionally the ‘one parent – one language’ principle is considered to be more advantageous for bilingual language development, while language mixing in a caregiver’s input is viewed as somewhat harmful (see de Houwer, 1995, p. 225 for a review). As Deuchar & Quay (2000, p.8) point out, there is so far no evidence of “any ‘type’ of environment as being more or less beneficial for bilingual upbringing”. One view holds: “one would expect children exposed to frequent and general mixing to mix frequently, since there is no reason for them to know that the languages should be separated” (Genesee, 1989). However, there is evidence showing that even though the metalinguistic and pragmatic skills of two-year olds are not yet fully developed, bilingual children can produce interlocutor-dependent code-switching patterns. For example, Lanza (1992) found that her bilingual Norwegian-English child used more English content word switching in Norwegian with her Norwegian father, who code-switched himself (despite the girl’s general preference for Norwegian). On the other hand, she used less Norwegian content word switching in her weaker English in conversation with her English-speaking mother, who did not approve of code-switching. Thus, such evidence suggests that bilingual children are well aware of the bilingual context of their upbringing. Another view, that of Chambers (2002), hypothesises the existence of an ‘innate accent-filter’ as a part of sociolinguistic (or pragmatic) competence in bilinguals. He called this ‘the Ethan experience’ after a boy, Ethan, born and raised in Toronto in a family of immigrants from Eastern Europe. His parents spoke English with a medium to strong accent from their L1. Yet Ethan never acquired the parental accent, but sounded like his English-speaking peers. However, Khattab’s (2002; 2004) empirical data on phonological acquisition contradicted the ‘accent filter’ hypothesis. She controlled for monolingual and bilingual language modes in her three English-Arabic bilinguals (aged 12 5;0, 7;0 and 10;0). Her data showed that the children did acquire the parents’ accents, but they produced them in appropriate sociolinguistic settings: i.e. while communicating in the bilingual language mode (with their parents), and used the native-like registers of English to communicate with the English-speaking peers. Khattab’s data suggested that such a register switch or continuum is possible as a part of sociolinguistic competence. However, the children did not seem to filter out accents innately, since they acquired parents’ accents and used them in appropriate situations. In fact, both findings of Lanza (1992) for syntactic acquisition and Khattab (2002; 2004) for the phonetic/phonological level of speech converge and point in the same direction. Acquiring parental communication patterns (including code-switching and interference) seems to be a part of sociolinguistic and pragmatic competence of a simultaneous bilingual, and it can be used in appropriate communicative contexts, such as in the bilingual language mode. 1.3.2.3 1.3.2.3.1 Structural differences of the languages in contact Why should language structure be important? In the context of monolingual language acquisition, it is known that different linguistic properties may develop at a different pace depending on their complexity. For example, phonological acquisition studies emphasise the interdependence of complexity in sound structure, phonological learning processes and the order of acquisition. This interdependence was incorporated in different theories ranging from the universal ‘laws of irreversible solidarity’ (Jakobson, 1941) to the probabilistic and self-organising ‘articulatory naturalness hierarchy’ (Lindblom, 1998) of phonological learning (which depends on articulatory naturalness, salience in the input, and other lexical forms already present in the lexicon). The common theme of these otherwise very different accounts of phonological learning is that learning proceeds from the acquisition of some basic to more complex tasks (Lindblom, 1998), so that linguistic and phonological structure matters in this most common sense. In the context of the two distinct languages in contact, language structure matters in measuring language interaction for an obvious reason: if the structure of some components of two languages in contact were identical, there would be nothing to interact with or to transfer. 13 In claiming that linguistic structure affects language interaction in simultaneous bilingual acquisition, researchers have taken at least two extreme views. The ‘minimalist’ view claims that language structure does matter in language interaction, but only in determining its docking sites. Structure does not determine the direction of interaction; hence it is not its cause. Such a view is implied in the Language Dominance Hypothesis (Petersen, 1988), which claims that language dominance is the major source of language interaction in young bilinguals. As opposed to that, the ‘maximalist’ view claims that linguistic structure does matter, and the structure itself and its complexity determine the direction of interaction, hence the structural differences cause language interaction. For simultaneous bilingual acquisition such a view is taken in the Cross-languageCompetition Hypothesis (Döpke, 1998; Döpke, 2000) and in the Markedness Hypothesis (Müller, 1998). The term ‘structural’, in our understanding, follows the definition of ‘phonological knowledge’ by Docherty et al. (2005), which “embraces all the systematic relationships between the sound patterns of spoken language and the external environment”. This definition includes among others socially structured variation, rather than only a structural subset covering lexical contrasts. It embraces any systematic sound structure property, be it linguistic or sociolinguistic in nature. Crosslinguistic differences can be addressed from the point of view of mere structure itself (e.g. phonological or syntactic constituents) and its distribution in the language (e.g. intralanguage frequency or frequency of input). With regard to the structural complexity of linguistic tasks and their acquisition, much research has been done in Second Language Acquisition (SLA) studies. For example, the Contrastive Analysis Hypothesis (CAH) (Lado, 1957), made predictions of transfer of linguistic habits of adult L2-learners, based on the similarity and differences between two phonological systems in contact, and made predictions about the degree of difficulty of learning. Similar categories in L2 were considered to be easy to acquire (‘positive transfer’), while those that were different were difficult to acquire (‘negative transfer’). In contrast to CAH, Flege’s Speech Learning Model (SLM) (Flege, 2002) predicts that L2 learners have more difficulty establishing similar (but phonetically differing) phonological categories, than new ones. Evidence for SLM is more solid, since it is empirically based (that for CAH is drawn from the impressionistic observations in foreign language classes). Both accounts build on the idea that structure is an important factor in determining transfer (whatever the direction). Such structural differences should 14 also be important in simultaneous bilingualism, but the direction of language interaction (if it takes place), might be qualitatively different, and require a separate model (MacWhinney, 1997; MacWhinney, 2004). Weinreich (1953, p.18) introduced a typology of processes of interference at the level of sound structure for L2 learners and adult bilinguals, which was derived from structural differences at phonological and phonetic levels. He distinguished between: 1. ‘Under-differentiation of phonemes’, i.e. a process when “two sounds of the secondary system whose counterparts are not distinguished in the primary system are confused”. For example, RP English features a phonemic tense-lax vowel contrast such as /i/ and //, while in Spanish this contrast is absent (only tense vowels are available). Therefore, a Spanish L2-learner of English must acquire a subtle vowel quality difference between English tense and lax vowels. Empirical evidence confirms that presence or absence of tense-lax contrast causes difficulties in acquisition for L2 learners (Panasyuk et al., 1995; Markus & Bond, 1999; Escudero, 2000; Guion, 2003; Piske et al., 2002). In terms of taxonomy, such crosslinguistic differences in structure have been labeled as systemic: i.e. involving different inventories (Wells, 1982). Another example of ‘under-differentiation’ is the presence and absence of postvocalic conditioning of vowel duration by the voicing of the following consonant in languages like English and French. In Mack’s (1982) study, proficient French speakers of L2 English produced a relatively short (similar to the French model) phonetic duration of vowels before voiced stops that should be long in the monolingual English model. Unlike native English speakers, the French L2 learners of English used their French ‘undifferentiated’ vowel categories in English irrespective of the voicing of the following consonant. 2. ‘Over-differentiation of phonemes’ also involves systemic crosslinguistic differences and occurs when L2-learners impose distinctions available in their L1 in structural places of L2 where they should be absent. Such a systemic difference is, for example, the presence of intrinsic vowel duration conditioning in L1 Finnish, and its absence in L2 Russian. It is found to affect speech production of Finnish students learning L2 Russian: i.e. they superimpose Finnish short/long vowel distinction in Russian, where it should not occur at all (de Silva, 1999). 15 3. ‘Reinterpretation of distinctions’ occurs when an L2-learner distinguishes a set of phonemes in L2 by some secondary phonetic features which are more important in his L1. In taxonomic terms, the involved cross-linguistic differences have been labeled as distributional (or phonotactic): i.e. involving differences in phonotactic distributions of an element in the system (Wells, 1982). For example, beginning English learners of L2 Thai interpret the Thai three-way voicing onset distinction /b p p/ as a two way distinction /p p/ based on aspiration rather than on voicing in word-initial positions (Pater, 2003), even though both voicing and aspiration are present in English. 4. ‘Phone substitution’ may happen when two languages feature the same phoneme, but differ in their phonetic implementation. In taxonomic terms, the such crosslinguistic differences have been labeled as realisational (Wells, 1982). For example, for the voiceless stop /t/ in Spanish, the place of articulation is dental, but in English it is predominantly alveolar, thus alveolar articulation of English L2 learners of Spanish is considered to be an interference and together with VOT differences it constitutes a source of foreign accent in Spanish (González-Bueno, 2002). It is not only the surface structure that affects the acquisition processes in bilinguals, but also the distributional characteristics within a language such as frequency of tasks and input (MacWhinney, 1997; MacWhinney, 2004; Paradis, 2004). For example, de Houwer (1990) addressed distributional differences between the grammatical tense systems in a Dutch-English bilingual subject. Both English and Dutch feature very similar simple past and present perfect tense systems. However, monolingual Dutch children first acquire present perfect, while monolingual English children first acquire the simple past tense. Both are acquired first due to their more frequent use in colloquial speech. The bilingual child in de Houwer’s (1990) study acquired the syntactic tense properties in the appropriate language-specific order, despite the crosslinguistic similarities from the point of view of surface structure. Bilingual speech production patterns can also be accounted by looking into sociolinguistic variation of the language varieties they speak. For example, Khattab (2002) studied speech production patterns of three English-Arabic bilingual children (aged five, seven and ten). The children were raised in Lebanese families, residing in Yorkshire (England). One of the aspects addressed by Khattab was production of /l/. From 16 the standard descriptions of the two acquired language varieties it could be expected that word-initially the children would acquire a dark /l/ in English and a ‘clear’ /l/ in Arabic. However, Khattab’s English monolingual control data suggested that the peers, their parents and subjects in Leeds IVIE corpus, also produced clear /l/ word-initially alongside the dark variant. Her bilingual subjects showed similar variation ranges in their English. Khattab’s data showed how such structured variation can be part of a child’s sociolinguistic competence, and the study provided a good example of how easy it would have been to misinterpret these data in favour of interference from Arabic, if appropriate control data were lacking. Differences in the acquisition rates of sound structure might also be connected to articulatory phonetic complexity. A good example is the bilingual acquisition of Voicing Onset Time differences. Kehoe (2004) studied the acquisition of VOT by four GermanSpanish bilingual children aged 1;9 to 2;6. For the word-initial voiceless stops, German features long lag and Spanish short lag. Phonologically voiced stops also involve a realisational difference: i.e. Spanish features a voicing lead while German features short lag. In both languages, the patterns of voiced stops are acquired later than those of the voiceless stops. Furthermore, the monolingual acquisition rates of the VOT in voiced stops in Spanish seem to be even slower than those in German. One of the explanations proposed for these acquisition rate differences was that the voicing lead is inherently more difficult to produce than the short lag. Kehoe tried to test this difference by involving the concept of ‘markedness’, since it incorporates such acquisition rate differences (see also Section 1.3.2.3.3). According to Kehoe, the bilingual results were consistent with three different patterns of language interaction (Paradis & Genesee, 1996): i.e. delay, transfer and no interaction. The variability of the results could partly be explained by the differences in the amount of the bilingual input in different children. However, one ‘balanced’ bilingual showed evidence for language interaction, while another one didn’t. Thus dominance was not the only conditioning factor. This study showed that differences in surface structure of some ‘end product’ are not sufficient to explain the complexity of acquisition process, and that acquisition rates should be taken into account. To summarise, if we look at the structure of languages in contact, at the phonological and phonetic level, the differences may involve systemic, phonotactic and realisational phonetic differences in the implementation of the same phoneme, but also phonotactic differences in the implementation of the same set (or subsets) of structures. Such structural differences can be potential docking sites for language interaction; 17 however, distributional (in terms frequency) characteristics of these structural differences in the language input are also important factors not to be dismissed. The next sections will discuss current hypotheses on the direction of language interaction specifically formulated for simultaneous bilingual acquisition. 1.3.2.3.2 Cross-Language Cue Competition The Cross-Language Cue Competition Hypothesis (Döpke, 1998; Döpke, 2000) builds on the Competition Model (Bates & MacWhinney, 1989; MacWhinney, 1997) to account for how structural ambiguities of the languages in contact determine transfer in simultaneous bilingual language acquisition. The Competition Model (Bates & MacWhinney, 1989; MacWhinney, 1997; MacWhinney, 2004) views first and second language acquisition as a constructive, datadriven process, which relies not on universals of the linguistic but of the cognitive system. The basic claim of the Competition Model with regard to first language acquisition is that structural cues available within a language compete among each other based on their ‘cue strength’: i.e. cues which are most reliable (i.e. unambiguous) and frequent are acquired first, while less reliable cues (i.e. more ambiguous or less frequent) are acquired later. This applies to all linguistic components: phonological, morphosyntactic, lexical, and semantic. With regard to SLA the strongest claim of the Competition Model is “all that can transfer, will transfer” (MacWhinney, 1997), given a potential for a crosslinguistic conflict. All beginning L2 learners start with a ‘parasitic’ (MacWhinney, 1997) set of linguistic structures based on their L1. In the context of first language acquisition, ‘cue strength’ is the factor determining the acquisition process, its relative order and difficulty. Linguistic cues with the strongest ‘cue strength’ are acquired first. ‘Task frequency’ is an important factor determining ‘cue strength’. The factor ‘task frequency’ comprises language internal frequencies of properties, but also environmental frequency (no input means there is nothing to acquire). MacWhinney notes that in the context of SLA and simultaneous bilingual language acquisition the factor ‘task frequency’ is of importance, because if one of the languages is infrequently used, “task frequency could become a factor determining a general slowdown of acquisition” (MacWhinney, 1997, p.122). The Cross-Language Cue Competition Hypothesis (Döpke, 1998; Döpke, 2000) builds on the Competition Model (Bates & MacWhinney, 1989; MacWhinney, 1997) to account for simultaneous bilingual acquisition situation. The hypothesis is derived from 18 the longitudinal data of three German-English children (aged between 2;0 and 5;0) growing up in Australia in families with German-speaking mothers and English-speaking fathers. English is spoken between the parents and in the community. The author analysed language-specific word order in verb phrases with auxiliary verbs and the acquisition of finiteness markers. For example, in both languages simple sentences feature the SVO word order. However, in the sentences with auxiliary verb phrases the word order is different: i.e. in English they retain the SVO word order as in simple sentences (E.g. I find it versus I will find it), while in German the complement moves from the post-verbal to the pre-verbal position (E.g. Ich finde es versus Ich werde es finden). Word order is thus more complex in German than in English. The results showed that the three subjects produced German sentences with word order like Ich möchte essen das instead of Ich möchte das essen significantly more often than in other reports for German monolingual peers before the age of three. Döpke (1998; 2000) explains the appearance of such non-target structures in the speech of bilingual subjects by introducing the Cross-language Cue Competition Hypothesis (CCCH). According to CCCH, structurally ambiguous cues of the languages in contact (like the above word order difference) present a cognitive challenge to a child. The appearance of ‘non-target structures’ (i.e. interference or transfer) is caused by the presence of such structural crosslinguistic differences, and their relative ‘cue strength’. ‘Non-target structures’ from the least complex language (in this case English) appear in the language containing a more complex structure (German). However, despite the claimed importance of ‘task frequency’ in the Competition Model (MacWhinney, 1997, p.122), Döpke (2000) argues that the environmental situation affects the frequency of non-target structures, but is not its primary cause. It certainly does not determine the direction of transfer. If CCCH is true, transfer can be bi-directional for different sets of structural ambiguities, depending on the surface structure of the languages in contact. However, it is always unidirectional towards the language with a more ambiguous structure. This hypothesis can be falsified if an ambiguous (unreliable) cue of one language is transferred onto another language where there is no ambiguity. For example, if we observe a transfer of the type ‘over-differentiation of phonemes’ (Weinreich, 1953) involving systemic crosslinguistic differences (see Section 1.3.2.3.1). However, since CCCH uses ‘cue strength’ devoid of ‘task frequency’ as a possible factor of language interaction, it is not clear how CCCH can account for realisational and 19 distributional differences described in 1.3.2.3.1. for the structures equally ambiguous from the point of view of either language or differing in their distributions. 1.3.2.3.3 Markedness Another explanation (similar to ‘cue strength’) on the directionality of language interaction in simultaneous bilingual acquisition employs the concept of ‘markedness’. ‘Markedness’ offers an explanation with some predictive power as to why certain linguistic features are basic, frequent and relatively easy to acquire while others are not. The idea originally referred to phonological privative oppositions (Trubetskoy, 1939, p.87), with those members of the opposition lacking a distinctive feature being ‘unmarked’ (like [-voice] in unvoiced stop /p/), while the members of the opposition featuring a distinctive feature are ‘marked’ (like [+voice] in /b/). An additional test of markedness involved neutralised positions, where the member of the opposition which appeared in the neutralised position was considered unmarked (like word final devoicing of [p] for /b/ in German or Russian). However, over the years other interpretations have been given for markedness. One of them is the distinction in Chomsky’s Universal Grammar between the core (innate marked and unmarked) rules of language and the peripheral marked rules (Chomsky, 1986). Another interpretation comes from the point of view of linguistic typological universals with features present in most languages being unmarked and those more exceptional being marked (Ellis, 1994). In SLA and bilingual acquisition studies, the concept often encompasses a combination of the structure of the languages in contact (presence and absence of features), their versatility (number of rules determining a feature in different contexts) and frequency within and between the languages in contact (Ellis, 1994). There is some empirical evidence that markedness may help to explain the direction of transfer in simultaneous bilingual acquisition. Müller (1998) claims that transfer in young simultaneous bilinguals can be viewed as a relief strategy that helps them to cope with structurally ambiguous input in the two languages. According to Müller (1998), markedness is a key to understanding the ambiguity of input, though like Döpke (1998; 2000) she views markedness from the point of view of surface structure devoid of the frequency effects. Müller (1998) looked at the acquisition of word order in subordinate clauses in simultaneous bilingual children acquiring German and either English, French or Italian. In German, the rules of word order in subordinate clauses are marked, since there are many 20 rules determining it. German presents a child with a more ambiguous (marked) input, with ‘verb-object’ (VO) order being just an option, while the English input is unambiguous (and unmarked) with regard to this syntactic property. Müller reviewed ten different case studies of simultaneous bilingual children with regard to the acquisition of this syntactic ambiguity. She concluded that in all cases transfer was unidirectional from the language with unmarked syntactic structure (French, Italian or English) into the language containing an optional marked structure in contexts determined by rule (German). Importantly, Müller (1998, p.160) claims that this unidirectionality is independent of the fact of whether German is a “preferred language of the bilinguals or not” [our italics], in other words whether they are balanced or dominant bilinguals. She interpreted the differences between monolingual and bilingual children as quantitative, since similar patterns of errors were found in both groups, and qualitative, since in monolingual children such error patterns were exceptional while in bilingual children they were frequent. Müller does not claim that such transfer is a necessary feature of bilingual language. Thus, we can further derive that markedness of a language structure is just a factor of language interaction, but not its direct source. The unidirectionality claimed by Müller is very similar to the formulation of CCCH (Döpke, 1998; Döpke, 2000). In fact Müller’s hypothesis can be refuted exactly the same way as that of CCCH discussed in the previous section. The difference between the two hypotheses is the paradigm adopted by the authors, since markedness implies a nativist paradigm while ‘cue strength’ builds on a connectionist view. Further, the two hypotheses have common ground in rejecting the importance of language dominance. Markedness has also been invoked to explain language interaction at the phonological level of language in simultaneous bilinguals (Paradis, 2001; Kehoe, 2002; Lleó, 2002; Whitworth, 2003; Kehoe, 2004). We discuss two of these studies (Kehoe, 2002; Whitworth, 2003) in greater detail in section 2.3.2, since they deal with vowel duration. Paradis (2001) examined word-truncation patterns in 17 French-English bilingual children aged about 29 months growing up in a bilingual community of Montreal, Canada. She compared bilingual speech production to English and French monolingual children (n=18 in each group). French features an iambic rhythm (E.g. WWWS). In English the trochaic pattern is predominant (SWS’W), however, other patterns (WS’WS, WSWW, SS’WW) are also common. Subsequently, the patterns of syllable omissions in monolingual children show a trochaic bias (SW) in English and an iambic one (WS) in French. The results for the bilingual children (Paradis, 2001) showed that they performed 21 similarly to the two monolingual peer groups, except for the crosslinguistically ambiguous WS’WS English pattern, which structurally resembles the French WWWS pattern. For this English pattern the bilingual children showed an iambic (French) bias in the preservation of the syllables unlike the English-speaking monolingual peers. Thus, despite the language-specificity of the majority of bilingual realisations, this result also firmly suggested the presence of crosslinguistic effects from French in bilinguals’ English. Therefore, this finding supported both the idea of ‘cue strength’ (Döpke, 1998; Döpke, 2000) and markedness-driven (Müller, 1998) direction of language interaction as for syntax. However, Paradis (2001) also considered the possibility of language dominance in her 17 bilingual subjects. She performed post-hoc tests for English and French dominant bilinguals (determined by amount of exposure). The results showed that the Frenchdominant group had a significantly stronger tendency to treat the English WS’WS words like French words than the English-dominant group. Paradis’ (2001) results emphasised the importance of considering both structural linguistic properties and environmental factors such as dominance. 1.3.2.4 Language dominance Language dominance, the notion that “one language is somehow stronger than the other and affects processing of the other” (Lanza, 2000), has been a controversial issue in bilingual acquisition studies. The controversy revolves either around the presence of the confounding effect of dominance on the developing child linguistic system, or around methodological issues of measuring dominance (is it defined in the language input or output?). The fact that language dominance may play a role in bilingual acquisition has been assumed or at least considered in quite a few handbooks in the field (Hamers & Blanc, 2000; Grosjean, 1982; Meisel, 2003), however there is surprisingly little empirical evidence for its operation. Grosjean (1982, p. 189) pointed out that “the main reason for dominance in one language is that the child has had greater exposure to it and needs it more to communicate with people in the immediate environment”. In that sense, it seems reasonable to consider the amount of input and motivation for language use to determine language dominance. It is also reasonable that (if it matters) language dominance should somehow be reflected in the mental representation, and that as an effect of this dominance the language output can contain transferred structures. However, since we know that 22 transfer of linguistic structures might be influenced by other factors, such as the language structure itself, than measuring amount of transfer in both languages in the output, and attributing it to dominance only, seems problematic. For example, the Dominant-Language Hypothesis (Petersen, 1988,p. 486) claims that for word-internal language mixing, grammatical morphemes (like plural or tense markers) of the dominant language of a bilingual child may co-occur with the lexical morphemes of either dominant or non-dominant language, while grammatical morphemes of the non-dominant language never occur in the dominant language. The hypothesis predicts a unidirectional transfer from a more dominant into the less dominant language irrespective of the language structure. The hypothesis is drawn from the data of a DanishEnglish bilingual child (aged 3;2), who lived in the USA and attended English-speaking day-care for 30 hours a week, while spending the rest of the week with her Danishspeaking parents. Petersen measured the girl’s proficiency in the two languages in terms of the occurrence of language mixing in the language output: i.e. in English she had only two mixed items, while in Danish a considerable amount of mixing from English occurred. However, the problem with this analysis is that ultimately it does measure the girl’s language output, but it does not consider any alternative explanations of this language mixing other than language dominance, such as, for example, the structural complexity of tense marking in Danish-English verbs and their monolingual acquisition patterns. Lanza’s (1992) data supports Petersen’s (1988) language dominance hypothesis. Like Peterson, Lanza interpreted the directionality of mixing in her data as an indication of language dominance. She claimed that dominance is not a necessary correlate of simultaneous bilingualism, and its state is prone to changes in time. She measured similar morphosyntactic features as Petersen. Lanza’s subject acquired a typologically similar language pair to Petersen’s study i.e. Norwegian and English, with the difference that the girl was dominant in Norwegian rather than in English, due to a greater exposure to Norwegian in Norway. Interestingly Lanza (1992, p. 642) noted that the patterns of mixing involved a mirror image of those in Peterson’s study, which is not surprising given their mirror image of dominance. Lanza (2000) noted that the same bilingual girl in fact also used an English pattern of negation in her dominant Norwegian, and that could be interpreted in favour of CCCH. Lanza (2000) refined her previous claim (1992) by stating that the claims of language dominance and CCCH “need not be mutually exclusive”. 23 To conclude, at both ends of the ‘minimalist’ and ‘maximalist’ approaches to structural differences in language interaction, there is no conclusive evidence that these accounts should be looked at in ‘either/or’ fashion or be dismissed. It seems thus reasonable to test the two accounts simultaneously so that the apparent binary nature of these proposals could be given a continuous interpretation, if tendencies emerging over a large number of studies would point in this direction. 1.3.2.5 Bilingual bootstrapping The “Bilingual Bootstrapping Hypothesis” (Gawlitzek-Maiwald & Tracy, 1996) accounts for syntactic acquisition, and views language mixing in young simultaneous bilinguals as a relief strategy which involves a temporary use of the child expertise in one domain of LA to solve similar problems in LB. The hypothesis was derived from the longitudinal data of one bilingual child (aged 2;3 to 4;3) acquiring German and British English in South Germany. The term ‘bootstrapping’ generally means that the improvement of one capability automatically improves any dependent capabilities. Thus, bilingual bootstrapping means that “something that has been acquired in language A, fulfils a booster function for language B” (Gawlitzek-Maiwald & Tracy, 1996, p. 903), or at least it serves as “a temporary pooling of resources”. For example, the subject in their study first acquired the infinitival constructions in English, with frequent use of want to constructions. At the same time in German she did not produce such infinitival constructions, however, she did use mixed utterances like: “Papa du mußt warten for me to dressed” (‘Daddy, you must wait for me to get dressed’). The subject later acquired the German infinitival constructions. This hypothesis is attractive because it puts the process of acquisition in a developmental perspective. Unfortunately, Gawlitzek-Maiwald & Tracy (1996) do not explicitly state how the hypothesis can be tested. In our view, one of the predictions of this hypothesis should be that once a structure is acquired, language interaction involving this structure should cease, since the child can ‘pool the newly acquired resources’ in the appropriate language, and no longer needs to rely on those of the other language. Of course, acquisition is a gradual process, and the difficult question in child language acquisition is ‘when is something fully acquired?’ In this sense this hypothesis is difficult to test empirically, since if a structure emerges, but is not yet fully acquired, and there is still language interaction present for this structure, then the hypothesis holds true. 24 However, if a structure is fully acquired, but there is language interaction involving this structure, then the hypothesis should be falsified. 1.4 Summary In this chapter we introduced the concepts of language interaction and differentiation. We saw that the current overriding assumption about bilingual phonological acquisition is that the mental representation of a bilingual’s languages is differentiated and that additionally both systems can interact to variable degrees. This assumption has received consensus among various researchers studying phonological aspects of bilingual language acquisition, even though the empirical foundation for this assumption mainly comprises studies of non-speech related language structures in morphology, syntax and lexicon. We discussed the concept of ‘language interaction’, and its relation to other broadly used terms such as ‘interference’, ‘transfer’, ‘code-switching’ and ‘language mixing’. From this discussion, we concluded that the concept of ‘transfer’ or ‘interference’ retains its validity and usefulness in bilingual language acquisition research. Further, it remains an open question what forms language interaction can take in phonological simultaneous bilingual acquisition. The discussion of the factors affecting language interaction concentrated on the claimed importance of system-internal factors such as ‘cue strength’ (or ‘markedness’ in the nativist paradigm), environmental factors such as ‘language dominance’, and the role of other environmental factors such as “the quality of input” and language mode. We emphasised that the available empirical evidence does not make it possible to treat the confounding factors such as linguistic structure or language dominance in an ‘either/or’ fashion. It is necessary to consider these factors simultaneously. 25 2 Crosslinguistic Differences in Sound Structure and Their Acquisition 2.1 Differences in Sound Structure between Scottish English and Russian 2.1.1 Introduction The bilingual subjects in this study are acquiring two languages: the Edinburgh variety of Scottish Standard English (SSE or Scottish English) and the Moscow variety of Modern Standard Russian (MSR or Russian). Russian belongs the East-Slavic branch of the Slavic group, while Scottish English belongs to the West-Germanic branch of the Germanic group, both within the IndoEuropean language family. There are large Russian-speaking communities in 30 countries (including former USSR, China, Mongolia, Israel and the USA). Scottish Standard English is usually described as a being one end of “a language continuum [ranging] from Broad Scots to Scottish Standard English” (Corbett et al., 2003, p.2). Broad Scots, also known as the Scots language, derived from the Anglian variety of Old English spoken in the 6th century A.D. According to Corbett et al. (2003) the term ‘continuum’ suggests “that there is a shading and overlap of language uses from ‘Broad Scots’ to ‘Scottish Standard English’”. The vocabulary and grammar of the two extremes of the Scots continuum are shared to a varying extent depending on (among others) the amount of influence of other standard varieties of English spoken in the U.K. Between the two extremes, Scottish Standard English has undergone great influence from other standard varieties of English spoken in the U.K., with which it shares a substantial overlap in written language, while retaining a more Scots phonology. As might be expected given their common inheritance, Scottish English and Russian have a number of similarities and a number of differences with regard to their phonological systems (both at the segmental and suprasegmental levels). As we mentioned in Chapter 1.3.2.3, structurally conflicting crosslinguistic differences may trigger language interaction in simultaneous bilingual language acquisition. Therefore, 26 this chapter outlines the subset of crosslinguistic differences in sound structure with a potential for language interaction between SSE and MSR. Having described the crosslinguistic differences in detail, we shall review previous findings concerning such differences in monolingual and bilingual language acquisition. Finally, based on the literature review we shall introduce the research hypotheses for this study. 2.1.2 Theoretical Framework for the Research Variables 2.1.2.1 A Short Sketch of the Research Variables Scottish English and Russian have similar word-prosodic systems, i.e. the languages employ prosodic parameters other than pitch (f0) to encode lexical stress (Beckman, 1986), but they differ as follows: (1) SSE features a certain amount of phonological encoding of tense and lax vowels, while such contrast is absent in Russian. We intend to look at this contrast in triple phonetic terms: vowel quality, vowel duration and laryngeal differences. (2) The Scottish Vowel Length Rule (SVLR) (Aitken, 1981; Scobbie et al., 1999a; Scobbie et al., 1999b; Scobbie, 2002), a highly systematic distribution of vowel duration conditioned by post-vocalic consonantal voicing and manner of articulation. In SSE, SVLR applies only to the tense vowels /i/ // /ai/ and not to the lax // . Such substantial extrinsic conditioning of vowel duration is absent in Russian (Chen, 1970; Gordeeva et al., 2003), where duration mainly cues prominence relations: i.e. the presence of a pitch accent versus lack of it, or the presence of word stress versus lack of it (Svetozarova, 1998). (3) The presence of the SVLR in SSE seems to trigger a differential employment of acoustic cues to accentual contrasts (Gordeeva et al., 2003). In prominent positions, a higher ‘spectral balance’1 (an acoustic correlate of vocal effort) is associated with phonetically short vowels compared to the long ones. The higher vocal effort of the short vowels is initiated at the pulmonic and laryngeal levels, and is reflected in the laryngeal adjustment towards a more asymmetrical glottal pulse (Gauffin & 1 In the literature, different terms - ‘spectral tilt’ (Campbell, 1995; Hanson, 1997; Sluijter & van Heuven, 1996a), ‘spectral balance’ (Sluijter & van Heuven, 1996b; Sluijter et al., 1997; Jessen, 2002) and ‘spectral emphasis’ (Traunmüller & Eriksson, 2000; Heldner, 2001; Heldner, 2003) - refer to the energy in spectral midfrequencies. The terminological differences reflect differences in methodology: i.e. they infer the same laryngeal effect by different acoustic measurements. ‘Spectral tilt’ is a ratio of the intensity of the first harmonic to that of F3 or F2, while ‘spectral emphasis’ measures energy (dB) in a signal low-pass filtered at 1.5 times f0. Since we use Sluijter & van Heuven’s (1996b) methodology with some adaptations, we call the acoustic parameter ‘spectral balance’ throughout the study. 27 Sundberg, 1989; Sluijter & van Heuven, 1996b; Hanson, 1997). In the radiated spectrum, this glottal asymmetry is reflected in a higher energy of spectral partials above 1000 Hz (Sluijter & van Heuven, 1996b). We hypothesised (Gordeeva et al., 2003) that the enhancement of spectral balance in short but prominent vowels in SSE serves as an additional cue to achieve sufficient prominence. The acoustic cue ‘duration’ is functionally loaded: i.e. the SVLR conditions a short duration in words such as ‘sheep’, while prominence requires accentual lengthening. Therefore, the SVLR interacts with the Scottish English accentual system and dynamically affects the acoustic cues to prominence. For Russian there is no such association. In Russian, vocal effort applied during speech production typically results in a relatively ‘slack articulation’ of stressed vowels (Bondarko, 1998), with their spectral levels being similar to those of the SSE long vowels (Gordeeva et al., 2003). These substantial structural differences are part of the adult language, and of the language input to bilingual children. We limited the scope of phonemes considered for the current analysis to a subset of close to close-mid vowels: /i,,/ in Scottish English, and /i,u/ in Russian, since this subset most clearly exemplifies the above crosslinguistic ambiguities. Before we proceed with a detailed explanation of these crosslinguistic structures, we would like to outline the framework within which we view the relation between these research variables. 2.1.2.2 ‘Stress-Accent Hypothesis’ In our analysis of word-prosodic features (such as the presence of systematic extrinsic vowel duration conditioning in SSE and its absence in MSR) and their influence on the prominence relations in the two languages, we follow Beckman’s (1986) ‘stressaccent hypothesis’. The hypothesis states that “stress accent differs phonetically from non-stress accent in that it uses to a greater extent material other than pitch”. In Beckman’s view, ‘accent’ is defined as a “system of syntagmatic contrasts for constructing prosodic patterns” (1986, p.1), and it is an organisational phonological rather than a distinctive feature such as found in segmental contrasts. The prosodic patterns subdivide an utterance into shorter phrases and organise them into larger units. Since such a prosodic system involves only syntagmatic oppositions (rather than paradigmatic), the distinctive function becomes secondary for prosodic properties (as opposed to its primacy in segmental contrasts). According to Trubetskoy (1939, p.35) accent in a language can serve delimitative, distinctive and culminative functions at the same time. However, in 28 Beckman’s interpretation (1986), the culminative function subserves either distinctive or delimitative functions, since accentual systems only participate in syntagmatic contrasts. ‘Stress’ is “a phonologically delimitable type of accent” (1986, p.1), with no specification of pitch shape in the lexicon, because this varies depending on the pragmatic meaning. On the contrary to Beckman’s interpretation of the relation between stress and accent, we rather agree with Sluijter and van Heuven (1996b) in understanding that stress and accent are two distinct dimensions, where “accentuation is used to focus and is determined by the communicative intentions of the speaker”, while “stress is a structural, linguistic property of a word that specifies which syllable in the word is the strongest” and a potential docking site for accent. In Beckman’s view, phonological categories of accentual systems are not necessarily phonetically uniform across languages (or even within a language). A phonological property in one language can differ in phonetic detail from the same property in another language. The difference in such phonetic detail is a question of extent rather than an absolute. 2.1.2.3 Stress and Vocal Effort in ‘Stress Accent’ Languages Traditionally, the phonetic detail of lexical stress in languages like English has been measured by acoustic cues such as f0, vowel duration, vowel quality, and overall intensity. Importantly, a set of acoustic measurements taken to infer stress depends on our understanding of what constitutes stress physiologically for a given language (vocal effort and its levels in speech motor control). In the past, differences in stressed versus unstressed English syllables were attributed “to differences in physical effort” (Lehiste, 1977, p. 106). Jones impressionistically defined ‘stress’ as “the degree of force with which a sound or syllable is uttered” (1918, p.245). The strong force in his view “gives the objective impression of loudness” (1918, p.245). Similarly to Jones and Lehiste, Bloomfield (1933, p. 110) associated English stress with loudness. However, there was no agreement as to the level of speech production at which this effort is achieved. Bloomfield (1933, p. 110) defined stress in terms of pulmonic, laryngeal and supralaryngeal effort: i.e. “more energetic movements, such as pumping more breath, bringing vocal cords closer together for voicing, and using muscles more vigorously for oral articulations”. Similarly Jones (1918, p.245) defined the source of the ‘strong force’ as a result of “energetic action of all the articulating organs” that involves 29 laryngeal and supralaryngeal levels and “a strong push from the chest wall”. However, Lehiste (1977) defined ‘effort’ on the pulmonic level only: i.e. involving physical activity of the muscles controlling respiration. In Lehiste’s understanding, the force exerted by the muscles was to be reflected in the subglottal pressure and ultimately in the compound effect on f0, vowel duration, vowel quality and overall intensity in the speech. An important turning point from associating loudness (and force) with stress in English came from Fry’s (1955; 1976) series of empirical studies. He investigated the hierarchy of acoustic cues to English stress in production and perception. Perceptually, higher f0 was found to be the strongest perceptual correlate of stress (as compared to unstressed syllables in word pairs like “OBject” versus “obJECT”), followed in cueing strength by longer duration, greater overall intensity and fuller segmental quality. Overall intensity was found to be a poor correlate of stress. There are two problems with Fry’s studies (1955; 1976). The first one concerns his treatment of f0 as a correlate of stress in the presence of accent. As Sluijter and van Heuven (1996b) point out, the primacy of pitch in cueing stress emphasised in Fry’s studies (1955; 1976) “is a major source of the common misunderstanding in the experimental literature that f0 excursion is a direct acoustic correlate of the feature ‘stress’”, since stress and accent are “distinct (though non-orthogonal) dimensions” (1996b, p.2471) and should be treated separately. The second problem is his dissociation of stress from loudness based on manipulations of the overall intensity of the sound. The latter problem needs more clarification. Over the years, our understanding of physiological manifestations of lexical stress has substantially evolved. While a relative amount of ‘effort’ remains an important part of the physiological definition of stress (for example, Beckman, 1986; Laver, 1994), effort at the subglottal level only as defined by e.g. Lehiste (1977, p.106) is not sufficient to explain linguistic stress. In agreement with Jones’ (1918) and Bloomfield’s (1933) views, in addition to pulmonic effort the laryngeal and supralaryngeal levels need to be taken into account in explaining stress production mechanisms (Rietveld & van Heuven, 1997). Empirical evidence supports this broader physiological definition of vocal effort (Fónagy, 1966). At the pulmonic level the intercostal muscles and the diaphragm control subglottal pressure by a dynamic balance of expiratory and inspiratory effort (Laver, 1994, p. 513). The subglottal pressure in the lungs produces an airstream that creates a 30 difference in transglottal pressure at the vocal folds (the Bernoulli effect) and sets them in vibration. At the laryngeal level, however, a speaker can control the crycothyroid muscles regulating the frequency of vocal fold vibration (f0) to convey the appropriate communicative intentions through intonation. A speaker is also able to control the shape of the vibration pattern of the vocal folds, by reciprocal control of the activation of the abductor and adductor muscle groups (Hirose, 1999, p.128). The shape of the vibration pattern determines voice quality. For example, a breathy voice quality (often found in female speakers) is associated with a constant glottal leakage (the vocal folds never come completely together) and a relatively symmetrical glottal pulse (Hanson, 1997; Ní Chasaide & Gobl, 1999). At the supralaryngeal level, the effort in the production of stressed syllables is characterised by a more careful (and slower) articulation by the tongue body, tongue blade and lips resulting in a spectral expansion (fuller quality) in the vowel production (Rietveld & van Heuven, 1997). Therefore, vocal effort is created at three levels in the vocal tract, and the acoustic measurements inferring stress should reflect pulmonic, laryngeal and supralaryngeal contributions to vocal effort. 2.1.2.4 Acoustic Correlates of Vocal Effort in ‘Stress-Accent’ Languages Fónagy (1966) analysed acoustic spectra of Hungarian stressed and unstressed vowels. This is how he described the acoustic differences he observed (1966, p.239): “The greater effort was reflected in different ways. In most cases ... the formants of the vowels in stressed syllables had higher amplitudes and broader bandwidths. Especially sharp was the divergence in the higher frequency ranges. In these cases, the stressed syllables took a higher level and were longer and had a higher pitch. The saturation of the spectrum indicated stress even when the greater effort was not indicated by relatively higher sound pressure levels.” [emphasis added – O.G.] Even though Fónagy treated f0 as a correlate of stress rather than one of accent, he was probably one of the first phoneticians after Fant (Fant, 1960) to emphasise the role of energy in the higher frequency ranges in differentiating stressed from unstressed syllables (in addition to sound pressure level, vowel duration and quality). It is known that an increase in subglottal pressure affects the f0, the radiated sound pressure level (SPL) and ultimately the overall intensity (Finnegan et al., 2000; Gauffin & 31 Sundberg, 1989). Gauffin and Sundberg (1989) observed that when their subjects (four singers and two non-singers) were instructed to increase phonatory loudness, this was always accomplished by an increase in subglottal pressure. When subglottal pressure was low (1a in Figure 2-1) there was glottal leakage: i.e. transglottal airflow did not reach zero during the quasi-closed phase (the vocal folds did not come completely together). When subglottal pressure increased with the increase the phonatory loudness (2a in Figure 2-1) a nearly complete glottal closure occurred. Thus, increases in SPL also affected the laryngeal level. The shape of the pulse became more asymmetrical (from 1a to 3a in the steepness of the slope of the dotted line increases): i.e. the trailing end of the flow glottogram pulses grew continuously steeper as subglottal pressure increased. This change of shape was due to the relative increase of the adduction force resulting in the increase of speed of the closing phase. This closing phase slope was the steepest in the condition of the highest increased phonatory loudness (3a in Figure 2-1). Gauffin and Sundberg’s (1989) study showed that at the level of (inversely filtered) radiated spectrum (1b to 3b in Figure 2-1 ) the change in closing phase due to increased loudness resulted in a boost of frequencies between 2 and 4kHz (hence midfrequencies). 2a 3a Time (arbitrary units) 1b Sound Pressure Level (arbitrary units) Transglottal airflow (arbitrary units) 1a 2b 3b Frequency (arbitrary units) Figure 2-1 Variations in the flow glottogram of a single cycle (left part of the diagram) when a speaker was instructed to increase phonatory loudness (conditions 1a to 3a from soft to loud). Right part of the diagram represents the acoustic consequence of such increase in the radiated spectrum (2nd and 3rd ticks on the horizontal axes show frequencies between 2 and 3 kHz) (adapted from Gauffin & Sundberg, 1989). 32 Traunmüller and Eriksson (2000) defined vocal effort as “the quantity that ordinary speakers vary when they adapt their speech to the demands of an increased or decreased … distance”. They investigated the acoustic effects of the adjustment of vocal effort as a consequence of changes in the physical distance (0.3 to 187.5 m) between a speaker (n=20) and the addressee. They found that the overall SPL in subjects’ production was affected by vocal effort. However, the overall SPL was an ambiguous cue: i.e. when the distance of the microphone form the subject was variable (uncontrolled), it became an unreliable cue in terms of its correlation with vocal effort. In comparison with the overall SPL, more reliable information on vocal effort was conveyed by spectral emphasis (a methodological alternative to ‘spectral balance’, measuring SPL above 1.5 * f0 (Hz) relative to the overall SPL), which was not affected by the location of the microphone, age or sex of the speakers. ‘Spectral emphasis’ in Traunmüller and Eriksson’s (2000) study is an acoustic inference of the same laryngeal adjustments as in Gauffin and Sundberg’s (1989) study: i.e. the more asymmetrical the glottal pulse, the higher the energy in the midfrequency ranges of the radiated spectrum. Importantly, adjustments in laryngeal configuration are also found to reliably cue linguistic properties such as accent and stress. In the light of Fry’s studies (1955; 1976) trying and failing to link stress, overall intensity and loudness, Sluijter et al. (1996b; 1996a; 1997) re-addressed the issue of acoustic cues to stress and accent for Dutch and American English. They argued that the laryngeal level needs to be taken into account in studying the acoustic correlates of stress or accent. In their first study Sluijter et al. (1996b) separated three prosodic conditions: [+stress][-accent], [+stress][+accent], [-stress][-accent], for syntagmatically comparable word pairs like “CAnon” versus “kaNON” in Dutch. They examined the acoustic correlates of stress and accent other than pitch. In addition to the ‘traditional’ acoustic correlates of stress such as duration, overall intensity and vowel quality, they also measured ‘spectral balance’. Spectral balance is an acoustic inference of asymmetry of the glottal pulse. It was measured by comparing spectral levels (dB) in four contiguous frequency bands B1-B4: 0-0.5, 0.5-1.0, 1.0-2.0, and 2.0-4.0 kHz after normalisation for vowel quality differences. The relative importance of each of the acoustic parameters was defined by their statistical ability to discriminate between the three prosodic conditions in a syntagmatic comparison. With regard to the hierarchy of acoustic cues for Dutch, Sluijter & van Heuven (1996b) found that duration remains the most effective acoustic correlate of stress. However, spectral balance (intensity levels in B2 to B4) appeared to be 33 a reliable correlate of stress (irrespective of accent) close in strength to duration. Overall intensity and vowel quality were the poorest indicators of stress in Dutch. Sluijter et al. (1996a) further established the hierarchy of acoustic cues for Dutch by inferring laryngeal configuration using more established voice source measures (such as open quotient, amplitude of volume velocity, closure rate/skewness of the glottal pulse and glottal leakage) derived from inversely filtered radiated spectrum (for a discussion see Ní Chasaide & Gobl, 1999). These additional acoustic inferences of vocal effort revealed the same role as in the previous study (Sluijter & van Heuven, 1996b), confirming the reliability of spectral balance as a measure of glottal pulse asymmetry, and confirming the results for the hierarchy. The analysis of overall intensity indicated that it is not a reliable acoustic correlate of stress, even though it reliably cued accent2. With regard to the hierarchy of the acoustic cues for American English, Sluijter et al. (1996a) found that duration, glottal parameters (high frequency emphasis and glottal leakage in B1), and vowel quality reliably cue stress in the [+stress][-accent] condition, while f0 and overall intensity are unreliable cues. Compared to Dutch, American English stress patterns had somewhat more influence from vowel quality. The relevance of intensity measurements in midfrequencies of the radiated spectrum as an acoustic cue to prominence has been confirmed in other studies (Campbell, 1995; Heldner, 2003) The perceptual relevance of vocal effort initiated at the laryngeal level as a cue to prominence has been addressed to some degree, but needs more study. Potentially vocal effort could be a perceptually relevant cue to stress, since perception of the loudness of a pure tone depends on its frequency and intensity (Fletcher & Munson, 1933), and the spectral midfrequency ranges, where the energy levels appear most differentiated for stressed syllables as compared to unstressed ones, lay within the frequency region of 2 – 5 kHz, in which the human ear is sensitive to smaller changes in intensity levels (Robinson & Dadson, 1956). The perceptual primacy of spectral tilt in cueing voice quality has been empirically found by e.g. Gobl & Ní Chasaide (1999a). With regard to prominence, Campbell (1995) addressed the issue of the perceptual relevance of spectral tilt at the level of the statistical ability of this parameter to discriminate between prominent/non-prominent syllables by means of linear discriminant analysis (LDA), based on spontaneous speech production data. However, the problem with Campbell’s link of prominence to perception is that 2 Like Traunmüller and Sundberg (2000), Sluijter et al. rigidly controlled the position of the microphone from the speaker’s mouth, thus in more real-world recordings exhibiting variation in this the reliability of overall intensity may not hold. 34 LDA statistics do not necessarily reflect real human perception of prominence and stress; at least this claim should be empirically substantiated. Sluijter (1997) addressed the issue of the perceptual relevance of the acoustic cues to stress for Dutch in an experiment with synthesized polysyllabic nonsense stimuli like “nana”. Syllable duration, energy in appropriate spectral bands, and overall intensity was separately manipulated to mimic differences in stress. Besides, there was an extra condition: i.e. with and without addition of reverberating noise. The stimuli were presented to 24 phonetically trained and 22 phonetically naive Dutch listeners. Results showed that overall intensity was a minor stress cue under all conditions; that when the perception of stimuli was hampered by variable reverberating noise, vocal effort implemented as spectral balance was the strongest perceptual cue (stronger than duration); and that in the condition with a stable noise background spectral balance and duration were primary cues close in strength. Contrary to Sluijter’s (1997) finding for stress, Heldner’s (2001) study on spectral emphasis as a perceptual cue to prominence in Swedish focal accents did not show any effect on perceived strength of focal accents through manipulated ‘spectral emphasis’. Heldner used already-accented words as a baseline for the manipulations. A potential explanation for Heldner’s negative results could be that he used already-accented words as a baseline without manipulating pitch. In the utterance contexts containing focal accents one would expect pitch to be a primary correlate of prominence, so that the listeners may have solely attended to the pitch information rather than to any changes in spectral emphasis. If the pitch information was made unreliable (e.g. monotonous), the listeners might have been forced to attend to other potential cues to prominence such as spectral emphasis. However, we agree with Heldner (2001, p.57) that the perceptual relevance of spectral emphasis in cueing prominence “at the upper end of the prominence scale” such as in focal accents does remain to be proven. To conclude this section, recent empirical studies (Campbell, 1995; Sluijter & van Heuven, 1996a; Sluijter & van Heuven, 1996b; Sluijter et al., 1997; Traunmüller & Eriksson, 2000; Jessen, 2002; Remijsen, 2002; Heldner, 2003) strongly underline the importance of the laryngeal level (in addition to pulmonic) in conveying linguistic information about stress and prominence in speech production and (somewhat less strongly) in perception. Overall intensity is unreliable, while selective energy in spectral midfrequencies measured as spectral balance, emphasis or tilt seems to reliably reflect a laryngeal contribution to vocal effort, stress and prominence. 35 These empirical studies confirmed earlier (impressionistic) observations of linguists and phoneticians (among others Jones, 1918; Bloomfield, 1933; Lehiste, 1977) that stress and prominence are associated with increased vocal effort or force, and provided a good solution to the puzzle as to why overall intensity is not a reliable cue to vocal effort and what a more reliable cue is. In Section 2.1.3 we address the differences between the acoustic correlates of the accentual systems of Russian and Scottish English. 2.1.2.5 Functional Load The theoretical concept that further connects the research variables in this study (such as duration and spectral balance) is the idea of ‘functional load’, i.e. the notion that the presence of certain phonological contrasts can influence the relative amount of work done by other contrasts within a phonological system (Beckman, 1986). For the acoustic correlates of accentual systems the hypothesis was formulated by Berinstein (1979) upon her finding that in K’ekchi (a Mayan language spoken in Guatemala) the presence of phonological vowel length contrast interacts with the use of duration as a cue to stress, since duration did not serve as a perceptual cue to stress for K’ekchi speakers. Thus, according to Berinstein, the use of duration as a cue to stress was precluded due its function as a cue to phonologically contrastive length in K’ekchi. Other cues (f0 and intensity) played a greater role in conveying stress. There are at least two problems with the Berinstein’s account of functional load. In Berinstein’s study, the stress condition was treated in Fry’s (1955) fashion: i.e. the polysyllabic target words with stress were recorded in prominent positions only. Thus, stress always had a confounding effect of the accent, and f0 could be a correlate of accent rather than of stress (see Ladd, 1996 for the criticism of this approach). Another problem with Berinstein’s formulation of the functional load is its absolute terms: i.e. the use of duration to cue stress is precluded in the presence of a phonological vowel length contrast. Recent studies (Potisuk et al., 1996; Remijsen, 2002; Taff et al., 2004) provide evidence that, while the notion of functional load seems justified as such, the functional load hypothesis needs a more relative interpretation. For example, Taff et al. (2004) found for Aleut (an Eskimo-Aleut family language member with a 3-vowel system with contrastive phonological vowel length) that, unlike Berinstein’s findings for K’ekchi, stress in Aleut does increase vowel duration, despite the presence of lexically contrastive vowel length. The increase in duration in Aleut due to 36 prominence is of a much lesser extent than, for example, in English, i.e. a language where phonological length contrasts are less frequent and involve additional differences in vowel quality. Taff et al. conclude (2004) that Aleut uses duration as a weaker acoustic correlate of stress than English. Therefore, the statement of the functional load on a suprasegmental acoustic correlate of stress is a matter of degree rather than absolute. These different interpretations of the scope of functional load with regard to stress may at least partly result from the differences in methodology: i.e. Berinstein derived her absolute interpretation from perceptual experiments, while both Potisuk (1996) and Remijsen (2002) derived their relative interpretations of the functional load based on statistical regression from their production data; and Taff et al. (2004) from the raw production data only. Despite these methodological differences and the fact that speech perception is known to work more categorically than speech production, it seems that the idea of functional load in its relative interpretation is more useful, since the relative scalar interpretation includes the case of preclusion, while the absolute interpretation excludes scalability. The relative interpretation of functional load is also compatible with the stressaccent hypothesis, in that according to it, the same phonological property can have a different phonetic implementation across languages and within a language, so that the phonetic implementation of stress is question of extent rather than of an absolute fashion. A way to measure the functional load of accentual contrast within a language is by a syntagmatic comparison of the phonetic properties of stressed and unstressed syllables in polysyllabic words, like “INcrease” and “inCREASE”. This method has been traditionally applied over the years for different languages (for example Fry, 1955; Berinstein, 1979; Beckman, 1986; Sluijter & van Heuven, 1996b; Potisuk et al., 1996; Remijsen, 2002; Taff et al., 2004). However, since phonological properties of (non-) stress accent may not necessarily be phonetically uniform within a language, a paradigmatic comparison should also be possible, at least in those places of the phonological systems where the amount of work of the acoustic cues representing the property is expected to differ due to the presence of some structural contrasts (functional load). For example, for K’ekchi Berinstein (1979, p.34) found that in the presence of phonological vowel length and, thus, unavailability of duration to cue stress (in prominent positions) f0 and peak intensity of stressed syllables are more important perceptual and acoustic cues to stress than duration. In addition, vowel peak intensity was also found to play a secondary role in a paradigmatic contrast in distinguishing short and long vowels in words with the same structure and utterance position. In such words, the peak intensity in 37 short vowels was on average 1 dB higher than that in the long ones. The subjects’ distance from the microphone was fixed. At first sight, one can doubt the significance of such a small difference in overall intensity as that reported by Berinstein. However, there are more studies pointing in the same direction. This suggests that there may be a more salient underlying cause to this difference in peak intensity. Fónagy (1966) found for Hungarian (a language with lexically contrastive vowel length) that in the same utterance position and prosodic context short vowels have higher overall intensities than their long counterparts. Since Hungarian vowel length opposition also involves vowel quality differences (i.e. long vowels have more tense and close articulations than the short ones), Lehiste (1977, p.121) interpreted this overall intensity difference in favour of the existence of intrinsic intensities (i.e. due to vowel quality differences). While we agree that vowel quality could be a confounding factor, we argue that, in this paradigmatic comparison of prominent syllables, we should also consider the concomitant effect of another factor, i.e. greater vocal effort exerted by Hungarian speakers to achieve sufficient prominence for the phonologically (and phonetically) short vowels in the prominent positions. As we discussed in the previous section, differences in vocal effort can be measured by looking at intensities in midfrequencies. We know that given stringent control of the speaker mouth distance from the microphone (like in Gauffin & Sundberg, 1989; Sluijter & van Heuven, 1996b; Traunmüller & Eriksson, 2000), SPL is directly proportional to spectral balance. So that given stringent control of the recording settings, significant differences in spectral balance may be proportional to less significant differences in overall intensity (as e.g. about 2 dB on average measured by Fónagy, and 1 dB in Berinstein’s study). Thus, it is possible that the seemingly insignificant differences in overall intensity in Hungarian and K’ekchi short and long vowels resulted from more significant differences of intensities in the midfrequency range, very much as in the studies of the acoustic cues of stress and accent (Campbell, 1995; Sluijter & van Heuven, 1996b; Heldner, 2003). Given that the above paradigmatic differences in overall intensities are systematic, it could potentially be argued in a stress-accent hypothesis stance that it is not only that presence of phonological length may limit the relative extent of use (possibly including complete preclusion) of duration as a cue to prominence, but it may also affect the extent of employment of the other secondary acoustic cues to stress, such as spectral balance. The vowel quality can not be affected for these reasons, since in prominent positions it is 38 also ‘occupied’ for segmental phonological oppositions, while the overall intensity has been found to be too ‘elusive’ (Lehiste, 1977) and unreliable (Traunmüller & Eriksson, 2000) to cue vocal effort. Jessen (2002) showed that in German, spectral tilt (H1-A2, and H1-A3, following methodology in Hanson, 1997) was an important discriminant of the tense/lax vowel opposition in prominent syllables. He found that in prominent positions German tense vowels have significantly lower intensities in frequencies around F2 than the lax ones. This finding also means that the traditional phonetic term ‘tense’ is in fact confusing, as it can stand for a more ‘lax’ configuration in laryngeal terms, and vice versa. Jessen instantiated his acoustic inferences of laryngeal effort with more direct electroglottographic evidence: the intensity differences in midfrequencies were indeed a consequence of the asymmetrical glottal pulse. Jessen interpreted his evidence in favour of the syllable-cut prosody (e.g. Trubetskoy, 1939). According to this theory, the vowel [] in the word “Mitte” (center) is ‘cut-off’ by the following consonant within the same syllable, while [i] in “Miete” (rent) the vowel is simply followed (but not interrupted) by the consonant belonging to another syllable. However, syllable structure alone provides only a partial picture in this German contrast, since substantial phonetic differences in vowel quality and vowel duration should also be considered. For example, Stevens (1998, p.297) pointed out that there are more than just segmental differences to the English tense/lax contrast. There may also be differences in the laryngeal configuration involved in addition to vowel quality. The more breathy laryngeal configuration for the tense vowels reduces spectrum amplitudes in midfrequencies, whereas a less breathy laryngeal configuration for the lax vowels enhances the amplitude of mid-frequencies. It is thus possible that the differences measured by Jessen were due to a different voice configuration adapted by the German speakers to mark the tense/lax vowel contrast. Besides, since the German tense vowels are roughly twice as long as the lax ones (for an overview see Whitworth, 2003), the difference in spectral balance might be explained by the differences in vowel duration. There is an important terminological note to make here. It is common in the voice source variation literature to view different voice qualities as being on a continuum with ‘modal’ voice source configuration being a neutral midpoint (Ní Chasaide & Gobl, 1999). Small deviations in adductive tension, medial and longitudinal compression of the vocal 39 fold from this midpoint are usually described as ‘tense’ (with higher values of these parameters) or ‘lax’ (with lower values) (Ladefoged, 1971). Extreme changes resulting in perceptually different voice modalities with similar parameter changes are accordingly described as ‘creaky’ or ‘breathy’ (Ní Chasaide & Gobl, 1999). Steven’s (1998) remark mentioned above implies that there is a big terminological problem: i.e. the ‘tense/lax’ supralaryngeal vowel quality means the opposite at the laryngeal level. To avoid the terminological confusion, in this study we shall use ‘more breathy’ for ‘laxer’ laryngeal configuration, and ‘less breathy’ for the ‘tenser’ one, while we limit the terms ‘tense’ and ‘lax’ to the segmental opposition only. This also implies that the ‘neutral’ voice modality midpoint this study is ‘breathy’, rather than ‘modal’, which is in fact more applicable to female and child voices used in this study (Hanson, 1997; Kent & Read, 2002) of this study. 40 2.1.3 Segmental Differences between Scottish English and Russian 2.1.3.1 Russian vowel system Russian features the six vowel phonemes shown in Table 2-1. Thus, the system of phonological oppositions involved is relatively small. Table 2-1 Russian vowel phonemes (Bondarko, 1998) i u However, the phonetic vowel space (Table 2-2) is quite crowded, reflecting the contextual variability of vowels. The variability is mainly a result of the presence of two features in the Russian phonological system: Table 2-2 Russian vowel allophones (adopted from Bondarko, 1998; Kuznetsov, 1997) i e u æ a (1) Consonant palatalisation influences the following vowel allophone. The vowel [] appears after palatalised consonants, and in complementary distribution with the main allophone of phoneme /u/, as in [luk] ‘onion’ versus [lk] ‘hatch’. Similarly, the sound [e] in stressed syllables is in complementary distribution with the main allophone of phoneme //, and [æ] with the main allophone of phoneme //. (2) Like English, Russian features vowel reduction. However, the patterns of reduction are more complicated in Russian. The sounds [],[],[],[] in Table 2-2 typically appear in unstressed syllables. Additionally, [] appears in both stressed and unstressed syllables. 41 The vowel reduction patterns jointly depend (a) on the position of the unstressed syllable in relation to the stressed one; (b) the position of the unstressed syllable in relation to the word onset; (c) the underlying phoneme of the unstressed vowel; (d) whether or not the unstressed syllable has an onset. Depending on the above four factors, there are three vowel reduction patterns in Russian. For example, // in: “ostorozhnogo” (‘from the careful one’) [.st.r.n.v] is reduced to [] the 1st pre-tonic syllable or in the word-initial unstressed syllable without onset; // is reduced to [] in any other pre- or post-tonic syllable. The phonemic contrasts between // and // (as well as between /i/ and //) are neutralised in unstressed syllables. 2.1.3.2 Scottish English vowel system With regard to the phonology involved in lexical contrasts, the SSE vowel system is more crowded than the Russian one. There are thirteen vowel phonemes. Ten of them are vowel monophthongs (see Table 2-3), of which // (schwa) appears only in unstressed syllables, and / / appear only in closed syllables. Besides, it features the three diphthongs /ai a i/. Table 2-3 Scottish English vowel monophthongs (adopted from Wells, 1982) i e o / a As we have mentioned in Section 2.1.1, the system of SSE monophthongs is different to Southern Standard British English (SSBE) and smaller in the number of oppositions involved, since it retains a Scots phonology. The differences in the monophthongs between the two standard varieties are shown in Table 2-4. Table 2-4 Comparison between monophthong phonemes between SSE and SSBE (adapted from Matthews, 2002) Word SSE SSBE foot – goose palm-bath-trap // //,/u/ /a/ //,//,/a/ lot-thought // //// 42 It is important to note these cross-varietal differences in the context of the study, since bilingual and monolingual children in Edinburgh are exposed to different English varieties through mass media, nurseries and community (see more in Section 3.2.1). 2.1.3.3 Segmental Differences in the Focus of Investigation Scottish English features a tense/lax contrast between /i/ and //, while such a contrast is absent in Russian (it has only the phoneme /i/). In the bilingual context, this constitutes a systemic difference that is a potential ‘docking site’ for language interaction. We will discuss the bilingual acquisition studies dealing with this particular contrast and its apparent difficulty in acquisition in Section 2.2. The SSE tense/lax contrast is different from SSBE. In SSBE phonological opposition usually implies both a difference in vowel quality (tense/lax) and a phonetic difference in vowel duration (with the lax/tense ratio of duration of 0.7 in the same consonantal context). However in SSE, the tense/lax opposition does not involve an extensive phonetic difference in duration; and it is featured only in the vowels /i / . Both “ship” and “sheep” are short in SSE (Aitken, 1981). In the bilingual Russian/Scottish English situation, according to the ‘Cross-language Cue Competition Hypothesis’ (CCCH) (Döpke, 1998; Döpke, 2000) discussed in Section 1.3.2.3, the situation involving absence of the tense/lax contrast (Russian) should have stronger ‘cue strength’, than the situation involving its presence (SSE). Thus if we extrapolate the CCCH to the level of speech, for this type of systemic difference we should observe unidirectional language interaction from Russian into Scottish English, but not the other way around. The alternative ‘Dominant Language Hypothesis’ (DLH) (Petersen, 1988) would predict language interaction from the more dominant into the less dominant language, irrespectively of their structure. The second difference relevant to this study is between the phonetic quality of close rounded phonemes: i.e. in SSE a more central [] and in MSR a back [u]. Crosslinguistically, this constitutes a realisational phonetic difference alongside the frontness – backness dimension. In Döpke’s (1998; 2000) terms, structures involved in such a realisational difference should have a similar ‘cue strength’. CCCH would not predict any language interaction here. On the other hand, DLH (Petersen, 1988) would predict a unidirectional language interaction from the more dominant language into the less dominant one. 43 Let us now consider the crosslinguistic differences between these vowels from the acoustic point of view. Figure 2-2 shows an acoustic representation of the vowels /i/, /u/, // and // in SSE, SSBE and MSR adopted from four acoustic studies (Bondarko, 1998; Deterding, 1997; Kuznetsov, 1997; Walker, 1992). The measurements are averages from adult female speakers. The vowels show the extremes of the vowel space. F2 (Hz) 2500 2000 1500 1000 u i 500 250 u 350 450 550 650 750 850 a F1 (Hz) 3000 SSE MSR SSBE 950 1050 1150 1250 Figure 2-2 Acoustic representation of SSE, SSBE and MSR cardinal vowel space (adopted from Bondarko, 1998; Deterding, 1997; Kuznetsov, 1997; Walker, 1992) . The vowel [i] seems to be similar in the three languages. The phonetic realisation of // in the SSE speakers (Walker, 1992) is lower in comparison with the SSBE speakers (Deterding, 1997). The main allophones of MSR /u/ and SSE // are very different acoustically, with the Russian vowel being back. Another striking issue is that the Russian close back rounded vowel /u/ and SSBE more central /u/ are annotated with the same phonetic symbol in the literature. While different studies (Bauer, 1985; Deterding, 1997; Hawkins & Midgley, 2004) (see also Section 3.6.4.2) reported that fronting has been progressing in the RP /u/ over the years, the phonological representation reflects for phoneme in its state of 40 years ago. 2.1.4 Prosodic Differences between Scottish English and Russian Despite substantial typological differences in grammar and lexicon, Scottish English and Russian have quite similar word-prosodic systems. 44 Both languages have variable syllabic location of lexical stress in that stress can create a syntagmatic opposition of polysyllabic words in both languages and change their meaning or grammatical properties: For example: Russian “ZAmok” (a castle) versus “zaMOK” (a lock) Scottish English “an INcrease” versus “to inCREASE” Both Scottish English and Russian are ‘stress accent’ languages (Beckman, 1986). They employ stress (primarily encoded by duration) for syntagmatic contrasts between words. Pitch conveys the pragmatic meaning of intonation rather than being a correlate of stress, and it is aligned with stressed syllables. Pitch functions on a different distinct dimension of prominence. Table 2-5 summarizes the main similarities and differences between Russian and SSE word-prosodic systems. Table 2-5 Broad differences and similarities between Russian and Scottish English word-prosodic systems. Russian Vowel reduction in unstressed syllables? Acoustic correlates of stress-accent Suprasegmental paradigmatic contrasts available? Intonation Scottish English Yes 1. Duration 2. Vowel quality 2. Spectral balance 3. Spectral balance? 3. Vowel quality 4. Intensity No Yes (SVLR) Pitch movement is not fixed at lexical level. Pitch expresses variable pragmatic meaning. Changes in pitch are associated with stressed syllables. Both languages feature vowel reduction. However, the rules of vowel reduction are more complicated in Russian. We discussed their implementation in Section 2.1.3.1. As a result, in Russian vowel quality seems to play a relatively more important perceptual role in distinguishing word stress (Bondarko, 1998; Svetozarova, 1998) compared to English. In Russian, it is second in strength as an acoustic correlate of word stress (Table 2-5). We are not aware of any studies addressing the role of spectral balance as a correlate of wordstress in Russian. However, Bondarko (1998, p.55) supports the traditional viewpoint of Russian phoneticians that in Russian the pronunciation of vowels is characterised by a rather slack articulation in prominent positions. This observation may indicate a secondary role of the acoustic parameter ‘spectral balance’ (which implies relative 45 slackness/tenseness of the glottal configuration) as compared to the primary role of vowel duration and vowel quality. We have found no studies on the order of importance of acoustic correlates of the word-prosodic system in Scottish English. However, other varieties, like General American have been studied in detail (Sluijter & van Heuven, 1996a; Beckman, 1986; Fry, 1955). As we discussed in Section 2.1.2.3, American English word stress is encoded by duration, spectral balance, vowel quality and intensity, with duration and spectral balance being close in strength (Sluijter & van Heuven, 1996a). The word-stress in American English is defined at lexical and morphological levels in a way similar to SSE. Both varieties use largely the same lexicon and grammar, as well as largely the same rules of syllabification and vowel reduction. We assume, then, that the order of importance of the acoustic cues to word stress in the SSE word-prosodic system should be similar to that in American English. As we discussed in section 2.1.2.5, the presence of suprasegmental contrasts in a language (like phonological tone or length) may affect the relative strength of acoustic correlates to prominence and word-stress in a word-prosodic system. Scottish English features ‘The Scottish Vowel Length Rule’ (SVLR) (Aitken, 1981; Scobbie et al., 1999a; Scobbie et al., 1999b). It involves a highly systematic distribution of vowel duration conditioned by postvocalic consonantal voicing and manner of articulation. SVLR applies to vowels /i/,// and /ai/, and it is conditioned by either (Scobbie, 2002): • the right consonantal context of the vowel: i.e. voiced fricatives and /r/ condition long duration of the vowel, all other consonants condition short duration; • the morphological context following the vowels: i.e. word-final open syllables are long like in “brew”, and they remain long if followed by a morpheme “_ed” like in “brewed”. The differences between the application of the morphological conditioning and the consonantal conditioning in SSE create a quasi-phonemic length contrast in a limited number of words like “brood” /brud/ and “brewed” /brud/ (Scobbie, 2002). Postvocalic consonantal conditioning of vowel duration has been claimed to be a phonetic universal (Chen, 1970), i.e. the duration is lengthened automatically due to the voicing of the following consonant with a number of mainly physiological phonetic 46 explanations (for a review see Lisker, 1974). The SSE SVLR pattern contradicts this automatic argument, since there is a strong segmental dependence in its applicability. Neither does the argument stand for Russian, since the language features final devoicing. As Keating (1984) put it, in Russian the “duration pattern was apparently determined by underlying values of the voicing features” (1984, p. 123), rather than by any physical voicing of the consonants. We agree with Keating that such vowel duration patterns are not universally predictable, but instead “each language must specify its own phonetic facts by rule” (Keating, 1984, p. 123). This also means that vowel quantity and quality are interdependent in a language-specific way, and monolingual and bilingual speakers should acquire the patterns. For example, if a child produces the Russian close back rounded [u] instead of the central [] in Scottish English it is worth considering, whether negligible postvocalic vowel duration conditioning in Russian also replaces the SVLR. Table 2-6 sketches the differences in the consonantal conditioning between Scottish English and other varieties (such as Southern Standard British English or General American). The table shows the examples of consonantal conditioning for the vowel //, and the relative length triggered by the following consonants. There are substantial crossvarietal differences in the application of the postvocalic conditioning: i.e. in SSBE vowel duration is mainly conditioned by the voicing of the following consonant, while in SSE both voicing and manner of articulation (voiced fricatives) trigger longer vowel duration. Unlike in SSBE, the SVLR conditions short duration in vowels followed by voiced stops like in “brood” or “seed”. Similar differences as in Table 2-6 apply to the vowel /i/. Figure 2-3 represents acoustic differences in vowel duration (ms) between the postvocalic conditioning for the vowels // and /i/ in SSE and that in General American. In SSE, the SVLR applies to the tense vowel, and does not to the lax one to the same extent (Agutter, 1988; McKenna, 1988). The SSE lax vowel has been described as ‘invariably short’ (Aitken, 1981). In General American, there is no such a differential implementation: i.e. voicing of the following consonant seems to trigger vowel lengthening in all contexts in both tense and lax vowels (House, 1961). As we discussed in Section 2.1.2.4, the availability of certain paradigmatic contrasts in a phonological system (such as length) can affect the system of syntagmatic contrasts of accentual systems (cf. functional load in Section 2.1.2.5). Since the SVLR in Scottish English is so different from the voicing effect in other English varieties, it can be expected 47 that the phonetic detail of the phonological accentual system in SSE might somewhat differ from either American English or SSBE. Table 2-6 Broad characterisations, for one representative vowel [], of vowel duration conditioning effects by various contexts in SSE and SSBE (adapted from Scobbie et al., 1999a) Dialect of English _n SSE Morphological context Duration Consonantal context _s _z Longer _d Bruise Shorter spoon Bruce SSBE (or Gen.Am.) Longer spoon Shorter _t _# _#d Brew brewed brute brood Bruise Bruce brood Brew brewed brute 400 350 vowel duration (ms) 300 250 SSE tense SSE lax GA tense GA lax 200 150 100 50 0 stop -voice stop +voice fric -voice fric +voice right consonantal context Figure 2-3 Acoustic differences in extrinsic vowel duration conditioning (raw duration in ms) for close vowels between SSE and General American. The solid lines represent tense close vowels, while the broken lines represent the lax ones (adapted from House, 1961; Agutter, 1988; McKenna, 1988) We can more certainly assume that the Scottish English word-prosodic system is different from the Russian one, since in Russian, vowel quality is the 2nd important acoustic cue to stress, and the extent of postvocalic conditioning of vowel duration is small compared to the SSE system (Chen, 1970; Gordeeva et al., 2003). For example, as 48 shown in Figure 2-4, for /i/ the increase in vowel duration conditioned by the change of the following consonant from /t/ to /z/ is 118% in Scottish English and only 18% in Russian (Gordeeva et al., 2003). 250 duration (ms) 220 190 Russian fric +v Russian stop -v Scottish fric +v Scottish stop -v 160 130 100 70 pos1 pos2 pos3 position in utterance Figure 2-4 Mean duration (ms) of /i/ in SSE and Russian prominent CVC words as a function of the following consonant (per position in utterance pos1= medial, pos2=final in an utterance with more than one pitch accents, pos3=final in an utterance with one pitch accent). Gordeeva et al. (2003) also addressed crosslinguistic differences in vocal effort across prominent words in Russian and Scottish English. The study addressed the question whether short and long SVLR vowels /i/ in prominent words like “sheep” and “cheese” differ in vocal effort (as determined by spectral balance), and whether SSE pattern is different from the Russian one given phonologically similar word structure and utterance position. The materials were the same as in this study. The vowel /i/ in the CVC words (e.g. “sheep” versus “cheese”) was followed by either phonologically voiceless stops or voiced fricatives (see Section 3.4.1 implications of phonetic differences). The words were compared in multiword utterances in similar prosodic contexts. Similar structure was applied to the Russian words and utterances. The subjects were female middle class speakers (five Scottish and four Russian), aged between 25 and 45. Spectral balance was measured in a steady-state portion of /i/ in four fixed frequency bands around F1 to F4 for each token of /i/. The methodology was addressed in detail in Gordeeva et al. (2003). The results are shown in Figure 2-5. They revealed that in SSE midfrequency bands of 2.5 to 4.5 kHz, the short /i/ had significantly higher RMS 49 power than the long /i/. In Russian, the contextual difference in the spectral balance of /i/ was not significant, but the RMS-power means were close to those of the SSE long vowel. Gordeeva et al. (2003) showed that spectral balance is a relatively more important acoustic correlate of the SSE word-prosodic system in than of the Russian one, since in Scottish English the application of SVLR in the vowel /i/ differentially affects the spectral balance of prominent short and long vowels, and results from greater vocal effort adopted by the speakers in short vowels in order to make them sufficiently prominent in the utterance. This context-dependent enhancement of spectral balance exemplifies the functional load of SVLR on the Scottish word-prosodic system. This supports Beckman’s (1986) dynamic view of an accentual system, in which phonological categories of the 0 -5 B1 B 2 B3 B4 B1 B2 B3 B 4 B1 B 2 B3 B 4 -10 -15 (dB) ratio of spectral level to overall intensity systems are not necessarily phonetically uniform across languages or within a language. pos 1 pos 2 pos 3 -v p lo s S c o t t is h + v fric S c o t t is h -20 -v p lo s Ru s s ia n -25 + v fric Ru s s ia n -30 -35 -40 fre q u e n c y b a n d / p o s it io n Figure 2-5 Mean spectral level (dB) in 4 frequency bands in three utterance positions in Scottish and in Russian. B1 = mean F1± 150 (Hz), B2 =mean F2 ± 300 (Hz), B3 = mean F3 ± 300 (Hz), B4 = mean F4 ± 300 (Hz). In Russian, the role of spectral balance was less important than in SSE, since the spectral balance was undifferentiated between the postvocalic conditioning contexts, and generally was as low as that of the SSE long vowel, indicating a rather slack glottal source configuration in the CVC words. The result was not surprising given the small extent of extrinsic vowel duration conditioning in Russian and the relatively great importance of vowel quality as a cue to stress. 50 2.2 Language Interaction in Bilingual Acquisition of Vowel Quality 2.2.1 Monolingual Acquisition 2.2.1.1 Non-Scottish English and Scottish English Studies on vowel development in young American English-speaking monolingual children (Stoel-Gammon & Herrington, 1990) suggest that vowel monophthongs can be grouped into three categories based on their accuracy rates and the order of acquisition. At the same time, vowel substitution patterns in earliest normal and disordered acquisition highlight the more ‘difficult’ phonological categories that children have to acquire. For American English, Stoel-Gammon and Herrington (1990) report (based on auditory description) that: (1) The corner vowels /i,,u/, midback /o/, and central // are acquired relatively early. These vowels seem to cause the least difficulties for both normally developing and children with phonological disorders. However, some children with phonological disorders substitute the tense vowel [i] with the lax counterpart [], while [u] can be substituted by [o]. (2) The group /æ,,,/ is acquired somewhat later than the first group of vowels. (3) The front vowels /e,,/ are acquired the latest among these vowels. The target [] is sometimes substituted with [i] in phonologically disordered child speech. However, other recent studies on the acquisition of tense/lax vowel contrast, contradict the above findings to some extent, at least for the tense/lax opposition. Kehoe and Stoel-Gammon (2001) found, based on auditory transcriptions that while normally developing English-speaking children (n=14, aged 1;3 to 2;0) systematically substituted [i] for [] and the other way around (at least for the youngest children), the number of realisations was limited and the majority of their productions had adult-like vowel quality. Besides, other substitution patterns of [] with [] have been reported in Otomo and StoelGammon (1992): i.e. lowering of // is the most common pattern in their dataset. In American English, the tense/lax contrast in vowel quality also involves a contrast in phonological length (long/short phonetic duration), with a lax-to-tense duration ratio (in words like “bit” versus “beat”) of .71 (House, 1961). With regard to the interplay of these 51 factors in acquisition, Stoel-Gammon et al. (Stoel-Gammon et al., 1995; Buder & StoelGammon, 2002) found that children acquiring such contrasts in American English initially acquire the differentiation in vowel quality only rather than the durational differences, which are added later. To summarise, based on previous research, we can anticipate (at least for children acquiring American English) that the tense/lax contrast should be already established by the age of 3;0 (i.e. the starting age in our study). Given substantial cross-varietal differences between American and Scottish English, we need to treat these findings with caution in transposing the possible consequences for children acquiring SSE. However, American English monolingual data show that the tense/lax opposition poses a certain degree of difficulty in child speech production, while the tense counterparts of the opposition /i/ and /u/ seem to be relatively easy to acquire. As opposed to American English, the SSE close rounded vowel // is central (or even front), rather than back (Wells, 1982; Walker, 1992; Scobbie et al., 1999b), and unlike American English or RP it does not involve a tense/lax opposition. Contrary to the acquisition patterns for American English, Matthews (2002) found a very broad range of ‘non-adult-like’ realisations in SSE vowel //. Matthews (2002) is the only longitudinal study devoted to the acquisition of vowels in Scottish English children. It deserves close attention, because it also suggests the range of segmental variation that can be considered ‘native-like’ in the speech of our bilingual subjects. Matthews’(2002) study focused on the acquisition of segmental aspects of SSE vowels in Scottish English children (n=7, aged 18 to 36 months) growing up in Edinburgh middle class families. Matthews’ analysis concentrated on the qualitative aspects of vowel mastery. He discussed the developmental trends of the children’s vowel production in terms of being ‘adult-like’ versus ‘non-adult like’. The vowels with the accuracy (in reaching adult-like targets) of production above one standard deviation in individual sessions were labelled as ‘easy’, while the ones below one standard deviation were labelled as ‘difficult’ for the children. His conclusions were mainly based on narrow phonetic transcriptions. Matthews found a broad range of segmental variability in the vowel production of the SSE children. It ranged from a substantial variation in quality (such as nasalisation, rhoticity or rounding) to approximant-like and consonantal realisations. 52 Regarding the close central rounded //, he found that this vowel is a ‘difficult’ type, responsible for a big range of ‘non-adult-like’ realisations in child speech. The ‘difficulty’ in acquisition of SSE // was surprising, since the vowel belongs to the traditional set of the ‘corner vowels’ /a,i,u/, often considered in the literature to be acquired first, according to the “law of irreversible solidarity” formulated by Jakobson (Jakobson, 1941). Thus, Matthews argues that “the primacy of acquisition of the corner vowels [..] does not necessarily hold true for the case of” SSE (Matthews, 2002, p. 268). On the one hand, we agree with Matthews’ criticism that Jakobson’s “law of irreversible solidarity” has proven to be an overgeneralisation of some tendencies in child speech (see e.g. discussion in Menn & Stoel-Gammon, 1995). On the other hand, Matthews’ criticism is not necessarily empirically substantiated in his dataset. Jakobson’s analysis of the corner vowels handles the acquisition of vowel contrasts in terms of the order of ‘emergence’(Jakobson, 1941), rather than in terms of the relative degree of adult-likeliness, as measured by Matthews. Thus, it is also possible that the SSE // emerges early, while pertaining a broad range of non-adult-like realisation for some time. The substantial range of variability in the child realisations of SSE adult target //, reported by Matthews, is an important finding. In Table 2-7, we summarise the most common substitutions for // in his dataset for the latest longitudinal sessions (age 29 to 36 months). From this list, we excluded idiosyncratic realisations with frequency 1, and 12 cases with rhoticity attributed to productions of one child. Table 2-7 Most frequent ‘non-adult-like’ substitutes for SSE target // in child speech (adapted from Matthews, 2002). Nr tokens % o 13 23 11 20 9 16 7 13 u 5 9 ø 3 5 3 5 3 5 2 4 53 As shown in the table, 20% of non-adult realisations of [] can be attributed to changes in lip rounding (i.e. unrounding). 29% of the cases can be attributed to both changes in rounding and quality (lowering, and/or backing or fronting). In fact, similarly to Stoel-Gammon and Herrington’s (1990) finding for phonologically disordered speech, the most frequent ‘non-adult-like’ realisation in SSE child data for // is [o]. The total percentage of non-adult like realisations varied from child to child. For example, Esther (2;9) makes only 28% of errors, while Ben (2;8) 75%. Also, interestingly, a number of the realisations involve the lax vowel [], which does not feature as a phoneme in the phonological system of adult SSE speakers. Of course, the presence of this lax realisation of // and its broad phonetic range could simply be due to speech immaturity. However, in addition to that, the relatively late acquisition of the SSE adult-like quality of // could also be attributed to cross-varietal influences on SSE from other British English varieties. Most children in Edinburgh are exposed to nonSSE varieties indirectly through TV, but additionally one fourth of the middle class children grow up in families with at least one parent from non-SSE British English background (Scobbie et al., 1999a). Children attending local nurseries are regularly exposed to different varieties of British English from either staff members or peers (see also Section 3.2.1 for discussion). The SSE //, and SSBE /u/ and // are tightly clustered in the vowel space (see Figure 2-2 ), and yet they are crossvarietally distinct. The extensive exposure of children to the tightly-clustered phonetic variants of SSE and nonSSE varieties such as [] versus [u] and [] in the same lexical items may explain the difficulties to acquire this particular SSE vowel as well as its non-corner location. Acquisition of the vowels /i/ and // in Matthews’ dataset was not treated in terms of a special opposition, since he focused on the ranges and processes for all SSE vowels. However, from his data we can derive that like in American English the SSE tense vowel /i/ is acquired early, since it belongs to the ‘easy’ category (i.e. all children produced most targets [i] with the adult-like accuracy). Among the rare cases of substitutions, in the age group of 29 to 36 months, the most common realisation for [i] (56% of all substitutions) was the lax vowel [], indicating that the tense/lax opposition may also cause some difficulty in monolingual SSE acquisition. On the other hand, the lax // seemed to belong to the ‘difficult’ category, since the number of ‘non-adult-like’ realisations was relatively high in all sessions. In the older age 54 group (29 to 36 months), the substitutions (see Table 2-8) mainly involved lowering in vowel quality rather than raising (in at least 80% of cases), and there was only one case of substitution by [i]. This is not surprising, since // in the adult ‘Scots continuum’ substantially varies alongside the degree of aperture (increasing F1) ranging from [] to a more open [] quality (Wells, 1982, v.2 p.404). Walker’s (1992) data show that this is also true for middle class SSE speakers from Edinburgh. The relative difficulty of acquisition of SSE // (as compared to /i/) parallels the acquisition pattern reported by Otomo and Stoel-Gammon (1992) for American English. Table 2-8 Most frequent ‘non-adult-like’ substitutes for SSE target [] (adapted from Matthews, 2002) Nr tokens % 24 46 12 24 e 4 8 4 8 o 2 2 y 1 2 1 2 ou 2 2 i 1 2 To summarize, Matthews’ (2002) data on the acquisition of SSE vowels confirms findings for American English on the relative difficulty to acquire the lax vowel // as compared to /i/. In SSE, there is a limited number of substitutions of /i/ by the lax [] in the course of monolingual acquisition (even at the age of 29 to 36 months), indicating a certain tension in acquiring this specific contrast. As opposed to that, substitutions of // mainly involve lowering. In comparison to American English, the acquisition of SSE // is not straightforward and easy, since at the age of 18 to 36 months SSE child speech production exhibits a broad range of variation with a substantial proportion of ‘non adultlike’ realisations, indicating relative difficulty in acquiring this vowel. 2.2.1.2 Russian As discussed in Section 2.1.3, there is a substantial difference between the number of segmental oppositions involved in Russian and SSE phonology. The Russian phonemic vowel inventory is substantially expanded at the phonetic level due to the presence of the 55 palatalised – non-palatalised opposition in consonants, and a more complicated (compared to English) system of vowel reduction. This section discusses previous findings on these phonological issues in monolingual acquisition of the Russian sound system. Shvachkin’s (1948/1948) investigation of the development of phonemic perception in Russian gave an important impulse to study other languages in terms of the order of acquisition of phonological contrasts. The suggested patterns of phonological acquisition in his study replicate to some extent the views on the emergence of contrasts in terms of ‘the laws of irreversible solidarity’ in Jakobson’s Kindersprache (Jakobson, 1941). Shvachkin’s longitudinal analysis of child speech (n=18, aged from 0;10 to 2;0) suggested a common pattern of emergence of phonological oppositions. The ‘discrimination of vowels’ (1948, p.123) is the first stage of phonological development. At this stage, // is initially discriminated from all non-// phonemes; subsequently there emerges an opposition of /i/ – /u/, // – //, /i/ – //, /u/ – //. The first stage is followed by the stage of ‘discrimination of presence of consonants’, and the further stages involve discrimination between different consonants. Even a brief comparison reveals that there are differences between the universal orders postulated by Jakobson and Shvachkin. In Jakobson’s analysis (Jakobson, 1941), the acquisition of the low vowel is proceeded by the emergence of the first consonantal opposition between nasal and oral stops. However, the two analyses suggest that both vowels /i/ and /u/ emerge early in the process of Russian child speech acquisition. In parallel to findings for English (Menn & Stoel-Gammon, 1995), a recent study of phonological development in Russian conducted by Zharkova (2002, p.48) (n=7, age 1;3 to 3;2) also puts a question mark against the universal order of acquisition suggested by Jakobson. Despite some common tendencies (e.g. all children first acquired //, and // was last), Zharkova reports quite variable orders of emergence of phonological oppositions in her subjects’ speech (e.g. // – /u/ – // or // – // – /i/). Table 2-9 summarises the frequencies of vowel phonemes in the speech of five subjects from Zharkova’s study. It is clear that the order of acquisition is probabilistic rather than deterministic, as the frequencies of occurrence of the 3rd - 5th frequent phonemes differ across five children. Despite these individual differences, we can assume that in monolingual Russian development all six vowel phonemes (in Table 2-9), will have emerged and established to a certain degree by the age of 3;0. 56 It is established that palatalisation is perceptually more salient for Russian speakers than the voicing/voicelessness distinction or even place of articulation changes (labial versus dental) (Kavitskaya, 2002). The phonemic opposition of palatalised – nonpalatalised consonants in Russian profoundly influences the quality of the following vowels, and this consonantal palatalisation is acquired early in phonological development (Jakobson, 1941; Shvachkin, 1948; Tsejtlin, 2002). It is also known that one of the common patterns of Russian child speech production in the first three years of life involves an extensive substitution of non-palatalised consonants by the palatalised counterparts (Jakobson, 1941; Zharkova, 2002). Table 2-9 Frequency of vowel phonemes in 5 subjects (the higher the row – the more frequent the sound in the table) (adapted from Zharkova, 2002). Subject Nr (age) 1 (1;3) 2 (1;9) 3 (2;0) i u i u - i u 4 (2;0) 5 (3;0) i u i u This process of palatalisation ought to affect the quality of the following vowel: e.g. the back vowel [u] should become more fronted after palatalised consonants. This fronting due to speech immaturity in Russian could be confused with language interaction from Scottish English, where // is central (or front) phonetically. Palatalisation of the preceding consonant might be an additional cue to the language identification in such cases. With regard to this influence of palatalisation on the vowel quality in child speech, Zharkova provided a limited analysis based on formant measurements for //, and she carried out a more comprehensive analysis of phonemic substitutions encountered in her child subjects. Formant analysis showed that as children get older, the segmental differences between vowels following the palatalised consonants and those following the non-palatalised ones become more differentiated: i.e. the vowel representing phoneme // following the non-palatalised consonants becomes more open and fronted. Her analysis of phonemic substitutions confirmed that in the age groups concerned, the substitution of 57 non-palatalised consonants by palatalised ones is a frequent feature of the Russian child speech. To summarise, based on previous research we can anticipate that by the age of 3;0 Russian monolingual children acquire the system of phonological oppositions involving vowels in focus of this study. There should be no difficulty to produce an adult-like [i] in stressed syllables. For [u] a certain amount of phonetically palatalised realisations of the preceding non-palatalised consonants can affect the vowel quality, i.e. it can become more fronted. Thus, in child speech the Russian back /u/ can become less different in formant structure from the Scottish vowel []. The variability in the order of segmental acquisition in both languages (MSR and SSE) discussed in this section is in line with Vihman’s (2002) view on the emergence of phonology in child language. According to her view, the initial phonological system is not directly constructed in the child’s language in terms of segments, phonological contrasts or distinctive features, but is rather based on explicit lexical ‘item learning’ in the second year of life. This initial speech production base is derived from the implicit learning and motor practice (‘Vocal Motor Schemes’) (Vihman, 2002) in the first year of life, and from prosodic and segmental language patterns available in the input language. The accumulation of vocabulary in the second year of life provides a further base to induce regularities from the available set, and, thus, helps to form characteristic production patterns (‘word templates’). Items learned are selected and restructured (if they don’t fit) into these ‘word templates’. Since ‘item learning’ is incidental to some degree (in that it also depends on the input), it predicts both variation in the individual paths of development, and similarities in the paths of development of a specific language. 2.2.2 Bilingual Acquisition The question of language interaction in bilingual acquisition has only recently started to receive some attention from researchers, but studies on language interaction in phonological development are scant. There are also no studies on early bilingual phonological development of vowel systems dealing either with Russian or Scottish English. As we discussed in Chapter 1, most of the studies on phonological development in early bilinguals addressed the question of ‘one versus two systems’ (Schnitzer & Krasinski, 1994; Schnitzer & Krasinski, 1996; Johnson & Lancaster, 1998; Deuchar & 58 Quay, 2000; Keshavarz & Ingram, 2002), and they focussed on speech sounds in terms of their inventories, rather than structural contrasts between them. Table 2-10 summarises the set up of five studies (Schnitzer & Krasinski, 1994; Schnitzer & Krasinski, 1996; Johnson & Lancaster, 1998; Deuchar & Quay, 2000; Keshavarz & Ingram, 2002) that addressed early bilingual phonological acquisition of vowel systems. These studies employ similar methodologies in that they are based on children exposed to the two languages from birth; all analyses are single case studies; phonological analyses are drawn from diary records or auditory phonetic analysis. All these studies found that the two vowel systems were differentiated in bilingual child speech. However, the question of ‘one or two systems from start’ seems to deliver controversial results and interpretations. For example, regarding the question ‘one versus two’ phonological systems, the results in two case studies of Schnitzer and Krasinski (1994; 1996) give different outcomes. In the two consecutive studies Schnitzer and Krasinski addressed segmental aspects of phonological development of their two children: Fernando (age 1;1 to 3;9) and Zevio (age 1;6 to 4;6). The children’s father was a native speaker of American English, and their mother spoke Puerto Rican Spanish. The children grew up in Puerto Rico. The authors found no evidence in either case for an initial ‘single system’ in the acquisition of vowels, while for consonants one child seemed to have an initial single system with later differentiation while the other subject differentiated between the two consonantal systems from the outset of speech production. The authors explain this discrepancy by individual differences in ‘avoidance’ strategies employed by the children: i.e. Zevio avoided target words that he could not pronounce (thus ultimately he produced more target-like forms), while Fernando attempted them at earlier more immature stages. These two papers present little evidence for language interaction. Similarly Deuchar & Quay (2000) considered the question of one versus two systems and the role of input the bilingual language acquisition. Their case study is an exception in that it simultaneously addressed several domains of language acquisition: i.e. lexicon, syntax and phonology. The subject M (aged 0;10 to 2;3) was raised following the ‘one parent – one language’ approach until around age 1;0. Her mother spoke British English and the father Cuban Spanish. The family lived in England. After the age of 1;0, Spanish became their home language while English was spoken outside home. Based on the input until the age of 2;0, the girl was exposed to more English than to Spanish. 59 Table 2-10 A summary of five studies that dealt with bilingual phonological acquisition of vowel inventories. S&K,94 S&K,96 1 N J&L1998 1 K&I,2002 1 Age 1;1 to 3;9 1;6 to 4;6 1;2 to 1;11 0;8 to 1;8 Mother speaks Puerto Rican Spanish Puerto Rican Spanish Father speaks American English American English Bokmål Norwegian Farsi Farsi & Canadian American English English Puerto Rico Residence Upbringing situation Puerto Rico Canada UK -- Iran two languages from birth 1 0;10 to 2;3 British English (SSBE) & Cuban Spanish Cuban Spanish UK diary records, audio recordings, auditory phonetic analysis Method Questions single initial system versus two systems from the outset? Sizes of compared vowel inventories Vowel Systems differentiated? Language Interaction Observed? One or two phonological systems initially? S&K,94 S&K,96 J&L,98 D&Q,2000 K&I,2002 D&Q2000 1 small versus large 2 large small versus large Yes Yes yes Yes Yes Yes two (vowels) single (consonants) No two (vowels) two (consonants) unclear Yes No neither is true Two Neither Schnitzer & Krasinksi 1994 Schnitzer & Krasinksi 1996 Johnson & Lancaster, 1998 Deuchar & Quay, 2000 Keshavarz & Ingram, 2002 60 Deuchar & Quay’s (2000) method involved analysis of diary records and audiovideo recordings. Among others, the research included analysis of broad phonetic inventory based on auditory transcription of words that entered the girl’s lexicon. Spanish features a five-vowel system, and RP twelve monophthongs and eight diphthongs. Deuchar & Quay (2000) reported that by the age of 1;10 the vowels produced by the girl reflected those of the input languages, and followed the patterns found in the monolingual acquisition. There is neither explicit report on variability ranges in the vowel production, nor that of language interaction in this study. This is interesting, since the differences between Spanish and RP in terms of contrasts and oppositions in the vowels systems is similar to those of Russian and Scottish English. As Vihman (2002) argues, the question of ‘one versus – two systems from start’, put in such a mutually exclusive way may just be the wrong one to ask, since other plausible questions can be asked as well. Vihman proposes a hypothesis of bilingual phonological acquisition, in which a bilingual child in the pre-linguistic period implicitly develops a considerable distributional knowledge about the languages in his environment, while explicit lexical learning at the start of speech production allows inducing and building the phonological knowledge about the two languages in contact. Thus, in fact there may be no question of ‘one or two phonological systems’ at all when a child starts to speak, but the systems can be constructed as phonological rules are gradually induced with the lexicon growth. In fact, Deuchar & Quay (2000) who tried to address the question of ‘one or two phonological systems’, have come to a similar conclusion that rather than having ‘one or two systems from start’, their subject’s bilingual acquisition could be rather seen as “a progression from a lack of system in either languages” to the establishment of a vowel (and VOT) system in English and Spanish. In Deuchar & Quay’s view (2000, p.34) the vowels acquired by their bilingual subject “reflect those in the input languages”, and looking at phonetic inventories does not reveal much “about the nature of the system in terms of contrasts and oppositions”. Since induced phonological rules may be incidental in the sense that they at least depend on the lexicon acquired, the input a child is exposed to (within and across languages) is very important. Thus, in looking at the sources of language interaction in phonological system we should at least consider the question of input conditions (such as different language exposure patterns) in addition to structural issues of the languages in contact. 61 Guion’s (2003) study of adult bilinguals is relevant in this discussion because it provides evidence that systemic crosslinguistic differences, such as relative crowdedness of vowel space (absence/presence of certain vowel contrasts in the languages in contact), can cause difficulties in the process of bilingual acquisition. Guion (2003) argues that the success of bilingual acquisition depends on the age of onset of language learning. In this study Quichua-Spanish adult bilinguals (n=20) were compared to Spanish monolinguals. The Quichua-Spanish bilinguals all acquired Quichua from birth, but differed in the age of acquisition onset for Spanish (from birth to 38;9). Quichua features only three monophthong vowels (rather lax), while Ecuadorian Spanish features five tense vowels. In the acoustic vowel space the Quichua vowels are less dispersed than the Spanish ones. Guion performed formant analysis of vowels with subsequent normalisation for vocal tract length differences. There were three groups of ‘tightly packed’ vowels with a potential structural conflict in the vowel space: (1) Quichua // versus Spanish /i/ and /e/; (2) Quichua // versus Spanish /u/ and /o/; (3) Quichua /a/ versus Spanish /a/ (F1 of the Spanish /a/ is phonetically lower than the Quichua one). The results indicated that age of onset of acquisition is an important factor in approaching native-like speech production in the two languages. All the simultaneous and most of the early bilinguals (onset between 5;0 and 7;0) distinguished between Spanish and Quichua vowels in speech production. The late bilinguals transferred their native lax Quichua vowels quality into the tense Spanish system. Besides, the simultaneous bilinguals were able to acquire ‘more tightly packed’ vowels, while the early and late bilinguals were able to acquire new vowels, but were less successful in “partitioning the vowel space in the same fine-grained way” (Guion, 2003, p.121) as simultaneous bilinguals. Since such systemic differences are difficult to acquire for adult bilinguals, they also may cause difficulties in the course of early bilingual acquisition, before the two systems grow into an adult product. Keshavarz & Ingram (2002) addressed the question of ‘one versus two systems’, but they also reported some variation in the vowel productions for their bilingual subject. Some phonetic variants showed signs of language interaction. The subject, Arsham (n=1, age 0;8 to 1;10), was brought up according to the ‘one parent – one language’ principle (the father spoke American English, while the mother was a Farsi speaker). The amount of input in the two languages changed as the family moved around: i.e. from 8 to 14 months Arsham received more input in Farsi, and 15 to 24 months more input in English. 62 Farsi has a six-vowel system, while American English features thirteen vowelmonophthongs in addition to three diphthongs. Keshavarz & Ingram annotated the audio recordings with broad phonetic transcriptions. The results showed that Arsham’s development of syllable structure was consistent with the syllabic structure of the target languages. The majority of English words were monosyllabic, and the majority of Farsi words were polysyllabic. Stress patterns in both languages were respected: i.e. Farsi has fixed stress on the ultimate syllable, while English has a variable word stress location. Vowel and consonant inventories of the child were mainly language-specific. Thus, Keshavarz & Ingram (2002) concluded that Arsham developed two separate phonologies. However, the authors also reported (2002, p.265) that the child produced a limited number of vowels, which at least suggested a possibility of transfer from English into Farsi: i.e. Arsham produced [] for /u/, [] for /o/ and [] for // in some target Farsi words. Keshavarz & Ingram (2002, p.265) suggested that this transfer could be viewed as “a sign of shifting dominance” in Arsham’s language acquisition process. Interestingly, the direction of transfer in Arsham’s case contradicts the unidirectional ‘markedness’ hypothesis proposed by Müller (1998) for simultaneous bilingual acquisition. We discussed these issues in Section 1.3.2.3.3. The transfer is directed from a more marked English system (/u/ and //, and // and //) into a less marked six-vowel Farsi system rather than the other way around as predicted by the hypothesis (Müller, 1998). Similarly, the direction of transfer in Arsham’s data contradicts the CrossLanguage Cue Competition Hypothesis (Döpke, 1998; Döpke, 2000), since elements of the more ambiguous English vowel system were introduced into a less ambiguous Farsi system, and not the other way around. Further, Kehoe’s (2002) study directly dealt with the sources of language interaction between different vowel systems of young simultaneous bilinguals. Since the segmental issues in her study also concerned vowel duration, we shall discuss it in Section 2.3.2. To conclude, there seems to be a consensus across the majority of studies devoted to bilingual phonological acquisition of vowel systems that the two systems are acquired language-specifically by early simultaneous bilinguals. However, handling acquisition of vowels in terms of pure inventories (rather than a system of meaningful contrasts, distributions and ranges of variation) may not reveal possible language interaction effects. It is also not clear from the scant reports of language interaction what determines its 63 direction: the structure of languages in contact or environmental factors (such as the amount of input in the two languages) or both. 2.3 Language Interaction in Bilingual Acquisition of Vowel Duration 2.3.1 Monolingual Acquisition The process of acquisition of phonological system involves both acquisition of segmental properties and their language-specific timing. It is known that a number of languages (e.g. Scottish English, Hungarian, Swedish or Finnish) feature contrasts such as postvocalic consonantal conditioning of vowel duration or phonological vowel length, while other (e.g. Russian, Spanish, Polish, Arabic) do not to the same extent, if at all. Studies of monolingual acquisition suggest that in languages involving such paradigmatic vowel duration conditioning (intrinsic or extrinsic) the durational contrasts are acquired relatively early. Stoel-Gammon et al. (1995) and Buder & Stoel-Gammon (2002) addressed the issue of acquisition of vowel duration by Swedish and American English children (age 30 months, n=18). Both languages feature intrinsic vowel duration conditioning for vowels /i/ and // in words like “bit” and “beat”. In English, the two vowels differ in vowel quality and duration with a lax-to-tense vowel duration ratio of .71 (House, 1961), while in Swedish the difference is substantially larger with a lax-to-tense ratio of .65. In addition, there are crosslinguistic differences in the implementation of extrinsic conditioning due to the voicing of the following consonant (in words like “beat” and “bead”): i.e. the voiceless-to-voiced ratio is .51 in American English, while in Swedish the extent of such conditioning is negligible. An important finding in these studies (Stoel-Gammon et al., 1995; Buder & StoelGammon, 2002) was that intrinsic and extrinsic vowel duration conditioning seemed to follow different paths of acquisition in the two languages. In American English, where the adult model features the tense/lax contrast for some vowels (but not for all) and their extrinsic conditioning is substantial, 30-month children acquired only the vowel quality differences in intrinsic vowel duration conditioning, not the differences in duration, and the extrinsic vowel duration conditioning pattern. In contrast, the Swedish children acquired the intrinsic vowel duration, but not the vowel quality distinction. The Swedish 64 pattern of acquisition was in line with the importance of intrinsic vowel duration conditioning in Swedish where the length contrast is present for all vowels. The above results for American English are also supported in Stoel-Gammon & Buder (1999). The child averages (n=20; age 2;0) for extrinsic and intrinsic patterns in their study conformed to the adult models. Stoel-Gammon & Buder also reported that both patterns considerably varied across subjects: in 35% of the children the duration of lax vowels exceeded that of the tense ones (unlike in the adult model in Figure 2-3). Empirical data on the acquisition of SVLR in Scottish English is rather limited. Hewlett et al. (1999) studied extrinsic vowel duration patterns in Edinburgh-born children (n=7; age 6;0 to 9;0). The children had different parental backgrounds: i.e. two children had both SSE-speaking parents, two had one SSE and one non-SSE speaking parent, and the remaining children had two non-SSE speaking parents from different English backgrounds. Regarding the cross-varietal differences in the implementation of the extrinsic vowel duration (Figure 2-3 and Table 2-5), it could be expected that children with different parental backgrounds (SSE or non-SSE) would have different extrinsic vowel duration conditioning patterns. The study concentrated on the differences for the vowels /i/ and // in CVC-words. The results in Hewlett et al. (1999) showed that all the children mastered an extrinsic vowel duration conditioning pattern, and the parental dialectal background largely determined what pattern it followed. Those children with one or two SSE speaking parents acquired the Scottish pattern, whereas the children with two non-SSE Englishspeaking parents acquired a pattern similar to the SSBE pattern for the tense vowels shown in Figure 2-3. Besides, two subjects had a different extrinsic pattern between the vowels /i/ and //: i.e. these children mastered the SVLR pattern for /i/, and the non-SSE pattern for /u/ and lax // (the SSE // stands for two realisations in SSBE, tense /u/ as in “foot” and lax // as in “food”). This suggests that additional vowel quality contrasts on top of durational differences may play a mediating role in the acquisition process of vowel duration. Matthews (2002) carried out a limited acoustic analysis of vowel duration for three (out of the seven) children in his study. He subdivided the tokens with syllable nucleus /i/ into ‘long’ and ‘short’ categories. Individual results of the acoustic analyses per child are presented in Figure 2-6. The ‘long’ category included /i/ followed by voiced fricatives, while ‘short’ included /i/ followed by voiceless stops. 65 600 mean duration (ms) 500 400 long 300 short 200 100 0 R_2;6 E_2;8 B_2;6 child_age Figure 2-6 Mean duration (ms) for SSE vowel /i/ as a function of the right consonantal context for three speakers in Matthews (2002). Figure 2-6 shows that two children (R and B) acquired the SVLR distinction, (though the differences were not statistically significant), since the differences between the means are in the right SVLR direction, with the ‘short’ category being shorter than the ‘long’ one. There are substantial individual differences between the subjects, which might be due to the low number of tokens in this sample (between three and five for each category). Matthews’ data suggest that Scottish children may acquire some SVLR-like pattern by the age of 2;6 to 2;8. However, we cannot fully assume this, since we don’t know (given that there was no /i/ followed by voiced stops in Matthews’ data) whether children acquired a Scottish or a non-Scottish extrinsic vowel duration pattern. Nor can we assume, given Hewlett et al. (1999) results that the bilingual children in our study would acquire the SVLR pattern, given that they have two non-SSE speaking parents (Russian), and that they are exposed to all English varieties in Edinburgh including the SSE majority (see Section 3.2.2 for further discussion). The results for American English, however, support the suggestive SSE data (Matthews, 2002) that postvocalic conditioning of vowel duration can be acquired by the age of 2;0. Thus, given that intrinsic and postvocalic vowel duration conditioning is of 66 similar extent within General American and SSE phonological systems (even though they differ in phonetic detail), we can expect that SSE monolingual children: (1) should generally have no difficulties in distinguishing between the vowel qualities of tense and lax vowels (/i/ and //); (2) will have acquired the SVLR patterns for /i/ and //, and will have a differentiated pattern for the lax vowel // from the tense /i/; (3) will show somewhat different individual SVLR patterns from the adult SSE model with a broad range of variation in its realisation. 2.3.2 Bilingual Acquisition With a few exceptions, the acquisition of the contrasts such as intrinsic and extrinsic vowel duration conditioning has rarely been addressed in early bilingual acquisition. Kehoe (2002) addressed the question of language interaction between young bilinguals’ phonological systems. She followed up Paradis & Genesee’s (1996) proposal for the bilingual acquisition of syntax that language interaction (if any) may take a form of acceleration, delay or transfer. She examined language interaction between German and Spanish vowel systems in German-Spanish bilingual children (n=3, aged 1;0 to 3;0). Her study concentrated on the influences of the Spanish 5-vowel system (with no intrinsic vowel duration conditioning) on the German 14-vowel system (with such a conditioning) in the speech of bilingual children. The bilingual children lived in Germany in families with German fathers and Spanish-speaking mothers. The families mainly followed the ‘one parent – one language’ approach. Control groups included age-matched German (n=3) and Spanish (n=3) monolinguals. The methodology involved acoustic analysis of vowel duration and auditory transcriptions of the target vowels in both languages. The results of her study for the German monosyllabic words are presented in Figure 2-7. 67 450 400 duration (ms) + 1 SD 350 300 short ML 250 short BL 200 150 long ML 100 long BL 50 0 S N J B T M subject Figure 2-7 Individual means of the differences in intrinsic vowel duration in German (short and long vowels) in the speech production of bilingual German-Spanish (broken bars) and monolingual German (solid bars) children. At the age of 2;3 to 2;6 the German monolingual children (B, T and M in Figure 2-7) produced significant differences between short and long vowels, even though the difference was not as substantial as in the adult model. The bilingual children (S, N and J) did acquire the difference between short and long vowels in the right direction, but the extent of the difference was much smaller than that of the monolingual children, and was not statistically significant. For the Spanish vowels, Kehoe found no systematic differences in the acquisition pattern between bilinguals and monolinguals. Marginally, the bilingual children produced more non-Spanish vowels (6%) than the Spanish monolingual children (3%). However, Kehoe does not interpret this as evidence for transfer from German since most nonSpanish vowels fitted into the non-adult ranges of variation of the monolingual children, and were mostly non-German vowels. The results further showed that bilingual children: (1) had a delayed acquisition of the German vowel length distinction compared to the German monolingual children; (2) had no problems acquiring the Spanish 5-vowel system. Similarly to Müller’s (1998) study for bilingual syntactic acquisition, Kehoe invoked the concept of ‘markedness’ to explain the ‘delayed’ acquisition of German 68 vowel length by the German-Spanish bilingual children. She argued that the marked German vowel system (featuring vowel length and more vowel contrasts) was more difficult to acquire, than the unmarked 5-vowel system in Spanish. In arguing that the language interaction takes the form of delay, and that it is due to the systemic differences between German and Spanish, Kehoe (2002) did not provide the evidence that the children in her study did eventually acquire the German intrinsic vowel duration conditioning in a manner similar to German monolingual children. In fact, the bilingual patterns in Kehoe’s study seem also to be influenced by the undifferentiated Spanish system (i.e. undergo transfer). We disagree with Kehoe’s treatment of the notions ‘transfer’ and ‘delay’ and their relations inherited from Paradis & Genesee’s (1996) study (see Section 1.2.2). Both delay and transfer could potentially be viewed as an effect of language interaction, only if the delay (a developmental notion) has nothing to do with transfer (a static process). For example, two phonological systems, like the durational features in Spanish/German bilinguals’ languages in Kehoe’s (2002) study can influence each other, because the nonbase LB is not completely inhibited, or because Spanish durational structures are stored in the German patterns. This effect may seem a delay compared to monolinguals because the transfer ultimately ceases. So we need more clarification on the relations between the notions of ‘transfer’, ‘acceleration’ and ‘delay’ in the taxonomy of language interaction effects. Whitworth’s (2003) study addressed the issue of acquisition of intrinsic and extrinsic vowel duration conditioning in early and late German-English bilinguals. German and English have similar phonological systems, but differ in the phonetic detail of the implementation. Intrinsic vowel duration conditioning plays a greater role in German, while extrinsic conditioning plays a greater role in English. The early simultaneous bilinguals (n=6) in her study aged from 5;0 to 13;2, and lived in West Yorkshire (UK) in families following the ‘one parent – one language’ upbringing principle. The extrinsic and intrinsic vowel duration were acoustically measured and expressed as ratios. There are some problems with interpreting Whitworth’s results on vowel duration. The first problem is that in a substantial part of the study she averaged the results of a cross-section of six early bilinguals as one group, despite the fact that they were too different in age to be treated this way (5;0 to 13;2). Being different in age and environmental situations the children naturally produced very different results. For 69 example, Whitworth (2003, p.151) reports that on average the bilingual children produced the non-language-specific patterns for German and English lax-to-tense ratios (with the German ratio being greater than the English one) compared to the patterns of all ‘fathers’ and all ‘mothers’. On the contrary, further individual results (2003, p.157), showed that the results of six children could, in fact, be split into two groups: (1) Max (6;2), Anneliese (7;6) and Reuben (10;10), whose German lax-to-tense ratio was greater than the English one, acquired a ratio unlike the adult model; and (2) Leonore (5;0), Rieke (8;2) and Salome (13;2), whose German lax-to-tense ratio was smaller than the English one, acquired a ratio more similar to the adult model. Despite this discrepancy between group and individual results, further analyses, conclusions and discussion of the bilingual acquisition of lax-to-tense ratio is based on the averaged group results, rather than on individual (Whitworth, 2003, p.180). Another problem is that Whitworth treats Germanspeaking mothers and English-speaking fathers as groups rather than as individuals, relating them to the speech production of their own children, despite the fact that averaging results of the parents makes them comparable to just any sample from the population from their dialectal area. Results for extrinsic vowel duration conditioning (voiceless-to-voiced ratio) in Whitworth’s study showed that the children produced results intermediate between the two language values (except for the youngest child aged 5;0). Whitworth argues (2003, p.192) that the intermediate results in the production of extrinsic conditioning in the bilingual children are “affected by markedness rather than by language transfer”. In our view, the two phenomena can’t be put together in an ‘either/or’ fashion, as markedness is shaped by the relative systems of the two languages in contact and their distributional characteristics, whereas ‘transfer’ is a process resulting from the relative (in-)dependence in their mental representations. In these capacities these processes can operate on top of each other. To summarise, there is some evidence that by the age of 5;0 bilingual children acquiring two systems with very different extrinsic and intrinsic vowel duration conditioning systems may experience relative ‘difficulty’ in acquiring the structurally more complex system of the two. The difficulty in acquisition may result in apparent delay in comparison to the monolingual children (Kehoe, 2002). What is not clear is whether this delay resolves (if it exists at all independently of ‘transfer’), in what form it settles down, and what factors other than language structure and its relative markedness can influence this. 70 2.4 Acquisition of Vocal Effort 2.4.1 Monolingual Acquisition In Section 2.1.2.3 we discussed the issue of association of stress and prominence with vocal effort, and showed that vocal effort is at least a result of interaction of respiratory and laryngeal levels of speech production. In acoustic output, the sound pressure level (SPL) is controlled by neuromuscular actions of the respiratory system, while the neuromuscular actions at the laryngeal level control the shape of the glottal pulse and, thus, affect the slope of the radiated spectrum in midfrequencies. Physiologically, the speech production of children is not just a scaled-down version of the adult system. At the respiratory and laryngeal levels, maturational processes take place throughout childhood. These processes make the child’s speech production system qualitatively different from the adult’s. In the respiratory system, children are more dependent on diaphragmatic breathing, due to anatomical differences in the angle of the ribs, so that children cannot control chest volume in the same way as adults until approximately the age of seven (see literature review in Mackenzie Beck, 1997). Netsell et al. (1994) derived from their developmental measurements of laryngeal and respiratory functions in speech production the conclusion that the respiratory system of pre-school children differs from that of adults in a substantially greater employment of expiratory muscle forces as compared to the inspiratory ones. At the laryngeal level, apart from the fact that vocal fold length increases as a linear function of age (Titze, 1994; Mackenzie Beck, 1997), the vocal fold ligament is still immature at the age of 4;0. It is thinner and does not have the same layered morphology as that of adults, and its maturation continues into adolescence (Titze, 1994). The thyroarytenoid muscle continues to develop throughout the childhood. Despite these (and other) non-linear physiological differences, children aged between two to six are able to speak as loudly as adults. One reason for that is the correlation of f0 and intensity: i.e. an octave increase in f0 corresponds to 8-9 dB increase in intensity; and children have higher f0 (Titze & Sundberg, 1992; Titze, 1994). Children seem to be working harder to achieve loudness like that of adults; by achieving higher lung pressures and longer volume excursions than adults, and by breathing more frequently (Strathopoulos & Sapienza, 1993). 71 With regard to control of vocal effort, Strathopoulos & Sapienza (1993) simultaneously measured aerodynamic, acoustic and kinematic correlates of vocal intensity. They found that when 4-year-old children (n=20) were asked to adjust phonatory loudness from soft to comfortable, and then to loud voice, their acoustic speech output was similar to that of 8-year-olds and adults in many ways, while kinematic correlates (lung volume, rib cage and abdominal displacement) differed quantitatively and functionally. However, their respiratory and laryngeal adjustments still resulted in an increase of the sound pressure level, like those of adults (see Figure 2-8). The overall higher levels of SPL in children in Figure 2-8 could result from smaller vocal tract size (the same force applied over a small area results in higher tracheal pressure than that applied to larger areas). Importantly for this study, Strathopoulos & Sapienza (1993) showed that when performing the same adjustments in phonatory loudness, 4-year-old children control the rate at which the vocal folds close. This aerodynamic measure is called ‘maximum flow declination rate’ (MFDR), and it is known to affect the spectral intensities in midfrequencies of the radiated spectrum (Gauffin & Sundberg, 1989). The children’s control of the vocal fold closure rate had similar results as the adults (see Figure 2-9): i.e. they increased MFDR with the increase of phonatory loudness. The mean difference in MFDR between boys and girls was small. Strathopoulos (1995) built on the data from Strathopoulos & Sapienza (1993) addressing the issue of age-related variability in acoustic, aerodynamic, and respiratory kinematic measurements. Even though Strathopoulos (1995) generally found that children were not consistently more variable than adults (not for all measurements), she did find that 4-year-olds produced several parameters significantly more variably than adults. These parameters included SPL and MFDR, which are closely related to the overall intensity and spectral balance in this study. 72 95 Sound Pressure Level (dB) 90 85 80 4 year old 75 8 year old 70 adult 65 60 55 50 soft comfortable loud phonatory loudness level Figure 2-8 Output sound pressure levels (dB) in female 4-, 8-year-olds and adults, when they are asked to adjust phonatory loudness for syllable trains /p/ (adopted from Strathopoulos & Sapienza, 1993) Maximum Flow Declination Rate (L/s/s) 500 450 400 350 300 4 year old 250 8 year old 200 adult 150 100 50 0 soft comfortable loud phonatory loudness level Figure 2-9 Maximum flow declination rate (L/s/s) in female 4-, 8-year-olds and adults, when they are asked to adjust phonatory loudness for syllable trains /p/ (adopted from Strathopoulos & Sapienza, 1993) Traunmüller & Eriksson (2000) compared a relative contribution of such acoustic parameters as sound pressure level, spectral emphasis, f0, F1, F3, duration and pausing to vocal effort. The subjects included adults (n=20) and 7-year-old children (n=8) of both sexes. While the results showed that acoustic output in terms of ‘vocal effort’ is a result of 73 “synergetic process that involves an increase of subglottal pressure, increase of vocal fold tension, and increase of the openness of the vocal tract”. However, the best single predictor of vocal effort (when the distance from the microphone is not fixed) was spectral emphasis. This parameter was not affected by speaker’s age and sex. We are not aware of any studies measuring spectral balance as an acoustic correlate of stress or prominence in child speech. However, the above findings confirm the fact that despite the qualitative differences in respiratory and laryngeal control children are able to perform the same linguistic tasks connected to loudness as adults. Thus, if the additional vocal effort in SSE to mark prominence in short SVLR vowels is linguistically relevant, we could expect that monolingual 4-year-old children acquiring SSE should learn this behaviour, while showing a considerable degree of variability compared to adults (Strathopoulos, 1995). 2.4.2 Bilingual Acquisition We are not aware of any studies dealing with the acquisition of vocal effort in bilingual children in acoustic, aerodynamic or kinematic terms. However, given the discussion on monolingual acquisition, we can hypothesise that if the vocal effort adjustments are relevant features of Russian and Scottish English sound structure systems, bilingual children should acquire the crosslinguistic differences in vocal effort along with other segmental and suprasegmental properties. 2.5 Summary and Research Questions In this chapter we explained the crosslinguistic differences between Scottish English and Russian that we would like to consider for this study. The research variables in the crosslinguistic perspective constitute a representative set of structural differences, which are frequent in phonology. The summary of the total of eight research variables in this study is provided in Table 2-11. The table shows the speech production level in assessing each of the vowels sets and the crosslinguistic difference involved. We also include a cross-reference to sections in which we discuss these variables either in terms of language description, or issues of monolingual and bilingual acquisition. 74 Table 2-11 Summary of the total of 8 research variables for three levels of speech production, vowel sets, crosslinguistic differences and a cross-reference to Section numbers containing discussion for these variables. Speech production level Vowels per Language Vowel Quality Postvocalic conditioning of Vowel duration Effect of interaction of vowel duration and prominence on Vocal Effort MSR SSE /i/ /i/ versus // /u/ // /i/ /i/ // /u/ // /i/ /i/ /i/ versus // /u/ // Crosslinguistic difference (MSR/SSE) systemic (lack/presence of contrast realisational (back/central) systemic (minimal/SVLR) systemic (none/invariably short) systemic (minimal/SVLR) systemic (unsystematic/systematic) systemic (none/differentiated) systemic (unsystematic/systematic) Discussed in Sections 2.1.3.3 / 2.2.1 / 2.2.2 2.1.3.3 / 2.2.1 / 2.2.2 2.1.4 / 2.3.1 / 2.4.2 2.1.4 2.1.4; 2.3.1;2.4.2 2.1.4 / 2.4.1 / 2.4.2 2.1.2.5 2.1.4 / 2.4.1 / 2.4.2 Thus, given the set of research variables (see Table 2-11) and environmental differences in the language input to our Scottish English – Russian subjects (n=2, aged 3;4 – 4;8), we will address the following questions: (1) Do bilingual children have differentiated control of their two languages? (2) Is their SSE ‘native-like’ compared to the monolingual SSE-peers and SSE adults? Is it SSE (or some other ambient language variety) that they acquire? (3) Is their MSR ‘native-like’ compared to MSR-speaking adults (including mothers)? (4) Is there any language interaction? If any, what are the patterns? Additionally we would like to provide data for the monolingual acquisition patterns in the SSE monolingual peers (n=7), for all of the research variables, since there are only limited accounts on their acquisition in this age group. Results from each of the research questions will be analysed in the longitudinal developmental perspective for each subject as well as with regard to the confounding effects of the bilingual’s language input and crosslinguistic structural differences. 75 Specifically we will address the issues of contributions of structural, environmental and longitudinal aspects of language input to language differentiation and possible interaction. Given the substantial number of sound structure variables and different levels of speech production involved in this study, any observed language interaction patterns should allow us to judge their systematicity and direction in a quite reliable way. The analysis should further help us to explain the observed patterns of bilingual language differentiation and interaction in the light of current views in the area of bilingual language acquisition studies (discussed in Chapter 1) and from the point of view of the need for a unified/separate model of phonological acquisition for bilingual and monolingual acquisition. 76 3 Methodology 3.1 Introduction This chapter justifies the methodological choices made in this study. It accounts for the selection of subjects and controls, the choice of materials, and the procedures for recording and analysing the data. As discussed in Chapter 2, this study aims to measure the extent of bilingual subjects’ differentiation of their two languages, and to identify possible language interaction patterns for cross-linguistically different aspects of vowel quality, vowel duration and vocal effort. One way to establish whether a child ‘differentiates’ between the two languages, is to find out the extent of the child’s language proximity to the language input they receive, together with identifying the extent of their speech immaturity and language interaction between the two languages. By referring to the ‘extent’ of this linguistic knowledge, we mean emphasise the continuous and gradient nature of speech production in general. In order to make categorical inferences concerning such non-categorical data, we need to create a representative control framework. To quantify the extent of bilingual’s language command, we can: - minimise pragmatic code-switching in the speech production of the children (Grosjean, 2001), in order to maximise language separation between the bilingual child’s two languages, and, thus, to reduce child speech variability (further reasons are discussed in Chapter 1). - control for the linguistic input from the direct or a closely matching sociolinguistic environment of the child; - apply the same methodology to all control groups and subjects, rather than rely on reports in the literature that inevitably differ in methodology; - use a sufficient number of repetitions of the carrier words to catch a representative sample of intra-subject variation and to be able to perform statistical analyses for the collected data. The sociolinguistic background of the bilingual subjects is discussed in Section 3.2.2. The control groups in this study differed depending on the language mode (Grosjean, 2001) analysed, and they were defined by the typical individual and social networks of the subjects. In the SSE monolingual language mode, the speech production 77 of the subjects was compared to that of typically developing SSE monolingual children (n=7), to SSE adults (n=5), and SSBE adults (n=4). In the Russian monolingual language mode, the bilinguals’ speech was compared to the speech of their Russian mothers (n=2), and to that of other adult MSR speakers (n=3). Since Russian monolingual children are not part of the social environment of our subjects, we did not gather Russian monolingual child data. However, for the Russian developmental patterns we refer to the available literature in subsequent chapters. The sociolinguistic background of the control groups is discussed in Section 3.3. It is known that when eliciting any structured data from pre-school children, researchers face qualitatively different methodological problems than when working with adults. Difficulties arise due to specific aspects of the cognitive and social development of children, such as, for example: their attention span, their sensitivity to strangers, the ‘observer effect’ (Crystal, 1997), or the (in-)ability to perform certain tasks at different ages. Consequently, researchers cannot impose the same stringent conditions on the experimental set-up as for adults. However, when tailoring data elicitation techniques, an optimal trade-off should be made between the child’s abilities and the feasibility of subsequent acoustic and statistical analyses. This is discussed in Section 3.5. A major problem concerns the instrumental measurement of child speech production, such as, for example, the difficulties in estimating formant frequencies due to the typically high fundamental frequency in child speech (Kent & Read, 2002). But also in physiological terms development means an increase of the vocal tract length. This in turn causes an age specific decrease in formant frequencies. Such developmental changes in acoustic measurements require inter- and intraspeaker normalisation. The acoustic analysis procedures, normalisation and data validation issues are presented in Section 3.6. 3.2 Subjects 3.2.1 Common Linguistic and Environmental Background This study investigates the speech production patterns of two bilingual Scottish English/Russian children, BS and AN3, aged 3;4 to 4;5. Both subjects are girls. The girls live in quite a similar linguistic and social environment. Both have grown up in Russian speaking families, in which the parents are native speakers of Russian, and the children have been acquiring Scottish English in the community. Both girls are firstborn children. 3 All initials used throughout the study have been changed to maintain anonymity. 78 The socioeconomic background of the families falls into a Middle Class (MC) classification based on the parental occupational background (AN falls in Group 1, and BS4 in 2 based on SOC 2000 NS-SEC standard) (Bilton et al., 2002). At the start of the recordings, the two families lived in the centre of Edinburgh (Scotland). The girls attended the same nursery close to their homes. The manager of the nursery (in personal communication) described the socioeconomic and language background of the nursery staff and children who were in daily contact with AN and BS as follows. The absolute majority of the children had both parents employed in ‘white collar’ MC jobs in Edinburgh. One child was from a working class family; two children were from upper-middle class background. Out of a total of 14 staff members, seven were born and bred in Scotland. One was born and bred in England. Furthermore, there were two Irish, and four bilingual staff members (2 English/Urdu, 2 Spanish/English). A total of 32 children were in daily contact with the girls. According to the nursery manager (an SSE speaker herself), only four of them had a clear SSE accent, while most children had a less clear mixture of SSE and near-RP accents. There was another Scottish/Russian bilingual boy attending the nursery, as well as a couple of Scottish/Spanish bilingual children. The subjects had regular contact with each other. When BS attended in the nursery, the two girls often played together, and they were reported to speak English to each other while playing. The Russian community in Edinburgh lacks any institutional and social networks, such as schools or nurseries. Thus, all contacts between Russian-speaking individuals are established on their personal initiative. The two families had regular contacts with at least three to four other Russian-speaking families living in Scotland5. Some of those families also had children. An officer in the Russian consulate in Edinburgh (in personal communication) was not able to provide exact information on the number of Russian native speakers living in Lothian, since not all the Russian citizens in Scotland register with the Russian consular services, and not all native Russian speakers are citizens of Russia. However, it was noted that by June 2003, the number of such registrations was about 140, and they estimated the real number of the residents in Lothian being at least twice or even three times bigger. The university of Edinburgh and local IT-companies attract Russian researchers and IT-specialists. This is also the background of our subjects’ 4 Group 1 includes managers and senior officials; Group 2 includes professional occupations. All information about the families was gathered in a language background questionnaire that was filled in by the parents upon the completion of recordings. 5 79 parents: all of them have university degrees, and three of them are employed in Scotland in ‘white collar’ jobs. Another common denominator in the environment of our subjects are the language varieties to which they are exposed in Edinburgh: i.e. the SSE continuum ranging from Scots to SSE (Aitken, 1981), other English varieties, and other languages. The heterogeneous sociolinguistic background of the nursery staff and children reflects the situation in Edinburgh. It is known that the phonological and phonetic range of SSE in Edinburgh sometimes reaches the near-RP side of the continuum in the speech of Scottish MC speakers (Scobbie et al., 1999a). According to Scotland's Census 2001 for Edinburgh, the Middle Class population (groups 1 and 2, SOC 2000 NS-SEC standard) constitutes 34.09% of the total city population (n=453,430). 12.14% of Edinburgh residents are born in England; this is the biggest population group after Scottish-born residents (77.1%). The percentage of residents born in England has increased by roughly one third as compared to the 1991 Census, and the percentage is greater than in the rest of Scotland (8.08%) or Glasgow (4.24%). Besides, Scobbie et al. (1999a) report that one fourth of children (23%) born in Edinburgh in MC families have at least one English parent. It is well established in the literature that parental language or dialectal background influences children’s speech. For Edinburgh specifically, Hewlett et al. (1999) found that children (aged 6 to 9) with two non-Scottish British parents implement extrinsic vowel duration conditioning differently from their peers with two Scottish or one Scottish parent: i.e. they exhibit more influences from the voicing effect featured in non-SSE English varieties. This raises a question relevant to the sociolinguistic background of our bilingual subjects. Both girls are growing up in Russian-speaking families. The families are not preoccupied with correcting their English. The only choice the parents make in this respect is deciding what nursery or school the child should attend. Given that parental choice, the girls have to make sense of the English varieties spoken in their environment themselves. Given that one fourth of the children in their environment may have some SSBE influences in their English, we cannot exclude the possibility of finding such influences in the speech of the subjects. Therefore, we decided to include adult SSBE speakers, and we also included one monolingual child with a mixed Scottish/English parental background as a control. The control groups are discussed in detail in section 3.3 of this chapter. 80 3.2.2 Differences in Linguistic and Environmental Background 3.2.2.1 Subject BS BS's Russian mother was born and grew up in a suburb of Moscow, Russia. BS's father was born in Ukraine in a Russian-speaking family. He is a Russian/Ukrainian bilingual. The father’s family moved around throughout his childhood: from Ukraine to Siberia, back and forth. He spent 13 years in Moscow for studies and work. As a result, BS’s father speaks a Moscow variety of Russian, with some minor South Russian and Siberian influences in his speech. Both parents lived in Moscow during their university studies and afterwards, until they moved to Scotland. BS was born in Moscow, and moved with her parents to Edinburgh when she was four months old. BS spent a lot of her time during the day with her mother. During their residence in Edinburgh, the family went on holidays in Scotland, England and Europe. At home, all communication was mainly in Russian. Naturally, the family went out into the community on a daily basis, where Scottish English was spoken, and the family had regular contacts with English-speaking families. BS watched children’s TV programmes and videos on a daily basis in Russian, BBC English and Scottish English (in order of importance). The family had contacts with more than 30 Russians living in the area at least one to three times a week. The contacts included several other Russian-speaking families with children. BS’s exposure to nursery English is summarised in Figure 3-1. The figure is a very conservative estimate of BS’s exposure to English, since it includes only the nursery attendance hours. This figure does not include any personal contacts of the family with English speakers, or daily exposure to community English during family outings or her exposure to mass media. Recordings of BS’ speech production in two languages were made at 3 age samples: from 3;4 to 3;5, from 3;9 to 3;10, from 4;4 to 4;5. BS was enrolled in a local nursery in Edinburgh at the age of 1;3. From the age of 1;3 to 3;0 she attended the nursery quite variably for different periods of time: one day a week (age 1;3 to 1;10), 30 hours a week (age 1;10 to 2;0), two days a week (10 hours a week, age 2;6 to 3;0). However, there was a period of four months (2;4 to 2;7) in which she did not attend the nursery at all and stayed at home. 81 100 90 80 70 % 60 English 50 Russian 40 30 20 10 0; 0 t 0; o 0 4 ;3 t 0; o 0 7 ;6 0; to 10 0; t 9 1; o 1 1 ;0 t 1; o 1 4 ;3 t 1; o 1 7 ;6 1; to 10 1; t 9 2; o 2 1 ;0 t 2; o 2 4 ;3 t 2; o 2 7 ;6 2; to 10 2; t 9 3; o 3 1 ;0 t 3; o 3 4 ;3 t 3; o 3 7 ;6 3; to 10 3; t 9 4; o 4 1 ;0 t 4; o 4 4 ;3 to 4; 5 0 BS's age Figure 3-1 BS’s language exposure pattern (% per 3 month) throughout the pre-school period, based on nursery attendance hours and 336 waking hours/month. From the age of 3;0 to 3;5 BS attended the nursery two half-days a week (10 hours in total). This period broadly corresponds to the first age sample recorded with BS. During this period, BS’s mother learned from talking to nursery staff that BS understands most spoken English, and that she was speaking spontaneously in sentences. For example, she used to tell the nursery staff about what she had had for breakfast, and where she had gone with her mum the day before. From the age of 3;6 to 4;0, BS continued to attend the nursery for one day a week (5 hours a week). This period broadly corresponds to BS’ second age sample. In this period, she had also started to attend a local community playgroup for 4 days a week (5 hours in total). Thus, she socialised with English-speaking peers for at least five days a week. From the age of 4;0 to 5;0, in addition to attending the nursery for one day a week (5 hours), BS was enrolled in a local nursery school. This period broadly covers BS’ third age sample. BS attended the nursery school for four days a week (10 hours in total), with a two-month break for the summer holidays (3;11 to 4;0). To summarise, while being exposed to Russian on a daily basis in the family from birth and throughout the pre-school period, BS’s exposure to the community English was limited to an average of 10 hours a week from the beginning of her linguistic experiences 82 in Scotland throughout the pre-school period. The language build-up continued with a substantially broadened exposure to the community English from the age of 3;6, when BS started attending the playgroup and nursery school. All the exposure to the community language was on a regular basis. Based on the pattern of BS’s exposure to both languages, she can be classified as a Russian-dominant Russian-Scottish English bilingual. 3.2.2.2 Subject AN AN was born in Edinburgh. Her Russian parents were born and grew up in Moscow. AN stayed at home in Edinburgh with her mother until the age of 0;7. After that she was enrolled in the local nursery full-time (five days a week, 45 hours in total), and continued to attend the nursery throughout the pre-school period. All communication at home between family members was in Russian. AN’s exposure to Russian and English in Edinburgh continued to be distributed in these proportions throughout this time. Figure 3-2 summarises the monthly percentages of AN’s exposure to English in the nursery based on the language background questionnaire filled in by her parents. The exposure pattern is based on 336 waking hours a week. The figure includes the time spent on holidays to Russia. It does not, however, include familial contacts with English-speaking families, and general daily exposure to the community English or mass media. The overall exposure to English in the nursery for AN has been 43% until the age of 4;5. The family had regular family visits from Russia. AN’s Russian grandmother stayed with the family in Scotland every year for at least six months. At the age of 0;6 AN visited Russia for three weeks. At the age of 2;3 and 3;2 AN spent about eight weeks in Moscow on each occasion. While staying in Russia, she was only exposed to Russian. In Edinburgh the family had regular contacts with other five to six Russian-speaking families with or without children, as well as with English-speaking families. The family went on holidays on a yearly basis in Scotland and Europe. During the holidays the family members spoke Russian to each other. AN watched children’s programmes on TV and videos on a daily basis in BBC English, Russian and Scottish English (in order of importance). 83 100 90 80 70 % 60 English 50 Russian 40 30 20 10 4;4 to 4;5 4;1 to 4;3 3;10 to 4;0 3;7 to 3;9 3;4 to 3;6 3;1 to 3;3 2;10 to 3;0 2;7 to 2;9 2;4 to 2;6 2;1 to 2;3 1;10 to 2;0 1;7 to 1;9 1;4 to 1;6 1;1 to 1;3 0;10 to 1;0 0;7 to 0;9 0;4 to 0;6 0;0 to 0;3 0 AN's age Figure 3-2 AN’s language exposure pattern throughout the pre-school period, based on nursery attendance hours and 336 waking hours/month. The recordings of AN’s speech production in two languages took place during three age samples: from the age of 3;7 to 3;8, at the age of 4;2, and from the age of 4;5 to 4;6. During all this time she was in childcare in an English-speaking environment for 45 hours a week, and Russian was used in the family home. Based on AN’s exposure pattern to the two languages, she can be classified as a nearly balanced Russian-Scottish English bilingual. 3.3 Control groups 3.3.1 Children Seven SSE monolingual children were selected as controls for the speech production of the bilingual children. Six children were recruited through staff and students at QMUC. One child was recruited through personal contacts. For their participation the children received a gift voucher from a toy store. No children had a history of speech or language disorders, or any reported hearing problems. The monolingual children were selected so as to match the age of the bilingual subjects. The age criterion was chosen above such developmental norms as, for example, Mean Length of Utterance (Brown, 1973), or phonological (e.g. PACS) (Grunwell, 1982) or syntactical profiles (e.g. LARSP), since no such developmental norms are available for 84 children acquiring two languages at the same time. It is known that monolingual norms may not be representative of bilingual normal language development, and should only apply to the population from which they were drawn (Crutchley et al., 1997; Stow & Dodd, 2003). Besides, there are no comparable age norms available for Russian. Such frequently used (see e.g. Müller, 1998; Deuchar & Quay, 2000) developmental norms as, for example, mean length of utterance, or MLU (Brown, 1973) have not yet been established for Russian child language development (Tsejtlin, 2002). It is also not clear what type of MLU should be taken, since applying either word-based or morpheme-based MLU are problematic for a crosslinguistic comparison between Russian and English. Typologically Russian is a more inflective language that involves more derivational and inflectional morphology in grammar, while English is a relatively more isolating language. For example a six word English sentence “The girl will ask the boy” is conveyed in Russian by three words “Devochka sprosit mal’chika” (Girl ask boy) with all grammatical relationships conveyed by inflections. Such typological differences between the languages make it difficult to apply either morpheme-based or word-based MLU. Besides, the MLU-norm was created to measure the development of syntactic and morphological aspects of language, and there is evidence that there may not necessarily be a correspondence between prosodic development and MLU (Lleó, 2002). Table 3-1 represents all the children (including the subjects) listed by their age. The child control group is listed in a separate column from the subjects. The letter “C” means ‘child controls’. The digit attached after “C” is the unique number of each control. The digit attached after the underscore is the child’s age at the end of each age sample. Three monolingual children (C3, C7 and C4) were recorded longitudinally in two age samples. All of the children are first-born except for C5. The children come from Scottish families residing in Edinburgh (C7, C6, C2, C4, C1), or close to the city (C3, C8). All of the children attended nurseries or nursery schools, and in all but one case the parents were Scottish-born. One child, C4, had a Scottish mother and an English father. C4’s speech production was a control case for any possible cross-dialectal influences (SSE-SSBE) in the speech of our bilingual subjects. Two of the monolingual subjects (C3 and C9) were boys. It was decided to accept the boys as controls alongside the girls, in order to simplify the control selection procedure. With regard to the effect of gender on the acoustic analyses (formant and f0), it is known that age-related vocal tract length differences are a bigger issue in pre-school children than gender since the vocal tract grows fast at this age (Kent & Read, 2002). The 85 longitudinal design of this study required a normalisation for vocal tract length differences; therefore, the gender differences were accounted for by the same procedure. Table 3-1 Identification codes, age and sex of the children who participated in experiments; the children are listed by age. Subject BS_3;5 AN_3;8 BS_3;10 AN_4;2 BS_4;5 AN_4;8 Control Age 3;4 - 3;5 C3_3;5 3;4 - 3;5 3;7 - 3;8 C4_3;8 3;8 3;9 - 3;10 C3_3;11 3;11 C5_4;0 3;11 - 4;0 C6_4;0 3;11 - 4;0 C4_4;1 4;1 4;2 C7_4;2 4;2 C8_4;2 4;2 4;4 - 4;5 4;7 - 4;8 C7_4;8 4;8 C9_4;10 4;9 - 4;10 Sex F M F F F M F F F F F F F F F M 1st-born yes yes yes yes yes yes no yes yes yes yes yes yes yes yes yes Residence Edinburgh Rosyth Edinburgh Edinburgh Edinburgh Rosyth Edinburgh Edinburgh Edinburgh Edinburgh Edinburgh Dunbar Edinburgh Edinburgh Edinburgh Edinburgh 3.3.2 Adults Adult control groups included five Scottish Standard English, four Southern Standard British English and five Modern Standard Russian speakers. All the adults were of a Middle Class social background (group 2 of SOC 2000 NS-SEC standard) (Bilton et al., 2002). The Russian speakers had all learnt RP-based English at school and during university studies in Russia, and have been exposed to different English varieties in Edinburgh for periods ranging from four months to four years. All but one adult (E4) were female. The MSR group included the two bilingual subjects' mothers. Table 3-2 summarises the geographical background, age and sex of the adult participants. Three of the Russian speakers were born in Moscow. One of the adults (R2) was born in Volgograd, in Southern Russia. The speech of R2 had negligible dialectal influences, as is often the case with urban Russian varieties, which are influenced by the language used on Russian central television and by the high rates of migration in Russia among the population with university degrees (Avanesov, 1972). All Russian controls spoke Modern Standard Russian. 86 Table 3-2 Native language, age, sex of adult participants. L1 MSR ID Age Grew up in Sex R1 R2 R3 R4 R5 26 29 32 31 27 Moscow Volgograd Tver Moscow Moscow F F F F F S1 S2 S3 S4 S5 23 25 27 45 37 Linlithgow Edinburgh Edinburgh Musselburgh Edinburgh F F F F F E1 E2 E3 E4 31 32 52 44 Surrey Oxford/Ascot Yorkshire Surrey F F F M SSE SSBE Four of the SSE speakers grew up in either Midlothian (mainly Edinburgh) or West Lothian (Linlithgow). Three of the SSBE speakers grew up in Southern parts of England. E3 grew up in Yorkshire, but spoke SSBE. 3.4 Materials 3.4.1 Children We compared structurally similar words in both languages, which differ enough crosslinguistically to be diagnostic for possible language interaction in bilingual child speech production. The materials consisted of monosyllabic "consonant - vowel consonant" (CVC) words. The decision was taken to keep the structure simple, given the constraints that arise from language-specific structural properties: (1) Russian and English exhibit rather different phonotactic rules for consonant clusters in syllable onsets and codas. (2) As opposed to CVC- type words, polysyllabic words are difficult to compare between English and Russian, because these languages have too different patterns of vowel reduction (see Section 2.1.3.1), whereby unstressed vowels crosslinguistically differ in vowel quality. 87 (3) Both languages exhibit variable word-stress location in polysyllabic words, while having rather different patterns of word-stress assignment (Trubetskoy, 1939). (4) Many polysyllabic words in Russian and English contain consonant clusters. For methodological reasons, there should be no ambiguity in the syllabic structure of target words, and no doubt as to which syllable the consonant clusters should belong. However, in the literature, there is a theoretical incompatibility between the accounts of English and Russian syllable structure. As Kessler and Treiman (1997) point out, of the many theories of English syllable structure the phonological account of "onset-rhyme" syllable structure (Fudge, 1969; Selkirk, 1982) "is perhaps most widely accepted". The most widely accepted account of the Russian syllable is "CV" structure defined in phonetic rather than phonological terms (Bondarko, 1998). In using CVC-structured words, there is no theoretical or practical ambiguity in the way in which syllabic decomposition of such a word should be performed, since there is only one syllable in this case. Apart from the above advantages in matching CVC-type words across Russian and English, there are important statistical reasons behind the choice of monosyllabic words for this study. In spontaneous English speech, monosyllabic words are most frequent (Crystal, 1997). The same accounts for the vocabulary acquired by English-speaking children in the first two years of life: i.e. in the CDI (Dale & Fenson, 1996) 23% of the 659 lexical items have CVC structure. In Russian the number of monosyllabic words is perhaps less frequent than in English, but it typically belongs to the most frequently used vocabulary. Besides, monosyllabic Russian adult targets are as frequent in child speech production as trochaic bi-syllabic ones. In addition to this, trochaic bi-syllabic adult targets are often substituted by monosyllabic templates in child speech (Zharkova, 2002). Given the above arguments, the materials were chosen to match the following criteria in both languages: - word-internally voiceless consonants should precede the syllable nucleus, so that phonation is available to define vowel onset the consonant following the syllable nucleus should be either a voiced or voiceless stop, or a voiced or voiceless fricative to trigger language-specific vowel duration conditioning the syllable nucleus should contain one of the vowels [i], [], or [] for SSE, or [i] or [u] for MSR (vowel [] is not featured in MSR) preference was given for words belonging to a typical child lexicon, or at least be easy to learn in games by children aged 3;0 had to be suitable for picture naming should not contain consonant clusters 88 - the English words should not be true cognates of the Russian ones and visa versa to avoid any confusion about the language identity of the word. Some clarification is needed as to what we mean by ‘voiced’ and ‘voiceless’ obstruents. For British English varieties it seems reasonable to assume that the neutralisation of word-final obstruents like /z/ is not complete. It is phonetically gradual, without neutralising the phonological contrast (Docherty, 1992). For Russian, some phonological accounts consider the contrast between voiced and voiceless obstruents as completely neutralised in word-final positions in favour of voicelessness (Bondarko, 1998), while others (Avanesov, 1972) consider voiced and voiceless counterparts as combinatory variants of the same voiced phonemes. For the neutralisation process to be complete word-finally, there should be no phonetic differences between voiced and voiceless counterparts. However, their behaviour in Russian across words boundaries is the area of disagreement: i.e. in some contexts the neutralisation can be obligatory and categorical, while in others it is gradient (Padgett, 2005). Since in this study we deal with spontaneous speech, in which the tokens can appear in various contexts, we adopt the gradual view of final devoicing in Russian. By ‘following voiced consonant’ we mean then a phonologically voiced one that may have various amounts of phonetic devoicing of the consonant and which may or may not be phonetically neutralised depending on the context. Following these criteria, we chose the target words listed in Table 3-3. Table 3-3 Elicited target words: orthography and adult target phonetic transcription per language. Utterance final phonetic targets for adults IPA transcription IPA transcription Orthography SSE SSBE Transliteration & Russian6 (Translation) [’ip] [’ip] Sheep [’fit] [’fit] [’kit] Feet kit (a whale) [’sid] [’sid] [’fip] Seed Fib (proper name) [’tiz] [’tiz] [’ti] Cheese chizh (a finch) [’piz] [’piz] Peas [’kk] [’kk] [’suk] Cook suk (a tree branch) [’pt] [’pt] [’ut] Put shut (a joker) [’fd] [’fud] [’kup] Food kub (a cube) [’z] [’uz] [’tus] Shoes Tuz (proper name) [’p] [’p] Pig [’sv] [’sv] Sieve [’f] [’f] Fish 6 Since Russian phonemes for voiced stops and fricatives are fully phonetically devoiced utterance-finally, in the table we use phonetic symbols for voiceless counterparts rather than the devoicing diacritics. 89 The English carrier words were mainly chosen from the lexical entries from the MacArthur Communicative Development Inventories (CDI) of Lexical Development Norms (Dale & Fenson, 1996). The CDI is based on parental reports on the lexical items acquired by English-speaking monolingual children (aged up to 1;5). For three of the target structures we could not find suitable lexemes in the CDI. Therefore, we chose depictable words, such as "seed", "sieve" and "cook". We anticipated that children aged 3;0 to 5;0 should have no problems acquiring these words (if they hadn't already done so). There is a verb (Table 3-3) in the list that matched the required structure, i.e. "put". We added it to the list after the data was collected, since almost all the children used it regularly during the recording sessions. It was less easy to find the matching words for Russian, since no Russian CDI was available at that point. Most frequent English CDI items are lexemes denoting animals, toys, cloths and food items. Therefore, the Russian carrier words (Table 3-3) were matched to these categories as much as possible. The word "Fib", was an invented frog’s name, the word was derived from the word "amphibian" to make sense. "Tuz" is a popular dog’s name in Russian. The girls had no problems with remembering these names. The picture for “cube” represented the popular toy, Rubik's cube, and appeared to be easy for the children. Other Russian words for "a whale", "a finch", "a branch" and "a joker" were already known or quickly learnt by the children. Since the depicted objects remained the same in all experiments, there was no confusion about what lexical item a particular picture represented. 3.4.2 Adults The materials collected from the adults matched those of the children in both languages. However, the materials were collected with a different procedure than playing games. The procedure is described in Section 3.5.2 3.5 Data Collection 3.5.1 Children 3.5.1.1 Recording Equipment and Set up The data were recorded using a Tascam (DA-P1) digital audiotape (DAT) recorder. Each subject was recorded using a MPC-65 Beyerdynamic microphone with increased directionality. After formal testing of different available options, this microphone had the 90 smallest Sound-to-Noise (SNR) ratio ranges in different environments (studio and office), and, therefore, was the best option for an environment with variable background noise (such as a family flat). Besides, this microphone had the advantage of being small in size, and, thus, was less intrusive than bigger microphones. During the recordings the microphone was connected to the left channel of the DAT recorder. It was put on a flat surface, with the front surface facing the child. The recording volume settings were kept constant. The microphone was kept as close as possible to the child, with a distance not exceeding one meter. A notebook computer was used for playing two computer games during the recordings. The notebook processor needed to be regularly cooled by a fan. When the fan went on, the ventilation resonated in the computer body, and generated steady narrowband formants in the spectrum of interest. For this reason, the notebook computer was turned off whenever it was not used. Besides, we added an extra step into the acoustic analysis procedure, i.e. noise reduction (described in section 3.6.3.2), to ensure that the fan noise formants do not interfere with the vowel formant frequency estimation. 3.5.1.2 Procedure All subjects (or a subject's parent) who volunteered for this study, signed a consent letter and received an information sheet about the broad purpose of the experiments. Any details that could influence their subsequent language behaviour were omitted. For the bilingual children most of the recording sessions in both languages took place in the author's own home. This was a flat with a relatively good sound insulation located off busy roads, so that the outside noise was reduced to a minimum. However, for practical reasons the recording of the first age sample of AN took place in AN's home. In order to trigger the monolingual language mode (Grosjean, 2001), discussed in Section 1.3.2.1, the bilingual children played games with two different interlocutors: i.e. the author in the Russian sessions, and an SSE native speaker, S17, in the SSE sessions. Whenever possible, the Russian parents were not present in the experiment location (especially in the SSE sessions), to ensure that they would not influence the child's language choice. This worked out very consistently with AN, but proved to be more difficult with BS, since she often refused to let her mother leave the room. In such cases BS's mother stayed in the room, but tried not to interfere with the games. 7 S1 grew up in Linlithgow (West Lothian), both of her parents are Scottish. As a speech and language therapy student she was experienced in phonetics and data elicitation from children. 91 For monolingual children, the location of recordings varied depending on the child. Usually it took place in their homes to ensure child's comfort and collaboration. For subjects C6, C5, C8 (in the SSE child control group), the recordings were performed in the studios of the Scottish Centre for Research into Speech Disability at QMUC, since it was the easiest arrangement for all parties. At all times, we ensured that the environmental noise at the recording location was reduced as much as possible. 3.5.1.3 Games Depending on the language mode, Russian or SSE, the children played different sets of games. The games were designed to elicit a sufficient number of repetitions of the target words to make possible statistical analyses, while preserving the spontaneity of the conversation with the experimenter as much as possible. The games were chosen to match the cognitive abilities of the children’s age range. Each game was self-contained, i.e. it included all the language-specific target words. This gave the advantage that the games could be interchanged depending on child's mood within each session and between different sessions. Each game lasted about 15-20 minutes. Typically, a set of three to four games was played in every session. This arrangement was sufficient to keep the attention span of the children for about 50 minutes. In each session, we elicited ten to twelve repetitions for each target word for SSE, and fifteen to seventeen for Russian (there were fewer Russian target words). The elicited speech contained a mixture of multi- and single-word utterances, and spontaneous speech. The SSE games were: "The fishing game": The basic fishing game with the magnets was acquired in a local toy store. Small laminated pictures representing the target words were then attached to the fish, and were caught with a set of fishing rods, interchangeably by the experimenter or by the child, and were collected into cups. The aim was to catch the most fish. "Snap": This game is very popular with the children of our target age. The cards are shuffled and distributed among the players. The piles are put face down in front of each player. One by one, each player takes the top card off the pile and lays it face up in another pile in-between the two players. The players name the pictures as they appear. When a sequence of the same pictures appears on top of the pile, both players 92 vie to be the first to call out "snap!" The one who shouts it out first takes the upturned pile. Then the next player lays down another card and the game continues. The player with most cards wins. "Picture-pairs": In this game, the aim is to match the hidden pairs of pictures. The participants turn over two cards, telling what is represented on the picture. If the pairs are not the same, they are placed back in the same position. If they are the same, the player keeps the cards and has another go. The player with the most pairs wins. This activity revolves around remembering where the cards are placed. All children were interested in this game, as long as they managed (or were allowed) to win. All the children were familiar with the game, but especially the youngest ones (3;2 to 3;4) had sometimes a too short attention span to keep playing the game. "Mister Cook's Kitchen": This computer game was especially developed for the experiments. As many SSE words were related to the food vocabulary, the story was about cooking. The story line was about three little friends (Pig, Sheep, and Fish), who visited Mister Cook. Mister Cook had plenty of kitchen cupboards, containing food or non-food items (cheese, sieve, food, peas, shoes and feet). Each cupboard was opened in turn, and the child could decide what items Mister Cook had to put into the soup he was making. The children clicked on the screen buttons themselves, if they wanted to. This game was a good complement to the non-computer games, and was very popular with the children. In the monolingual Russian language mode experiments, we played a slightly different set of games: "Catch the ladybirds": The basic magnet game was acquired in a local toy store. It contained a set of colourful ladybirds that were placed on a playing surface ("grass and flower field"). Participants caught the ladybirds in turn with magnet spiders hanging from green branches. Small laminated pictures representing the target words were attached to the bottom of the ladybirds. "Hide and seek": This computer-game was especially developed for the experiments. The story is about a puppy named Tuz, who is hiding away from his mum in one of the four rooms. The child had to look for the puppy in each room. Pictures of the objects representing the Russian carrier-words were hidden in the rooms behind red buttons. Children could click on the buttons to see whether they could find the puppy. 93 In addition to that, we also played "Snap" and "Picture Pairs" in a similar way as in the SSE experiments, but with the pictures depicting Russian target words. An SSE-version of "Hide and seek" game was also adopted for the sessions with the SSE monolingual children. The puppy was called Spud in that version. 3.5.2 Adults For adults, the CVC carrier words were embedded in two types of carrier sentences covering four prominent positions (referred to as “pos 1 – 4” in Table 3-4). The four different positions, shown in Table 3.4, were chosen to elicit different degrees of phonetic prominence in the vowels. Introducing such variation in prominence provides a more meaningful basis for comparison (than, for example, reading out word lists), given the extent of variability that is likely to occur in the child speech. Position 1 covered a phrase initial pitch accent in an utterance with several pitch accents. Position 2 covered a non-initial pitch accent before a phrase boundary. Position 3 covered a phrase final pitch accent in an utterance with several pitch accents. Position 4 covered a short full-intonation phrase with one pitch accent. The subjects were recorded on a DAT-recorder in a soundproof booth using a condenser boundary microphone with a half-spherical response. The recording volume settings were kept constant. The subject's mouth distance from the microphone was 50 – 60 cm. The subjects were instructed to speak clearly. No specific instructions were provided towards the pitch accent placement in the utterances. Subjects read the set of sentences containing the target words five times from the computer screen. We determined the subject’s speech rate by prompting each sentence at regular time intervals with short pauses in between. In these studio recordings, we gathered 20 renditions of each target word. Table 3-4 Main type carrier sentences used in the two languages. English (orthography) Russian (transliteration) It's a [target](pos 4). Eto [target] (pos 4). A [target] (pos 1) is a [target] (pos 2) Tot [target] (pos 1) – eto [target] (pos and nothing but a [target](pos 3). 2), i tol'ko tot [target](pos 3). 94 In addition to the studio recordings we also analysed child directed speech for those adult subjects who elicited data from children during the recording sessions. These subjects (and languages) were S1 (SSE), R3 (SSE and MSR) and E4 (SSBE). 3.5.3 Summary of the Elicited Data To extract statistically representative averages, we aimed to collect about 20 repetitions of the target words from each child per language and age sample. Some tokens were excluded during data annotation due to too much background noise. Table 3-5 summarises the number of tokens collected (of all target words) per child and age sample. It also shows the number of sessions needed to collect the tokens. Table 3-5 Summary of the number of sessions and the total number of elicited tokens per child (and age sample) Child BS_3;5 C3_3;5 AN_3;8 C4_3;8 BS_3;10 C3_3;11 C5_4;0 C6_4;0 C4_4;1 AN_4;2 C7_4;2 C8_4;2 BS_4;5 AN_4;8 C7_4;8 C9_4;10 Number of sessions per language SSE MSR 5 3 4 1 3 2 2 2 1 1 2 2 3 2 2 2 4 Total Number of Elicited Tokens SSE MSR 282 134 314 3 408 216 101 2 299 266 260 250 237 168 2 163 203 215 342 2 2 390 220 442 231 279 244 The number of tokens collected per session changed from child to child, and from session to session. The interest of the children to the games was highest in the first session, and typically somewhat reduced in subsequent sessions. The total number of sessions recorded with children was 52, and a total of 5664 tokens were collected in both languages. 95 As shown in Table 3-5, for the majority of children sufficient amounts of data were collected in two sessions. Younger children had 3 to 5 sessions per age sample. C4 had only one session for each age sample. AN had only one session for SSE due to a technical fault, which resulted in the loss of the data for one session. Despite that, the number of elicited tokens from AN was sufficient for statistical analyses. 3.5.4 Digital Audio Data Formats The original recordings were digitised at the sampling rate of 44100 Hz and with 16-bit quantisation. The sampling rates used for analyses were 11025 Hz for adults, and 22050 Hz for children (both 16-bit quantisation). The digital audio files with these sampling rates served as input for manual annotation and automatic acoustic analyses. 3.6 Phonetic and Acoustic Measurements 3.6.1 Overview In Chapter 2 we introduced the cross-linguistic variables that are the focus of this study. The aim is to address theoretical questions on how young bilinguals (balanced and less balanced) cope with sound-structural ambiguities in their languages. The research variables concern vowel quality, duration, and vocal effort in prominent syllables. To enable automatic acoustic measurements, vowel duration was manually annotated. Vowel quality was measured both qualitatively (auditory phonetic analysis) and quantitatively (automatic acoustic analysis). To measure vocal effort, we used the measure of spectral balance (Sluijter & van Heuven, 1996b) with modifications for the purposes of this study. The measure of spectral balance combined intensity level analysis and the formant analysis, since intensity had to be measured in specific frequency bands of the radiated spectrum for a particular vowel token. To enable further normalisation, and to exclude excessively loud or quiet utterances, we needed to estimate both overall intensity and fundamental frequency. The data annotation and analysis procedure consisted of two parts, and the general process is represented in Figure 3-3. We first annotated the onset and the offset of a target vowel and surrounding consonants, assigned it a broad phonetic label, and annotated a piece of typical silence (noise) for a given fragment of speech. Subsequently, from the annotated duration of the vowel we automatically estimated acoustic parameters, such as formant frequencies (F1-F3, Hz), formant bandwidths (Hz), fundamental frequency (f0, 96 Hz), and RMS-power (for the whole vowel spectrum and for three fixed frequency bands around F1, F2 and F3). The same process was applied to both child and adult speech. The method of acoustic encoding, however, differed depending on the vocal tract characteristics of adult and child participants. The overview of the acoustic parameters used in this study is given in Table 3-6. Figure 3-3 Data flow diagram of the encoding process of the acoustic waveform into acoustic parameters and phonetic labels. Table 3-6 Raw acoustic measurements in this study. 1 2 3 4 5 Parameter F0 F1 F2 F3 OI 6 A1 7 A2 8 A3 Description estimate of fundamental frequency (Hz) estimate of the centre frequency of F1 (Hz) estimate of the centre frequency of F2 (Hz) estimate of the centre frequency of F3 (Hz) overall intensity, measured as RMS-power (dB) from the whole DTFT spectrum in a 23 ms Hamming window RMS-power (dB) measured around estimated F1 in a fixed frequency band RMS-power (dB) measured around estimated F2 in a fixed frequency band RMS-power (dB) measured around estimated F3 in a fixed frequency band 97 3.6.2 Data Annotation 3.6.2.1 Phonetic Labelling For each token, the vowel was analysed auditorily and labelled accordingly. All the manual labelling was done in PRAAT (Boersma & Weenink, 2004). One of the following broad phonetic symbols was assigned to a vowel: [i], [], [u], [], [], []. Since all files containing phonetic labels and duration markers were subjected to automatic processing after acoustical analyses, all the phonetic labelling was made with a computer readable version of IPA, i.e. SAM Phonetic Alphabet (or SAMPA) (Wells, 1995). Tokens were excluded from further acoustic analyses if: - an adult was talking at the same time as the child - there was too much environmental noise - the target word could not be identified - it was pronounced in whisper - the vowel was de-accented (cf. Section 3.6.2.3) - none of the above vowel symbols could be assigned. Neighbouring consonants were also phonetically labeled. The transcription for consonants included some diacritics for narrow phonetic transcription: - a palatalisation marker8 was used, since it is a lexically contrastive phonological feature in Russian; but it also often occurred in SSE child speech - a marker for ejective stops was used, since we identified that some SSE child realisations of voiceless stops in syllable coda were made with a glottalic airstream rather than pulmonic; the duration of the occlusion in such stops often looked significantly longer than more typical realisations made with a pulmonic airstream mechanism - a marker for aspiration was used when it appeared in the Russian data, since this phonetic property is not featured in Russian either phonologically or phonetically, and could thus be due to language interaction from SSE. 3.6.2.2 Annotation of Timing Duration of the vowel and the surrounding consonants was measured after visual inspection of the waveform and of the spectrogram of each utterance. In defining 8 In SAMPA, the phonetic labels do not contain diacritics, but rather unambiguous sequences of symbols, which we refer to as a “marker”. 98 segmental boundaries for each of the CVC segments, we mainly followed the annotation criteria specified in van Zanten et al. (1991). The criteria concern the shape of the amplitude envelope of an acoustic waveform. Since the consonants surrounding vowels were usually realised as voiceless, the amplitude envelope of the waveform changed dramatically at the CV transition, and such a transition was relatively easy to identify (e.g. Figure 3-4). The same was often true for the VC transitions when a vowel was followed by a voiceless obstruent (Figure 3-5). Figure 3-4 Timing marker indicating the end of the voiceless fricative [s] and the beginning of the following vowel [] in “sieve” (annotated in SAMPA). Figure 3-5 Timing marker indicating the end of the vowel [] and the beginning of the devoiced stop [t] in “food” (annotated in SAMPA). 99 Figure 3-6 Timing marker indicating the end of the vowel [] and the beginning of the voiceless stop [k] in “cook” (annotated in SAMPA). However, in some cases (Figure 3-6) it was more useful to follow spectral cues. For example, in agreement with the patterns described for British English by Gobl and Ní Chasaide (1988; 1999b), in vowels preceeding voiceless stops the cessation of voicing could start quite early in the vowel, and the decrease of the vowel amplitude was long and gradual rather than abrupt (see Figure 3-6). In such cases, we identified the boundary in the vowel spectrum at the offset of F2. A boundary between a vowel and a following voiced fricative was usually annotated at the beginning of visually identifiable friction in the higher frequency partials (Figure 3-7). Some difficulty for segmentation was caused by the presence of preaspiration of voiceless fricatives in Scottish English (Gordeeva & Scobbie, 2004). Preaspiration also occurred in the child data in the SSE sessions. In the preaspirated VC realisations, usually in “fish” tokens, the [] sequence could contain rather long (sometimes up to 400 ms) [h]sounding or whispery transitions. Such a sequence was annotated as a separate phonetic entity (as in Figure 3-8). 100 Figure 3-7 Timing marker indicating the end of the vowel [i] and the beginning of the voiced fricative [z] in “cheese” (annotated in SAMPA). Figure 3-8 Two timing markers indicating the boundaries between the end of the vowel [] and the preaspirated whispered transition [] and the following voiceless fricative [s] in “fish” (annotated in SAMPA). The duration of [] is 142 ms. 101 3.6.2.3 Annotation of Prominence and Utterance Type It is well known that segmental and suprasegmental sound properties, such as voice quality, duration and intensity vary as a function of prominence and pragmatic meaning of intonation (Lehiste, 1977). Since the target words were elicited in spontaneous interaction, children produced them in phrases of different length, structure, and intonational meaning. Therefore, for each token the syllable prominence was analysed and labelled. Syllables carrying a pitch accent in a broad or narrow focus (Ladd, 1996) were considered for further analyses, while de-accented syllables were excluded. For each focal syllable, we annotated position in the utterance as: - phrase initial phrase medial phrase final (in phrases with more than one pitch accent) phrase final single pitch accent It was beyond the scope of this study to identify pragmatic meaning of intonation expressed by fundamental frequency or by giving a phonological transcription (e.g. by assigning H and L labels to tonal events). However, since pragmatic meaning of intonational events affects other suprasegmental sound properties, it was necessary to broadly classify utterances according to modalities of illocutionary speech acts commonly maintained in the literature, with such basic distinctions as non-emphatic statements, yes/no questions, WH-questions, emphatic statements (see Hirst & Di Christo (1998)). This enabled us to choose between appropriate intonational modalities for statistical analyses for different subsets of tokens. 3.6.3 Automatic Acoustic Measurements 3.6.3.1 Steady-State of the Vowel The formant centre frequencies and RMS-power were calculated by averaging the estimates through all frames in a steady state part of the vowel. The steady state was defined as beginning at 15% of the total vowel duration from the vowel onset, and ending at the 50% of the total vowel duration. The minimum duration of the steady state was set to 25 ms. In exceptional cases, when the percentage approach resulted in a steady state of less than 25 ms, than its duration was defined in an absolute fashion, i.e. as the 25 ms after the initial 15 ms transition. By performing acoustic analyses in the steady state of the vowel rather than in the transitions, we excluded parts of vowels with possible short-term laryngeal influences from the left and right consonantal contexts. By integrating the RMS102 power means through the steady part of the vowel, we also levelled out accidental perceptually irrelevant short-term fluctuations of the intensity curve, which are due to the interaction of harmonic and formant frequency (Ladefoged & McKinney N.P., 1963; Lehiste, 1977). 3.6.3.2 Formant Analysis Several difficulties have to be taken into account when estimating centre formant frequencies for high fundamental frequencies, such as in child and adult female speech. First of all, it is known that the estimation can be affected by a bias towards the harmonic closest to the centre frequency. The greater the distance between two harmonics the greater the bias away from the centre frequency is likely to be (Traunmüller & Eriksson, 1997). Secondly, since this bias is proportional to f0 (Traunmüller & Eriksson, 1997), it is not constant in a prominent syllable nucleus, because in “stress-accent” languages (like English and Russian) (Beckman, 1986), f0 changes as a function of the pragmatic meaning of intonation throughout the course of prominent syllables. Therefore, in a syllable nucleus, the error in formant estimation will fluctuate depending on how far the poles are from the closest harmonic (both can eventually coincide). This problem can be somewhat levelled out by averaging the estimated centre frequencies of formants through a steady state portion of the vowel, taking the measurements at multiple points. Thirdly, we had to make naturalistic recordings with variable environmental noise (different homes and computer fan resonance). It is well known that estimation of formants is prone to environmental noise, especially when such noise is narrow band. The formant analysis procedure in this study is designed to address these issues as much as possible. Figure 3-9 represents the process of formant analysis employed in this study. The same process applied to both child and adult data, but the specific method of formant estimation (circle 2 in Figure 3-9) differed for the two types of data based on the best performance. The issues of determining the best performance are discussed in Section 3.6.4.2. 103 Figure 3-9 Data flow diagram of the formant analysis process of the acoustic waveform and annotated timing of vowels. Step 1 in Figure 3-9 involved noise reduction. The issues of efficacy of applying this method are discussed in Section 3.6.4.2.3. The timing of a typical piece of silence (or noise) was labeled for each utterance during phonetic segmentation. ‘Typical’ means that the piece of silence had to be more or less steady and reflect the noise level during of the utterance. For example, with computer fan noise we could see in the spectrum that the narrow-band formants run throughout the whole utterance. Subsequently, we performed noise reduction9 on the speech file, by subtracting the spectral magnitude (in short term discrete time Fourier transform) in a central frame of the annotated silence (noise) from all frames in the speech signal. The speech signal with subtracted noise served hence as input for further formant analysis (not for RMS-power analysis). In Step 2 (Figure 3-9), formant centre frequencies of the target vowels were analysed using PRAAT (Boersma & Weenink, 2004). Two different methods were employed for the formant centre frequency estimation in child and adult speech. For adult female speech (sampled at 11025 Hz), we estimated centre frequencies based on the 10th order LPC analysis (autocorrelation, 25 ms analysis width, 10 ms time step, pre-emphasis of +3 dB per octave from 50 Hz). For child speech, we estimated the centre frequencies based on the LPC (burg) method as implemented in PRAAT (Press et al., 1992). In this method, a speech signal is re-sampled at twice the maximal frequency of interest. Subsequently, formant analysis is applied in a Gaussian window of 51.9 Hz. The 9 The algorithm was kindly implemented by Peter Rutten, a speech signal-processing engineer working in text-to-speech technologies. 104 following parameters were employed: 5 ms (time step), 4 (number of poles), 6000 Hz (maximal frequency of interest), 25 ms (analysis width), 100 Hz (+3 dB per octave preemphasis). In Step 3 (Figure 3-9), the extracted formant centre frequencies for F1, F2 and F3 (Hz) were averaged through all the frames in a steady part of the vowel. Since both autocorrelation and burg LPC make residual errors in estimation due to high f0, we included a heuristic in the mean extraction algorithm to exclude spurious formant estimations (‘candidate errors’). The criteria for exclusion (for both children and adults) were established after extensive examination of typical values and errors in the vowels and by subsequently verifying the errors in the FFT-spectra. The following criteria were put forward, and referred to the auditorily verified phonetic symbols: - if number of estimated poles was less than 3 (rather than 4) - if female adult [i] in SSE and MSR had F2 < 1800 Hz (typically F2 > 2500) - if female adult [u] in MSR had F2 > 1800 Hz (typically F2 is low) - if child F1 < 250 Hz - if child F1 > 1500 (no open vowels in the child data in this study) - if child [i] in SSE and MSR had F2 < 2000 Hz (typically F2 > 3000 Hz) - if child [u] in MSR had F2 > 2000. The validation of the formant means, calculated after applying this procedure is addressed in section 3.6.4.2. 105 3.6.3.3 RMS-Power Analysis RMS-power analysis in specific frequency bands of DTFT spectrum served as a non-normalised basis for the spectral balance measurement. Figure 3-10 represents the process of RMS-power analysis employed in this study. The same process applied to both child and adult data. Figure 3-10 Data flow diagram of the RMS-power analysis of the acoustic waveform in fixed frequency bands. In Step 1, the RMS-power (dB) was calculated in three frequency (spectral) bands around F1, F2 and F3. The bandwidths of the spectral bands (Hz) were fixed as in Table 3-7. According to the acoustic theory of speech production (Fant, 1960), the formant bandwidths are predictable from the formant frequency, for a normal phonation with resulting spectral tilt of -12dB per octave. The bandwidths can be derived for a frequency of interest from the following equation (Fant, 1960): Equation 3-1 Bn = Fn / 2π ; where Bn is a bandwidth, and Fn is a target frequency. The bandwidths derived from Equation 3-1 were than fixed to a maximum, by rounding of the maximal bandwidth to the nearest hundred in each of the three frequency slices shown in Table 3-7. For example, if F2 of a vowel was 2100 Hz, the RMS-power (dB) would than be calculated between 1801 and 2400 Hz. For each token, the spectral bands did not overlap. 106 Table 3-7 Summary of the fixed frequency bandwidths for three frequency slices. Frequency slice (Hz) Maximal bandwidth (Hz) Fixed Bandwidth (Hz) 1 to 2000 318.3098862 300 2001 to 4000 636.6197724 600 4001 to 5500 875.352187 900 For a given spectral band, the power (dB) was calculated from the short term Discrete Time Fourier Transformed spectrum (stDTFT)10. A Hamming window of 23 ms was used to extract the speech signal samples. The power (dB) was than defined as 20 times the base-10 logarithm of the power in a frequency band, relative to the maximum power allowed by 16-bit quantisation (32767). The power was expressed as RMS following the equation: Equation 3-2 RMS ( Fx → Fy ) = 2 N2 N / x− y ∑ F ( n) 2 (dB) 0 where: N is number of samples in a frame; Fx is start frequency of a frequency band; Fy is end frequency of a frequency band; F(n) is the short term Discrete Time Fourier Transformed spectrum of the of the windowed speech segment. In Step 2 (Figure 3-10), the RMS power values (dB) were extracted for each of the three bands, and from the values the means were calculated throughout the steady part of the vowel. Overall intensity was measured and averaged in the same way as in specific spectral bands, except that the RMS-power measurements covered all spectrum frequencies. 3.6.3.4 Fundamental Frequency Analysis Fundamental frequency, f0 (Hz), was estimated using speech analysis package “Wavesurfer” (Sjölander & Beskow, 2000), using the ESPS (XWaves) pitch analysis method. Since both child and female adult speech can have high fundamental frequencies, we employed the same broad parameters covering both groups. 10 The algorithm was kindly implemented by Peter Rutten, a speech signal-processing engineer working in text-to-speech technologies. 107 The f0 was calculated in frames of 10 ms, with a minimal pitch threshold of 50 Hz, and a maximum pitch threshold of 1500 Hz. The f0 (Hz) values were then extracted for three positions in the vowel (with reference to the total vowel duration, rather than to that of the steady state): (1) for the frame corresponding to the onset of the vowel duration + 10 ms (2) for the frame in the middle of the vowel (3) for the last frame in the vowel. 3.6.4 Data Validation and Normalisation. 3.6.4.1 Validation of Phonetic Labels We performed a validation of auditory phonetic labelling by testing intra- and intertranscriber reliability. Both tests were based on 5% of the child data pseudorandomly chosen from the data covering two age samples in both languages and from all the child control data available at the time of testing. A total of 143 utterances from six subjects were selected. Both intra- and inter-transcriber reliability tests were conducted following the labelling criteria specified in 3.6.2.1, except that only vowels and no surrounding consonants were annotated in the tests. The transcribers were informed as to the target words collected in this study, since the author was also aware of those during the data annotation process. The agreement in the labeling results in both tests was measured by means of statistical analysis. ‘Cohen's kappa’ statistic tests a pairwise agreement between transcribers against the degree of agreement expected by chance, and, therefore, it is considered to be a better indication of intra- and intertranscriber reliability than a percentage of agreement (Fleiss, 1971). Pairwise agreement with a kappa value of more than 0.7 (K>0.7) is considered to be a statistically satisfactory agreement. Kappa is an overall index of agreement, and it does not indicate sources of disagreement, so we mention them separately. In the intratranscriber reliability test, we re-labeled vowels in the selected 143 utterances. The time difference between the original labeling and re-labeling ranged from 2 to 8 months. The overall agreement for phonetic labels was greater than 0.7 (kappa=0.767; n=143), which is a statistically satisfactory pairwise agreement. The major sources of disagreement were the labels for close rounded vowels, specifically the phonetic labels [u],[],[] realised in place of the adult target // or /u/. While it was not 108 difficult to decide that a given target belongs to a close rounded vowel space as such, it was not always obvious what precise label to assign in this variable child vowel space. In the intertranscriber reliability test, two phonetically trained transcribers other than the author annotated the same set of 143 utterances. Both transcribers (A and B) were native speakers of English, (A) of American English, and (B) of SSE. The intertranscriber agreement with transcriber A was statistically unsatisfactory, i.e. K< 0.7 (K= 0.611, n=143), even though the kappa value was still at the high end of the agreement. The agreement with transcriber B was statistically satisfactory (K= 0.767, n=143). Likewise in the intratranscriber reliability test, the major source of disagreement were the labels for close rounded vowels [, u, ]. Since there was statistically significant intratranscriber agreement, and a significant intertranscriber agreement with the transcriber B (native speaker of SSE), we consider the overall quality of auditory phonetic analyses performed in this study generally to be satisfactory for further statistical analyses. 3.6.4.2 3.6.4.2.1 Validation of Estimated Formant Frequencies Introduction The method of inferring vocal effort (Sluijter & van Heuven, 1996b) used in this study relies on the goodness of estimation of the formant frequencies. In this study we used two different standard methods of formant frequency estimation (LPC autocorrelation for adults and LPC burg for children). We determined the goodness of performance formant analysis for each subset of data separately. For adult recordings, the LPC autocorrelation method performed better than the LPC burg method in estimating Russian close back vowel [u], and performed similarly for other vowels. We checked this by selective manual examination of the automatic output and by visual inspection of the corresponding FFT spectra. Based on this observation, we chose the autocorrelation method for the adult data. For child speech LPC burg method performed substantially better than LPC autocorrelation, and the test of performance is described in Section 3.6.4.2.3 alongside the validation part. 109 3.6.4.2.2 Adults Previous studies of vowel formant frequencies in SSBE, SSE and MSR allow us to judge whether the adult vowel formant frequencies are plausibly estimated in this study. For this comparison, we selected two studies of Russian vowel formants offering typical values for one male (Fant, 1960) and one female speaker (Bondarko, 1981). For SSBE, Wells (1962) studied 25 male subjects, whereas Deterding (1997) made acoustic measurements of five female SSBE BBC broadcasters from the 1980’s. For SSE, Walker (1992) reported formant frequencies of all vowel monophthongs. She analysed F1 to F3 for five female SSE speakers from Edinburgh. Additionally for SSE, we refer to the data on Glaswegian female speakers in Scobbie et al. (1999b). Table 3-8 compares formant estimations (in Hz) between these acoustic studies for different languages and our own measurements using LPC autocorrelation method described in Section 3.6.3.2. The General American data and male data are useful as additional references when female cross-study estimations are in conflict. For the adult data, we discuss all language-specific discrepancies larger than 150 Hz between our data and different sources for female speakers listed in Table 3-8. All the language-specific female data agrees on F1 to F3 for the vowels [i] and []. The measurements between the sources are very similar for all the languages in the table. Russian formant frequencies seem to have been reasonably estimated, given that there is a gender difference between Fant (1960) and this study, so that lower formant values can be expected in Fant’s male data. The difference of 315 Hz in Russian back [u] for female data is acceptable, since it falls within possible F2 ranges reported from different sources by Bondarko (1981). For SSE [], there is a large discrepancy of 401 Hz in F2 between this study and Walker. For SSBE [u], there is an even larger discrepancy of 610 Hz in F2 between this study and Deterding. For SSBE [], there is a discrepancy of 152 Hz in F2 between this study and Deterding. There are methodological differences across the studies. Walker and Wells both performed manual measurements from FFT spectra. However, both Deterding and this study used an automatic LPC-based method, and yet show a large discrepancy. Even though, different estimation methods are bound to result in somewhat different measurements for the same tokens, they are not likely to explain such big differences, since most sources in Table 3-8 agree to a greater extent for the formants of [i] and []. 110 It is also possible that the studies in Table 3-8 reported reasonable estimations of F2 for the close rounded vowels, and the big differences are due to diachronic phonetic changes in close rounded vowel quality. There are several reports (Gimson, 1962; Bauer, 1985) of a rapid diachronic change in the quality of RP /u/ since Well’s data (1962) was sampled. A recent study by Hawkins and Midgeley (Hawkins & Midgley, 2004) confirms the acoustic fronting of [u] for a group of 20 male RP speakers recorded in 2001, especially in the younger age group. Deterding (1997) based his measurements on the corpus of speech of BBC World Service newsreaders recorded in the 1980’s. The estimated mean F2 for /u/ in Bauer’s (1985) 1982 data was 1704 Hz, which is 369 Hz higher than the means reported by Deterding (1997). Our F2 means for SSBE /u/ are 610 Hz higher than Deterding’s (1997). This shows that in today’s SSBE the traditionally described rounded close back vowel has evolved into a more central or even front rounded vowel. It is less certain whether a similar phonetic change (401 Hz difference in F2 of []) may have happened in SSE, since the time span between the different sources is only 12 years at most. Data in this study and data reported in Scobbie et al. (1999b) are in agreement, while Walker’s (1992) study is in disagreement with both. Since we have seen that our method has no problems estimating low F2 in back vowels, we can conclude that our estimation of SSE resonating frequencies is acceptable here as well. 111 Table 3-8 A comparison of different acoustic studies of formant frequencies (Hz), estimated for adult native speakers of SSBE, SSE, MSR and General American. IPA symbol SSBE n/Sex [i] Wls 25M 285 2373 3088 356 2098 2696 MSR Detr 5F 303 2654 3203 384 2174 2962 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 309 328 F2 939 1437 F3 2320 2674 F1 376 410 F2 950 1310 F3 2440 2697 [] [] [u] [] SSE Gord. Bond. Fant Gord. Sc. Wlk. Gord. 3F 1F 1M 4F F 5F 5F 399.4 300 222 383.4 412 343 376.7 2681 2620 2240 2708 2749 2689 2725.6 3170 * 3140 3220 * 3327 3364.6 * 531 545.1 507.9 * 2255 2110.4 2178 * 3009 3027.9 2976 445 411 393.6 1977 1576 1992.9 * 2727 2818.2 403.9 320 231 405.5 2047 620 730 935.4 2911 * 2230 2889 488.2 1462 2886 Gen.Am. H&al 48F 437 2761 3372 483 2365 3053 459 1105 2735 519 1225 2827 References for the sources and explanations for the codes used in Table 3-8: Gord. (this study) Wls (Wells, 1962) Detr (Deterding, 1997) Bond. (Bondarko, 1981) Fant (Fant, 1960) Wlk (Walker, 1992) Sc. (Scobbie et al., 1999b) H & al (Hillenbrand et al., 1995) * not available F Female M Male 112 3.6.4.2.3 Children We validated the estimated formant frequencies for child speech by manual reannotation of vowel formant frequencies for a subset of child speech. The subset of child data included the same pseudo-random 143 utterances that were originally used for measuring inter- and intratranscriber reliability (Section 3.6.4.1). All manual annotation of formants was performed in PRAAT (Boersma & Weenink, 2004). The F1 to F3 frequencies (Hz) were measured directly from FFT spectra (Hamming window, frequency range 0 to 7000 Hz, window length of 25 ms, and the dynamic range varying from -45dB to -20dB depending on the signal quality) and additionally from the spectra of according spectral slices. The vowel formant values from the manual annotation were then compared to the automatic output of two formant methods, i.e. LPC burg and LPC autocorrelation. The comparison was done by calculating RMS error (Hz) for each formant separately. The input parameters of the LPC burg method were the same for children described in Section 3.6.3.2. The parameters for the autocorrelation method were the same as for adults, except that for child speech the downsampling frequency was 12000 Hz. Additionally, to test the efficacy of the noise reduction component in the formant analysis procedure, we also included for each of the LPC methods the outputs from the acoustic signal with subtracted noise and with the original noise level. So in total we calculated RMS errors of formant estimation (Hz) for four methods: (1) LPC burg with subtracted noise; (2) LPC burg with original recording noise level; (3) LPC autocorrelation with subtracted noise; (4) LPC autocorrelation with original noise level. The results of this test are presented in Figure 3-11. In terms of the smallest RMS error (Hz), the LPC burg method with subtracted noise outperformed other three methods for all measured formants (F1 to F3, Hz). Compared to the manual annotation, the autocorrelation method performed much worse than burg, especially for F2 and F3. For all three formants, applying the noise subtraction improved the performance of the automatic LPC formant tracking as compared to the output from the signal with the original noise levels. The improvement was the biggest for F2, where both methods (burg and autocorrelation) had an improvement in estimation of 17%. This test reinforces: (1) the use of the LPC (burg) method for the formant estimation of child speech in our study, since it gives a more realistic estimation of the formant frequencies than LPC 113 (autocorrelation); (2) the use of the noise subtraction procedure to improve the overall formant estimation. 900 800 RMS Error (Hz) 700 LPC Burg (reduced noise) 600 LPC Burg (original noise) 500 LPC Autocorrelation (reduced noise) LPC Autocorrelation (original noise) 400 300 200 100 0 F1 F2 F3 Formant Number Figure 3-11 RMS errors (F1 to F3, Hz) for four automatic formant analysis methods as compared to manual formant measurements from FFT spectra. 3.6.4.3 Normalisation of RMS-Power Measurements In section 3.6.3.3 we described the acoustic method used to measure spectral balance. However, raw RMS-power measurements around formants cannot be used without normalisation for other (non-)linguistic effects, such as variation of overall intensity and formant frequency shifts between different instances of the same vowel. Differences in the overall intensity of a vowel result from several confounding speaker-related effects, such as the amount of the exerted effort, differences in vowel quality. It also results from the environmental factors such as the distance of the subject from the microphone, speaking volume, recorder volume settings and environmental noise. An appropriate normalisation procedure should separate the speaker effects from the environmental effects. The measures undertaken to reduce environmental noise were discussed in Section 3.5.1.2. To normalise for the residual differences in overall intensity, we chose what Jessen (2002) calls an ‘intrinsic normalisation method’. An intrinsic normalisation expresses a 114 relationship between two different acoustic measures from the same token, while extrinsic methods involve a comparison of a measure in one token to the same measure in some other related token (e.g. overall intensity in stressed and unstressed syllables of the same word). An extrinsic normalisation method is ill suited in this study, given that the data were gathered in spontaneous play situations, and elicited phrase structure varied from utterance to utterance, so that no unique comparison point could be found for all the utterances. The first step involved normalisation for differences in overall intensity. The spectral band level of each vowel token was expressed as a ratio of the RMS-power in a specific frequency band to the overall RMS-power of the same token, as shown in: Equation 3-3 Ai = RMSBi − OI (dB) where OI is the measured overall RMS-power (dB) averaged through the steady state of the vowel, and RMSBi is the RMS power in a specific frequency band i (dB) for the same part of the vowel. This ratio was calculated for each of the formant frequency bands of the vowel, and was taken as input for the next normalisation step. The next step involved normalisation for differences in the vocal tract resonance frequencies. The differences are a result of intra- or interspeaker variation in supralaryngeal settings for a given vowel rather than in vocal effort. This problem has been addressed in studies dealing with acoustic correlates of stress (Sluijter & van Heuven, 1996b), ‘syllable-cut’ prosody (Jessen, 2002) and voice quality (Hanson, 1997). According to the acoustic theory of speech production (Fant, 1960), formant frequencies and intensity levels of the spectrum are interrelated, following the ‘low-pass filter’ rule: shift in the frequency Fn of a formant brings about an intensity level change of the sound, which is mainly confined to frequencies above Fn and amounts to +12 dB for an increase of one octave in Fn (Fant, 1960, p. 58) This means that the same speaker can produce two phone instances of the same vowel with a slightly different articulatory setting (e.g. due to a more open articulation). This supralaryngeal difference results in different vocal tract resonance in the radiated spectrum (higher F1 in our example), and consequently in somewhat higher intensity levels in the frequencies above F1 as specified by the ‘low-pass filter’ rule. 115 An appropriate normalisation procedure should thus separate the laryngeal effects from the supralaryngeal ones. To achieve this, we followed the normalisation method for formant frequency shifts described in Jessen (2002). As a whole, the method in his study originates from different sources (Sluijter & van Heuven, 1996b; Hanson, 1997). Each formant measurement (F1 to F3, Hz) for a vowel phoneme instance is compared to the formant means for this vowel across speakers. The residual formant frequency difference (Hz) is then transformed into intensity difference (dB). Subsequently the intensity difference is subtracted from the measured RMS-power (dB) around the formant. The method enables a comparison between different articulatory realisations of the same phonemes, and allows comparison between speakers. However, in this study we had to adopt the normalisation to enable crosslinguistic comparison. Therefore, the specific formant estimation for each vowel token was compared to the formant mean for this vowel phoneme across languages and speakers. Originally we intended to apply the normalisation procedure for the formant frequency shifts for both F2 and F3. However, upon the validation of formants in child speech (Section 3.6.4.2.3), it was decided to exclude the data of spectral balance around F3 due to its unreliability, since the formant shift normalisation in F3 requires subtraction of a joint contribution from F1, F2 and F3 (Hanson, 1997). We have seen that the RMSerror in estimation of F2 and F3 was on average 300 Hz (Figure 3-11) for either formant, so that the cumulative RMS-error was greater if we considered both F2 and F3. Therefore, we account for the spectral balance measurement around F2 that requires subtracting a joint contribution from only F1 and F2 (Sluijter & van Heuven, 1996b, p. 2478). We first calculated the correction factors for intensity level differences that are due to formant frequency differences. The mean formant frequencies for F1 to F2 (Hz) were averaged for each target phoneme across languages and all speakers (for adults and children separately). The means for F1 to F2 (Hz) were derived separately for adults and children due to big differences in their vocal tract length and resonance. Equation 3-4 (Sluijter & van Heuven, 1996b, p. 2478) calculates the correction factor for intensity level differences of A2 due to shifts in F1 as compared to the mean F1 and F2. Equation 3-4 ∆A2 a = 40 log 10( F 1n / F 1) − 40 log 10( ( F 2 n 2 − F1n 2) / ( F 22 − F 12) ; (dB) 116 where F1, and F2 are estimates of the first and second resonating frequencies (Hz) for each vowel token. F1n and F2n (Hz) are mean frequencies for each formant averaged across speakers (adults and children separately) for each phoneme in each language separately. The correction factor ∆A2 a (Equation 3-4) was then subtracted from the result of Equation 3-3 as in: Equation 3-5 A2 * a = A2 − ∆A2 a ; (dB) Additionally, to allow comparisons between pairs of vowels different in vowel quality we calculated the correction factors based on Equation 3-4 and for F1n and F2n (Hz) based on mean frequencies for each of the two formants averaged across speakers (adults and children separately) and languages (1) jointly for unrounded vowels /i/ and //, resulting in correction factor ∆A2b (2) jointly for rounded vowels // and /u/ and across speakers (adults and children separately) resulting in correction factor ∆A2c. The correction factors were then subtracted from the band-specific result of Equation 3-3 resulting in two additional normalisations: Equation 3-6 A2 * b = A2 − ∆A2b ; (dB) Equation 3-7 A2 * c = A2 − ∆A2c ; (dB) To summarise, four normalised measures of spectral balance (A2, A2*a, A2*b, A2*c) were used in this study. All of them normalised for the differences in overall intensity. The normalisation was separate for the child and adult speakers. Additionally, to allow crosslinguistic comparison for the vowels similar in quality, A2*a normalised for the supralaryngeal differences in F2 within each of the phonemes /i/, //, /u/, // or // across each speaker group (children or adults) across languages. Besides, to allow crosslinguistic comparison and comparison of vowels different in vowel quality, A2*b normalised across the phonemes // and /i/ across each speaker group and language, while A2*c normalised for the differences across /u/, // and // across each speaker group and languages. 117 4 Acquisition of Vowel Quality 4.1 Introduction This chapter investigates bilingual patterns of vowel quality. The subjects are two bilingual children AN and BS acquiring Russian and Scottish English. The two girls differ in the bilingual input conditions: i.e. BS is a Russian-dominant bilingual, while AN had a nearly equal amount of input in the two languages. Both girls had “early, simultaneous, regular, and continued exposure to more than one language” (de Houwer, 1995, p.222) from before the age of two. The vowels investigated fall into two categories: (1) close(-mid) unrounded vowel /i/ in Russian versus SSE /i/ and //; and (2) MSR close rounded vowel /u/ versus SSE close central rounded //. The former group forms a systemic crosslinguistic difference; the latter one is realizational: i.e. equally ambiguous from the point of view of either language. The crosslinguistic differences regarding the vowel quality have been addressed in Section 2.1.3. The chapter is built around two sets of questions: (1) Does each of the bilingual children have a differentiated control of vowel quality in their two languages? Does this control change longitudinally? (2) Is there any language interaction? What are the patterns? Can the direction of interaction be explained by the amount of language input or intralanguage factors such as ‘markedness’ or ‘cue strength’? The aim set out in this and further chapters is to account for bilingual language differentiation and interaction patterns for these research variables, and to test them against two views on language interaction in bilingual language acquisition that have been admittedly formulated for morpho-syntactic development. The Language Dominance Hypothesis (Petersen, 1988) claims that language dominance determines the direction of language interaction in young simultaneous bilinguals: i.e. transfer is unidirectional towards a less dominant language. As opposed that, the Cross-language-Competition Hypothesis (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998) both claim that linguistic structure and its complexity (relative ‘cue strength’ or ‘markedness’) determine the direction of interaction, whereby language interaction for a feature is unidirectional towards the language with a more ambiguous (marked) structure. 118 Furthermore, we provide new SSE monolingual data on the state of acquisition of these vowels at the ages of 3;4 to 4;8. Monolingual results (adults for Russian and adults versus children) are presented first. Then bilingual results for each subject (BS and AN) are compared to the monolingual data. In particular, the subjects’ phonetic ranges of realisations of adult targets in each of the vowel categories are compared to the ranges of the monolingual controls (both children and adults), trying to tease apart the issues of bilinguality and speech immaturity. Then we compare (where possible) each subject’s ranges in SSE to those in their Russian monolingual language mode to address the question of language differentiation. Longitudinal aspects will be looked at separately. From all these analyses we attempt to derive the answers to the above research questions. 4.2 Statistical Analysis The phonetic variation between bilinguals’ languages and between bilingual and monolingual children is compared by means of non-parametric statistics. Phonetic range for each vowel category is defined by the frequencies of assigned phonetic labels for an adult target phoneme. Chi-square (χ2) tests are appropriate to determine whether such distributional differences in phonetic realisation are significant (at 95% level of confidence). The tests were performed only for those subsets of data that fulfilled the validity requirements of ‘expected frequencies’. 4.3 Acquisition of Vowel Quality 4.3.1 Scottish English Monolingual Results 4.3.1.1 Acquisition of close(-mid) unrounded vowels In Section 2.2.1.1., we reviewed the evidence that SSE vowel // is acquired later than /i/, and that at the age of 3;0 SSE-speaking children produce the lax vowel with a relatively low accuracy rate compared to other vowels (Matthews, 2002). This section explores monolingual production patterns at the age of 3;4 to 4;9. As we discussed in Section 3.3.1, the control sample for SSE monolingual children in this study consisted of seven children (aged 3;4 to 4;9). Three of the children were 119 recorded twice longitudinally, giving us a total of ten controls. In the following graphs the data of all children (including longitudinal cases) are plotted by their age. In this section we explore whether the accuracy rate of the SSE vowel // reported in Matthews (2002) has improved in our age group of SSE-speaking children, whether the children are still different from adults in the range of phonetic variants. Additionally, we consider how the phonetic range of /i/ relates to that of //. With regard to the first question, Matthews’ (2002) data show that the total percentage of // produced as adult-like [] in the longitudinal data of his seven subjects aged 1;10 to 2;10 is only 54.5% (n= 521). The percentages of adult-like forms of [] for // for each subject at the age of 2;5 to 2;10 ranged from 11 to 93%. The lax vowel belonged to the ‘difficult’ category11. The phonetic range of // for our SSE child control data is presented in Table 4-1 and in Figure 4-1. The results show that the overall percentage of adult-like realisations of // in our group has increased to a total of 98.9%. Interestingly the overall 0.9% of all non- adult-like forms is contributed by tongue raising ([i] for //). Among all the children, only C3 (aged 3;11) had an instance of // (1.4%) produced as [] with lip rounding and tongue raising. The cases of tongue raising to [i] (0.9%) belonged to two children, of whom C9 (aged 4;9) was the oldest child in our group, so that the results showed that segmentally // is acquired by the age of 3;4 of monolingual development. The limited number of non-adult-like realisations is likely to be a sign of residual speech immaturity, rather than a systematic developmental property. As expected, the overall number of non-adult-like realisations for the vowel /i/ is very low (0.5%), and the results per child are summarised in Appendix A. There was only one non-adult variant involved: [] for /i/. 11 Matthews’ (2002) subdivision into “difficult” and “easy” refers to standard deviations in the number of all individual target vowels for a given session per speaker, so that we can’t use the same criterion for our comparison, since our data involves only a limited subset of the whole vowel space. Here and in the following sections on the rounded vowels we calculated percentages comparable to our own data for each individual vowel from the raw data set kindly provided to us by Ben Matthews. 120 100% 90% % tokens per subject 80% 70% } [] I [] i [i] 60% 50% 40% 30% 20% 10% C 9_ 4; 9 C 7_ 4; 8 C 7_ 4; 2 C 8_ 4; 2 C 4_ 4; 1 4; 0 5_ 4; 0 C C C 3_ 6_ 3; 11 3; 8 4_ C C 3_ 3; 4 0% Subjects Figure 4-1 Phonetic range of variation in the production of the lax vowel // by SSE monolingual children (plotted by age on the horizontal axis) Table 4-1 Phonetic ranges of adult target // produced by SSE monolingual children Speaker C3_3;4 C7_4;2 C4_3;8 C3_3;11 C6_4;0 C8_4;2 C5_4;0 C4_4;1 C9_4;9 C7_4;8 Total Tokens per speaker N [i] 0 84 0 84 % .0% 100.0% .0% 100.0% N 0 46 0 46 % .0% 100.0% .0% 100.0% Label [] Total [] N 0 28 0 28 % .0% 100.0% .0% 100.0% N 0 72 1 73 % .0% 98.6% 1.4% 100.0% N 0 56 0 56 % .0% 100.0% .0% 100.0% N 1 92 0 93 % 1.1% 98.9% .0% 100.0% N 0 73 0 73 % .0% 100.0% .0% 100.0% N 0 45 0 45 % .0% 100.0% .0% 100.0% N 5 76 0 81 % 6.2% 93.8% .0% 100.0% N 0 80 0 80 % .0% 100.0% .0% 100.0% N 6 652 1 659 % .9% 98.9% .2% 100.0% 121 With regard to the second question of how the amount of non-adult realisations for the target /i/ compares to that of //, the answer is summarised in Table 4-2. The low number of non-adult like forms did not allow us to perform valid chi-square tests12. The overall low number of non-adult like realisations: 0.5% for /i/ versus 0.7% for // shows that both vowels are equally mastered in terms of adult-like production at this stage of monolingual development, and there is no difference between them in terms of ‘difficulty’ at this point. Table 4-2 Frequencies of adult and non-adult like realisations of /i/ and // for SSE monolingual children (aged 3;4 to 4;9) Target vowel /i/ // Total Tokens per target vowel N Adult like? No Total 4 Yes 780 % .5% 99.5% 100.0% N 6 652 658 % .9% 99.1% 100.0% N 10 1432 1442 % .7% 99.3% 100.0% 784 With regard to the third question of children’s production compared to adults’ phonetic variants in auditory terms, the difference between the two groups is in the fact that some children did marginally produce non-adult-like forms (e.g. [i] for // in Table 4-1), whereas adults did not at all. 4.3.1.2 Acquisition of close rounded vowels Based on the overall numbers of adult-like and non-adult-like realisations for // in Matthews (2002) study, the vowel was labeled as ‘difficult’ compared to the rest of the SSE vowels (see also Section 2.2.1.1). This section explores phonetic ranges for this target vowel produced by the monolingual children aged 3;4 to 4;9. We address the questions: (1) whether the substantial percentage of non-adult-like realisations for // reported in Matthews (2002) for children aged up to 3;0 is reduced in the production of SSE monolingual children aged 3;4 to 4;9; (2) how the phonetic ranges of // produced by the SSE-speaking children relate to those of the adults in this study. 12 This remark is also true in the following sections in places where we present contingency tables without accompanying statistical results. 122 With regard to the first question, Matthews’ (2002) data showed that the percentage of adult-like forms (i.e. produced as []) for // in the longitudinal data of his seven subjects aged 1;10 to 2;10 was only 63.8% (n of tokens= 537): with the percentages of adult-like forms across subjects (compared to total targets //) at the age of 2;5 to 2;10 ranging from 25% to 93%. Therefore, the vowel was classified as ‘difficult’. Our results for // for the SSE monolingual children aged 3;4 to 4;9 are presented in Table 4-3 and in Figure 4-2. The overall percentage of adult-like realisations of // by the children increased to a total of 83% (compared to 63.8% in Matthews’ study). However, despite an increase of 19 percentage points of adult-like forms compared to Matthews’ data, all subjects in this study still produced a substantial number of non-adult-like realisations ranging from 1.9 to 35.8%. The most important contributor of non-adult-likeliness (a total of 11.8% in Table 43) is the tongue lowering and backing resulting in a sound like []. // does not feature in adult SSE phonology. However, all monolingual children produced [] to some degree (ranging from 1.9% to 32%). Interestingly, among the tokens labeled as [], 76.8% of cases come from the SSE carrier words featuring the lax // in the SSBE adult model (see Appendix B). The second most important contributor to non-adult-likeliness of // is the 4.3% of back realisations as [u]. The presence of this variant produced by the SSE-speaking children is very important for our crosslinguistic perspective involving Russian back vowel /u/. This means that any presence of back realisations of [u] for SSE // in our bilingual children speech production cannot be interpreted as a sign of language interaction from Russian. Instead we need to look at the statistical significance of the differences in phonetic ranges, rather than at their mere presence. 123 100% 80% U [] [u] u [] } [] I [i] i 60% 40% 20% 9 C 9_ 4; 8 C 7_ 4; 2 4; 7_ C C 8_ 4; 2 1 C 4_ 4; 0 4; 5_ C 6_ 4; 0 C 3_ 3; 11 C 8 3; 4_ C C 3_ 3; 4 0% Figure 4-2 Phonetic range of the production of the adult target vowel // by SSE monolingual children (sorted by age on the horizontal axis). Table 4-3 Phonetic range in the realisation of adult target [] by SSE monolingual children Speaker C3_3;4 C7_4;2 C4_3;8 C3_3;11 C6_4;0 C8_4;2 C5_4;0 C4_4;1 C9_4;9 C7_4;8 Total Tokens per speaker N [i] 0 0 51 1 25 77 % .0% .0% 66.2% 1.3% 32.5% 100.0% N 1 0 55 0 2 58 % 1.7% .0% 94.8% .0% 3.4% 100.0% N 0 1 13 0 1 15 % .0% 6.7% 86.7% .0% 6.7% 100.0% Label [] [] Total [u] [] N 0 1 46 0 1 48 % .0% 2.1% 95.8% .0% 2.1% 100.0% N 0 0 52 8 6 66 % .0% .0% 78.8% 12.1% 9.1% 100.0% N 0 1 71 5 15 92 % .0% 1.1% 77.2% 5.4% 16.3% 100.0% N 0 0 43 11 9 63 % .0% .0% 68.3% 17.5% 14.3% 100.0% N 0 0 29 0 6 35 % .0% .0% 82.9% .0% 17.1% 100.0% N 0 0 52 0 1 53 % .0% .0% 98.1% .0% 1.9% 100.0% N 0 1 67 0 2 70 % .0% 1.4% 95.7% .0% 2.9% 100.0% N 1 4 479 25 68 577 % .2% .7% 83.0% 4.3% 11.8% 100.0% 124 4.3.1.3 Summary of results for the SSE monolingual peers For the vowel /i/, the SSE monolingual children aged 3;4 to 4;9 had no problems producing adult-like vowel quality. Overall, there were only 0.5% of non-adult-like targets involving lowering of /i/ to []. Only two children contributed to this immature realisation. Even though this is not systematic, considering the bilingual aspect of the study involving lack of tense/lax contrast in Russian the presence of this variant in the monolingual data is worth noting. For the vowel //, the SSE monolingual children seem to have resolved the production difficulty reported in Matthews (2002). Immature realisations for this vowel are infrequent, but nevertheless they do occur. They include fronting and raising to [i] and lip rounding and raising to []. The occurrence of fronting and raising of // to [i] in monolingual child speech is again important to notice, given that the same phenomenon is frequently observed in the speech in L2-learners of languages lacking such a contrast (Panasyuk et al., 1995; Escudero, 2000; Piske et al., 2002), and it is possible that similar effect might manifest itself in the course of SSE acquisition in bilingual children speaking Russian and Scottish English. For the vowel //, the non-adult-like realisations were still systematic, in that all the monolingual children produced them. Overall we measured 17% of non-adult-like variants for //. The non-adult-like production involved in order of importance: lowering and backing to [] (11.8%), backing to [u] (4.3%), lip unrounding (0.9%). Among the tokens labeled as [], 76.8% of cases come from the English carrier words featuring the lax // in the SSBE adult model. This is a very interesting finding given the fact that one fourth of the MC families in Edinburgh have at least one member speaking a non-SSE English variety (Scobbie et al., 1999a), usually featuring the /u/-// distinction. Obviously pre-school children come into contact with these non-SSE English varieties in Edinburgh through the local nurseries, and the variability of their speech production might also reflect the variability of cross-varietal input that they receive in the community, rather than only be attributed to speech immaturity at the age concerned. Individual children showed different ranges of variation for each of these processes. Important for our crosslinguistic perspective is the process of backing to [u], since it 125 involves a variant similar to Russian /u/. Thus in defining language differentiation in SSE and possible interaction from Russian in our bilingual subjects we need to compare the proportions of phonetic ranges (including [u]) rather than relying on the mere presence of /u/ to establish language interaction. 4.3.2 Bilingual Acquisition 4.3.3 Subject AN 4.3.3.1 4.3.3.1.1 Acquisition of close unrounded vowels Language differentiation As we discussed in Sections 1.3.2.3.1, 2.1.3.3 and 2.3.2, crosslinguistic differences in tense/lax vowel opposition, such as its presence in one language and its absence in another, constitute a relative difficulty for L2-learners depending on the age of onset of L2-learning (Flege et al., 1995; Guion, 2003), while simultaneous bilinguals ultimately acquire such contrasts in a native-like fashion (Guion, 2003). We investigate whether AN acquired this systemic difference. The first question is whether the absence of the vowel [] in MSR affected AN’s production of the lax vowel in SSE in terms of frequency of occurrence of phonetic variants [i] and [] for the target // compared to the SSE monolingual peers. Table 4-4 The effect of factor bilinguality of subject AN compared to the SSE monolingual peers for the production of phonetic variants [i] and [] for the target //. Bilingual? No Tokens per label N % Yes Total Label [i] Total [] 6 659 665 .9% 99.1% 100.0% N 4 303 307 % 1.3% 98.7% 100.0% N 10 961 972 % 1.0% 99.0% 100.0% The set up of the test is presented in Table 4-4. As we discussed in Section 4.3.1.1, the occurrence of [i] for // in the SSE monolingual data was rather infrequent (in two children out of seven). However, it was present, with the total of 0.9% of [i] for the //. 126 AN produced 1.3% of [i] for the target //. This percentage is comparable to C8’s individual 1.1%, and it is less than C9’s individual 6.2% in Table 4-1. Therefore, AN’s production of the lax vowel in SSE is acquired similarly to the monolingual children despite the absence of the vowel in Russian. The occurrence of [i] for the target // can fully be attributed to speech immaturity, rather than to language interaction from Russian. The second question is whether the presence of the tense/lax vowel contrast in SSE affected AN’s production of the vowel /i/ in MSR. We assess this in terms of number of occurrence of phonetic variants [i] and [] for the MSR target /i/ versus the SSE /i/. The comparison is presented in Table 4-5. There was a significant association [χ2=20.536; df=1; p<0.01] between the numbers of the phonetic labels [i] and [] for target /i/ and language, with [] for /i/ being more common in MSR (10%) than in SSE (2%). Table 4-5 AN’s production of phonetic variants [i] and [] for the target /i/ in SSE compared MSR (across age). Language SSE MSR Total Label for /i/ Tokens per Language N [i] 378 7 385 % 98.2% 1.8% 100.0% [] Total N 198 22 220 % 90.0% 10.0% 100.0% N 576 29 605 % 95.2% 4.8% 100.0% As we already discussed, the vowel quality of /i/ in both Russian and Scottish English is similar, while Russian features no lax vowel //. The direction of the significant association in this test, however, is very surprising, since it appears that AN produced a greater proportion of [] for the target /i/ in MSR compared to SSE, while we would expect it to be the other way around; and, in fact, we should expect no [] in Russian at all. The accounted substitutes for the vowel /i/ in Russian children are [] or [] (Zharkova, 2002, p. 72). To our knowledge there are no maturational accounts of the occurrence of [] for /i/ in stressed syllables in the speech of Russian monolingual children. The unusual use of [] for /i/ by AN in Russian certainly confirmed our native speaker intuition that these cases sounded non-Russian even for immature child speech. 127 This affected all three Russian carrier words containing target /i/: /’kit/ (n=3), /ti/ (n=15), /’fib/ (n=4). Thus, we can conclude that the presence of the vowel [] for the target /i/ in AN’s Russian speech is a sign of language interaction from SSE. This is a surprising direction of language interaction, because AN introduced a marked SSE-sounding vowel [] for the unmarked Russian target /i/. This direction of interaction contradicts both CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998), if the influence originated from within the crosslinguistic vowel systems. A possible explanation for this language interaction in the vowel system in AN’s /i/ could be the phonotactic influence of palatalisation of the preceding consonant. If the effect on the vowel /i/ in AN’s speech had its origin in the acquisition of the consonantal system, rather than in that of vowels, then the argument of the relative markedness of the vowel systems would be totally irrelevant, and we cannot make a claim that both CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998) are falsified at the level of speech production. Phonotactically, Russian /i/ is preceded by palatalised consonants, while nonpalatalised consonants require following // for this pair of vowels. We also know that Russian-speaking children sometimes produce [] for /i/ (Zharkova, 2002, p. 72). Thus, potentially AN could aim for [], but produce [] for the target /i/, because she had not acquired //. Alternatively, she could not yet have acquired the palatalisation in a similar way to Russian monolingual children, and thus produced non-palatalised consonants followed by []. Since we annotated this detail of consonantal contexts during phonetic labelling, we can investigate this alternative explanation. Table 4-6 shows the distribution of palatalised and non-palatalised consonants in the preceding context of the vowels [i] and [] for the Russian target /i/. Overall she produced 87.9% of palatalised consonants before target /i/ which is low, considering the findings that Russian monolingual children acquire palatalisation early (Jakobson, 1941; Zharkova, 2002). 128 Table 4-6 Distribution of palatalised and non-palatalised consonants in the preceding context of the vowels [i] and [] for Russian target /i/. Tokens per Label N [i] Label [] Total Preceding consonant palatalised? No Total 24 Yes 174 % 12.1% 87.9% 100.0% N 1 21 22 % 4.5% 95.5% 100.0% N 25 195 220 % 11.4% 88.6% 100.0% 198 Thus, this marginal lack of palatalisation might be a sign of language interaction from SSE. However, we also see that 95.5% (n=21) of the [] variants are actually produced after palatalised consonants; this means that the underlying reason for the use of the lax vowel is not due to the phonotactic influence from the preceding consonant, but is due to the vowel systems itself. 4.3.3.1.2 Longitudinal perspective Since there was no difference in AN’s production of phonetic variants [i] and [] for the target // compared to the SSE monolingual peers, we do not expect the longitudinal perspective to reveal any language interaction effects. Table 4-7 Longitudinal production of [i] and [] for target /i/ in SSE by AN. Age 3;8 4;2 4;5 Total Tokens per longitudinal moment N Label for SSE /i/ 182 0 182 % 100.0% .0% 100.0% [i] Total [] N 55 3 58 % 94.8% 5.2% 100.0% N 140 1 141 % 99.3% .7% 100.0% N N 4 381 % % 1.0% 100.0% Indeed, the percentages in Table 4-7 show that AN’s production of [] for /i/ in SSE split by age is unsystematic. Overall the production of target /i/ in SSE is very similar to the SSE monolingual ranges. AN’s production of the phonetic variants [i] and [] for the target /i/ in Russian split by age is presented in Table 4-8. 129 Table 4-8 Longitudinal production of [i] and [] for target /i/ in Russian by AN. Age 3;8 4;2 4;5 Total Tokens per longitudinal moment N Label for MSR /i/ [i] Total [] 53 16 69 % 76.8% 23.2% 100.0% N 52 6 58 % 89.7% 10.3% 100.0% N 93 0 93 % 100.0% .0% 100.0% N 198 22 220 % 90.0% 10.0% 100.0% The association of distributions of phonetic variants [i] and [] for the target /i/ and age was significant at the 99% level [χ2=23.676; df=2; p<.01]. The percentages in Table 4-8 show that AN’s production of [] for /i/ in Russian decreases in time: it is 23.2% at the age of 3;8 and 0% at the age of 4;5. Thus AN’s production of Russian /i/ becomes more adult-like with increasing age. 4.3.3.2 4.3.3.2.1 Acquisition of close rounded vowels Language differentiation Unlike the tense/lax vowel contrast for the unrounded vowels, the crosslinguistic difference between Russian /u/ and Scottish English // is realizational. In Russian the vowel is back, while in Scottish English is it front or central. Thus the crosslinguistic speech production for this vowel is directly comparable. We found that the monolingual SSE children did not produce 100% of adult-like [] at the age concerned, but rather a range of related variants involving adult-like [], lowering and backing to [], and backing as far as [u]. Since [u] for the target // was possible in the SSE monolingual child speech, we cannot assess bilingual language differentiation in terms of mere presence of the back [u] in bilingual child speech in SSE. Therefore, we assess the ranges rather than mere presence of certain phonetic realisations. The first question is whether the proportion of the subset phonetic variants [] and [u] for the SSE target // in AN’s production differs from that of the SSE monolingual peers across age. The set up of the test is summarised in Table 4-9. 130 Table 4-9 The effect of factor bilinguality of the subject AN on the production of phonetic variants [] and [u] in SSE in comparison to the SSE monolingual children. Bilinguality No Yes Total Tokens per label N Label for the target // [] Total [u] 498 26 524 % 95.0% 5.0% 100.0% N 249 2 251 % 99.2% .8% 100.0% N 747 28 775 % 96.4% 3.6% 100.0% There was a highly significant association [χ2=8.454; df=1; p<.01] in SSE between the vowel labels [u] and [] and the factor bilinguality for all the monolingual children (bilinguality=“No” in Table 4-9) in comparison across all age samples of AN (bilinguality=“Yes”). In fact, Table 4-9 shows that AN produced more adult-like SSE targets [] in comparison to the SSE monolingual children. Thus, the result means that AN produces the SSE vowel [] language-specifically that her speech production is more mature than the overall production of the SSE peers, and that a small number of back realisations in her sample cannot be attributed to language interaction from Russian. The second question is whether the proportion of the phonetic variants for the SSE target // in AN’s production is different from proportions of the phonetic variants for the MSR target /u/ across age. In order for the MSR and SSE production to be languagespecific, AN should produce the majority of phonetic variant [] for the SSE target //, and the majority of [u] for the MSR target /u/. Table 4-10 summarises all phonetic variants in both languages produced by AN. To test the question statistically we derived a 2x2 contingency table from Table 4-10 with two factors: “phonetic label”: i.e. [] and [u] and “language”: i.e. SSE and MSR; and we tested whether the difference between the proportion for these two phonetic labels and the two languages was significantly different. The result showed that the difference between the observed and the expected frequencies in each vowel category and language was highly significant [χ2=336.387; df=1; p<.01]. This result means that overall (across all longitudinal data), AN distinguished the rounded central-back vowel quality in a language-specific way: i.e. in MSR 75.3% of her [u] is back, while the SSE [] was central in 83.3% of instances. 131 Table 4-10 Phonetic ranges of the MSR target /u/ and SSE // produced by the bilingual subject AN Language SSE Tokens per label N % MSR Total 4.3.3.2.2 [i] Label [] [] [u] [] Total 2 2 249 2 44 299 .7% .7% 83.3% .7% 14.7% 100.0% N 4 1 66 238 7 316 % 1.3% .3% 20.9% 75.3% 2.2% 100.0% N 6 3 315 240 51 615 % 1.0% .5% 51.2% 39.0% 8.3% 100.0% Longitudinal perspective We did not expect the longitudinal perspective for AN’s production of the SSE target // to reveal any developmental trends, or reveal any language interaction, since overall we labeled only two instances of the back realisation for the target // (see Table 4-10). AN’s longitudinal production of the MSR target /u/ is presented in Table 4-11. There was no association [χ2=.198; df=1; p=.906] between the factor age and the observed frequency of phonetic labels [u] and []. Their proportions remained stable throughout the three age samples, and the majority of AN’s MSR realisations involved adult-like [u] at all ages. The longitudinal perspective did not reveal more developmental trends than the averaged data in Section 4.3.3.2.1. Table 4-11 Longitudinal production of phonetic variants [u] and [] for the MSR target /u/ by AN. Age 3;8 Tokens per label N % 3;10 4;5 Total Label for MSR /u/ [] [u] Total 25 86 111 22.5% 77.5% 100.0% N 21 73 94 % 22.3% 77.7% 100.0% N 20 79 99 % 20.2% 79.8% 100.0% N 66 238 304 % 21.7% 78.3% 100.0% We were interested to find out whether the overall 21.7% of [] variant in AN’s MSR production was a developmental trend or an effect of language interaction from the SSE target //. 132 Therefore, we looked at the breakdown of the results for the most frequent variants [] and [u] for MSR target /u/ per carrier word; the results are summarised in Table 4-12. There was a highly significant association [χ2=202.614; df=3; p<.01] between the carrier words used in this study and the overall observed frequency of the phonetic labels [] and [u] in AN’s MSR production. As can be seen in Table 4-12, 92.4% of AN’s [] vowels were produced in a single Russian carrier word “shut” (a joker), while the percentage [u] for other three carrier words is quite close to complete adult-like realisation (94.6% to 100%). Table 4-12. The effect of carrier words on the proportions of the variants [] and [u] for the MSR target /u/ produced by the subject AN. Label for /u/ Carrier tuz suk kub shut Total Tokens N [] [u] Total 2 109 111 % carrier 1.8% 98.2% 100.0% % label 3.0% 46.4% 36.9% 3 53 56 % carrier 5.4% 94.6% 100.0% % label 4.5% 22.6% 18.6% 0 58 58 % carrier .0% 100.0% 100.0% % label .0% 24.7% 19.3% 61 15 76 % carrier 80.3% 19.7% 100.0% % label 92.4% 6.4% 25.2% N N N N % carrier % label 66 235 301 21.9% 78.1% 100.0% 100.0% 100.0% 100.0% Unlike in English, the Russian postalveolar voiceless fricative // (as in “shut” /ut/) is apical: i.e. it involves the tip of the tongue as active articulator (Bondarko, 1998) rather than laminal (tongue blade articulation) in English (Ladefoged, 1993, p.161). The laminal articulation is associated with another palatalised postalveolar voiceless fricative in Russian //. AN’s realisation of // in “shut” typically involved a laminal articulation [] without palatalisation13. Such a realisation is most unlikely for Russian monolingual children. In the course of speech acquisition Russian monolingual children often realize // as [s] or [t] (Zharkova, 2002), so that typical immature realisations of // involve 13 Unfortunately, we did not annotate apicality or laminality of the consonants in the transcriptions. However, this statement reflects our general impression of this consonant in AN’s production, and it holds for the five instances of “shut” produced by AN which we rechecked when preparing this section. 133 palatalisation, fronting and de-apicalisation at the same time. Both // and // are acquired later than other coronal consonants; they belong to consonants with a substantial proportion of immature realisations (to the age of 3;0) compared to other consonants (Zharkova, 2002). No accounts state that apical // can be realised by children as laminal and palatalised []. Thus, AN’s laminal realisation of // in “shut” is unlikely to be due to speech immaturity similar to that of Russian monolingual children. AN’s articulation of SSE // as in “sheep” was quite mature compared to other monolingual children, so that her laminal production of Russian // could not be explained by the difficulty to acquire postalveolar fricatives as such. We can conclude that AN did acquire the Russian back vowel /u/ in a quite adultlike way, if we look at the carrier words other than “shut” (a joker). The presence of nonadult-like variants [] in AN’s case was neither due to speech immaturity similar to that of Russian monolingual children, nor was it due to the contacting vowel systems as such. The language interaction occurred on one lexical item rather than being systematic. It had an affect on the vowel but might originate in the phonotactic influence of the preceding consonant. The language interaction possibly happened because the subject was not familiar enough with the carrier word, or because she had not acquired the Russian apical articulation for the sound, or because this Russian word happened to be a false cognate of the English verb “to shoot”. The laminal articulation of // is also likely to be language interaction from SSE, rather than a manifestation of speech immaturity. So in fact, it looks that language interaction might apply to the whole word demonstrating a clear lexicalisation effect, whatever the cause. 4.3.3.3 Summary of AN’s results The results show that AN produced language-specific vowel quality for all the vowels concerned in this study, and that she differentiated between her two languages. First of all, AN acquired the segmental quality of the tense/lax contrast between vowels /i/ and //. AN produced only a 1.3% of // as [i], and this small proportion of is comparable with the results of the SSE monolingual peers on average. Her 0.9% is in fact smaller than individual results of some SSE children. Secondly, she acquired language-specific control of the close rounded vowels. AN’s production of the SSE // was significantly more adult-like than the average results of the 134 SSE peers: i.e. among the [] and [u] tokens AN produced 99.2% of adult-like [] compared to the average of 95% of the SSE peers. Besides, her MSR/SSE production of the close rounded vowels was differentiated in a significant way: i.e. among all phonetic variants in SSE she produced 83% of [] for //, while in MSR she produced 75.3% of the back [u] for /u/. However in addition to AN’s language differentiation, we also observed two language interaction effects regarding vowel quality in her speech production. Both effects occurred in AN’s Russian rather than in SSE. First of all, we found a significant proportion of [] variants (10%) for the Russian target /i/ (as opposed to only 1.8% in her SSE production). The presence of the vowel [] in AN’s Russian speech is quite unexplainable in terms of speech immaturity accounted for Russian monolingual children in the literature (Jakobson, 1941; Zharkova, 2002). Its presence could not be explained by the phonotactic influence of the lack of palatalisation in the preceding consonant, and was, thus, due to the vowel system rather than due to the acquisition of consonants. [] for /i/ appeared in all MSR carrier words involved in this study. The production of [] for /i/ in MSR was, thus, an effect of language interaction from the SSE vowel system and it contradicts the direction of language interaction predicted by both CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998) discussed in Sections 1.3.2.3.2-3. The longitudinal perspective revealed that the number of occurrences of [] for /i/ significantly decreased with age. The second language interaction effect was also observed in AN’s Russian speech production. It involved the presence of central vowel [] for the back /u/ in one specific MSR carrier word “shut” (a joker). As such, the presence of [] for /u/ in Russian child speech should not necessarily be seen as an effect of language interaction, since it is known that Russian children acquire palatalised consonants quite early on, and, in fact, ‘over-palatalisation’ (i.e. the use of palatalised consonants for the non-palatalised ones) with subsequent change in the following vowel quality is a sign of speech immaturity in Russian child speech (Jakobson, 1941; Zharkova, 2002). For the close back rounded vowel /u/ over-palatalisation of the preceding consonant could involve fronting towards a vowel quality similar to the SSE vowel //. However, as we observed, 92.4% of all 135 instances of [] occurred only for one lexical item “shut” (a joker) rather than being an overall effect. We argued that the effect on the vowel originated in the phonotactic influence of the preceding consonant // which AN produced with a laminal articulation (typical for SSE) rather than with apical (typical for MSR). The laminal articulation could be due to language interaction from SSE. 4.3.4 Subject BS 4.3.4.1 4.3.4.1.1 Acquisition of close unrounded vowels Language differentiation In this section on BS’ bilingual acquisition of vowel quality we address the same questions as those addressed for AN. Recall that the difference between the two bilingual subjects was primarily in the amount of input that they received in the two languages: i.e. AN had a nearly equal amount in both SSE and MSR, while BS received substantially more input in Russian than in SSE. The first question is whether the absence of the vowel [] in MSR affected BS’ production of the lax vowel in SSE in terms of number of instances of phonetic variants [i] and [] for the target // compared to the SSE monolingual peers. Table 4-13 The effect of factor bilinguality of the subject BS on the proportion of phonetic variants [i] and [] produced for the target // in comparison to the SSE monolingual children. Tokens per label Bilingual? No Yes Total Label [i] Total [] N 6 659 665 % .9% 99.1% 100.0% N 144 78 222 % 64.9% 35.1% 100.0% N 150 737 887 % 16.9% 83.1% 100.0% The set up of the statistical test is summarised in Table 4-13. The result showed that for BS, there was a highly significant association [χ2=484.609; df=1; p<.01] between the factor “bilinguality” and the proportions of phonetic labels [i] and [] for SSE target //. The table shows that BS (bilingual “Yes” in Table 4-13) produced only 35% of adult-like forms compared to 99.1% of the monolingual children (bilingual “No”), while she produced 64.9% of [i] for the SSE //. Such a highly significant difference in the 136 proportion of phonetic labels can only be accounted by language interaction from Russian, since the lax vowel // is not featured in Russian. However, it is worth noting that BS also produced an overall of 35.1% of lax vowels []. This means that she was able to produce the language-specific vowel quality, it’s just that she did not produce it systematically in a way similar to the SSE monolingual children. Longitudinal results may be more revealing here. The second question is whether the presence of the tense/lax vowel contrast in SSE affected BS’ production of the vowel /i/ in MSR in terms of number of occurrence of phonetic variants [i] and [] for the target /i/ in comparison to her own production in the SSE monolingual language mode. Table 4-14 The effect of language on the phonetic ranges for the target /i/ produced by BS in SSE compared to MSR language modes across age samples. Language SSE MSR Total Label Tokens per Label Nr [i] Total 395 7 402 % 98.3% 1.7% 100.0% Nr 208 0 208 % 100.0% .0% 100.0% Nr 603 7 610 % 98.9% 1.1% 100.0% [] The set up the test is presented in Table 4-14. 100% of all BS’ /i/ in MSR and 98.3% in SSE involved adult-like [i]. Therefore, the vowel /i/ had an adult-like vowel quality in both languages. This is not surprising, since the vowel is acquired in the second year of life in both languages (Matthews, 2002; Zharkova, 2002). Interestingly, like the SSE monolingual children BS also produced a limited extent of lax realisations (1.7%) in SSE that is comparable to the proportion of [] for /i/ produced by the SSE monolingual children (see Table 4-1). This despite the fact that BS produced proportions of phonetic variants [i] and [] for the lax vowel // differently to the monolingual results. 137 4.3.4.1.2 Longitudinal perspective Since the amount of adult-like realisations for target /i/ was 100% for BS, we don’t need to assess this aspect longitudinally. It is, however, interesting to view the longitudinal perspective for the phonetic ranges of BS’ SSE target //, since overall it was significantly different from those of the SSE monolingual peers, with BS over-producing [i] for the target //. The question we address here is whether the proportions of phonetic labels [i] and [] for the SSE target // has an association with a specific age sampled in this study for BS (3;4, 3;10 and 4;5). The set up of the test is presented in Table 4-15. Table 4-15 Longitudinal production of [i] and [] for target // in SSE by the subject BS. Age 3;4 3;10 4;5 Total Label Tokens per label N [i] Total 52 25 77 % 67.5% 32.5% 100.0% [] N 57 15 72 % 79.2% 20.8% 100.0% N 35 38 73 % 47.9% 52.1% 100.0% N 144 78 222 % 64.9% 35.1% 100.0% There was a highly significant association [χ2=15.872; df=2; p<.01] between the sampled ages of BS and the observed frequencies for labels [i] and [] for the SSE target // for these ages. As shown in Table 4-15, the observed frequency of tense realisations for // decreases from 67.5% at the age of 3;4 to 47.9% at the age of 4;5. The result shows that with increasing age BS produced less instances of [i] for //. However, even at the age of 4;5 BS’ percentage of [i] for // (47.9%) is still highly significantly different [χ2=71.257; df=1; p<.01] from the eldest monolingual children in our control group (3.1% in C7 and C9). 138 4.3.4.2 4.3.4.2.1 Acquisition of close rounded vowels Language differentiation The first question is whether the proportion of the phonetic variants [] and [u] for the SSE target // in BS’ production differs from that of the SSE monolingual peers across age. We assess the whole range of phonetic variants of // produced by BS in comparison to her monolingual peers. The proportions of labels [i], [], [u] and [] for // are presented in Table 4-16. Notably 77.7% of BS’ [] for the target // is comparable to the individual ranges of SSE monolingual children presented in Table 4-3, which vary from 66.2 to 98%. So that BS’ amount of adult-like [] for // in SSE is language-specific. Table 4-16 Phonetic ranges for the SSE target // produced by BS in comparison to the SSE monolingual children (across age) Bilingual? No Yes Total Tokens per label N Label [i] [] [u] Total [] 1 468 26 63 558 % .2% 83.9% 4.7% 11.3% 100.0% N 3 209 46 11 269 % 1.1% 77.7% 17.1% 4.1% 100.0% N 4 677 72 74 827 % .5% 81.9% 8.7% 8.9% 100.0% However, if we look for the most frequent non-adult-like realisations: i.e. [u] and [], we can see a substantial distributional difference in the percentages. Thus, the difference between BS and the SSE monolingual children might be in the distribution of non-adult forms rather than in the percentages of adult-like realisations compared to the non-adult-like [u]. Therefore, we need to deviate from the test that we ran for AN, and compare the two most frequent non-adult-like realisations of // as [u] and [] produced by BS and by the SSE monolingual children. The set up of the test is presented in Table 4-17. There was a highly significant association [χ2=36.853; df=1; p<.01] between the factor “bilinguality” and the observed frequency of phonetic labels [u] and []. Among non-adult-like targets (excluding marginal [i]), on average the monolingual children produced 70.8% of [] variants, while 139 BS produces 80.7% of back [u]. This data suggests possible language interaction from Russian. Table 4-17 Contingency table showing the effect of the factor bilinguality on the distribution of two most frequent non-adult phonetic targets for SSE // produced by the subject BS in comparison to SSE monolingual children. Bilingual? No Yes Total Label Tokens per label N [u] Total 26 63 89 % 29.2% 70.8% 100.0% [] N 46 11 57 % 80.7% 19.3% 100.0% N 72 74 146 % 49.3% 50.7% 100.0% However, this highly significant result cannot be seen as a definitive proof of language interaction from Russian in BS’ case, since the 17.1% of [u]’s in BS’ ranges, is comparable to the results of individual SSE monolingual children (17.5% for C5 and 12.1% C6 in Table 4-3). Therefore, we can conclude that BS’ production of SSE close rounded // was acquired in a native-like way compared to the SSE-monolingual children, if we look at the overall data across longitudinal results. The second question is whether the proportion of the phonetic variants for the SSE target // in BS’ production is different from proportions of the phonetic variants for the MSR target /u/ across her age samples. In order for the MSR and SSE production to be language-specific, BS should produce a majority of phonetic variant [] for the SSE target //, and the majority of [u] for the MSR target /u/. Indeed Table 4-18. shows that 77% of BS’ realisations of SSE // are adult-like central [] in comparison to 84% of MSR back [u] for /u/. Data in Table 4-19, showed a highly significant association [χ2=249.700; df=1; p<.01] between the factor language and the observed frequencies between the phonetic labels [u] and []. This means that BS’ production of rounded vowels /u/ and // was language-specific: i.e. she produced the back vowel in Russian, and a central vowel quality in SSE. 140 Table 4-18 Phonetic ranges of the MSR adult target /u/ and SSE // produced by the bilingual subject BS. Language Mode SSE MSR Total Tokens per label N Label [i] [] [u] Total [] 3 209 46 11 269 % 1.1% 77.7% 17.1% 4.1% 100.0% N 0 41 247 6 294 % .0% 13.9% 84.0% 2.0% 100.0% N 3 250 293 17 563 % .5% 44.4% 52.0% 3.0% 100.0% Table 4-19 Contingency table showing the effect of language on the realisations of [] and [u] for subject BS in SSE compared to MSR language modes across her age samples. Language mode SSE MSR Total 4.3.4.2.2 Label Tokens per label N [] 209 46 255 % 82.0% 18.0% 100.0% N 41 247 288 % 14.2% 85.8% 100.0% N 250 293 543 % 46.0% 54.0% 100.0% [u] Total Longitudinal results We assess whether the proportions of phonetic labels [u] and [] for the SSE target // has an association with a specific age sampled for the bilingual subject BS (3;4, 3;10 and 4;5). The set up of the test is presented in Table 4-20. There was a highly significant association [χ2=15.210; df=2; p<.01] between the factor “age” of BS and the observed frequency of phonetic labels [u] and [] for the SSE target //. Table 4-20 shows that there was a decrease in the production of the back vowels [u] for the target // from 22.2% at the age of 3;4 to 6.3% at the age of 4;5. Importantly, the 6.3% of [u] at the age of 4;5 fall within the ranges produced by the SSE monolingual children. (see Table 4-3). The longitudinal results for the production of // in the SSE language mode in Table 4-20 also revealed that at the age of 3;4 and 3;10 BS produced accordingly 22.2% and 27.6% of the back variant [u] for // in SSE, somewhat higher than any of the SSE 141 monolingual peers (cf. C5 in Table 4-3). However, by the age of 4;5 her production was normal. Table 4-20 Longitudinal production of [u] and [] for the target // in SSE by the subject BS. Age 3;4 3;10 4;5 Total Label Tokens per label N [] Total 56 16 72 % 77.8% 22.2% 100.0% N 63 24 87 % 72.4% 27.6% 100.0% N 90 6 96 % 93.8% 6.3% 100.0% [u] N 209 46 255 % 82.0% 18.0% 100.0% Table 4-21 Longitudinal production of [u] and [] for the target /u/ in MSR by the subject BS. Age 3;4 3;10 4;5 Total Label Tokens per label N [] 21 31 52 % 40.4% 59.6% 100.0% N 19 107 126 % 15.1% 84.9% 100.0% [u] Total N 1 109 110 % .9% 99.1% 100.0% N 41 247 288 % 14.2% 85.8% 100.0% The longitudinal results for the Russian monolingual language mode shown in Table 4-21 revealed a highly significant association [χ2=45.196; df=2; p<.01] between the age of BS and her use of central and back vowels in Russian. With increasing age BS reduced the number of central vowels in Russian and steadily increased the proportion of back vowels from 59% at the age of 3;4 to 99.1% at the age of 4;5. All the Russian carrier words containing /u/ used in this study should be preceded by a non-palatalised consonant in the Russian adult model. As we discussed in Section 2.2.1.1, it is known that over-use of palatalisation for the non-palatalised adult targets in children affects the quality of the following vowel (Jakobson, 1941; Zharkova, 2002). For the close rounded vowels, this means that the back vowel should become more fronted. The presence of the central vowel in BS’ Russian can potentially be explained by the over-use of preceding palatalised consonants rather than by any language interaction from SSE. Therefore, we need to investigate this potential consonantal effect on the vowel. 142 The results of the effect of age on the palatalisation/non-palatalisation of the consonant preceding target MSR /u/ is shown in Table 4-22. There was a highly significant association [χ2=46.139; df=2; p<.01] between the age of BS and the number of palatalised and non-palatalised consonants preceding the target /u/. There is a developmental decrease in the palatalisation of the preceding consonants in Russian from 42% at the age of 3;4 to 0.9% at the age of 4;5. Table 4-22 The effect of BS’ age on the use of (non-) palatalised consonants preceding the MSR target /u/. Age 3;4 3;10 4;5 Total Tokens per palatalisation N Preceding consonant palatalised? No Total Yes 30 22 52 % 57.7% 42.3% 100.0% N 90 36 126 % 71.4% 28.6% 100.0% N 109 1 110 % 99.1% .9% 100.0% N 229 59 288 % 79.5% 20.5% 100.0% This percentage of the over-use of palatalisation by children for adult nonpalatalised targets is in line with the developmental data reported in Zharkova (2002, p. 75) for a Russian monolingual girl aged 3;0. In that case, 67% of all the processes for the consonants involved the substitution of the non-palatalised consonants by the palatalised ones. This means that BS’ production of the central [] for the MSR /u/ can be explained by BS’ speech immaturity at the age of 3;4 to 3;10, rather than by any language interaction from SSE. 4.3.4.3 Summary of BS’ Results The results of BS’ acquisition of vowel quality form a mirror image of AN’s patterns of acquisition in many respects. Being a Russian-dominant bilingual, BS had not acquired the tense/lax contrast in a way similar to the monolingual peers or to a more balanced bilingual AN. Overall, BS produced only 35% of the adult-like forms of [] for // in comparison to 99.1% of adultlike forms produced by the monolingual peers. The non-adult-like realisations primarily involved of [i] for // (64.9%) . This pattern, called by Weinreich (1953) ‘underdifferentiation of phonemes’, is frequently accounted for in the L2 acquisition literature (Panasyuk et al., 1995; Escudero, 2000; Flege, 2002; Piske et al., 2002). Unlike AN’s 143 language interaction pattern for this set of vowels, the direction of the language interaction did not contradict the direction of language interaction predicted by either CCCH (Döpke, 1998; Döpke, 2000) or the Markedness Hypothesis (Müller, 1998), but it was also compatible with BS’ language input conditions (lesser than in AN extent of exposure to English). We also showed that the difference between BS and the SSE monolingual peers in the overall production of the vowels /i/ and // was not that BS was unable to produce an adult-like [] for //. In fact, we could reverse the argument and say that she did produce [] in 35% of all // cases. In that sense, we can state that BS did differentiate between her two languages, but she also showed a considerable amount of language interaction from Russian. Similarly, the longitudinal perspective showed that BS was in the process of gradual acquisition of the tense/lax vowel quality. We showed that with increasing age BS produced more instances of adult-like [] for //, with 32% at the age of 3;4 and increasing it to 52.1% at the age of 4;5. It is very interesting to compare BS’ acquisition pattern for the tense/lax contrast to her production of the close rounded vowels // and /u/ forming a realizational difference between SSE and MSR. Unlike for the systemic tense/lax difference, for these vowels BS did differentiate between the two languages in a native-like way. BS’ overall number of adult-like and non-adult-like forms for the SSE target // was not significantly different from that of the SSE monolingual children. Overall she produced 77.7% of adult-like [] for // in SSE, and 84% of [u] for /u/ in MSR. The 13.9 % of [] in MSR was not due to language interaction from SSE, but in fact followed the Russian monolingual pattern of ‘over-palatalisation’ accounted for in the acquisition literature (Trubetskoy, 1939; Zharkova, 2002). The amount of non-adult forms was comparable to the monolingual ranges in SSE. Despite the language differentiation we also observed a language interaction pattern in SSE. It concerned BS’ ranges of phonetic variation for the SSE target //, namely the proportion of the back vowel [u] tokens for // compared to that of the SSE monolingual peers. With increasing age BS reduced the percentage of the close back rounded [u] in SSE from 22.5 and 27.6% (at the age of 3;4 and 3;10) to 6.3% at the age of 4;5, a 144 proportion comparable to the SSE monolingual ranges. So that we can conclude that BS had acquired the crosslinguistic difference between SSE // and MSR /u/ by the age of 4;5. This finding seems to reinforce the idea that not all ambiguous sound structural properties seem to be equally prone to language interaction. The realizational phonological difference between SSE // and MSR /u/ seems to be less difficult to acquire than the systemic tense/lax difference. However, the acquisition of this realizational difference by the Russian-dominant bilingual BS may be mediated by the richer phonetic continuum in Russian compared to the number of phonological categories: i.e. MSR features palatalisation which triggers fronting of the main allophone [u] towards a more central vowel similar in quality to the main allophone in SSE. 145 5 Acquisition of Vowel Duration 5.1 Introduction This chapter presents results on monolingual and bilingual acquisition of postvocalic conditioning of vowel duration. In this chapter the term ‘postvocalic conditioning’ refers to how the consonant following a vowel systematically conditions the duration of the vowel. In the literature review on monolingual acquisition (Section 2.3.1) we concluded that despite the evidence suggesting that English-speaking children master postvocalic conditioning by the age of 3;0 to 5;0, there is little empirical evidence to confirm the acquisition of the Scottish Vowel Length Rule for the SSE-speaking children at this age. There is also little empirical evidence on the patterns of bilingual acquisition of postvocalic conditioning patterns involving either language differentiation or interaction. This chapter aims to shed more light on these issues. We address the acquisition patterns for three variables concerning vowel duration: (1) the acquisition of postvocalic conditioning of the SSE and MSR vowel /i/ ; (2) the acquisition of SSE ‘invariably short’ postvocalic conditioning for the vowel // compared to the differentiated pattern of the SSE /i/; and (3) the acquisition of postvocalic conditioning for the SSE vowel // and MSR /u/. First of all, we compare the crosslinguistic differences in the variables between adult speakers: i.e. Russian (n=5), Scottish Standard English (n=5) and Southern Standard British English (n=4). MSR is not included in any tests involving the lax vowel //, since the vowel is not featured there. The idea behind testing of the adult models is to pinpoint substantially different crosslinguistic differences between SSE and MSR in the postvocalic conditioning of the vowel duration before testing the bilingual acquisition of these patterns. The SSBE adult model is kept in mind for possible cross-varietal influences in the speech of bilingual and monolingual children. 146 Subsequently, we assess the issue of monolingual acquisition of the postvocalic conditioning of vowel duration by comparing SSE adult speech to the data of the SSE monolingual children. To account for the bilingual acquisition patterns, we compare the SSE speech of each bilingual subject to that of the SSE monolingual peers (n=10, a cross-section of 7 individual cases plus 3 longitudinal cases C3, C4 and C7) to establish similarities and differences between the two groups. This should allow us to determine any language interaction patterns in SSE. Besides, bilingual language differentiation is assessed by comparing each subject’s speech production in SSE to her own speech production in MSR. We perform no direct statistical comparison between adults and bilingual subjects, because the two groups differ from each other alongside several dimensions including age and bilinguality. However, we do consider individual patterns of the bilingual subjects in relation to the patterns of the Russian-speaking mother, the Russian-speaking experimenter (Gordeeva) and the adult group results to assess the native-likeliness of the MSR pattern in a descriptive way. The speech of the experimenter should be indicative of how the changing mode of data elicitation (spontaneous play with the child) may affect the adult vowel duration patterns (recorded in reading out utterances from computer screen). 5.2 Data Analysis The variables on the postvocalic conditioning of vowel duration form numerical data (annotated vowel duration, ms). Labelling procedures have been described in Section 3.6.2.2. During data annotation (see Section 3.6.2), we indicated whether or not a token carried a pitch accent, in what position it occurred in the utterance and we assigned a broad intonational modality to each utterance (non-emphatic versus emphatic statement, yes/no or WH-questions). We also dispose of information on f0 (Hz) and formant frequencies. In order to achieve maximal comparability between adult and child data, we tried to make the data-sets as uniform as possible by selecting the subsets of tokens which: - carried a pitch accent (are not de-accented); occurred in phrase final and medial positions (not phrase initial), or in single word utterances; were produced with detectable f0 (i.e. f0 > 0) and valid formant structure (i.e. see exclusion criteria in Section 3.6.3.2); 147 - were produced as non-emphatic statements. All durational measurements of child speech are averaged based on phonological adult targets (rather than phonetic) unless stated otherwise. The averages are mainly based on median values (if the statistical test used allowed that) for both adults and children to achieve more comparability between the data, since the elicitation method was different (adults read out utterances from computer screen, and children played games) and the children had more variable speech production, which affected the distributions of the data. The acquisition of postvocalic conditioning patterns in SSE, MSR and SSBE was tested by means of Analysis of Variance (ANOVA) with a different set up: namely, mixed design or multivariate, depending on the variables and number of subjects. The set up is explained for each variable and subject group separately. All reported F-values were ‘Greenhouse-Geisser epsilon’ corrected. The statistical analyses were performed separately for each of the bilingual subjects, since their language input situations were too different to treat them as a group. The group of the SSE monolingual children (10 cases) was split up into three age subgroups for each bilingual child separately to more closely match individual age ranges of the subjects in the longitudinal samples. All the individual results of the monolingual children, as well as comparisons of bilingual results and individual adults are analysed descriptively. 5.3 Acquisition of Vowel Duration 5.3.1 A comparison of adult models 5.3.1.1 Vowel /i/ We examine the crosslinguistic differences in the postvocalic conditioning (by voiceless stop, voiced stop or voiced fricative) of the duration of the vowel /i/ between MSR, SSE and SSBE. The median values for the duration (ms) of /i/ for each speaker were entered in a mixed design ANOVA with “LANGUAGE” (SSE, SSBE, MSR) as a between-subject factor and the “FOLLOWING CONSONANT” as a within-subject factor. The “FOLLOWING CONSONANT” factor had three levels: i.e. voiced fricative, voiced stop and voiceless stop. 148 The results showed that there was a highly significant main effect of the following consonant on the duration of the vowel /i/ [F(2,22)=77.152; p<.01]. There was also a highly significant main effect of the factor “LANGUAGE” [F(2,11)=41.133; p<.01] showing that the durational means are different between the languages. Tukey HSD posthoc tests for the factor “LANGUAGE” showed that all three languages were highly significantly different from each other (p<.01). Besides, there was a highly significant interaction between the factor “LANGUAGE” and the “FOLLOWING CONSONANT” [F(4,22)=16.943; p<.01]. The direction of the crosslinguistic differences is plotted in Figure 5-1. The mean duration and standard deviations for /i/ per consonantal context and language averaged for all the speakers are found in Table 5-1. Figure 5-1 shows that the direction of the main effect of the following consonant on the duration of /i/ is in parallel in the three languages. However, as shown by the interaction between the language and following consonants, the extent of postvocalic conditioning differed significantly depending on the language. There were significant differences between SSE and SSBE in the implementation of duration before voiced stops: i.e. this context triggered long duration in SSBE and short duration in SSE. There were significant crosslinguistic differences between SSE and MSR in the implementation of duration before voiced fricatives: i.e. this context triggered long duration in SSE and relatively short duration in MSR. Generally in MSR, the vowels remained relatively short irrespective of the following consonant. Appendix C sums up the individual results per speaker and language, mean and median duration (ms), the number of tokens and the standard deviation. Appendix D sums up the language results: mean and median duration (ms), the number of tokens and the standard deviation. The voiceless stop/voiced fricative (VLS/VF) ratio was .5 for SSE, .84 for MSR and .54 for SSBE and the voiceless stop/ voiced stop (VLS/VS) ratio is .84 for SSE, .89 for MSR and .63 for SSBE (see Appendix T for the overview of the ratios for all speakers adults and children). These results confirm previous reports on the extent of postvocalic conditioning of vowel duration in SSE (McKenna, 1988; Scobbie et al., 1999a; Scobbie et al., 1999b) and MSR (Chen, 1970; Gordeeva et al., 2003), and the cross-varietal differences between SSE and SSBE (Scobbie, 2002). 149 350 duration (ms) + 1 Stdev 300 250 200 fric voice+ stop voice+ 150 stop voice- 100 50 0 SSE MSR SSBE Language Figure 5-1 Mean duration and standard deviation of the vowel /i/ in the three languages (SSE, MSR and SSBE) in the contexts before voiced fricatives, voiced stops and voiceless stops produced by monolingual adults. Table 5-1 Mean duration and standard deviation of the vowel /i/ (ms) for three right consonantal contexts per language averaged for all the adult speakers. Following consonant voiced fricative voiced stop voiceless stop 5.3.1.2 Language SSE MSR SSBE SSE MSR SSBE SSE MSR SSBE Mean Std. n of duration (ms) Deviation subjects 209 24 106 20 285 44 125 16 100 10 243 54 106 11 89 11 153 17 5 5 4 5 5 4 5 5 4 Vowel // We examined the crosslinguistic difference in the influence of the right consonantal context on the duration // in stressed syllables in similar phrase positions. The set up of ANOVA was the same as in Section 5.3.1.1, except that the between-subject factor “LANGUAGE” had only two levels (SSE and SSBE), since MSR does not feature //, and 150 the within-subject factor “FOLLOWING CONSONANT” had three levels: i.e. voiceless fricative, voiced stop, voiced fricative. The results showed that in both English varieties there was a highly significant main effect of the “FOLLOWING CONSONANT” on the duration of the vowel // [F(2,14)=15.826; p<.01]. The result means that postvocalic conditioning for // operates systematically in both SSE and SSBE. Besides, there was a significant main effect of the factor “LANGUAGE” [F(2,7)=9.772; p<.05] indicating that the overall durational means are different in each variety. There was also a significant interaction between the factors “LANGUAGE” and the “FOLLOWING CONSONANT” [F(2,14)=5.363; p<.05]. This interaction means that the conditioning depends on the following consonant and is implemented differently between SSE and SSBE. Mean duration and standard deviations for // per consonantal context and language averaged for all the speakers are shown in Table 5-2. The direction of the differences between the two English varieties is shown in Figure 5-2. Individual results per speaker are found in Appendix C. Figure 5-2 shows that SSE and SSBE appear to differ both in the extent and more clearly in the contexts of the postvocalic conditioning for //. Voiceless fricatives following the vowel trigger the shortest duration in both varieties. However, in SSE there is only a slight increase in vowel duration before voiced stops and fricatives compared to that before voiceless fricatives, whereas in SSBE the increase in both voiced contexts is substantial. The voiceless fricative/voiced stop (VLF/VS) ratio is .91 in SSE and .68 in SSBE, while the corresponding voiceless fricative / voiced fricative (VLF/VF) ratios are .9 and .69. Looking at individual variation of the adult speakers (values derived from Appendix C), both VLF/VF and VLF/VS ratios are consistently smaller than 1 for the all four SSBE speakers. However, in SSE only the VLF/VF ratio is consistently less than 1 for all individual speakers, while the VLF/VS ratio varies from 0.77 to 1.0, with two speakers having values greater than one. 151 200 180 mean duration (ms) 160 140 120 SSE 100 SSBE 80 60 40 20 0 voiced fricative voiced stop voiceless fricative following consonant Figure 5-2 Durational means (ms) in all SSE versus SSBE adults of the vowel // in the contexts before voiced fricatives, voiced stop and voiceless fricatives. Table 5-2 Mean duration and standard deviation of the vowel // (ms) in three right consonantal contexts per language (SSE or SSBE) averaged for all the speakers. Following Consonant voiced fricative voiced stop voiceless fricative Mean Std. n of Language duration Deviation subjects SSE 119 13 SSBE 172 37 SSE 105 13 SSBE 177 47 SSE 95 19 SSBE 120 24 5 4 5 4 5 4 The VLF/VS ratio for SSE of .91 and the VLF/VF ratios of .9 in our study are smaller than the corresponding .87 and .72 ratios derived from McKenna (1988). The data from Agutter (1988) gave similar results to McKenna’s (1988) study. The length of utterances used for the analysis in these studies can plausibly explain the difference in the ratios: i.e. both McKenna and Agutter recorded words in isolation, while we recorded carrier words embedded in sentences. In a longer utterance the extent of postvocalic conditioning for // becomes smaller due to speaking rate differences and a more spontaneous mode of data elicitation. Our results confirm previous reports on the relatively small phonetic extent of postvocalic conditioning in the SSE // (Agutter, 1988; McKenna, 1988), and equally 152 confirm our analysis in Section 2.1.4 on the crosslinguistic difference between SSE and SSBE, based on the data for General American (House, 1961; Peterson & Lehiste, 1960). Additionally, empirical data on the postvocalic conditioning of duration of // in McKenna (1988), Agutter (1988) and our own data support the fact that, unlike /i/, the lax vowel // is relatively short before the three consonantal contexts considered; but that there is still a small but systematic extent of postvocalic conditioning applicable to the vowel depending on the voicing and the manner of articulation of the following consonant. This means that Aitken’s (1981) definition of the SSE lax vowel as being ‘invariably short’ needs refinement, since the phonological ‘invariability’ does not hold at the phonetic level. 5.3.1.3 Close rounded vowels We examine the crosslinguistic patterns of the postvocalic conditioning (voiceless stop, voiced stop or voiced fricative) of the duration of the rounded vowels /u/, // and // between MSR, SSE and SSBE. The set up of the ANOVA was the same as for the vowel /i/ in Section 5.3.1.1. We expected the extent of crosslinguistic differences between the consonantal conditioning of the duration of the close rounded vowels to be similar to that found for /i/. The results showed that there was a highly significant main effect of the following consonant on the duration of the close rounded vowels [F(2,22)=110.025; p<.01]. The result means that postvocalic conditioning patterns are systematically different in different consonantal contexts. There was a highly significant main effect of the factor “LANGUAGE” [F(2,11)=29.044; p<.01]. This effect showed that the overall durational means are different between the languages. Tukey HSD posthoc tests for the factor showed that all the language pairs significantly differed from each other (p<.05). There was also a highly significant interaction between the factor “LANGUAGE” and the vowel duration as a function of the “FOLLOWING CONSONANT” [F(4,22)=33.959; p<.01], showing that the duration of the rounded vowel depends on the language in different consonantal contexts. The direction of the crosslinguistic differences is shown in Figure 5-4. 153 350 mean duration + 1 StdDev (ms) 300 250 fric voice+ stop voice+ stop voice- 200 150 100 50 0 SSE MSR SSBE Language Figure 5-3 Mean duration (ms) and standard deviation of the close rounded vowels in the three languages (SSE, MSR and SSBE) in the contexts before voiced fricatives, voiced stops and voiceless stops produced by monolingual adults. Table 5-3 Mean duration and standard deviation of close rounded vowels (ms) as a function of the following consonant averaged for all the SSE, MSR and SSBE adult speakers. Following consonant voiced fricative voiced stop voiceless stop Mean Std. n of Language duration (ms) Deviation subjects SSE 214 30 MSR 115 18 SSBE 269 45 SSE 118 10 MSR 98 6 SSBE 253 52 SSE 108 10 MSR 97 6 SSBE 122 22 5 5 4 5 5 4 5 5 4 The mean duration and standard deviations for the rounded vowels per consonantal context and language averaged for all the speakers are shown in Table 5-3. Individual results per speaker are found in Appendix C. The results confirm that the crosslinguistic implementation of postvocalic conditioning for close rounded vowels shown in Figure 5-3 was very similar to that of /i/ 154 (Figure 5-1). As expected, the duration of close rounded vowels before voiceless stops was rather short in all three languages. The main crosslinguistic differences in duration between MSR and SSE occurred before voiced stops and voiced fricatives. Similarly to /i/, there were clear crosslinguistic differences between SSE and SSBE in the implementation of vowel duration before voiced stops: i.e. this context triggered long duration in SSBE and short duration in SSE. In Russian, vowels remained relatively short irrespective of the following consonant, though there was a slight increase of the vowel duration before voiced fricatives. For the close rounded vowels, the VLS/VF ratio is .5 for SSE, .84 for MSR and .46 for SSBE. The VLS/VS ratio is .92 for SSE, .99 for MSR and .48 for SSBE. Similarly to /i/, these results confirm previous reports on the extent of postvocalic conditioning of vowel duration in SSE (McKenna, 1988; Scobbie et al., 1999a; Scobbie et al., 1999b) and MSR (Chen, 1970; Gordeeva et al., 2003), and the cross-varietal differences between SSE and SSBE (Scobbie, 2002). 5.3.1.4 Summary of results for monolingual adults The results of between-language analysis of variance showed that there were significant differences in the implementation of postvocalic conditioning of vowel duration between MSR, SSE and SSBE for all the vowels concerned. The results for SSE confirm empirical evidence for the operation of the Scottish Vowel Length Rule in SSE (Agutter, 1988; McKenna, 1988; Scobbie et al., 1999a; Scobbie et al., 1999b). In agreement with these studies, our data showed that both monophthongs /i/ and // have a long duration conditioning before voiced fricatives, as opposed to its short conditioning in the contexts before voiced and voiceless stops. All adults consistently showed these patterns. The results for MSR showed only a slight overall increase in vowel duration as a function of the following consonant for the adults. This result confirms previous report of such an increase in Chen (1970). However, it is important to note that the context dependent increase in duration was different for the vowels /i/ and /u/ in our data, and that the individual speakers varied and deviated from this trend in several instances (Appendix D). This shows that postvocalic conditioning is not an obligatory phonetic property in Russian, and that we should be careful comparing Russian language results of bilingual children to the mean results of Russian adults as a group. Their speech production can 155 rather be compared crosslinguistically and to the individual speech production of their parents (the mother in this case), and they still could differ from their parents. The biggest difference in postvocalic conditioning between Russian and SSE is not necessarily in the pattern, as both could coincide, but rather in the extent, especially in the context before voiced fricatives. The results for SSBE primarily confirm data based on American English (Peterson & Lehiste, 1960; House, 1961). The vowels /i/ and /u / are short before voiceless stops and long before voiced fricatives and stops, thus the increase in duration is conditioned purely by the voicing of consonants rather than by a combination of voice and manner of articulation as in SSE. SSBE has long duration before voiced stops as opposed to short duration in SSE. The results for the lax vowel // confirm that SSE and SSBE have a differential implementation of postvocalic conditioning before voiced fricatives, voiced stops, and voiceless fricatives. In SSE, the context-dependent increase of duration is very small: the VLS/VS ratio is .91 and VLS/VF ratio .9. This confirms the previous reports (Aitken, 1981; Scobbie, 2002) that the vowel can be considered as phonologically short regardless of the following consonant. At the phonetic level, however, our data as well as the data from Agutter (1988) and McKenna (1988) show context-dependent variability (Aitken, 1981) of the duration of //. The duration of // systematically (though marginally) increases in the context of voiced stops and voiced fricatives compared to the voiceless context, and individual SSE adults are consistent in realising this pattern. 156 5.3.2 SSE monolingual acquisition 5.3.2.1 5.3.2.1.1 Vowel /i/ Group results This section addresses the question whether the SSE monolingual children participating in this study acquired the SVLR pattern for the vowel /i/ in patterns similar to the SSE adults. The median values for the duration (ms) of /i/ for each monolingual SSE speaker (15 subjects including the three longitudinal cases) were entered in a mixed design ANOVA with “AGE” (adult, child aged 3;4 to 3;11; child aged 4;0 to 4;4, child aged 4;5 to 4;9 ) as a between-subject factor and the “FOLLOWING CONSONANT” as a withinsubject factor. The factor “FOLLOWING CONSONANT” had three levels: i.e. voiced fricative, voiced stop and voiceless stop. The results showed that there was a highly significant main effect of the following consonant on the duration of the vowel /i/ [F(2,22)=113.852; p<.01] in all groups. The direction of the main effect was parallel in all age groups and it is shown in Figure 5-4. As expected, the context of voiced fricatives triggered the longest duration of the preceding vowel, while the context before voiced and voiceless stops remained relatively short. This highly significant main effect means that the SSE monolingual children acquired the SVLR pattern for the vowel /i/. Furthermore, there was a highly significant main effect of the factor “AGE” [F(3,11)=10.169; p<.01] on the vowel duration. This means that absolute differences between the age groups in vowel duration were systematic. Figure 5-4 shows that the main difference between the groups was contributed by the overall higher duration means in the child groups compared to adults. 157 350 300 duration /i/ (ms) 250 SSE adult child 3;4 to 3;11 child 4:0 to 4;4 child 4;5 to 4;9 200 150 100 50 0 voiced fricative voiced stop voiceless stop following consonant Figure 5-4 Mean duration of the vowel /i/ (ms) as a function of the following consonant in four age groups of the SSE monolingual speakers. Table 5-4 Mean duration and standard deviation for the SSE vowel /i/ as a function of the following consonant in four age groups of the SSE monolingual controls. Following Mean duration Std. n of Consonant SSE age group (ms) Deviation subjects voiced fricative adult 209 24 5 child 3;4 to 3;11 322 26 3 child 4;0 to 4;4 359 49 5 child 4;5 to 4;9 295 51 2 Total 293 73 15 voiced stop adult 125 16 5 child 3;4 to 3;11 178 28 3 child 4;0 to 4;4 194 53 5 child 4;5 to 4;9 153 13 2 Total 162 44 15 voiceless stop adult 106 11 5 child 3;4 to 3;11 158 19 3 child 4;0 to 4;4 166 54 5 child 4;5 to 4;9 132 9 2 Total 140 41 15 To establish what groups contributed to the significance of “AGE” we ran Tukey HSD post-hoc tests for the age effects on the duration of the vowel /i/. The results of the tests are shown in Table 5-5. The significant effects are marked with an asterisk (*) in the column “(J) Age”. The post-hoc tests revealed that the age effect was only significant (p<.05) between adults and the two youngest age groups (3;4 to 3;11 and 4;0 to 4;4). 158 There was no significant difference between adults and the older children (4;5 to 4;9). This means that there was a significant longitudinal effect observed, and that the acquisition of the SVLR pattern for the vowel /i/ is getting closer to the adult form at the age of 4;5. Table 5-5 Results of Tukey HSD post-hoc tests for the differences between age groups within SSE monolingual controls. (I) Age Adult (J) Age Mean Difference (I-J) Std. Error child 3;4 to 3;11* -72.56 20.21 child 4;0 to 4;4* -93.28 17.51 child 4;5 to 4;9 -46.95 23.15 child 3;4 to 3;11 adult* 72.56 20.21 child 4;0 to 4;4 -20.72 20.21 child 4;5 to 4;9 25.61 25.26 child 4;0 to 4;4 adult* 93.28 17.50 child 3;4 to 3;11 20.72 20.21 child 4;5 to 4;9 46.33 23.15 child 4;5 to 4;9 adult 46.95 23.15 child 3;4 to 3;11 -25.61 25.26 child 4;0 to 4;4 -46.33 23.15 * The mean difference is significant at the .05 level. Finally, there was a significant interaction between the factors “AGE” and the “FOLLOWING CONSONANT” [F(6,22)=2.658; p<.05]. The interaction means that the extent of age differences in vowel duration depended on the following consonant. Figure 5-4 shows that the largest contextual differences in vowel duration occurred in the context before voiced fricatives, where the three child groups had longer duration compared to adults. There were no other significant main effects or interactions. 5.3.2.1.2 Individual results Since the speech production of the bilingual children (AN and BS) is assessed individually, it is worthwhile considering the ranges of individual variation of the SSE monolingual children. Individual results of the monolingual children are plotted in Figure 5-5. Individual descriptive statistics are reported in Appendix E. Figure 5-5 shows that all seven SSE monolingual children (individual children are plotted by increasing age on the x-axis) had an SVLR-like pattern, with /i/ before voiced fricatives having about twice longer duration than before other consonants. Three of the monolingual children that were recorded longitudinally: i.e. C3 (3;4 and 3;11), C7 (4;2 and 4;8) and C4 (3;8 and 4;1), showed a stable SVLR pattern in all age samples. 159 500 median duration (ms) 450 400 350 300 voice fricative 250 voiced stop 200 voiceless stop 150 100 50 3; 8 C 3_ 3; 11 C 6_ 4; 0 C 5_ 4; 0 C 4_ 4; 1 C 7_ 4; 2 C 8_ 4; 2 C 7_ 4; 8 C 9_ 4; 9 4_ C C 3_ 3; 4 0 SSE monolingual children Figure 5-5 Individual results of SSE monolingual children on the duration of /i/ as a function of the following consonant In the context of /i/ before voiced stops the duration was relatively short in most subjects compared to voiced fricatives. However, the relationship of the median duration of /i/ between the contexts before voiced stops compared to that before voiceless stops was quite variable between the 10 cases: i.e. some decrease, some increase, which supports general claims for SSE that both voiced and voiceless stops condition short vowels (Aitken, 1981; Scobbie et al., 1999a; Scobbie et al., 1999b; Scobbie, 2002). There seems to be no developmental pattern in the SVLR with increasing age, supporting the results in the previous section. The youngest children had already acquired adult-like postvocalic conditioning of the vowel /i/. The VLS/VF ratios of the children ranged from .31 to .62. The figures and patterns of the individual children confirm the significant group results on the acquisition of the SVLR pattern. 160 5.3.2.2 5.3.2.2.1 Vowel // Group results This section addresses the question whether the SSE monolingual children acquired the phonetically short postvocalic conditioning pattern for the lax vowel // in a way similar to the SSE adults. The set up of ANOVA was the same as in Section 5.3.2.1.1, except that the withinsubject factor “FOLLOWING CONSONANT” had thee levels: voiced fricative, voiced stop and voiceless fricative. The results showed that there was a highly significant main effect of the factor “FOLLOWING CONSONANT” on the duration of the vowel // [F(2,22)=14.05; p<.01]. The mean results for duration and standard deviations are presented in Table 5-6 for the four age groups. The direction of the effect of the following consonant on vowel duration was parallel in all age groups (see Figure 5-6). This result means that the SSE monolingual children had acquired the postvocalic conditioning for //. However, there was also a highly significant main effect of the factor “AGE” on the duration of the vowel across all consonantal contexts [F(3,11)=11.26; p<.01]. This effect means that absolute differences between the age groups in vowel duration were systematic. The difference between the age groups is shown in Figure 5-6. Similarly to /i/, children in all the age groups had higher mean duration values than the adult group. To establish what groups contributed to the significance of the main effect of the factor “AGE” we ran Tukey HSD post-hoc tests for the age effects on the duration of the vowel //. The results are shown in Table 5-7. The significant effects are marked with an asterisk (*) in the column “(J) Age”. The post-hoc tests revealed that the age effect was only significant (p<0.5) between adults and the two youngest groups aged 3;4 to 3;11, and 4;0 to 4;4. 161 350 300 duration (ms) 250 SSE adult 200 child 3;4 to 3;11 child 4:0 to 4;4 150 child 4;5 to 4;9 100 50 0 voiced fricative voiced stop voiceless stop following consonant Figure 5-6 Mean duration of the vowel // as a function of the following consonant in 4 SSE monolingual age groups Table 5-6 Mean duration and standard deviation for the SSE vowel // as a function of the following consonant for each age group of the SSE monolingual controls Following consonant voiced fricative Mean Std. n of Age duration (ms) Deviation subjects Adult 119 13 5 Child 3;4 to 3;11 226 38 3 Child 4;0 to 4;4 192 21 5 Child 4;5 to 4;9 162 20 2 Total 170 47 15 voiced stop Adult 105 13 5 Child 3;4 to 3;11 191 35 3 Child 4;0 to 4;4 178 59 5 Child 4;5 to 4;9 131 40 2 Total 150 53 15 voiceless fricative Adult 95 19 5 Child 3;4 to 3;11 153 20 3 Child 4;0 to 4;4 130 21 5 Child 4;5 to 4;9 116 24 2 Total 121 29 15 Similarly to /i/, there was no significant difference between adults and the older children aged 4;5 to 4;9. The result is similar to that for the acquisition of SVLR in /i/, 162 and means that the short postvocalic conditioning for the vowel // settles in an adult-like form by the age of 4;5. There were no other significant main effects or interactions. Table 5-7 Results of Tukey HSD post-hoc tests for the age effects for the SSE monolingual speakers. (I) Age Adult (J) Age child 3;4 to 3;11* child 4;0 to 4;4* child 4;5 to 4;9 child 3;4 to 3;11 adult* child 4;0 to 4;4 child 4;5 to 4;9 child 4;0 to 4;4 adult* child 3;4 to 3;11 child 4;5 to 4;9 child 4;5 to 4;9 Adult child 3;4 to 3;11 child 4;0 to 4;4 * Mean Difference (I-J) Std. Error -83.73 15.88 -60.30 13.75 -30.28 18.20 83.73 15.88 23.43 15.88 53.46 19.85 60.30 13.75 -23.43 15.88 30.03 18.20 30.28 18.20 -53.46 19.85 -30.03 18.20 The mean difference is significant at the .05 level. 5.3.2.2.2 Individual results Individual results of the monolingual children are presented in Figure 5-7. The descriptive statistics are reported in Appendix G. Figure 5-7 shows that all the SSE monolingual children consistently produced // before voiceless fricatives shorter than before voiced fricatives. This finding parallels the SSE adult results. Individual VLF/VF ratios range from .49 to .85, which is a somewhat broader range than the adult ratios. VLF/VS ratios vary from .52 to 1.1. Similarly to the SSE adults, the individual child VLF/VS ratios had a broader range than the VLF/VF ratio. There are two possible explanations for the broader range of variation in children. The first one is the difference in the data elicitation mode used between adults and children: i.e. adults read out utterances from computer screen, while children produced carrier words playing games. Secondly, the broader range of the ratios could be explained by speech immaturity. 163 300 median duration (ms) 250 200 voiced fricative 150 voiced stop voiceless fricative 100 50 8 3_ 3; 11 C 6_ 4; 0 C 5_ 4; 0 C 4_ 4; 1 C 7_ 4; 2 C 8_ 4; 2 C 7_ 4; 8 C 9_ 4; 9 3; C C 4_ C 3_ 3; 4 0 SSE monolingual child Figure 5-7 Individual results of SSE monolingual children on the duration of // as a function of the following consonant Both group results and individual results for /i/ and // also confirm that the SSE monolingual children had acquired differential implementation of postvocalic conditioning for these two vowels similarly to the SSE adults. 164 5.3.2.3 5.3.2.3.1 Close rounded vowel Group results In Section 4.3.1.2 we found a broad range of phonetic variation in the production of vowel quality by the SSE monolingual children: i.e. the production of // was less adultlike than that of /i/ even at the age of 3;4 to 4;9. This section assesses the acquisition of postvocalic conditioning of duration (SVLR) for this vowel. We used the same set up for the ANOVA as that described in Section 5.3.1.1 dealing with SVLR for /i/. Similarly to /i/, the results show that there was a highly significant main effect of the following consonant on the duration of the vowel // irrespective of the other factors [F(1.092,22)=26.896; p<.01]. The direction of the main effect was parallel in all age groups (see Figure 5-8). In all groups, the context before voiced fricatives triggered the longest duration of //, while the context before voiced and voiceless stops remained relatively short. The results are consistent with the acquisition of SVLR for the vowel /i/ in our data set. Furthermore, there was a highly significant main effect of the factor “AGE” [F(3,11)=10.169; p<.01] on vowel duration. This effect means that absolute differences in vowel duration between the age groups were systematic. Figure 5-8 shows that the main difference between the groups is contributed by the overall higher duration means in the child groups as compared to the adult group. To establish what groups contributed to the main effect of “AGE”, we ran Tukey HSD post-hoc tests for the age effects. The results of the test are shown in Table 5-9. The significant effects are marked with an asterisk (*) in the column “(J) Age”. The post-hoc tests revealed that like for /i/, there was no significant difference in the implementation of SVLR between adults and the older children aged 4;5 to 4;9. However, there was also no significant difference between adults and children aged 3;4 to 3;11, while there was a significant difference between adults and children aged 4;0 to 4;4. This may indicate that the significance of main effect “AGE” can also be influenced by the variability of individual children contributing to the specific age groups in addition to developmental trends. Individual results in the next section may help to clarify this issue 165 400 350 duration (ms) 300 250 SSE adult child 3;4 to 3;11 200 child 4:0 to 4;4 child 4;5 to 4;9 150 100 50 0 voiced fricative voiced stop voiceless stop Figure 5-8 Mean duration of the vowel // (ms) as a function of the following consonant in four age groups of SSE monolingual speakers. Table 5-8 Mean duration and standard deviation for the SSE vowel // as a function of the following consonant for each age group of the SSE monolingual controls. Following Consonant voiced fricative voiced stop voiceless stop Mean duration Std. Age (ms) Deviation adult 214 child 3;4 to 3;11 292 child 4;0 to 4;4 421 child 4;5 to 4;9 270 Total 306 adult 118 child 3;4 to 3;11 213 child 4;0 to 4;4 235 child 4;5 to 4;9 129 Total 178 adult 108 child 3;4 to 3;11 107 child 4;0 to 4;4 119 child 4;5 to 4;9 96 Total 110 n of subjects 30 5 53 3 89 5 74 2 106 15 10 5 40 3 135 5 32 2 93 15 10 5 33 3 15 5 39 2 20 15 There were no other significant main effects or interactions. The average VLS/VF ratio for the children was .47, and it is similar to .5 in the adult data. The average VLS/VS ratio for the children was .87, and it is again comparable to the adult ratio of .84. 166 Table 5-9 Results of Tukey HSD post-hoc tests for the differences in the duration of // between age groups within SSE monolingual controls. (I) Age (J) Age Adult child 3;4 to 3;11 child 4;0 to 4;4* child 4;5 to 4;9 child 3;4 to 3;11 Adult child 4;0 to 4;4 child 4;5 to 4;9 child 4;0 to 4;4 adult* child 3;4 to 3;11 child 4;5 to 4;9* child 4;5 to 4;9 Adult child 3;4 to 3;11 child 4;0 to 4;4* * Mean Difference (I-J) Std. Error -57.26 -111.56 -18.06 57.26 -54.30 39.20 111.56 54.30 93.50 18.06 -39.20 -93.50 21.30 18.44 24.40 21.30 21.30 26.62 18.44 21.30 24.40 24.40 26.62 24.40 The mean difference is significant at the .05 level. 5.3.2.3.2 Individual results Individual results of the monolingual children are shown in Figure 5-9. The descriptive statistics for each child (and age) are reported in Appendix F. As shown in Figure 5-9, all monolingual children, except for C4 (aged 3;8 and 4;1) had an SVLR-like pattern with // before voiced fricatives having longer duration than in the other two contexts. In both age samples, C4 produced an SSBE-like postvocalic conditioning pattern rather than SVLR, with // in the context before voiced fricatives and voiced stops having much longer duration in comparison to the voiceless stop context. This can be explained by this subject’s language background: i.e. C4 was the only child with mixed parental background: i.e. an SSE-speaking mother and SSBE-speaking father, even though she attended a largely SSE-speaking community nursery. 167 600 median duration (ms) 500 400 voiced fricative 300 voiced stop voiceless stop 200 100 8 3_ 3; 11 C 6_ 4; 0 C 5_ 4; 0 C 4_ 4; 1 C 7_ 4; 2 C 8_ 4; 2 C 7_ 4; 8 C 9_ 4; 9 3; C C 4_ C 3_ 3; 4 0 SSE monolingual children Figure 5-9 Individual results of SSE monolingual children on the duration of // as a function of the following consonant. This subject was deliberately included in our monolingual sample, because it was not obvious what variety of English should be preferred by the bilingual children from Russian-speaking families, given the fact that they grow up in a crossvarietally heterogeneous English community of Edinburgh. Interestingly, this SSBE pattern showed up only for the C4’s vowel // and not for /i/, which had an SVLR like pattern (see Figure 5-5). As we discussed in Section 2.3.1., similar results were reported in Hewlett et al. (1999), where two children with a non-SSE English parental background acquired an SVLR-like pattern for the vowel /i/ and the non-SVLR English vowel duration pattern (similar to SSBE) for the rounded vowel. According to Hewlett et al. (1999), this fact suggested that there were competing influences from the two varieties at work in the children’s speech production patterns. The acquisition pattern by the subject C4 agrees with the pattern reported in Hewlett et al. (1999), with the difference that C4 comes from a mixed parental background (SSE-speaking mother and SSBE-speaking father) as opposed to the non-SSE English background of both parents in Hewlett et al. (1999). The other two monolingual children (other than C4) recorded longitudinally: i.e. C3 (3;4 and 3;11), C7 (4;2 and 4;8) showed stable SVLR patterns in all age samples. The VLS/VF ratios of the children (across all ages) range from .31 to .62, while VLS/VS ratios range from .52 to 1.19. 168 5.3.2.4 Summary of results for the SSE monolingual children The results of the ANOVA comparing different age groups within the SSE monolingual speakers confirmed that the SSE monolingual children aged 3;4 to 4;9 firmly acquired the SVLR pattern for the close vowels /i/ and //. This supports Matthews’ (2002) suggestive evidence that Scottish children might be in the process of acquisition of SVLR by the age of 2;6 to 2;8. The SVLR pattern for the target // was acquired in an adult-like form despite the fact that segmental production of the monolingual children still showed broad (non-adultlike) ranges of phonetic variability. Furthermore, the ANOVA results confirmed that the short postvocalic conditioning pattern for the lax ‘invariably short’ (Aitken, 1981) vowel // is also established at the age concerned. The results for child groups for // parallel those of the SSE adults. Since the SSE adult-model of the postvocalic conditioning is different for the lax vowel compared to the tense one, the results for both vowels in children also mean that the duration of the SVLR tense vowel and the lax vowel is differentiated at the age concerned. Concerning age effects, it appears that at the age of 4;5 to 4;9, the SSE monolingual children were getting closer to adult-like production patterns for all the vowels concerned. At no point were there significant differences between this age group and the SSE adults, while there were significant differences between the younger age groups and the adults. 169 5.3.3 Bilingual acquisition 5.3.3.1 5.3.3.1.1 Subject AN SSE /i/ This section addresses the question whether the bilingual subject AN (who received a nearly equal input in SSE and MSR) acquired the SVLR pattern for the vowel /i/ in a similar way to the SSE monolingual children. The median values for the duration (ms) of /i/ for each monolingual SSE child and AN were entered in a mixed design ANOVA with “BILINGUALITY” (yes, no) and “AGE” (3;4 to 3;11; 4;0 to 4;4, 4;5 to 4;9) as between-subject factors and the “FOLLOWING CONSONANT” as a within-subject factor. The factor “FOLLOWING CONSONANT” had three levels: voiced fricative, voiced stop and voiceless stop. The result showed that there was a highly significant main effect of the following consonant on the duration of the vowel /i/ [F(2,14)=47.019; p<.01] irrespective of age and bilinguality. The median results for the duration of /i/ and number of tokens for subject AN for each age sample are presented in Table 5-10. The corresponding values for the SSE monolingual children (per age) are found in Table 5-4. The direction of the main effect of the following consonant on the duration of the vowel /i/ was parallel in all age groups, and it is shown in Figure 5-10. There were no significant main effects of the factors “AGE” or “BILINGUALITY”, and no significant interactions. This result means that AN acquired an SVLR pattern for /i/ in a way similar to the SSE monolingual peers. Additionally, the result means that AN acquired the SSE majority model of SVLR rather than a non-SSE English one (SSBE-like) despite their cooccurrence in the community of Edinburgh. 170 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 SSE child 3;4 to 3;11 50 AN 3;8 0 50 SSE child 4;0 to 4;4 AN 4;2 0 voiced fricative voiced stop voiceless stop 50 SSE child 4;5 to 4;9 AN 4;5 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-10 Median duration of the vowel /i/ (ms) as a function of the following consonant for subject AN compared to age matched SSE monolingual children in three age samples. 14 Table 5-10 Number of tokens and duration of the vowel /i/ (ms) as a function of the following consonant for subject AN in three age samples Speaker AN_3;8 AN_4;2 AN_4;5 Following Median duration Consonant (ms) n of tokens voiced fricative 228 118 voiced stop 184 33 Voiceless stop 164 66 Total 196 217 voiced fricative 271 20 voiced stop 168 12 Voiceless stop 97 25 Total 151 57 voiced fricative 267 50 voiced stop 167 28 Voiceless stop 128 49 Total 181 127 14 SSE children’s group values are means of individual children’s median values in this Figure and in all subsequent Figures comparing SSE of bilingual and monolingual children. 171 Despite the fact that AN’s production of SVLR for this vowel is not significantly different from that of the peers, it is worth noting that at the youngest age of 3;8 AN the VLS/VF ratio is .71 as opposed to the average of 0.37 of the SSE children, and to the individual highest 0.62 (C5 aged 4;0). Recall that the VLS/VF ratio of the monolingual children was close to that of the SSE adults, while AN’s ratio at the age of 3;8 is much higher than in the monolingual sample. We shall return to this pattern later in the results. 5.3.3.1.2 SSE // The main question addressed in this section is whether AN acquired the short postvocalic conditioning of duration of // in a way similar to the SSE monolingual children. The ANOVA had the same design as in Section 5.3.3.1.1, except that the withinsubject factor “FOLLOWING CONSONANT” had different levels: i.e. voiced fricative, voiced stop and voiceless fricative. The result showed that there was a significant main effect of the following consonant on the duration of the vowel // [F(2,14)=5.540; p<.05] irrespective of the other factors. The median results for duration of // and number of tokens for AN are presented in Table 5-11 for each age. The corresponding values for the SSE monolingual children (per age) are found in Table 5-6. The direction of the main effect of the following consonant on the duration of // was similar in all age groups despite AN’s bilinguality (see Figure 5-11). The result showed that AN’s production of postvocalic conditioning for // was similar to that of the monolingual children. This pattern of the duration of // was different of AN’s SVLR pattern for the vowel // (compare Figure 5-10 and Figure 5-11). There was no significant main effect of the factors “AGE” or “BILINGUALITY” and no significant interactions. It is interesting to note that despite the non-significance of the factor “AGE”, AN’s patterns of postvocalic conditioning in Figure 5-11 are somewhat different from the averaged results of the monolingual children at the age of 3;8 and 4;2. Similarly to the monolingual children, this pattern becomes more SSE-child-like and, thus, also more adult-like at the age of 4;5. 172 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 50 50 AN 3;8 SSE child 4;0 to 4;4 SSE child 3;4 to 3;11 0 50 AN 4;2 0 voiced fricative voiced stop voiceless fricative AN 4;5 SSE child 4;5 to 4;9 0 voiced fricative voiced stop voiceless fricative voiced fricative voiced stop voiceless fricative Figure 5-11 Median duration of the vowel // (ms) as a function of the following consonant produced by the subject AN in comparison to the SSE monolingual peers in three age samples (plotted from left to right). Table 5-11 Number of tokens and duration of the vowel // as a function of the following consonant produced by the subject AN in three age samples. Following Median n of AN’s Age Consonant duration (ms) tokens 3;8 voiced fricative 180 29 voiced stop 241 37 Voiceless fricative 40 194 Total 203 106 4;2 voiced fricative 121 9 voiced stop 182 12 Voiceless fricative 19 95 Total 122 40 4;5 voiced fricative 188 33 voiced stop 144 27 91 Voiceless fricative 133 Total 166 151 173 However, we should not forget that the individual results for the monolingual children were also somewhat less consistent for this vowel, and in fact AN’s pattern at age 3;8 and 4;2 is very similar to the pattern of C5 at age 4;0 (see Figure 5-7). This might explain, why the difference between AN’s production and the averaged results for the monolingual children in Figure 5-11 are not significant. The result shows that by the age of 3;8 AN produced short vowel // in a way similar to the SSE monolingual children. This result equally means that she differentiated between the postvocalic conditioning for the vowels // and /i/. 5.3.3.1.3 SSE // This section investigates whether the bilingual subject AN acquired the SVLR pattern for the vowel // in a similar way to the SSE monolingual children. The design of the ANOVA was the same as for AN’s vowel /i/ in Section 5.3.3.1.1 The result of the test showed that there was a significant main effect of the following consonant on the duration of the vowel // [F(2,14)=9.03; p<.05] irrespective of age and bilinguality. The descriptive statistics for the subject AN for each age are presented in Table 5-12. The corresponding values for the SSE monolingual children (per age) are found in Table 5-8. The direction of the main effect of the following consonant on the duration of // is parallel in all age groups (see Figure 5-12). There was no significant main effect of “AGE” or “BILINGUALITY”, and no significant interactions. This result means that AN acquired the SVLR for // similarly to the SSE peers. The result AN is consistent with her own results for /i/. The longitudinal results for // revealed a statistically insignificant trend which was nonetheless comparable to /i/. AN’s realisation of SVLR for // had a rather small VLS/VF ratio of .72 at the age of 3;8. The ratio is substantially greater than the largest VLS/VF ratio of .41 among the SSE monolingual peers (C3 aged 3;4). Similarly to /i/, this smaller extent of SVLR for // mat the youngest age might indicate language interaction from AN’s Russian vowel duration system. 174 450 450 450 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 50 SSE child 3;4 to 3;11 AN 3;8 0 50 SSE child 4;0 to 4;4 AN 4;2 0 voiced fricative voiced stop voiceless stop 50 SSE child 4;5 to 4;9 AN 4;5 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-12 Median duration of the vowel // (ms) as a function of the following consonant for subject AN compared to age matched SSE monolingual children in three age samples. Table 5-12 Number of tokens and median duration of the vowel // as a function of the following consonant for subject AN in three age samples Following AN’s Age Consonant 3;8 voiced fricative voiced stop voiceless stop Total 4;2 voiced fricative voiced stop voiceless stop Total 4;5 voiced fricative voiced stop voiceless stop Total Median duration (ms) n of tokens 234 215 168 205 183 154 112 146 325 204 117 159 38 37 47 122 9 5 13 27 27 31 51 109 175 5.3.3.1.4 MSR/SSE differentiation for /i/ In Section 5.3.1.1 we showed a substantial crosslinguistic difference in the postvocalic conditioning of vowel duration for the MSR and SSE adults. The difference was most obvious in the context before voiced fricatives (short in MSR and long in SSE), while in the other two consonantal contexts /i/ remained relatively short (see Figure 5-1). If AN differentiates between her two languages for this variable we would expect to see a substantial crosslinguistic difference in the vowel duration before voiced fricatives. To establish this crosslinguistic difference in AN’s speech and any age effects, we entered all subject’s renditions of the carrier words with target /i/ in a multivariate ANOVA. We applied the exclusion criteria specified in Section 5.2 to the individual renditions. The ANOVA had mean vowel duration as a dependent variable and three fixed factors: i.e. “FOLLOWING CONSONANT” (voiced fricative, voiced and voiceless stop), “LANGUAGE” (SSE and MSR) and “AGE” (3;8, 4;2 and 4;5). The results of the ANOVA showed that there was a highly significant main effect [F(2,602=17.059; p<.01)] of the factor “FOLLOWING CONSONANT” on the duration of the vowel /i/. The direction of this effect per age and language is shown in Figure 5-13. The descriptive statistics for each consonantal context, language and age are reported in Appendix H. This result paralleled the main effect between MSR and SSE adult models in Section 5.3.1.1. We ran Tukey HSD post hoc tests to determine which of the three consonantal contexts contributed to the effect of the “FOLLOWING CONSONANT”. The results revealed a significant difference (p<.05) between the duration of /i/ before voiced fricatives compared to voiced and voiceless stops. Thus, this result replicated the adult results: i.e. for both languages there was a parallel direction of postvocalic conditioning before voiced fricatives compared to the other contexts. With regard to the language differentiation, there was no significant main effect of language or age. However, there was a highly significant interaction [F(2,602=19.165; p<.01)] between the factors “FOLLOWING CONSONANT” and “LANGUAGE”. This interaction suggests a differential implementation of the duration of /i/ between AN’s two languages depending on the following consonant. Such an interaction can be expected 176 given that adult SSE and MSR models in Figure 5-1 showed a differential implementation of duration before voiced fricatives. Besides, the ANOVA showed a highly significant interaction [F(4,602=5.231; p<.01)] between the “FOLLOWING CONSONANT”, “LANGUAGE” and “AGE”. This interaction can be seen in Figure 5-13. In SSE, AN showed a fairly consistent SVLR pattern irrespective of her age, while AN’s MSR pattern for /i/ differs between the three age samples. In MSR, AN increased the duration of /i/ depending on the following consonant. Nevertheless, the increase between the contexts of voiced and voiceless stops in MSR is inconsistent between the three age conditions. There were no other significant main effects or interactions. 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 AN 3;8 SSE 50 AN 4;2 SSE 50 AN 3;8 MSR 0 AN 4;2 MSR 0 voiced fricative voiced stop voiceless stop AN 4;5 SSE 50 AN 4;5 MSR 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-13 Mean duration of the vowel /i/ (ms) as a function of the following consonant produced by the subject AN in MSR and SSE in three longitudinal age samples (from left to right). The difference between AN’s two languages is not obvious at the age of 3;8 when the VLS/VF ratios are almost equal (.72 in SSE and .73 in MSR). If we also consider that the fact that her VLS/VF ratio in SSE exceeds the maximal monolingual child ratio of .62, the possibility of language interaction from MSR becomes clearer. At no age is AN’s MSR pattern consistent with the SVLR pattern in SSE. Generally across age samples, it seems that the crosslinguistic difference for AN’s /i/ is substantial. It is consistent with a systematic pattern of vowel duration conditioning 177 in SSE, and with a less systematic one in MSR. Recall that in Section 5.3.1.4 we concluded that the adult pattern of postvocalic conditioning in Russian was small in extent and varied in individual speakers showing its non-obligatory nature. Let’s consider a possible explanation for the lesser systematicity in AN’s MSR consonantal conditioning of duration. The difference between AN’s longitudinal results, her Russian-speaking mother’s pattern and that of the Russian-speaking experimenter (in child directed speech) is presented in Figure 5-14. 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 AN 3;8 MSR MSR mother R3 CDS 50 0 voiced fricative voiced stop voiceless stop AN 4;2 MSR MSR mother R3 CDS 50 0 AN 4;5 MSR MSR mother R3 CDS 50 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-14 A comparison of AN’s longitudinal results for the mean duration of /i/ (ms) to that of her mother speaking Russian and of the principal investigator (subject R3) in child directed speech. The figure shows that at the age of 4;2 AN’s contextual increase in the duration of /i/ parallels that of her mother, but not at other ages. However, there was a difference in the data elicitation used for AN and her mother: i.e. AN’s mother read out utterances from the computer screen, while AN was recorded in a more spontaneous speech elicitation situation involving structured games. Principal investigator’s (“R3 CDS” in Figure 5-14, “CDS”: child directed speech) speech during structured games might be more representative of Russian duration patterns given the elicitation situation. A comparison of the patterns of “R3 CDS” and AN’s mother shows the overall longer absolute duration of “R3 CDS” compared to that of AN’s mother. Thus, AN’s overall longer duration of /i/ in 178 Russian compared to her mother might be due to the differences in the data elicitation procedure. To conclude, the results of AN’s realisation of postvocalic conditioning for /i/ suggest that she differentiated between her two languages from the age of 4;2. 5.3.3.1.5 MSR/SSE differentiation for /u/ and // To establish language differentiation in AN’s production of postvocalic conditioning of the SSE // and MSR /u/ and age effects, we entered all AN’s individual renditions of the carrier words with targets // and /u/ in a multivariate ANOVA. The ANOVA had the same design as for /i/ in Section 5.3.3.1.4. The results showed that there was a highly significant main effect [F(2,516=57.960; p<.01)] of the factor “FOLLOWING CONSONANT” on the duration of // and /u/. The direction of this effect per age and language is shown in Figure 5-15. The descriptive statistics for the close rounded vowels per consonantal context, language and age are reported in Appendix I. This result paralleled the main effect between MSR and SSE adult models in Section 5.3.1.1. We ran Tukey HSD post hoc tests to determine which of the three consonantal contexts contributed to the main effect of the “FOLLOWING CONSONANT”. The results revealed that the main effect was contributed by the significant difference (p<.05) of the duration // and /u/ in the context of voiced fricatives compared to either voiced or voiceless stops. Thus like for /i/, AN’s result paralleled the results of the SSE and MSR adults in that both languages had some extent of postvocalic conditioning before voiced fricatives compared to the other contexts. Unlike for /i/, there was also a highly significant main effect of the factor “AGE” [F(2,516)=4.785, p<.01]. Tukey HSD post hoc tests for the age effects showed that there was a significant difference (p<.05) between the ages of 3;8 and 4;5. The effect can be seen in Figure 5-15. The crosslinguistic patterns of the postvocalic conditioning appear to be the opposite between the age of 3;8 and 4;5. The difference between AN’s longitudinal results and her mother’s pattern is shown in Figure 5-16. Similarly to the production of /i/, the overall higher duration values irrespective of the context in AN compared to her mother’s can be attributed to the difference in the data elicitation procedure. 179 450 450 450 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 AN 3;8 SSE 50 AN 4;2 SSE 50 AN 3;8 MSR 0 AN 4;2 MSR 0 voiced fricative voiced stop voiceless stop AN 4;5 SSE 50 AN 4;5 MSR 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-15 Mean duration of the close rounded vowels (ms) as a function of the following consonant for the subject AN in MSR and SSE. 450 450 450 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 AN 3;8 MSR 50 MSR mother R3_CDS 0 voiced fricative voiced stop voiceless stop AN 4;2 MSR MSR mother R3_CDS 50 0 voiced fricative voiced stop voiceless stop AN 4;5 MSR MSR mother R3_CDS 50 0 voiced fricative voiced stop voiceless stop Figure 5-16 A comparison of AN’s longitudinal results for the mean duration of MSR /u/ (ms) compared to that of her mother speaking Russian, and of the principal investigator (subject R3) in Russian child directed speech. 180 The MSR pattern at AN’s younger age of 3;8 is dissimilar to both adult speakers, and shows some potential influence from SVLR. At the age of 4;5, AN shows a similar pattern of postvocalic conditioning of /u/ to both of her mother and her interlocutor “R3 CDS”. Thus her production of duration of the close rounded vowels becomes more adultlike at the age of 4;5. Altogether towards the age of 4;5 AN produced a differentiated pattern for MSR (and SSE) which is close to both adult models. 5.3.3.1.6 Summary of AN’s results The results of acquisition of postvocalic conditioning of vowel duration by the bilingual subject AN suggest that overall she differentiated between her two languages in a way similar to the monolingual speakers. First of all, AN acquired the postvocalic conditioning of vowel duration for the vowels /i/ and // in a way similar to the SVLR pattern of the SSE monolingual peers. She produced a consistent SVLR pattern for both vowels in all age samples gathered. The factors “bilinguality” or “age” were consistently not significant compared to the SSE peers. However, despite the non-significance of the main effect of age or bilinguality for both vowels, there was a clear longitudinal trend which is unlikely to be a coincidence. At the age of 3;8, AN produced consonant-dependent vowel duration with VLS/VF ratios exceeding the maximal monolingual ratios. This means that her average increase of duration as a function of the following consonant was smaller compared to the SSE monolingual children at the age of 3;8. As we discussed in Section 2.3.2., the Markedness Hypothesis (Müller, 1998) had been invoked to explain smaller ratios for the for intrinsic vowel duration conditioning for Spanish-German bilingual children reported in Kehoe (2002). Kehoe’s study showed that the bilingual children aged 2;3 to 2;6 produced a much smaller extent of the durational difference between short and long vowels than the German monolingual children. AN had a similar pattern. Such a difference in the implementation of the SVLR of /i/ and // between AN and the SSE-speaking peers could, thus, be an effect of language interaction from the Russian unmarked (less ambiguous) system of postvocalic conditioning of vowel duration. At the age of 4;5, AN produced consonant-dependent duration values for both SVLR vowels in a way very similar to the SSE monolingual peers. Given the fact that we observed a significant developmental trend in the data of the SSE monolingual children, 181 whereby the oldest group of children aged 4;5 to 4;9 showed no significant differences to the SSE adult data, we can conclude that AN’s SVLR patterns for both vowels also became more like the SSE adult model. Concerning language differentiation for the two SVLR vowels, generally AN did differentiate between her two languages. The crosslinguistic differences for both vowel sets were substantial and consistent. AN produced a systematic pattern of vowel duration conditioning in SSE in the three age samples and a less systematic one in MSR. It is, thus, possible that more variable longitudinal patterns produced by AN in MSR were compatible with a non-obligatory system of postvocalic conditioning in Russian (even though there were significant trends in the adult data in Section 5.3.1.4). AN’s MSR data was quite similar to the patterns observed for the mother. They both had a tendency to produce somewhat longer duration in vowels before voiced fricatives compared to voiceless stops context. However, this was not true at the age of 3;8. Comparing AN’s crosslinguistic pattern for the vowels / u/ at the age of 3;8 as opposed to 4;5, we saw the reversal of the language patterns (which contributed to the significance of the 3-way interaction between the factors “FOLLOWING CONSONANT”, “AGE ” and “LANGUAGE” in the ANOVA). The direction of the significant interaction was surprising, since it appeared that at the age of 3;8 AN produced a more SVLR-like pattern in her Russian than in SSE, and the pattern reversed at the age of 4;5 towards an adult-like model for both languages. This pattern suggests a bi-directional influence of the systems of the postvocalic conditioning in MSR and SSE in AN’s speech production at the age of 3;8. However, it is a puzzling finding, because for a native-speaker of Russian a transfer of an SVLR-like system into Russian is totally irrelevant. Increased duration of a stressed vowel in Russian would primarily be perceived as a prominence related event, even though the event could turn out to be pragmatically odd. We used sufficient number of repetitions in this study to derive the means (see Appendix I), so AN’s reversed crosslinguistic patterns at the age of 3;8 and 4;5 should be representative for the subject’s speech production given the elicitation mode. The bi-directional language interaction are not compatible with the directions predicted by the CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998), as well as the language dominance hypothesis (Petersen, 1988), since all of them are unidirectional. We further address these issues in the discussion chapter. 182 No language differentiation could be assessed for the postvocalic conditioning of the lax vowel, since it is not featured in Russian. The result of the comparison of AN’s production in SSE showed that AN acquired a relatively short conditioning of the lax vowel // in a way similar to the SSE monolingual children. This result also means that AN produced two different patterns of postvocalic conditioning of vowel duration in SSE: (1) an SVLR pattern for the vowels /i/ and //, (2) and a relatively short conditioning of the lax vowel // irrespective of the following consonant. AN’s acquisition of SVLR in SSE means that she acquired the Scottish variety of English in favour of other SSBE-like varieties co-occurring in Edinburgh. 183 5.3.3.2 5.3.3.2.1 Subject BS SSE /i/ We investigate whether the bilingual subject BS acquired the postvocalic conditioning of the duration of /i/ in a way similar to the SSE monolingual peers, and whether the pattern had any significant age effect. The set up for ANOVA was the same as for subject AN described in Section 5.3.3.1.1. However, the between-subject factor “AGE” had three levels (3;4 to 3;8, 3;9 to 4;1 4;2 to 4;9). The age levels of the SSE monolingual children matched BS’ ages of 3;4, 3;10 and 4;5. The results showed that there was a highly significant main effect of the following consonant on the duration of the vowel /i/ [F(2,14)=27.812; p<.01]. We observed the same main effect in the comparison of the adults (Figure 5-1) in Section 5.3.1.1. The descriptive statistics of this test for BS are presented in Table 5-13. The corresponding values for each of the SSE monolingual children are found in Appendix E. The test also showed a highly significant interaction between “FOLLOWING CONSONANT” and “BILINGUALITY” [F(2,14)=10.361; p<.01]. There were no other significant effects or interactions. The direction of the main effect of the following consonant on the duration of /i/ and its interaction with BS’ bilinguality is shown in Figure 5-17. The figure shows that unlike the SSE peers BS did not produce a sufficiently long duration of /i/ before voiced fricatives. Recall that this is exactly the context and direction, in which the crosslinguistic difference between MSR and SSE manifested itself in the speech of the adult controls (Figure 5-1, there we had a significant interaction between “LANGUAGE” and “FOLLOWING CONSONANT”). This result means that BS had not acquired the SVLR for /i/ like the SSE monolingual peers, and that this difference was due to her bilinguality. It seems that BS followed the MSR model of postvocalic conditioning of vowel duration for this vowel, rather than the SSE one. BS’ VLS/VF ratios were .94 (age 3;4), .91 (age 3.10). 73 (age 4;5). In all age samples the ratios exceeded the maximal ratio of .65 produced by C5 at the age of 4;0. 184 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 SSE child 3;4 to 3;8 50 SSE child 3;9 to 4;1 50 BS 3;10 BS 3;4 0 BS 4;5 0 voiced fricative voiced stop SSE child 4;2 to 4;9 50 voiceless stop 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-17 Median duration of the vowel /i/ (ms) as a function of the following consonant produced by the subject BS compared to the SSE monolingual peers in three age samples. Table 5-13 Median duration and number of tokens of the vowel /i/ as a function of the following consonant produced by the subject BS in three age samples. Following BS’ Age consonant 3;4 voiced fricative voiced stop voiceless stop Total 3;10 voiced fricative voiced stop voiceless stop Total 4;5 voiced fricative voiced stop voiceless stop Total Median n of duration (ms) tokens 224 69 197 28 211 61 211 158 211 47 156 22 193 43 182 112 249 68 175 22 182 64 206 154 185 Despite the non-significance of the factor “AGE” and the lack of interactions with this factor, Figure 5-17 shows that at the age of 4;5 BS does produce a more SVLR-like pattern than the patterns at the younger ages. Her VLS/VF ratio of .73 at the age of 4;5 is comparable to .72 of AN at the age of 3;8. It is possible that BS’ MSR system had an influence on her SSE production. However, this influence is in line with both unmarkedness of the Russian postvocalic conditioning system and BS’ language exposure patterns. Besides, a stronger influence of the Russian model in BS’ case compared to AN suggests that the amount of language interaction between the bilingual child’s languages can be affected by the individual language exposure patterns. 5.3.3.2.2 SSE // We investigate BS’ acquisition of postvocalic conditioning of duration of the SSE vowel // in comparison to the SSE monolingual peers and whether there is any significant age effect. The ANOVA had the same design as in Section 5.3.3.1.1, except that the within-subject factor “FOLLOWING CONSONANT” had different levels: i.e. voiced fricative, voiced stop and voiceless fricative. The factor “AGE” had three levels: 3;4 to 3;8, 3;9 to 4;1 and 4;2 to 4;9. The results showed no significant main effects. However, there was a highly significant interaction [F(2,14)=12.112, p<.01] between the factors “FOLLOWING CONSONANT” and “BILINGUALITY”. The descriptive statistics for BS are presented in Table 5-13. The corresponding values for each of the SSE monolingual children are found in Appendix G. The direction of the differences between BS and the SSE monolingual children per age is shown in Figure 5-18. The figure shows that BS produced an opposite postvocalic conditioning pattern compared to the SSE peers: i.e. she realised longer duration before voiceless fricatives than before voiced fricatives irrespective of her age. This interaction shows that similarly to the results for /i/, BS’ production of postvocalic conditioning of the duration of // is different from the SSE monolingual peer group. 186 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 SSE child 3;4 to 3;8 50 SSE child 3;9 to 4;1 50 BS 3;4 BS 3;10 0 BS 4;5 0 voiced fricative voiced stop SSE child 4;2 to 4;9 50 voiceless fricative 0 voiced fricative voiced stop voiceless fricative voiced fricative voiced stop voiceless fricative Figure 5-18 Median duration of the target vowel // (ms) as a function of the following consonant produced by the subject BS compared to the SSE monolingual peers in three age samples. 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 50 SSE child 3;4 to 3;8 50 BS 3;4 SSE child 3;9 to 4;1 BS 3;10 0 voiced stop voiceless fricative SSE child 4;2 to 4;9 BS 4;5 0 voiced fricative 50 0 voiced fricative voiced stop voiceless fricative voiced fricative voiced stop voiceless fricative Figure 5-19 Median duration of all phonetic realisations of [] (ms) as a function of the following consonant produced by the subject BS compared to the SSE monolingual peers in three age samples. 187 Table 5-14 Number of tokens and median duration of the vowel // as a function of the following consonant produced by the subject BS in three age samples. Median Following duration BS’ Age consonant (ms) n of tokens 3;4 voiced fricative 199 14 voiceless fricative 259 28 voiced stop 198 35 Total 207 77 3;10 voiced fricative 183 6 voiceless fricative 229 37 voiced stop 223 31 Total 234 74 4;5 voiced fricative 137 19 voiceless fricative 212 37 voiced stop 211 22 Total 192 78 At this point it is worth considering the segmental aspect of BS’ acquisition of the vowel // in its relation to the vowel duration. Recall that regarding vowel quality across all age samples BS produced only 35% of adult-like [] as opposed to 99.1% of the SSE peers. 98.3% of her non-adult-like realisations involved production of vowel [i] for the SSE target //. It is then worth considering how BS’ phonetically adult-like [] vowels compare to those produced by the SSE monolingual peers. The duration of // in Figure 5-18 refers to all the adult targets produced by BS, while Figure 5-19 plots only the ones that were auditorily labeled as []. The same ANOVA based on the median duration of vowels phonetically realised as [] (rather than all adult targets), produced an almost significant effect [F(2,14)=3.336, p=0.065] for the factor “FOLLOWING [F(2,14)=3.310, CONSONANT”, p=0.067] between and almost “FOLLOWING significant interaction CONSONANT” and “BILINGUALITY”. Thus, given this phonetically motivated set-up, the importance of BS’ bilinguality decreased, while the significance of the main effect of the following consonant increased compared to the SSE monolingual children. So that BS’ realisation of duration of phonetic [] was less different from that of the monolingual peers. However, the effect was only near significant and concerned only 35% of BS’ attempts to produce the lax vowel //, so that overall the BS vowel was mature neither at the segmental nor at the suprasegmental (durational) level. 188 Once again a possible explanation for this effect in BS speech is her greater exposure to Russian than to SSE. 5.3.3.2.3 SSE // To establish how BS’ production of postvocalic conditioning of the duration of the SSE vowel // compares to that of the SSE monolingual peers, and whether there was any observable age effect, we entered all median values of duration for the SSE target // in different consonantal contexts in a mixed design ANOVA. The test had the same set up as in Section 5.3.3.2.1. The results showed that there was a highly significant main effect of the following consonant on the duration of the vowel // [F(2,14)=6.714; p<.01]. There were no other significant effects or interactions. The descriptive statistics for the subject BS are presented in Table 5-15. The values for each of the SSE monolingual children are found in Appendix F. The direction of the main effect of the following consonant on the duration of // is shown in Figure 5-20. Unlike for /i/, the test showed no significant interactions of the factors “BILINGUALITY” and “FOLLOWING CONSONANT”. In fact, the results paralleled AN’s test for this vowel. However, longitudinally there was a substantial difference between the subjects AN and BS (compare Figure 5-15 and Figure 5-20). AN’s VLS/VF ratio for // decreased with age towards a more SSE adult like values. For BS, there was little longitudinal change in the VLS/VF ratio throughout the considered ages: i.e. the VLS/VF ratio was .89 at the age of 3;4 , it decreased to .72 at the age of 3;10, and it was .81 at the age of 4;5. In the three age samples considered, BS’ VLS/VF ratio was substantially greater than the monolingual upper boundary of .62. It was more similar to the adult Russian ratio of .85, and it did not change much in the time considered. 189 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 100 SSE child 3;4 to 3;8 50 SSE child 3;9 to 4;1 50 0 BS 4;5 0 voiced fricative voiced stop SSE child 4;2 to 4;9 50 BS 3;10 BS 3;4 voiceless stop 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-20 Median duration of the vowel // (ms) as a function of the following consonant for subject BS compared to age matched SSE monolingual children for three longitudinal moments. Table 5-15 Number of tokens and median duration of the vowel // as a function of the following consonant for subject BS for three longitudinal moments. Median duration BS’ Age Following consonant (ms) n of tokens 3;4 voiced fricative 227 41 voiced stop 196 18 voiceless stop 203 13 Total 226 72 3;10 voiced fricative 255 20 voiced stop 229 18 voiceless stop 185 33 Total 224 71 4;5 voiced fricative 271 31 voiced stop 219 39 voiceless stop 219 21 Total 241 91 190 This comparison suggests that even though the ANOVA showed a statistically significant main effect of the following consonant on the duration of the vowel //, and no effect of bilinguality of BS, the subject was still different from the monolingual children in the reduced extent of the consonantal conditioning throughout the period considered. The non-significance of the factor “BILINGUALITY” was most probably due to the parallel direction of the main effect of the “FOLLOWING CONSONANT” in BS’ case combined with the relatively low number of the subjects in the test. Altogether we can conclude that BS’ pattern of the postvocalic conditioning for // looked more SVLR-like than the pattern observed for /i/. However, BS’ VLS/VF ratios underwent little longitudinal change and were beyond the monolingual ranges throughout the study. BS’ pattern of postvocalic conditioning of // had a smaller extent (values closer to 1) of the VLS/VF ratio compared to the SSE monolingual children. A similar “reduced” SVLR pattern was also observed for AN at the age of 3;8. Both AN’s and BS’ patterns were consistent with the results for German-Spanish bilinguals (Kehoe, 2002) discussed in Section 2.3.2 in connection to the relative markedness of the two languages in contact of a bilingual child. However, the variable extent of the difference in vowel duration conditioning between the bilingual subjects compared to the SSE monolingual children suggests that factors other than relative language structure might as well be at work. 191 5.3.3.2.4 MSR/SSE differentiation for /i/ To establish whether BS differentiated between her MSR and SSE production of postvocalic conditioning of duration for the vowel /i/ and whether there was an age effect, we entered all BS’ individual renditions of the carrier words with target /i/ (after applying exclusion criteria specified in Section 5.2) in a multivariate ANOVA. The ANOVA had vowel duration as a dependent variable with three fixed factors: i.e. “FOLLOWING CONSONANT” (voiced fricative, voiced and voiceless stop), “LANGUAGE” (SSE and MSR) and “AGE” (3;4, 3;10 and 4;5). The results showed neither a significant main effect of the factor “FOLLOWING CONSONANT” on the duration of /i/, nor any interactions with this factor. This result means that overall BS did not differentiate the postvocalic conditioning between her two languages. The descriptive statistics for BS’ production of vowel duration per consonantal context, language and age are reported in Appendix J. The direction of the crosslinguistic differences is shown in Figure 5-21. There was a highly significant main effect of the factor “AGE” [F(2,610)=6.655; p<.01]. There were no other significant main effects or interactions. Since there were no significant interactions between language and age and the following consonant, which would show language-specific implementation of the postvocalic conditioning, this main effect of “AGE” is not relevant. In fact, the results of Tukey HSD post hoc tests showed that there was a significant [p<.05] age effect between the age of 3;10 compared to the ages of 3;4 and 4;5. Thus, this pattern is not linear longitudinally. In fact, Figure 5-21 shows that at the age of 3;4 and 3;10 BS produced quite similar postvocalic conditioning patterns in both SSE and MSR. This observation agrees with the results of the comparison of BS’ speech production to that of the SSE monolingual peers, for which we found a highly significant main effect of the factor “BILINGUALITY” contributed by the relatively shorter duration of /i/ before voiced fricatives compared to the SSE monolingual peers. 192 450 450 450 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 100 BS 3;4 SSE 50 BS 3;10 SSE 50 BS 3;4 MSR 0 100 BS 3;10 MSR 0 voiced fricative voiced stop BS 4;5 SSE 50 BS 4;5 MSR 0 voiceless stop voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-21 Longitudinal results for the mean duration of the vowel /i/ (ms) as a function of the following consonant produced by the subject BS in MSR and SSE 450 450 BS 3;10 SSE BS 3;4 SSE BS 3;4 MSR 400 BS 3;10 MSR 400 R3 CDS 350 R3 CDS MSR mother 300 300 250 250 250 200 200 200 150 150 150 100 100 100 50 50 50 0 voiced fricative voiced stop voiceless stop BS 4;5 MSR 350 300 0 BS 4;5 SSE 400 MSR mother MSR mother 350 450 R3 CDS 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-22 A comparison of BS’ longitudinal results for the mean duration of /i/ in SSE and MSR to those of her mother speaking Russian, and those of the principal investigator (subject R3) in child directed MSR. 193 The results indicated that overall BS did not differentiate between the two languages according to the adult models of postvocalic conditioning. This pattern of language interaction agrees with both unmarkedness of the Russian model and BS’ language exposure pattern. However, the postvocalic conditioning at the age of 4;5 suggests that BS’ crosslinguistic production of the patterns started to look more language-specific. The latter point becomes more obvious if we compare BS’ longitudinal results in both languages to her mother’s MSR pattern, and to that of the principal investigator in child directed speech. This difference is shown in Figure 5-22. Similarly to AN’s pattern in Section 5.3.3.1.4, the overall higher duration values I in BS’ production compared to her mother’s might be attributed to the difference in the data elicitation procedure. At the age of 4;5, BS’ MSR production of postvocalic conditioning of /i/ showed a similar pattern to those of her mother and of the principal investigator (“R3 CDS” in Figure 5-22), while her SSE pattern looks more SVLR-like. 5.3.3.2.5 MSR/SSE differentiation for /u/ and //. To establish whether BS differentiated between her MSR and SSE production of the postvocalic conditioning of duration for the vowels /u/ and // and whether there was an age effect for this crosslinguistic difference, we entered all BS’ individual renditions of the carrier words with adult targets /u/ and // (after applying exclusion criteria specified in Section 5.2) in a multivariate ANOVA. The test had the same design as in Section 6.3.3.2.4. The results showed a highly significant main effect [F(2,491)=5.992; p<.01] of the “FOLLOWING CONSONANT” on the duration of these vowels. There were no other significant main effects or interactions. To determine the direction of the consonantal effect, we ran Tukey HSD post hoc tests. The tests revealed that the main effect was due to a significant difference [p<.05] between duration of the vowels before voiced fricatives compared either to the context before voiced stops or to that before voiceless stops. This result agrees with the direction of the crosslinguistic differences between the SSE and MSR adult models. The descriptive statistics of the test are reported in Appendix K. The direction and the extent of the crosslinguistic differences per language and age are shown in Figure 5-23. 194 450 450 450 400 400 400 350 350 350 300 300 300 250 250 250 200 200 200 150 150 150 100 BS 3;4 SSE 50 BS 3;4 MSR 0 100 BS 3;10 SSE 50 BS 3;10 MSR 0 voiced fricative voiced stop voiceless stop 100 BS 4;5 SSE 50 BS 4;5 MSR 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-23 Mean duration of the close rounded vowels (ms) as a function of the following consonant produced by the subject BS in MSR and SSE in three age samples. 450 450 450 BS 3;10 SSE BS 3;4 SSE 400 350 BS 3;4 MSR 400 MSR mother 350 BS 3;10 MSR MSR mother BS 4;5 SSE 400 R3_CDS R3_CDS R3_CDS 300 300 250 250 250 200 200 200 150 150 150 100 100 100 50 50 50 0 voiced fricative voiced stop voiceless stop MSR mother 350 300 0 BS 4;5 MSR 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 5-24 A comparison of BS’ longitudinal results for the mean duration of /u/ and // (ms) in SSE and MSR to those of her mother speaking Russian, and those of the principal investigator (subject R3_CDS) in child directed MSR speech. 195 Despite the significance of the factor “FOLLOWING CONSONANT”, the test showed neither significant main effect of “LANGUAGE”, nor significant interactions between the factors (which could be expected given the extent of the differences between SSE and MSR in the contexts before voiced fricatives). The result shows that though BS produced the language-specific direction of the postvocalic conditioning for the close rounded vowels in both languages, she did not produce the language-specific extent of it to achieve sufficient language differentiation. However, BS’ crosslinguistic pattern of postvocalic conditioning emerging at the age 4;5 suggests that BS speech production became more differentiated and more language-specific. Recall that a similar trend was found for BS’ /i/ at the age of 4;5 in Section 5.3.3.2.4. In that sense the two sub-tests agree. The difference between BS’, her mother’s and the interlocutor’s speech production is shown in Figure 5-24. As for /i/, the overall higher duration values in BS’ production compared to her mother’s can be attributed to the difference in the data elicitation procedure. Apart from the absolute differences between BS’ and her mother’s production, BS’ MSR production showed a similar pattern of postvocalic conditioning of the close rounded vowels to both her mother and the interlocutor (“R3 CDS”). 5.3.3.2.6 Summary of BS’ results In this section we investigated bilingual SSE/MSR patterns of postvocalic conditioning of vowel duration produced by the subject BS, who by the age of 4;5 had received substantially more input in Russian than in SSE. We addressed the question of language differentiation and considered the possibility of language interaction. The results suggest an overall lack of language differentiation for the postvocalic vowel duration conditioning patterns. The patterns produced by BS seemed to follow a Russian model of postvocalic conditioning in SSE irrespective of the vowel, while her MSR production was similar to that of her mother. First of all, we addressed BS’ acquisition of the postvocalic conditioning of duration for the vowels /i/ and // requiring application of SVLR in the SSE adult model. We compared BS’ production for these vowels to that of the SSE monolingual peers. The results for both vowels showed that BS’ production of the postvocalic conditioning was very different from the SSE monolingual peers. The bilinguality of BS played a highly 196 significant effect on the production of SVLR in /i/: i.e. BS produced a reduced extent of the postvocalic conditioning. The subject produced greater VLS/VF ratios (.73 to.94) compared to the SSE peers (.65 maximally, .47 on average), and her patterns were more variable. For the vowel //, the interaction between BS’ bilinguality and SVLR did not play a statistically significant role, possibly because BS’ patterns were more systematic. However, as for /i/, BS produced a reduced extent of the postvocalic conditioning of // compared to the monolingual children, again mainly due to the fact that the vowels before voiced fricatives were not long enough. Irrespective of BS’ age, her VLS/VF ratios were nearer to 1 than the maximal individual VLS/VF ratios produced the SSE monolingual children, and closer to the Russian adult model (VLS/VF of .85 for either vowel). The results of the crosslinguistic comparison of BS’ production of postvocalic conditioning between SSE and MSR showed no significant main effects or interactions. This means that BS did not differentiate between her languages in the age samples considered. For both unrounded and rounded close vowels there was no significant crosslinguistic difference in vowel duration before voiced fricatives compared to voiceless stops, which we expected, given the differences between the adult models. At the same time the postvocalic conditioning patterns for both vowels in MSR showed similarities to both production of BS’ mother and that of the principal investigator. Since BS’ production was different from the SSE monolingual peers and similar to the Russian adult model, we can conclude that BS followed the Russian model in her SSE production. The comparison of BS’ production of postvocalic conditioning for the vowel // to the SSE monolingual peers showed a highly significant interaction of the factors “FOLLOWING CONSONANT” and “BILINGUALITY”. In fact, the patterns were the opposite of the averaged monolingual child results: i.e. BS produced longer duration of vowels before voiceless fricatives compared to all other contexts. Consequently BS’ production of duration for the SSE vowel // was not SSE-specific. 197 6 Acquisition of Vocal Effort 6.1 Introduction This chapter presents data on the acquisition of vocal effort by bilingual and monolingual pre-school children. In Section 2.4.1., we showed that despite qualitative physiological differences in respiratory and laryngeal control between adults and children (Titze, 1994; Mackenzie Beck, 1997), children are able to control their respiratory and laryngeal mechanisms sufficiently to achieve a fine-grained control of phonatory loudness similarly to adults (Strathopoulos & Sapienza, 1993; Strathopoulos, 1995; Traunmüller & Eriksson, 2000). In this chapter, we address the acquisition patterns (language differentiation and interaction) for the phonetic variables involving vocal effort. The variables have been discussed in detail in Sections 2.1.2 and 2.1.4. Vocal effort has been measured acoustically as “spectral balance” as outlined in 3.6.3.3. We look at the three variables: (1) vocal effort patterns for the SSE/MSR vowel /i/ compared across three postvocalic consonantal contexts triggering different vowel length in SSE, all in prominent positions (see Table 3-3 for the list of carrier words). The short SSE vowel is produced with more vocal effort than the long one to achieve sufficient prominence. To achieve language-specific results in vocal effort, the bilingual subjects should produce higher spectral balance values (less breathy laryngeal configuration) for the short vowel (before voiced or voiceless stops) than for the long vowel (before voiced fricatives) in SSE, and have a variable pattern in MSR. (2) vocal effort differences between the SSE tense/lax vowels /i/ and // across all consonantal contexts in prominent positions (see discussion in Section 2.1.2.5.). This potential difference between the tense/lax vowels is hypothesised to be due to differentiated laryngeal configuration adopted for the vowels (Stevens, 1998). If proved, the involvement of the laryngeal 198 level should be seen as a separate phonetic dimention differentiating the tense/lax contrast in addition to vowel quality and duration. To achieve language-specific results in SSE, the bilingual subjects should produce substantially higher spectral balance values (less breathy laryngeal configuration involving more vocal effort) for the lax vowel than for the tense vowel. (3) vocal effort patterns of the close rounded vowel // in SSE and /u/ in MSR compared across the three postvocalic consonantal contexts triggering SVLR in SSE (see Table 3-3 for the carrier words), all in prominent positions. The short SSE vowel is produced with more vocal effort than the long one to achieve sufficient prominence. The difference is Russian across the three consonantal contexts might not be systematic. To achieve language-specific results, the bilingual subjects should produce patterns very similar to /i/. In order to present data on bilingual acquisition we need to create a reference for the exact patterns of vocal effort for these variables produced by the appropriate monolingual control groups. First of all, we perform a crosslinguistic comparison for the three research variables between the adult speakers of MSR (n=5), SSE (n=5) and SSBE (n=4). The comparisons are performed with a similar set up as in Chapter 5 dealing with vowel duration. The SSBE adult set serves as additional control for possible cross-varietal influences in the child data. Secondly, we present data on the SSE monolingual acquisition of vocal effort for each of the variables by the pre-school children (n=7 plus three longitudinal cases). Finally, we present the bilingual patterns of vocal effort. The structure of the tests is similar to that in Chapter 5 on vowel duration. Each bilingual child’s SSE speech is first compared to that of the SSE monolingual peers. Subsequently, we present a comparison between the MSR and SSE patterns for each of the subjects. Additionally, we descriptively compare each subject’s Russian pattern to that of her mother and the investigator for the reasons outlined in Section 5.1. 199 6.2 Data Analysis We present statistical analysis for all the normalisation methods of spectral balance around F2 to allow assessment of their coherence. The measures A2, A2*a, A2*b, A2*c (explained in Section 3.6.4.3) all normalised for differences in overall intensity. The measures represent the RMS-power (dB) of the steady-state of each vowel measured around mean F2 of the vowel in a fixed frequency band of 600Hz (see Table 3-7 for the definitions). In addition to that, A2*a normalised for formant frequency shifts within each of the targets /i u / across all speakers and languages. This measure is suitable for comparing vowels similar in vowel quality within or between languages (as within /i/). A2*b normalised for formant frequency shifts across the targets /i /, and across all the speakers and languages. A2*c normalised for formant frequency shifts across the targets / u /, and cross all speakers and languages. The measures A2*b or A2*c are suitable for comparing vowels differing in vowel quality (such as between /i / or between /u /) within or between languages. Interspeaker normalisation was applied separately for children and adults for all the measures, since the two groups of speakers differed in their vocal tract size. We applied the same data selection criteria as in Section 5.2. The statistical analyses performed also had a set up similar to that in Chapter 5. We hypothesised that the short SVLR context requires more vocal effort due to the conflict between the short SVLR-conditioning and durational lengthening required by the word-prosodic system. It is thus not an anticipatory effect of the following consonant. For this reason it would be useful to present the vocal effort measurements in their strength of association with vowel duration (rather than as a function of the following consonant). However, exploratory data analysis suggested that it is only sensible to perform tests based on measures of statistical association for stringent and prosodically homogenous subsets of data. The variable and more spontaneous child datasets do not satisfy these criteria. Therefore, we chose measures of difference (ANOVA’s) rather than of association (correlation analysis). However, we do present a bivariate correlation test in Section 6.3.1.1 based for one subset of the adult data for the vowel /i/ to exemplify our hypothesis. 200 6.3 Acquisition of Vocal Effort 6.3.1 A comparison of adult models 6.3.1.1 Unrounded vowel /i/ We examine the crosslinguistic differences in the implementation of the vocal effort pattern for the vowel /i/ compared between three postvocalic consonantal contexts triggering differential patterns of vowel duration in MSR, SSE and SSBE. The median values of three normalisation methods of RMS-power: i.e.. A2, A2*a and A2*b (dB) of /i/ for each speaker were entered in a mixed design ANOVA with “LANGUAGE” (SSE, SSBE, MSR) as a between-subject factor and the “FOLLOWING CONSONANT” as a within-subject factor. The factor “FOLLOWING CONSONANT” had three levels: i.e. voiced fricative, voiced stop and voiceless stop. Since the crosslinguistic comparison involves vowel /i/ with similar formant structure (see Table 3-8) the normalisation method A2*a of RMS-power is most suitable for this test. The results of the ANOVA are presented in Table 6-1. There was a significant main effect of the following consonant on the measures of A2*a and A2*b for the vowel /i/, and A2 almost reached significance. This effect showed that overall the following consonant influences vocal effort applied for /i/. The factor “LANGUAGE” showed no significant main effect. However, there was a significant interaction between the factors “LANGUAGE” and “FOLLOWING CONSONANT” which showed that the direction of the contextual effect on vocal effort depends on the language. Table 6-1 Summary of the ANOVA results for adult controls for the vocal effort measures in the vowel /i/. Main Effects Normalisation Method A2 A2*a A2*b Interaction Language * Following Following Consonant Language Consonant F(1.2,13.98)=4.152, p=.053 ns F(2.5,13.98)=3.866; p<.05 F(1.3,22)=4.152; p<.05 ns F(2.5,22)=3.866; p<.05 F(1.3,22)=6.277; p<.05 ns F(2.5,22)=6.755; p<.01 The mean values of A2, A2*a and A2*b (dB) and standard deviations for /i/ per consonantal context and language averaged for all the speakers are summarised in Appendix L. The direction of the interaction between consonantal context and languages 201 is shown in Figure 6-1. Russian monolingual speakers showed the opposite contextual effect compared to both SSE and SSBE. In the two English varieties, the context before voiced fricatives was produced with a lower A2*8a values (and accordingly vocal effort) compared to that before voiced and voiceless stops. The difference between the two contexts for is on average 5.6 dB in SSE, and 4.2 dB in SSBE. This crosslinguistic difference between MSR and SSE is of importance, because it shows that bilingual children have to acquire a differentiated control of the underlying vocal effort applied to the vowel in addition to the durational differences due to the postvocalic conditioning of vowel duration. -45 RMS-power around F2 (dB) -40 -35 -30 voiced fricative -25 voiced stop -20 voiceless stop -15 -10 -5 0 SSE adult MSR adult SSBE adult Figure 6-1 Crosslinguistic effect on vocal effort (based on A2*a measure, dB) produced by adults for the vowel /i/ as a function of the following consonant. The statistical results (in Table 6-1) across the three normalisation methods used for the analysis were somewhat different and yet consistent, since significant effects were obtained for the same factors and interactions. The intermeasure consistency was expected, given that the vowel /i/ is not much different in formant structure between the three languages. Besides, the adult group was quite homogeneous and was recorded in studio conditions. Therefore, the difference in significance levels between the method A2 (normalising for the intra- and interspeaker differences in overall intensity) and A2*a or A2*b (normalising for both formant frequency shifts and overall intensity) can be 202 explained by the intra- and interspeaker variation in vocal tract length and slight articulatory changes in the production of the vowel /i/. Figure 6-2 shows the association of the method A2*a (dB, on the Y-axis) inferring vocal effort and vowel duration (ms, on the x-axis) between SSE and MSR. In MSR (left panel), there was a highly significant positive correlation [r=.225, N=486, p<.01] between vocal effort and vowel duration. This means that the MSR speakers spent more vocal effort to produce vowels of longer duration. In SSE (on the right panel) there was a highly significant negative correlation [r=-.337; N=396, p<.01] between vocal effort and vowel duration meaning that the highest vocal effort was spent to produce the short SVLR vowel /i/. Figure 6-2 Correlation between the measure A2*a (dB) and vowel duration (ms) between MSR (left panel) and SSE (right panel) adults speakers. Furthermore, the individual results for SSE and MSR adult speakers in Figure 6-3 show that the SSE speakers were consistent in producing less vocal effort for long /i/ (before voiced fricatives) and in producing more effort for the short vowels. As opposed to that the Russian speakers were much less consistent in their pattern (compare R2, R3 and R5). As for vowel duration, for vocal effort there is a system in SSE, and a system is less obvious in MSR. This is the main crosslinguistic difference to keep in mind for the bilingual acquisition part of the study. 203 Figure 6-3 Individual results for SSE and MSR adults for the production of measure A2*a of vocal effort for the vowel /i/ as a function of the following consonant. 6.3.1.2 Vowel /i/ compared to // Stevens (1998, p.297) pointed out that there are more than just segmental differences to the tense/lax contrast. There are also differences in the laryngeal configuration involved: for non-low vowels. This implies that the amplitude of the spectrum above F1 tends to be higher for the lax American English vowels. The more breathy laryngeal configuration for tense vowels reduces spectrum intensity in midfrequencies (meaning that less vocal effort is spent to produce tense vowels), whereas less breathy laryngeal configuration for the lax vowels enhances the intensity of midfrequencies. Jessen (2002) found such an acoustic correlate for German tense/lax contrast and attributed it to “syllable-cut” (phonotactic “free” versus “checked”) differences between the vowels. It is the aim of this section to investigate whether the same change in laryngeal configuration (and vocal effort) applies to the SSE and SSBE tense/lax contrast for /i /. No crosslinguistic comparison is drawn to Russian, since the language does not feature the contrast. A background comparison to SSBE is interesting because, the SSBE tense/lax contrast is of greater importance in the number of vowel pairs involved compared to SSE. The median values of the measures A2, A2*a, A 2*b (dB) of /i/ and // for each speaker were entered in a mixed design ANOVA with “LANGUAGE” (SSE and SSBE) as a between-subject factor and the “TENSE/LAX VOWEL” as a within-subject factor. The vowels /i/ and // differ in formant structure, therefore A2*b normalisation for formant frequency shifts across the two vowels was most suitable for this test. No consonantal context effects were taken into account (see the list of carriers in Table 3-3), and all median values were calculated across the consonantal contexts. 204 The result of the ANOVA is summarised in Table 6-2. There were no significant main effects and no significant interactions for the measures of A2 and A2*a. However, there was a highly significant main effect of the vowel tenseness/laxness on the acoustic parameter A2*b which was based on the normalisation for formant frequency shifts performed across the two vowels. The direction of the differences between the vowels in the two English varieties is shown in Figure 6-4. There were no other significant main effects or interactions. The result confirms Steven’s (1998, p.297) point above, as well as replicating Jessen’s (2002) finding of laryngeal correlates of the tense/lax contrast for the German vowels /i/ and //. This means that apart from the vowel quality differences adult speakers in SSE and SSBE produced similar laryngeal changes: i.e. they adopted a less breathy laryngeal configuration (spent more vocal effort) for the lax vowel //, as opposed to a more breathy configuration for the tense vowel /i/. Table 6-2 Summary of the ANOVA results for the three normalisation methods of vocal effort for the tense/lax vowel pair /i / in adult SSE/SSBE speakers. Main Effects Normalisation Method A2 A2*a A2*b Tense/lax vowel ns ns F(1,7)=52.335, p<.01 Interaction Language * Language tense/lax ns ns ns ns ns ns -40 RMS-power around F2 (dB) -35 -30 -25 tense -20 lax -15 -10 -5 0 SSE SSBE Figure 6-4 Differences between vocal effort spent (based on mean A2*b, dB) to produce lax vowel // and tense vowel /i/ for 5 SSE and 4 SSBE adult speakers. 205 Table 6-3 SSE and SSBE adult means and standard deviations for three normalisation methods of vocal effort for the vowels /i/ versus //. Normalis ation method Vowel /i/ A2 // /i/ A2*a // /i/ A2*b // Std. n of Language Mean (dB) Deviation subjects SSE -24.60 1.71 SSBE -23.96 2.69 Total -24.32 2.07 SSE -23.11 4.82 SSBE -19.04 3.62 Total -21.30 4.59 SSE -25.07 1.85 SSBE -24.29 3.55 Total -24.72 2.57 SSE -23.00 4.02 SSBE -19.15 2.77 Total -21.29 3.88 SSE -28.83 1.85 SSBE -26.70 3.29 Total -27.88 2.65 SSE -15.97 4.30 SSBE -13.11 2.39 Total -14.70 3.70 5 4 9 5 4 9 5 4 9 5 4 9 5 4 9 5 4 9 Whether this change of laryngeal configuration is an intrinsic property of the tense/lax contrast (Stevens, 1998, p.297) or is due to language phonotactics and prosody (‘syllable-cut’) (Jessen, 2002) or has any other reason, monolingual and bilingual children have to acquire the segmental differences between tense/lax vowels and also the appropriate laryngeal configuration accompanying the contrast. 6.3.1.3 Rounded vowels We examine the crosslinguistic differences in the implementation of vocal effort for the rounded vowels /u / compared between three postvocalic consonantal contexts triggering differential patterns of vowel duration in MSR, SSE and SSBE. Since similar postvocalic consonantal conditioning applies to this set of vowels as for /i/, we expect to find similar crosslinguistic results as for the vowel /i/. The set up of the ANOVA was the same as for the vowel /i/ in Section 6.3.1.1. The methods involved were A2, A2*a, A2*c. Since the vowels /u / are different in formant structure, the A2*c method is most relevant in this test. 206 The results of the ANOVA are presented in Table 6-4. There was a significant main effect of the “FOLLOWING CONSONANT” on A2 and A2*c, and an almost significant effect for the measure of A2*a. This shows that overall the following consonant plays a significant role in the production of vocal effort for the close rounded vowel. There was no significant main effect of “LANGUAGE”. However, there was a highly significant interaction between the factors “LANGUAGE” and “FOLLOWING CONSONANT” for all three normalisation methods, showing that the direction of the contextual effect depends on the language (Figure 6-5). Unlike /i/, there was also a highly significant main effect of the factor “LANGUAGE” on the normalisation methods of A2 and A2*c (see Table 6-4). This effect is most plausibly due to the absolute difference in intensity levels between MSR and other two languages (see Figure 6-5 and Figure 6-6) and might be a result of a methodological side effect of comparing vowels crosslinguistically different in formant structure (see Table 3-8 comparing formants). There were no other significant main effects or interactions. The mean values for A2, A2*a, A2*c (dB) and standard deviations for the close rounded vowels per consonantal context and language averaged for all the speakers are summarised in Appendix M. The direction of the interaction is shown in Figure 6-5. The figure shows that the crosslinguistic pattern between MSR and SSE was very similar to that of /i/, i.e. the effect seems to be the opposite. In SSE, the speakers spent less effort in producing // before voiced fricatives and stops compared to that before voiceless stops. There was also a substantial cross-varietal difference between SSE and SSBE vowels in the vocal effort based on A2*c between the contexts of voiced fricatives and voiceless stops. This is not surprising, since in SSE the vowel // in the two contexts differs only in duration and the consonant following the vowel, while in SSBE there is an additional tense/lax /u / vowel contrast. The ratio of the overall difference for A2*c of the context before voiced fricatives compared to voiceless stops is 6.52 dB in SSE (a ratio similar to the SSE /i/), as opposed to the more substantial 22.3 dB in SSBE. Thus the tense/lax difference in A2*c does explain the extent of the difference in SSBE compared to SSE. However, since neither vowel quality differences are involved in the SSE // nor “syllable-cut” bounding (the vowel is “free”, it can occur in an open 207 syllable without coda), the differences in the adjustments of the laryngeal configuration must be due to some other reason, such as prominence cueing. Table 6-4 Summary of the ANOVA results for the three normalisation methods of vocal effort for the close rounded vowels in adults. Main Effects Normalisation Method Following Consonant A2 F(2,22)=4.597; p<.05 A2*a F(2,22)=3.346; p=.054 A2*c F(2,22)=15.567; p<.01 Interaction Language * Following Language Consonant F(2,11)=5.963, p<.01 F(4,22)=6.245; p<.01 ns F(4,22)=4.542; p<.01 F(2,11)=38.246, p<.01 F(4,22)=13.739; p<.01 -45 RMS-power arond F2 (dB) -40 -35 -30 voiced fricative -25 voiced stop -20 voiceless stop -15 -10 -5 0 SSE MSR SSBE Figure 6-5 Crosslinguistic effect on vocal effort (based on mean A2*c , dB) in the adult production of close rounded vowels as a function of the following consonant. As for /i/, individual results for the SSE and MSR adult speakers in Figure 6-6 show that the SSE speakers were consistent in producing less vocal effort for the long // (before voiced fricatives) and in producing more effort for the short ones before voiceless stops. However, the results for the context before voiced stops were less consistent than for /i/. As opposed to SSE, the MSR speakers are much less consistent in their patterns (compare R2, R1 and R5). Once again, there is a system in vocal effort for the SSE vowel //, and a system is lacking in MSR /u/. This is the main crosslinguistic difference to keep in mind for the bilingual acquisition part of the study. 208 Figure 6-6 Individual results for SSE and MSR adults for the production vocal effort (based on median A2*c, dB) for the close rounded vowels as a function of the following consonant. 6.3.1.4 Summary of results for monolingual adults The results of between-language analysis of variance showed that there were significant differences in the systematicity of changes of laryngeal configuration between MSR, SSE and SSBE for the vowel /i/. In the two English varieties, the context before voiced fricatives (long vowel) was produced with a lower A2*a values (and accordingly vocal effort) than the short vowel /i/ before voiceless stops. The difference between the two contexts is on average 5.6 dB in SSE, and 4.2 dB in SSBE. The context before voiced stops was usually produced with intermediate values of A2*a between the other two contexts in SSE, and was somewhat lower than the context before voiced fricatives in SSBE. All adults consistently showed these patterns. A very similar vocal effort pattern was found for the vowel //, which in SSE features the same SVLR conditioning as for the vowel /i/. For these two vowels, the average results for MSR showed a pattern of vocal effort opposite to SSE. Similarly to vowel duration patterns, individual MSR speakers varied and deviated from this trend in several instances for the vowel /i/, and showed substantial variation for the rounded vowel /u/. This shows that vocal effort in MSR is not connected to the postvocalic conditioning system, and probably serves exclusively for the purposes of increasing prominence or phonatory loudness. In fact for the MSR /i/ we observed a positive correlation between vocal effort and vowel duration: i.e. the increase in duration 209 was associated with increasing vocal effort, while in SSE the opposite pattern was observed. Similarly to crosslinguistic duration patterns, the main difference between Russian and SSE is that SSE features a fine-grained system of laryngeal contrasts depending on the duration of the vowel, while MSR seems to lack a system, since the adults produced very variable results. There are some issues that we would like to address in the discussion of these results. One of them is whether the observed systematicity in the vocal effort for /i/ and /u / should be attributed to segmental influences from the consonantal contexts (which admittedly differed in this study) or to the systems of prominence and their acoustic correlates involved. With regard to the laryngeal contrast between tense and lax vowels /i / we showed that adult speakers of both SSE and SSBE produced similar laryngeal changes: i.e. they adopted a less breathy laryngeal configuration (spent more vocal effort) to produce the lax vowel //, as opposed to a more breathy configuration (involving less vocal effort) for the tense vowel /i/. This result replicates Jessen’s (2002) findings for laryngeal correlates of the German tense/lax contrast. Similar laryngeal adjustment is found for the SSBE tense/lax pair /u/ and // in Section 6.3.1.3. For the SSE monolingual and SSE/Russian bilingual acquisition this means that appropriate laryngeal configuration changes should be acquired alongside languagespecific vowel quality differences and the system of duration. 6.3.2 SSE monolingual children 6.3.2.1 6.3.2.1.1 Vowel /i/ Group results We investigated whether the SSE monolingual children acquired the adult-like finegrained differences in vocal effort of SVLR vowel /i/ before voiced fricatives and voiced and voiceless stops. In adult speech, long vowels (before voiced fricatives) were produced with less vocal effort in prominent positions, while the short ones were produced with a 210 relatively less breathy laryngeal configuration (boosting RMS-power levels in midfrequencies). The intensity levels are represented by the values A2 (normalised for the overall intensity differences), A2*a (normalised for the overall intensity and formant frequency shifts within the target /i/ across speakers and languages), A2*b (normalised for the overall intensity and formant frequency shifts across the targets /i / and across child speakers and languages). To address the SSE monolingual acquisition (for the age 3;4 to 4;9) we entered the median values of A2, A2*a, A2*b (dB) as dependent variables in a mixed design ANOVA. The between-subject factor “AGE” had four levels: i.e. adult, child aged 3;4 to 3;11; child aged 4;0 to 4;4, child aged 4;5 to 4;9. The within-subject factor “FOLLOWING CONSONANT” had three levels: i.e. voiced fricative, voiced stop and voiceless stop. The results of the ANOVA are presented in Table 6-5. The results showed a highly significant main effect of the factor “FOLLOWING CONSONANT” irrespective of the other factors. There were no other significant main effects or interactions. All three normalisation methods showed the same level of significance. The direction of the main effect is plotted for the four age groups in Figure 6-7. The descriptive statistics for each of the groups are reported in Appendix N. Table 6-5 Summary of the ANOVA results for the three normalisation methods of vocal effort of the vowel /i/ in four SSE monolingual age groups. Normalisation Method A2 A2*a A2*b Main Effects Following Consonant F(2,22)=10.777; p<.01 F(2,22)=18.758; p<.01 F(2,22)=18.757; p<.01 Age ns ns ns Interaction Age * Following Consonant ns ns ns 211 Following consonant -35 -33 -31 A2*a (dB) of /i/ -29 adults -27 child 3;4 to 3;11 -25 child 4;0 to 4;4 -23 child 4;5 to 4;9 -21 -19 -17 -15 voiced fricative voiced stop voiceless stop Figure 6-7 Context dependent vocal effort pattern (based on mean A2*a dB) for the vowel /i/ produced by the SSE adults compared to three groups of children aged 3;4 to 4;9. The result showed that by the age of 3;4 the SSE monolingual children acquired the same fine-grained difference in producing vocal effort for /i/ in different consonantal contexts as the SSE adults: the short vowel /i/ in prominent positions was produced with an adjustment of laryngeal configuration towards a less breathy phonation resulting from an increase in vocal effort, while the long vowel before voiced fricatives was produced with a more breathy laryngeal configuration. The result shows that despite non-linear physiological differences the SSE children acquired the same vocal effort pattern as SSE adults already at the age of 3;4. 6.3.2.1.2 Individual results Individual results of the children for the context dependent pattern of vocal effort in /i/ are shown in Figure 6-8. The individual results are plotted on the x-axis. The patterns differed in extent, but like the SSE adults in Figure 6-3, the children produced a very consistent direction of the contextual differences in vocal effort. 212 SSE children -40 median A2*a (dB) -35 -30 -25 voiced fricative voiced stop voiceless stop -20 -15 -10 -5 C 7_ 4; 2 C 8_ 4; 2 C 7_ 4; 8 C 9_ 4; 9 4; 1 C 4_ 4; 0 C 5_ 4; 0 6_ C 3_ 3; 11 3; 8 C 4_ C C 3_ 3; 4 0 Figure 6-8 Individual SSE child results of vocal effort (based on median A2*a, dB) for the vowel /i/ as a function of the following consonant. 213 Vowel /i/ compared to // 6.3.2.2 6.3.2.2.1 Group results In Section 6.3.1.2 we showed that the adult speakers of both SSE and SSBE produced similar laryngeal differences between tense and lax vowels: i.e. they adopted a less breathy laryngeal configuration (spent more vocal effort) for the lax vowel //, as opposed to a more breathy configuration for the tense vowel /i/. This section investigates whether the SSE monolingual children acquired a similar difference in the vocal effort patterns for tense and lax vowels /i/ and // as the adults. We report the results for the three normalisation methods A2, A2*a, A2*b. However, since we compare two vowels different in formant structure the method A2*b is most suitable for this test. We entered the median values of A2, A2*a and A2*b (dB) as dependent variables in a mixed design ANOVA. The between-subject factor “AGE” had four levels: i.e. adult, child aged 3;4 to 3;11; child aged 4;0 to 4;4, child aged 4;5 to 4;9. The within-subject factor “TENSE/LAX VOWEL” had two levels: i.e. /i/ and //. The result of the ANOVA is reported in Table 6-6. The test showed a highly significant main effect of the factor “TENSE/LAX VOWEL”. All the normalisation methods showed the same level of significance. There was no significant main effect of “AGE”. The direction of the main effect is plotted for the four age groups in Figure 6-9. The descriptive statistics for each of the groups are reported in Appendix O. Table 6-6 Summary of the ANOVA results for the three normalisation methods of vocal effort of the vowels /i/ versus // produced by four SSE monolingual age groups. Normalisation Method A2 A2*a A2*b Main Effects Tense/Lax Vowel F(1,11)=33.778; p<.01 F(1,11)=47.244; p<.01 F(1,11)=163.763; p<.01 Age ns ns ns Interaction Age * Tense/Lax Vowel F(3,11)=3.717; p<.05 F(3,11)=5.658; p<.05 F(3,11)=4.465; p<.05 214 tense/lax vowel -35 -30 mean A2*b (dB) -25 adult -20 child 3;4 to 3;11 child 4;0 to 4;4 -15 child 4;5 to 4;9 -10 -5 0 /i/ /I/ Figure 6-9 Vowel dependent vocal effort (based on mean A2*a, dB) for the vowels /i/ versus // in SSE adults compared to three groups of children aged 3;4 to 4;9. Figure 6-9 shows that the SSE speakers of all age groups produced a very similar difference in A2*b measure between the tense /i/ and lax //. This result means that the SSE speakers of all ages adjusted their laryngeal configuration towards a less breathy phonation for the lax vowel compared to the tense one, and that this difference was highly significant. However, there was also a significant interaction (for all three normalisation methods) between the factors “AGE” and “TENSE/LAX VOWEL”. This interaction suggests that the extent of vocal effort between tense and lax vowels depended on the factor “AGE”. We consider the individual child-by-child results to determine whether this interaction showed a linear age pattern or was contributed by individual variation of the subjects. There were no other significant main effects or interactions. 6.3.2.2.2 Individual results Figure 6-10 plots the individual results of all the SSE children (by age on the Xaxis) on the acquisition of the difference in vocal effort between the tense and lax vowel pair /i/ and //. The individual results show that there was no age-dependent pattern in the difference between the tense and lax vowels for the cross-section of the children concerned: i.e. the involvement of vocal effort between the two vowels does not seem to 215 increase (or decrease) as a function of age. It is likely that the significant interaction between the factors “AGE” and “TENSE/LAX VOWEL” observed in the previous section is due to individual variation of the children contributing to the three age groups. However, as with adults the observed patterns are consistent throughout the individual results. Therefore, we can conclude that the SSE monolingual children acquired the laryngeal distinction between the tense and lax vowels in addition to differences in vowel quality and duration: i.e. similarly to adults they produced a more breathy laryngeal configuration for the tense /i/, and a less breathy configuration for the lax counterpart. The laryngeal adjustment resulted from applying different vocal effort pattern for the tense and lax vowels. SSE children -40 -35 median A2*b (dB) -30 -25 tense -20 lax -15 -10 -5 ;9 C 9_ 4 ;8 7_ 4 ;2 C 8_ 4 ;2 C 7_ 4 ;1 C C 4_ 4 ;0 5_ 4 ;0 C 6_ 4 C 3_ 3 ;8 C C 4_ 3 ;4 3_ 3 C ;1 1 0 Figure 6-10 Individual results for SSE children for the vocal effort differences (based on median A2*b, dB) between the tense/lax vowels /i /. 216 Vowel // 6.3.2.3 6.3.2.3.1 Group results We investigated whether the SSE monolingual children acquired adult-like differences in the vocal effort pattern for the SVLR vowel // in the contexts before voiced fricatives and voiced and voiceless stops. The expected differences were similar to those for the vowel /i/: i.e. the long vowels (before voiced fricatives) should be produced with a more breathy laryngeal configuration, the short ones with a less breathy configuration. To address the SSE monolingual acquisition we entered the median values of A2, A2*a, A2*c (dB) as dependent variables in a mixed design ANOVA. The set up of the test was the same as that in Section 6.3.2.1.1. We report on three normalisation methods A2, A2*a and A2*c, but expect the method A2*a to be most suitable for this test. The results of the ANOVA are presented in Table 6-7. There were no significant main effects or interactions for the measure of A2. For the measures A2*a and A2*c, the test showed a highly significant main effect of the factor “FOLLOWING CONSONANT”. The two normalisation methods showed the same level of significance. There was a significant main effect of the factor “AGE” for the measure A2*a. The direction of the main effects for the measure A2*a is plotted per consonantal context for the four age groups in Figure 6-11. The descriptive statistics for all three normalisation methods for each of the SSE age groups are reported in Appendix P. Table 6-7 Summary of the ANOVA results for the three normalisation methods of vocal effort for the vowel // in four SSE monolingual age groups. Main Effects Normalisation Method Following Consonant A2 ns A2*a F(2,22)=10.415; p<.01 Age ns F(1,11)=4.231; p<.05 Interactions Age * Following Consonant ns ns A2*c ns ns F(2,22)=10.415; p<.01 217 Following Consonant -35 -33 mean A2*a (dB) -31 -29 adult -27 child 3;4 to 3;11 -25 child 4;0 to 4;4 -23 child 4;5 to 4;9 -21 -19 -17 -15 voiced fricative voiced stop voiceless stop Figure 6-11 Context dependent vocal effort pattern (based on mean A2*a dB) for the vowel // in the SSE adults compared to three groups of children aged 3;4 to 4;9. There were no other significant main effects or interactions. The direction of the main effect of the “FOLLOWING CONSONANT” is similar in all age groups, and it is fairly consistent with the results of the acquisition of this pattern for the vowel /i/. The result suggests that the SSE monolingual children acquired a similar fine-grained control of laryngeal adjustments resulting in differentiated RSM-power levels around F2 of the vowel //: i.e. the children adopted a more breathy laryngeal configuration for the long vowel // before voiced fricatives, and spend more vocal effort to produce short vowels before voiceless stops. We ran Tukey HSD post-hoc tests to determine which groups contributed to the significant main effect of the factor “AGE” for the measure A2*a. The results indicated that the age differences were non-linear and there was only a significant (p<.05) difference between the groups “adults” and children aged 4;0 to 4;4. 218 6.3.2.3.2 Individual results Individual results for the SSE children are presented in Figure 6-12. SSE children -35 median A2*a (dB) -30 -25 voiced fricative voiced stop voiceless stop -20 -15 -10 -5 C 7_ 4; 2 C 8_ 4; 2 C 7_ 4; 8 C 9_ 4; 9 4; 1 C 4_ 4; 0 5_ C 4; 0 6_ C 3_ 3; 11 3; 8 C 4_ C C 3_ 3; 4 0 Figure 6-12 Individual SSE child results of vocal effort (based on median A2*a, dB) for the vowel // as a function of the following consonant. The individual results in Figure 6-12 show somewhat less consistent patterns than those for the vowel /i/. Subject C3 did not produce an adult-like pattern at either ages 3;4 or 3;11, neither did C6 aged 4;0. However, all other subjects did produce an adult-like pattern. The lesser consistency for the vowel // could be due to several reasons. First of all, in the SSE children the vowel // was less adult-like in quality than /i/. Secondly, the children differed qualitatively from the SSE adults in producing a broader phonetic range of vowel qualities. Thirdly, we did not provide a joint normalisation for the formant frequency shifts across adults and children. Therefore, a part of the differences in consistency (as well as the age effect) might be due to differences in the methodology used. However, despite all these potential reasons for consistency and the age effects observed for A2*a measure, the child results are largely in agreement with the acquisition pattern for the vowel /i/ and with the SSE adult results. 219 6.3.2.4 Summary of results for the SSE monolingual children We investigated whether the SSE children acquired the same fine-grained contextdependent differences in vocal effort as the SSE adults for the SVLR vowels /i/ and //, and the differences in laryngeal configuration between tense/lax /i/ and //. The results showed that despite non-linear physiological differences of respiratory and laryngeal systems between children and adults, the SSE children produced the patterns of vocal effort in a way similar to the adults at the age of 3;4 for all vowels concerned in this study. They systematically produced the short SVLR vowel /i/ (before voiceless stops) with an adjustment of laryngeal configuration towards a less breathy phonation resulting in a 3-4 dB higher intensity levels compared to the long /i/ (before voiced fricatives). Similar fine-grained phonetic system was acquired for the close rounded vowel //. However, the individual patterns of the children for // were more variable than those for the vowel /i/, since in three out of ten cases the children did not reproduce the adult pattern (while all of them produced it for /i/). The children acquired a substantial laryngeal distinction between the tense and lax vowel in addition to the difference in vowel quality and duration. As the adults, the SSE monolingual children produced a more breathy laryngeal configuration for the tense vowel /i/, and a less breathy configuration (involving more vocal effort) for the lax counterpart. The less breathy configuration in the child data was reflected in a substantial boost of intensities (RMS-power) by 16 to 27 dB in the acoustic spectrum depending on age compared to the average of 13 dB produced by the adults. The result is significant because it shows for the first time that pre-school children also perform fine-grained speech motor control in varying vocal effort in a way similar to adults, in addition to increases in phonatory loudness shown in previous studies (Strathopoulos & Sapienza, 1993; Strathopoulos, 1995; Traunmüller & Eriksson, 2000), and that this fine-grained speech motor control at laryngeal level of speech production was used for linguistic tasks. 220 6.3.3 Bilingual Acquisition 6.3.3.1 6.3.3.1.1 Subject AN SSE /i/ We assess whether the bilingual subject AN acquired the fine-grained differences in vocal effort in the SVLR vowel /i/ before voiced fricatives and voiced and voiceless stops in a way similar to the SSE monolingual peers. The set up of the ANOVA was the same as in the Section 5.3.3.1.1. The dependent variables for vocal effort were represented by the median values of the normalisation methods A2, A2*a, A2*b (dB). Since we assessed differences in vocal effort within the target /i/, the measure of A2*a is most suitable for this test. The results of the ANOVA are summarised in Table 6-8. The test showed a highly significant main effect of the “FOLLOWING CONSONANT” irrespective of the other factors for all three normalisation methods. There were no other significant main effects or interactions. This result shows that the bilingual subject AN acquired the system of vocal effort for the vowel /i/ in a way similar to the SSE monolingual peers. Table 6-8 Summary of the ANOVA results for the three normalisation methods of vocal effort for the SSE vowel /i/ produced by the bilingual subject AN as compared to the SSE monolingual peers. Normalisatio n Method A2 A2*a A2*b Main Effects Following Consonant Age F(2,14)=17.715; p<.01ns F(2,14)=18.426; p<.01ns F(2,14)=18.425; p<.01ns Bilinguality ns ns ns 221 -40 -40 -40 -35 -35 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 SSE child 3;4 to 3;11 -15 SSE child 4;0 to 4;4 AN 4;2 AN 3;8 -10 voiced stop voiceless stop SSE child 4;5 to 4;9 AN 4;5 -10 voiced fricative -15 -10 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-13 Vocal effort for the vowel /i/ (based on A2*a, dB) as a function of the following consonant produced by the subject AN as compared to the SSE monolingual peers in three age samples.15 The direction of the main context-dependent effect on vocal effort in /i/ is shown in Figure 6-13. The descriptive statistics for AN’s production are reported in Appendix Q. Similarly to the SSE adults and children, on average AN spent less vocal effort to produce the long /i/ before voiced fricatives than the short ones before voiced and voiceless stops. The results are consistent with AN’s language differentiation patterns for the vowel quality. There seem to be no language interaction effects for this variable, as opposed to AN’s vowel duration pattern for this vowel at the age of 3;8. Recall that for the postvocalic conditioning of the duration of /i/ at the age of 3;8 AN produced a reduced range of VLS/VF ratio compared to the monolingual peers. For the vocal effort pattern at different ages she produced VLS/VF ratios (based on A2*a, dB) of 3, 7 and 5 dB similar to the monolingual 3, 4, 4 dB. 15 SSE children’s group values are means of individual children’s median values in this Figure and in all subsequent Figures comparing SSE of bilingual and monolingual children, while the bilingual child’s results are represented by the median value. 222 6.3.3.1.2 SSE /i/ compared to // In Section 6.3.2.2 we showed that in addition to the vowel quality and duration differences, the SSE monolingual children also acquired a laryngeal contrast specific to the tense/lax vowels /i/ and //. The contrast involved producing a less breathy laryngeal configuration for the lax // and more breathy configuration for the tense /i/. In this section we assess whether AN acquired this contrast in a way similar to the SSE monolingual peers. The ANOVA set up was similar to that in Section 6.3.2.2.1. There was an additional between-subject factor “BILINGUALITY” with two levels: i.e. “bilingual” and “monolingual”. The factor “AGE” had three levels: i.e. “3;4 to 3;11”; “4;0 to 4;4”, “4;5 to 4;9”. The normalisation A2*b is most suitable for this test, since it involves a comparison of two vowels different in quality. The results of the ANOVA are summarised in Table 6-9. The descriptive statistics of AN’s production are reported in Appendix R. The test showed a highly significant main effect of “TENSE/LAX VOWEL”. There were no other significant main effects. Table 6-9 Summary of the ANOVA results for the three normalisation methods of vocal effort for the SSE vowel /i/and // produced by the bilingual subject AN compared to the SSE monolingual peers. Main Effects Normalisatio n Method A2 A2*a A2*b Tense/lax vowel Age F(1,7)=55.076, p<.01 ns F(1,7)=39.735, p<.01 ns F(1,7)=122.503, p<.01ns Interactions Tense/lax Bilinguality vowel*Age ns F(2,7)=7.744, p<.05 ns ns ns F(2,7)=4.908, p<.05 There was, however, a significant interaction between the factors “TENSE/LAX VOWEL” and “AGE”. The direction of the main effect and of the interaction is shown in Figure 6-14. AN produced a consistent tense/lax pattern that was very similar to the pattern of the SSE monolingual peers in all age samples. AN’s tense-lax ratios were 26.2 dB at the age of 3;8, 9.39 dB at the age of 4;2 and 21.6 dB at the age of 4;5. The interaction of the laryngeal contrast in the “TENSE/LAX VOWEL” with “AGE” is due to the age of 4;2. The interaction with age is not linear in time, thus it does not seem to be age-related. It shows that the laryngeal pattern is acquired; it is consistent but can vary in its extent. There were no other significant interactions. 223 -40 SSE child 3;4 to 3;11 AN 3;8 -35 -40 -35 SSE child 4;0 to 4;4 AN 4;2 -40 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 vowel /i/ vowel /I/ SSE child 4;5 to 4;9 AN 4;5 0 vowel /i/ vowel /I/ vowel /i/ vowel /I/ Figure 6-14 Vocal effort applied to /i/ and // (based on A2*b, dB across all consonantal contexts) produced by the bilingual subject AN and by the SSE monolingual peers of three age groups. Once again, AN produced a consistent language-specific pattern for SSE comparable to that of the monolingual peers. The similarity of AN’s speech production in this test is in line with her acquisition of vowel quality and vowel duration for the tense/lax contrast. It is also in line with the quantity of SSE input that she received in the community. This research variable does not seem to show any language interaction effects. 224 6.3.3.1.3 SSE // We assess whether AN acquired the patterns of vocal effort for the vowel // before voiced fricatives and voiced and voiceless stops in a way similar to the SSE monolingual peers. The set up of the ANOVA was similar to that in the Section 5.3.3.1.1. The dependent variables were vocal effort represented by the median values of the methods A2, A2*a, A2*c (dB). Since we assessed only the SSE target // and we already know that AN produced vowel quality ranges similar to the SSE monolingual peers (see Section 4.3.3.2.1), there are no substantial vowel quality changes involved in this comparison (apart from the issue of non-adult like segmental variability in SSE children). Therefore, the methodA2*a is most relevant for this test. The results of the ANOVA are summarised in Table 6-10. As opposed to /i/, the test revealed no significant main effect of the “FOLLOWING CONSONANT” on the vocal effort pattern for //. However, there was a significant main effect of the factor “BILINGUALITY” for the measure of A2*a (and an almost significant effect for A2*c). There were no other significant main effects or interactions. Table 6-10 Summary of the ANOVA results for the three normalisation methods of vocal effort for the SSE vowel // produced by the bilingual subject AN in comparison to the SSE monolingual peers. Normalisation Method A2 A2*a A2*c Main Effects Following Consonant ns ns ns Age ns ns ns Bilinguality ns F(1,7)=5.802; p<.05 F(1,7)=4.992; p=.061 The descriptive statistics for AN’s production of vocal effort are summarised in Appendix S. A comparison of AN’s production to that of the monolingual peers is plotted in Figure 6-15. The figure shows that despite the non-significance of the factor “FOLLOWING CONSONANT” AN produced an SSE-like pattern at the age of 3;8 and at the age of 4;2, whereby the RMS-power levels of the vowel // are lower in the long vowels before voiced fricatives compared to the short ones before voiceless stops. AN’s vocal effort pattern before voiced stops is less consistent which was also the case in the SSE monolingual children. However, AN’s pattern at the age of 4;5 is unlike the monolingual pattern, in fact it is the opposite and resembles more the MSR adult pattern shown in Figure 6-5. 225 The result might be due to child speech variability. We already discussed in Section 6.3.2.4 that the SSE vowel // is produced by monolingual (and bilingual) children with a greater range of phonetic variability in vowel quality than the target produced by the adults. There was also less consistency among the SSE monolingual children in producing the vocal effort pattern for // compared to /i/. We also do not know the exact effect of greater differences in vowel quality (SSE // versus MSR /u/) on the precision of normalisation used in this study. -40 -40 -40 -35 -35 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 SSE child 3;4 to 3;11 SSE child 4;0 to 4;4 AN 4;2 AN 3;8 voiced stop voiceless stop SSE child 4;5 to 4;9 AN 4;5 -10 -10 voiced fricative -15 -10 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-15 Vocal effort for the vowel // (based on mean A2*a, dB) as a function of the following consonant produced by AN in comparison to the SSE monolingual peers in the three age samples. 226 6.3.3.1.4 MSR/SSE differentiation for /i/ In Section 6.3.1.1 we showed that there is a fine-grained crosslinguistic difference in the realisation of the vocal effort between Russian and Scottish English: i.e. in SSE prominent syllables containing an extrinsically short vowel /i/ are produced with somewhat higher vocal effort (higher A2*a, dB) than the long vowels before voiced fricatives. Russian seemed to lack a system of vocal effort, since individual speakers varied more in their patterns than the SSE speakers, and the average effect of the following consonant on the vocal effort seemed to show a variable pattern. To establish whether AN produced the crosslinguistic difference in vocal effort for the vowel /i/ in different consonantal contexts, and whether there is any age effect for this difference, we entered all AN’s renditions of the words with the target /i/ in a multivariate ANOVA. The ANOVA had A2, A2*a andA2*b (dB) as dependent variables and three fixed factors: i.e. “FOLLOWING CONSONANT” (voiced fricative, voiced and voiceless stop), “LANGUAGE” (SSE and MSR) and “AGE” (3;8, 4;2 and 4;5). Since /i/ is crosslinguistically similar in vowel quality, the method A2*a is most relevant for this test. The set up of the ANOVA required the use of the mean values rather than median used in the test of the monolingual children. The results of the ANOVA are summarised in Table 6-11. The results showed a highly significant main effect of the factor “AGE”. We ran Tukey HSD posthoc tests for the measure A2*a to determine which age contributed to this effect. The result showed that there was a significant (p<.05) difference between AN’s results at the age of 4;2 compared to the ages of 3;8 and 4;5. Therefore, this age effect was not linear in time. There were no other significant main effects. However, there was a highly significant interaction between the factors “FOLLOWING CONSONANT” and “LANGUAGE”. The interaction means that for each of her languages AN produced a different vocal effort pattern depending on the following consonant, and that she differentiated between SSE and MSR in producing vocal effort for the vowel /i/. There were also highly significant interactions between the factors “FOLLOWING CONSONANT” and “AGE” on one hand, and “AGE” and “LANGUAGE” on the other. 227 Both interactions reflect a relative instability of AN’s vocal effort patterns throughout the age samples. There were no other significant interactions. Table 6-11 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*b, dB) for the vowel /i/ as a function of the following consonant produced by the bilingual subject AN in MSR and SSE Normali sation Main Effects Method Age F(2,602)=14.2, A2 p<.01 F(2,602)=13.38, A2*a p<.01 F(2,602)=13.38, A2*b p<.01 Interactions of the "Following Consonant" with Language Age F(2,602)=6.08, F(4,602)=2.8, p<.01 p<.05 F(2,602)=8.51, F(4,602)=4.5, p<.01 p<.01 F(2,602)=8.51, F(4,602)=4.5, p<.01 p<.01 Other Interactions Age*Language ns F(2,602)=4.4, p<.01 F(2,602)=4.4, p<.01 The direction of the crosslinguistic difference in AN’s production is plotted in Figure 6-16. Descriptive statistics for this test are reported in Appendix Q. The figure shows that at the ages of 4;2 and 4;5 AN differentiated between her MSR and SSE vocal effort in a way quite similar to the crosslinguistic pattern that we reported for the adult production in Figure 6-1. However, at the age of 3;8 the VLS-VF ratio in SSE was only 1.5 dB. The significance of the main effect “AGE” could potentially be explained by factors other than age. First of all, the variability of AN’s Russian pattern throughout time is greater than that of the SSE pattern. Therefore, the significant effect of the “AGE” and all interactions with the factor could be a side effect of the variability in the MSR pattern in connection to the following consonant. In Figure 6-17, we compare AN’s MSR vocal effort pattern to that of her mother producing read speech and to that of the experimenter during the games (spontaneous speech). Despite the absolute differences in intensity levels, AN’s patterns of vocal effort in different consonantal contexts are quite similar to those of both adults in different elicitation modes. We conclude that AN differentiated the vocal effort pattern for the vowel /i/ between SSE and MSR throughout at the three age samples. Based on the median results, she produced a language-specific pattern for SSE comparable to the SSE peers. There were no language interaction patterns observed for this variable. 228 -40 -40 SSE 3;8 MSR 3;8 -35 -40 SSE 4;2 MSR 4;2 -35 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 voiced fricative voiced fricative voiced stop voiceless stop MSR 4;5 -35 -30 0 SSE 4;5 voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-16 AN’s crosslinguistic production of vocal effort for the vowel /i/ (based on mean A2*a, dB) as a function of the following consonant (age is plotted from left to right). -40 MSR 3;8 mother MSR R3 CDS MSR -35 -40 MSR 4;2 mother MSR R3 CDS MSR -35 -40 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 voiced fricative voiced stop voiceless stop MSR 4;5 mother MSR R3 CDS MSR 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-17 A comparison of AN’s vocal effort for /i/ in different consonantal contexts in MSR (based on median A2*a, dB) to that of her mother and experimenter (R3 in child directed speech). 229 6.3.3.1.5 MSR/SSE differentiation for /u/ and // To establish whether AN produced a crosslinguistic difference in vocal effort applied to SSE // and MSR /u/ before different consonants and whether there was any age effect for this crosslinguistic difference, we entered all AN’s individual renditions of the carrier words with the targets // and /u/ in a multivariate ANOVA. The ANOVA had A2, A2*a andA2*c (dB) as dependent variables and three fixed factors: i.e. “FOLLOWING CONSONANT” (voiced fricative, voiced and voiceless stop), “LANGUAGE” (SSE and MSR) and “AGE” (3;8, 4;2 and 4;5). Since // and /u/ are crosslinguistically dissimilar in vowel quality, the normalisation method A2*c should be most relevant for this test. The set up of the ANOVA required the use of mean values for each condition rather than median used in the comparison to the SSE monolingual children. The results of the test are presented in Table 6-12. The test showed highly significant main effects for the factors “FOLLOWING CONSONANT”, “LANGUAGE” and “AGE” for the methods A2*a and A2*c. There was also a highly significant interaction between the factors “FOLLOWING CONSONANT” and “LANGUAGE” for the same methods. There were no other significant main effects or interactions. The direction of the main effects of the postvocalic conditioning on vocal effort in AN’s crosslinguistic production for // and /u/ is shown in Figure 6-18. The descriptive statistics for AN’s crosslinguistic production of vocal effort are found in Appendix S. Table 6-12 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*c, dB) for the SSE vowel // and MSR /u/ as a function of the following consonant produced by the bilingual subject AN in MSR and SSE. Normali Main Effects sation Following Method Consonant F(2,516)=4.382, p<.05 A2 F(2,516)=9.510, A2*a p<.01 F(2,516)=9.458, A2*c p<.01 Interactions Language * Following Consonant Language Age F(1,516)=36.467 , F(2,516)=3.523, p<.01 p<.05 ns F(1,516)=48.546, F(2,516)=6.222, p<.01 p<.01 F(2,516)=11.557, p<.01 F(1,516)=44.363, F(2,516)=6.139, p<.01 p<.01 F(2,516)=11.649, p<.01 230 -40 SSE 3;8 MSR 3;8 -35 -40 SSE 4;2 MSR 4;2 -35 -40 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 voiced fricative voiced stop voiceless stop MSR 4;5 -35 -30 0 SSE 4;5 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-18 AN’s crosslinguistic production of vocal effort for SSE // and MSR /u/ (based on mean A2*c, dB) as a function of the following consonant (age is plotted from left to right). The results showed that overall at age 3;8 and 4;2 AN differentiated between SSE and MSR in context-dependent vocal effort for the vowels // and /u/. The patterns for these age samples seem to be acquired in the language-specific direction similar to adults in Figure 6-5. The significant effects of the factor “LANGUAGE” and significant interaction between the “FOLLOWING CONSONANT” and “LANGUAGE” confirm the pattern in the figure. The vocal effort patterns in the three MSR age samples are all different, and agree with the lack of system for the Russian language pattern in AN’s production of vocal effort. However, AN’s crosslinguistic pattern at the age of 4;5 does not seem to show any language differentiation, and seems to follow the Russian pattern (see Figure 6-19). This fact and the variability of the Russian patterns possibly contributed to the significant main effect of the factor “AGE”. We showed in Section 6.3.3.1.3. that AN’s SSE pattern at the age of 4;5 differed from the monolingual results. From the longitudinal perspective this pattern at the age of 4;5 does not make sense, since at earlier ages of 3;8 and 4;2 AN did differentiate between the two languages. This pattern could be explained by the individual variability in the production of vocal effort for this vowel set, since the pattern also varied 231 between the individual SSE monolingual children. Therefore, we do not consider the possibility of language interaction from Russian in this age sample. Overall, the AN’s data at age 3;8 and 4;2 for the close rounded vowels is in agreement with this subject’s language differentiation of vocal effort patterns for the vowel /i/, and /i/ and //. -40 MSR 3;8 mother MSR R3 CDS MSR -35 -40 MSR 4;2 mother MSR R3 CDS MSR -35 -40 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop MSR 4;5 mother MSR R3 CDS MSR voiced fricative voiced stop voiceless stop Figure 6-19 A comparison of AN’s vocal effort for /u/ in different consonantal contexts in MSR (based on median A2*c, dB) to that of her mother (reading) and experimenter (R3 in spontaneous speech). 6.3.3.1.6 Summary of AN’s results The results of the acquisition of crosslinguistic vocal effort patterns for the bilingual subject AN suggest that overall she differentiated between her two languages in a way similar to the monolingual speakers. We observed no language interaction effects for this variable. First of all, AN acquired the vocal effort pattern connected to the interaction of SVLR and prominence for the vowel /i/ in a way similar to the SSE monolingual peers. As the monolingual children, she produced VLS-VF ratios (based on median A2*a, dB) of 3, 7 and 5 dB in the longitudinal age samples similar to the monolingual 3, 4, 4 dB. At the same time, AN produced significantly different and language-specific patterns of vocal effort for the vowel /i/ in the direction similar to that in the crosslinguistic adult data. 232 Unlike for the acquisition of SVLR (at age 3;8), no language interaction effects were observed for this variable. There were also no age effects. This means that AN produced the vocal effort patterns by the age of 3;8. Secondly, AN acquired a language-specific SSE pattern of laryngeal adjustment for the tense/lax contrast between the vowels /i/ and //, whereby AN produced significantly higher vocal effort (based on A2*b, dB) for the lax vowel // compared to the tense /i/. In physiological terms, the acoustic results indicate that the subject produced a less breathy phonation for the lax vowel and a more breathy phonation for the tense vowels. Her results closely matched the results of the SSE monolingual children. AN’s tense-lax ratios were 26.2 dB at the age of 3;8, 9.39 dB at the age of 4;2 and 21.6 dB at the age of 4;5 compared to 26.5, 16 and 21 dB of the monolingual peers. The results are consistent with AN’s language differentiation patterns for the vowel quality of the tense/lax contrast. Unlike for the segmental quality of /i/ and //, and similarly to context dependent vocal effort for /i/ this suprasegmental variable showed no language interaction effects. With regard to AN’s acquisition of the context-dependent vocal effort pattern for the close rounded vowels // and /u/, the results showed that at age 3;8 and 4;2 AN differentiated between her two languages. The patterns in these age samples seem to be acquired in the language-specific direction similarly to adults and to the patterns observed for AN’s production of vocal effort for the vowel /i/. Significant effects of the factor “LANGUAGE” and a significant interaction between the “FOLLOWING CONSONANT” and “LANGUAGE” confirmed the adult pattern. The vocal effort patterns in the three MSR age samples were different, and they agree with the lack of system for the Russian language pattern in AN’s production of vocal effort. However, AN’s crosslinguistic pattern at the age of 4;5 did not seem to show any language differentiation. It was unlike the overall SSE pattern of the monolingual children, and it seemed to follow the Russian pattern. However, it did fall within the ranges of individual variation of the SSE monolingual children. Several considerations arose around AN’s vocal effort pattern at the age of 4;5 for the close rounded vowels. First of all, the pattern did not make sense from the longitudinal perspective, since at earlier ages of 3;8 and 4;2 AN did produce a crosslinguistic difference, and her SSE pattern did not significantly differ from that of the monolingual children. Secondly, the SSE monolingual results for the SSE vowel // showed that not all 233 the children produced the pattern (see Figure 6-12): i.e. subject C3 aged 3;4 and 3;11 consistently did not produce it in the two age samples, neither did C6 aged 4;0. Therefore, the question arises whether this variability of results reveals a non-obligatory nature of this vocal effort pattern (i.e. Is it just a tendency?), or whether it reveals methodological problems in connection to measuring spectral balance in child speech generally, or normalising for formant frequency shifts between vowels too different in formant structure, in particular. After all, the pattern of vocal effort did seem to be more systematic for the unrounded vowel /i/ which is also crosslinguistically similar in formant structure. We return to these questions to some extent in the discussion chapter. At this point suffice it to state that given the uncertainly about the methodological issues, we cannot accept AN’s vocal effort pattern at the age of 4;5 as evidence for language interaction from Russian in SSE. 234 6.3.3.2 6.3.3.2.1 Subject BS SSE /i/ We assess whether the bilingual subject BS acquired the system of vocal effort connected to interaction of SVLR of the vowel /i/ and prominence in a way similar to the age-matched SSE monolingual children. The set up of the ANOVA was the same as in Section 6.3.3.1.1. for the bilingual subject AN. Similarly, the dependent variables were vocal effort represented by the median values of the normalisation methods A2, A2*a, A2*b (dB). Since we assessed only target /i/ and there are no vowel quality changes involved in this comparison, the A2*a measure is most suitable for this test. The results of the ANOVA are summarised in Table 6-13. The test showed a highly significant main effect of the factor “FOLLOWING CONSONANT” for all three normalisation methods similarly to the tests of the SSE monolingual children. There was also a significant main effect of the factor “BILINGUALITY”, but no significant interaction between the factors “FOLLOWING CONSONANT” and “BILINGUALITY” for the measures of A2*a and A2*b. This lack of interaction showed that the direction of the main effect of the “FOLLOWING CONSONANT” was the same in all age groups irrespective of the factor “BILINGUALITY”, and there was only a difference in the absolute level of the RMS-power observed after normalisation for the formant frequency shifts. The difference between BS’ and the SSE peers’ production for this variable per age is shown in Figure 6-20. Descriptive statistics for BS are reported in Appendix Q. There were no other significant main effects or interactions. The results showed that despite the differences in absolute RMS-power levels measured between BS and the SSE monolingual children (accounting for the significant effect of the factor bilinguality), BS acquired the SSE pattern of vocal effort for the vowel /i/ in a way similar to the SSE monolingual peers. The results are consistent with BS’ acquisition of the vowel quality for /i/. However, we showed in Section 5.3.3.2.1 that BS did not start to acquire the SVLR pattern for the vowel duration until the age of 4;5, when she produced an insignificant SVLR-like difference in the language-specific direction (but not yet in extent). 235 Table 6-13 Summary of the ANOVA results for the normalisation methods of vocal effort for the SSE vowel /i/ produced by the bilingual subject BS as compared to the SSE monolingual peers. Normalisat Main Effects ion method Following Consonant A2 F(2,14)=10.798; p<.01 A2*a F(2,14)=19.351; p<.01 F(2,14)=19.351; p<.01 A2*b Interactions Following Consonant * Bilinguality Bilinguality F(1,7)=5.845, p<0.05 F(2,14)=7.061; p<.01 F(1,7)=5.668, p<0.05 ns F(1,7)=6.936, p<0.05 ns Age ns ns ns -40 -40 -40 -35 -35 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -10 -15 BS 3;4 BS 3;8 SSE child 3;4 to 3;8 SSE child 3;9 to 4;1 voiced fricative voiced stop voiceless stop -10 voiced fricative voiced stop BS 4;5 voiceless stop -10 SSE child 4;2 to 4;9 voiced fricative voiced stop voiceless stop Figure 6-20 Vocal effort for the vowel /i/ (based on mean A2*a, dB) as a function of the following consonant produced by the subject BS in comparison to the age-matched SSE monolingual children in three age samples. Despite that, it seems that at the age of 3;4 BS produced the SSE vocal effort pattern in a way similar to the monolingual peers: she produced a less breathy phonation mode for the vowel before voiceless stops as opposed to two other contexts. Recall that BS did not produce language-specific patterns of postvocalic consonantal conditioning of duration compared to the monolingual peers. For the vocal effort pattern at different ages she produced VLS-VF ratios (based on median A2*a, dB) of 4, 3 and 6 dB similar to the monolingual 3, 4, 4 dB. This fact suggests that BS acquired the vocal effort pattern prior to the SVLR pattern for this vowel. 236 6.3.3.2.2 SSE /i/ compared to // We investigated whether BS acquired the laryngeal configuration for the SSE tense/lax vowels /i / in a way similar to the monolingual peers. The laryngeal contrast involves producing a less breathy laryngeal configuration for the lax vowel // and more breathy one for the tense vowel /i/. In acoustic terms it involves higher A2*b levels (dB) for the lax vowels and lower A2*b levels for the tense vowels. The ANOVA set up of the same as in Section 6.3.3.1.2 except that the factor “AGE” was changed to match BS age samples: i.e. “3;4 to 3;11”; “4;0 to 4;4”, “4;5 to 4;9”. We report for the three normalisation methods A2, A2*a and A2*b (dB). The measure A2*b is most relevant for this test, since it involves comparison between two vowels different in formant structure. The results of the ANOVA are summarised in Table 6-14. The descriptive statistics of BS’ production are reported in Appendix R. The test showed a highly significant main effect of the “TENSE/LAX VOWEL” for the measures of A2 and A2*b (and an almost significant effect for A2*a). This means that BS acquired the tense/lax pattern of vocal effort in a way similar to the monolingual peers. However, there was a highly significant interaction between the factors “TENSE/LAX VOWEL” and “BILINGUALITY”. There were no other significant main effects or interactions. Table 6-14 Summary of the ANOVA results for vocal effort for the SSE vowel /i/and // produced by the bilingual subject BS in comparison to SSE monolingual peers. Normalisatio n method A2 A2*a A2*b Main Effects Tense/lax vowel F(1,7)=25.998, p<.01 F(1,7)=5.311, p=.055 F(1,7)=45.575, p<.01 Interactions Age Bilinguality Tense/lax vowel * Bilinguality ns ns F(2,7)=11.166, p<.05 ns ns F(2,7)=20.532, p<.01 ns ns F(2,7)=6.186, p<.01 The direction of the interaction is shown in Figure 6-21. The figure shows that even though BS produced a difference in laryngeal configuration between the tense and lax vowels in the language-specific direction, the difference did not reach the same extent compared to the monolingual children. This explains the significant interaction between the factors “TENSE/LAX VOWEL” and “BILINGUALITY”. BS produced the tense-lax ratio for A2*b between 2 and 6 dB compared to 17 to 28 dB produced by monolingual peers. Considering the results of BS’ acquisition of segmental quality of the lax vowel // (Section 4.3.4.1), this finding for its vocal effort is not surprising, since BS produced only 35% of adult targets // as [], while the rest as [i]. 237 The dotted line on the Figure 6-21 shows BS’ median A2*b values of the 35% of the vowels auditorily labeled as phonetically tense and lax (compared to the 100% phonological targets represented by the solid line). Surprisingly, there is little difference in the realisation of the vocal effort measures between the vowels [i] and [] and all adult targets /i/ and // produced by BS. In fact, the ANOVA based on the vocal effort measures for phonetic labels [i] and [] showed the same levels of significance as in Table 6-14. This shows that BS started to acquire the laryngeal contrast irrespective of the vowel quality, and that the two properties are not necessarily intrinsically bound to each other. -40 SSE child 3;4 to 3;8 BS 3;4 /i/ /I/ BS 3;4 [i] [I] -35 -40 SSE child 3;9 to 4;1 BS 3;10 /i/ /I/ BS 3;10 [i] [I] -35 -40 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 i I SSE child 4;2 to 4;9 BS 4;5 /i/ /I/ BS 4;5 [i] [I] 0 i I i I Figure 6-21 Vocal effort of the vowel /i/ and // (based on mean A2*c, dB) produced by the bilingual subject BS compared to the SSE monolingual peers in three age samples (BS’ target /i/and // are plotted separately from the phonetic labels [i] []). 238 The reduced extent of BS’ differentiation of vocal effort between the SSE tense and lax vowels parallels her patterns of postvocalic vowel duration conditioning in SSE, and is in line with her substantially greater exposure to Russian. Yet it is interesting to note that BS did produce a systematic difference in the laryngeal configuration between the two targets /i/ and //, despite the relative lack of differentiation in duration at the age of 3;4. It may indicate that BS was in the process of acquisition of the language-specific laryngeal configuration of tense and lax vowels, and that this happened prior to the acquisition of the postvocalic conditioning of duration. 6.3.3.2.3 SSE // We assess whether BS acquired the system of vocal effort connected to interaction of SVLR of the vowel // and prominence in a way similar to the SSE monolingual peers. The set up of the ANOVA was the same as in Section 6.3.3.1.3, except that the factor “AGE” had levels specifically matched to BS’ age samples “3;4 to 3;11”, “4;0 to 4;4”, “4;5 to 4;9”. The test showed no significant main effects or interactions for any of the normalisation methods reflecting vocal effort (A2, A2*a and A2*c, dB). The descriptive statistics for BS speech production are reported in Appendix S. The median values of A2*a of // for each of the consonantal contexts are plotted in Figure 6-22. The lack of significance for the factor “FOLLOWING CONSONANT” which we systematically observed in the monolingual tests for the SSE adults, monolingual children and the subject AN can be explained by a joint effect of (1) BS’ quite variable longitudinal production (see Figure 6-22), (2) the fact that the monolingual children were also variable in producing the vocal effort pattern for //; and (3) by a relatively small sample of children tested in this study. 239 -40 -40 -35 BS 3;4 -40 SSE child 3;9 to 4;1 SSE child 3;4 to 3;8 -35 BS 3;10 SSE child 4;2 to 4;9 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 voiced fricative voiced stop voiceless stop BS 4;5 -10 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-22 Vocal effort for the vowel // (based on mean A2*a, dB) as a function of the following consonant produced by the subject BS in comparison to the SSE monolingual peers in three age samples. The lack of significance effects in this test revealed that BS had not acquired the fine-grained SSE vocal effort pattern for //. This is in line with the fact that she had not yet acquired the SVLR pattern for this vowel, and with the fact that she produced a relatively high percentage (17%) of phonetically back vowels for SSE // compared to the monolingual children. However, we should also remember that not all of the monolingual children produced this vocal effort pattern by the age of 4;0, and that generally this vowel was less adult-like in vowel quality than the vowels /i/ or //. Thus, the result of this test does not necessarily prove that BS was different in producing vocal effort if we keep in mind the individual results of the monolingual peers. 240 6.3.3.2.4 MSR/SSE differentiation for /i/ To establish whether BS produced a crosslinguistic difference in vocal effort for the vowel /i/ in SSE and MSR in different consonantal contexts and whether there was any age effect for this crosslinguistic difference, we entered all BS’ renditions of the words with the target /i/ in a multivariate ANOVA. The ANOVA had A2, A2*a andA2*b (dB) as dependent variables and three fixed factors: i.e. “FOLLOWING CONSONANT” (voiced fricative, voiced and voiceless stop), “LANGUAGE” (SSE and MSR) and “AGE” (3;4, 3;10 and 4;5). Since the SSE and MSR targets /i/ are similar in formant structure, the normalisation method A2*a is most suitable for this test. As in AN’s tests, the multivariate ANOVA required the use of mean values. The results of the tests are summarised in Table 6-15. The results showed a highly significant main effect of the factor “AGE” and significant main effects for the factors “LANGUAGE” and “FOLLOWING CONSONANT” for the methods A2*a and A2*b. There were no significant interactions between any of the factors. Since there was no significant interaction between the “FOLLOWING CONSONANT” and BS’ languages, the two languages were not as clearly differentiated as in the monolingual adults and AN. Even though the factor “LANGUAGE” was significant, the significant main effect of the factor “FOLLOWING CONSONANT” showed that the direction of the consonantal effect on vocal effort of the vowel /i/ was largely in the same direction in both of BS’ languages. The direction of the main effects per age and language is plotted in Figure 6-23. The descriptive statistics are presented in Appendix R. The figure shows that BS’ MSR vocal effort in the three age samples had variable patterns, while BS’ SSE vocal effort pattern systematically showed lower A2*a levels for the vowel /i/ before voiced fricatives as opposed to both contexts before voiced and voiceless stops. The crosslinguistic difference became greater at the age of 4;5 which should at least partly account for the highly significant age effect. 241 Table 6-15 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*b, dB) for the vowel /i/ produced by the bilingual subject BS in MSR compared to SSE. Normalisation Main Effects Method Following Consonant A2 ns A2*a F(2,610)=3.066, p<.05 A2*b F(2,610)=3.066, p<.05 -40 SSE 3;4 MSR 3;4 -35 Age F(1,610)=5.909, p<.05 F(1,610)=10.455, p<.01 F(1,610)=10.717, p<.01 -40 SSE 3;10 MSR 3;10 -35 -40 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 voiced fricative voiced stop voiceless stop SSE 4;5 MSR 4;5 -35 -30 0 Language ns F(2,610)=3.313, p<.05 F(2,610)=3.313, p<.05 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-23 BS’s crosslinguistic production of vocal effort for the vowel /i/ (based on mean A2*a, dB) as a function of the following consonant (age is plotted from left to right). The vocal effort patterns in Figure 6-23 show a longitudinal progression towards greater language differentiation at from the age of 3;4 to the age of 4;5 explaining the significance of the main effect of “AGE”. Like AN, irrespective of her age BS seemed to spend less vocal effort to produce the vowels before voiced fricatives compared to the contexts before voiced and voiceless stops in SSE, while in Russian she had quite variable patterns in the three age samples A comparison of BS’ production of context specific vocal effort shown in Figure 6-24 to that of her mother and of the Russian-speaking investigator in child-directed speech shows that at age 4;5 BS vocal effort pattern was quite similar to that of the adults. 242 -40 -40 -40 -35 -35 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 MSR 3;4 mother MSR R3 CDS MSR -5 0 MSR 3;10 mother MSR R3 CDS MSR -5 0 voiced fricative voiced stop voiceless stop MSR 4;5 mother MSR R3 CDS MSR -5 0 voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-24 A comparison of BS’s vocal effort for /i/ in different consonantal contexts in MSR (based on A2*a, dB) to that of her mother (read speech) and experimenter (R3 spontaneous speech). Overall, despite the lack of significant interaction between “FOLLOWING CONSONANT” and “LANGUAGE”, the patterns and results of the tests observed in this section suggest that BS differentiated to a certain degree between her two languages from the age of 3;4 for the production of vocal effort based on mean A2*a, and that this differentiation became more substantial at the age of 4;5. 6.3.3.2.5 MSR/SSE differentiation for /u/ and // To establish whether BS produced a crosslinguistic difference in vocal effort between the SSE vowel // and MSR /u/ in different consonantal contexts, and whether there was any age effect for this crosslinguistic difference, we entered all BS’ individual renditions of the carrier words with the close rounded targets in a multivariate ANOVA. The ANOVA had A2, A2*a andA2*c (dB) as dependent variables and three fixed factors: i.e. “FOLLOWING CONSONANT” (voiced fricative, voiced and voiceless stop), “LANGUAGE” (SSE and MSR) and “AGE” (3;4, 3;10 and 4;5). Since the SSE // and 243 MSR /u/ are dissimilar in formant structure, the normalisation method A2*c is most relevant for this test. The results of the test are summarised in Table 6-16. The test showed significant main effects for the factors “FOLLOWING CONSONANT”, “LANGUAGE” and “AGE” for the methods A2*a and A2*c. There was a highly significant interaction between the factors “FOLLOWING CONSONANT” and “LANGUAGE”. However, it was only observed for the method A2, and not A2*a or the more relevant A2*c. The direction of the main effects of the postvocalic conditioning on vocal effort in BS’ crosslinguistic production for // and /u/ is shown in Figure 6-25. The descriptive statistics for BS’ crosslinguistic production of vocal effort are reported in Appendix S. The age-specific language differentiation patterns showed that BS seemed to differentiate between her languages in absolute levels of A2*a, but not in the direction depending on the following consonant. This pattern is different from the crosslinguistic patterns in the adult data in Figure 6-5. Besides both languages seemed to vary considerably depending on BS’ age, therefore, we cannot speak of a system in either language patterns. The significant main effect of the factor “LANGUAGE”, and the lack of interaction between the factors “LANGUAGE” and “FOLLOWING CONSONANT” supported the pattern observed in Figure 6-25 that BS seemed to differentiate between the languages in the absolute levels of vocal effort but not in the direction depending on the following consonant. This result contradicts BS’ language differentiation in vocal effort for the vowel /i/, as well as it parallels the non-differentiation pattern of vocal effort observed for the subject AN for the same vowel set. Table 6-16 Summary of the ANOVA results for the normalisation methods of vocal effort (A2, A2*a, A2*c, dB) for the SSE vowel // and MSR /u/ as a function of the following consonant produced by the bilingual subject BS in MSR and SSE. Normal isation Main Effects Metho Following d Consonant Age F(2,491)=5.517, A2 p<.01 ns A2*a A2*c Interactions of Following Consonant with Language Age Language F(2,491)=9.348 p<.01 Ns ns F(1,491)=132.978, F(2,491)=5.251, F(2,491)=3.8, p<.01 F(4,491)=2.50 F(1,491)=124.598, 5, p<.05 p<.01 p<.05 ns p<.01 244 -40 -40 SSE 3;4 MSR 3;4 -35 -40 SSE 3;10 MSR 3;10 -35 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 voiced fricative voiced stop voiceless stop MSR 4;5 -35 -30 0 SSE 4;5 voiced fricative voiced stop voiced fricative voiceless stop voiced stop voiceless stop Figure 6-25 BS’ crosslinguistic production of vocal effort for SSE // and MSR /u/ (based on mean A2*c, dB) as a function of the following consonant (age is plotted from left to right). -40 MSR 3;4 mother MSR R3 CDS MSR -35 -40 MSR 3;10 mother MSR R3 CDS MSR -35 -40 -35 -30 -30 -30 -25 -25 -25 -20 -20 -20 -15 -15 -15 -10 -10 -10 -5 -5 -5 0 0 0 voiced fricative voiced stop voiceless stop MSR 4;5 mother MSR R3 CDS MSR voiced fricative voiced stop voiceless stop voiced fricative voiced stop voiceless stop Figure 6-26 A comparison of BS’s vocal effort for /u/ in different consonantal contexts in MSR (based on A2*a, dB) to that of her mother (read speech) and experimenter (R3, in spontaneous speech). 245 A comparison of BS’ production of the context dependent vocal effort in MSR to her mother’s patterns and to those of the Russian-speaking investigator in more spontaneous data elicitation mode is shown in Figure 6-26. BS’ pattern is quite dissimilar to that of her mother, but it is very similar to that of the investigator. This discrepancy shows again that the Russian pattern of vocal effort might be more variable than the SSE one. Both BS and the investigator produced the utterances in the same elicitation mode, and that might explain the similarity of their patterns. 6.3.3.2.6 Summary of BS’ results The results of the acquisition of crosslinguistic vocal effort patterns for the bilingual subject BS were variable depending on the vowel set concerned. First of all, BS acquired the context-dependent vocal effort pattern for the vowel /i/ in a way similar to the SSE monolingual peers. Like the SSE children, in the three age samples she produced VLS-VF ratios (based on median A2*a, dB) of 4, 3 and 6 dB similar to the monolingual 3, 4, 4 dB. At the same time, BS seemed to differentiate between her two languages to a less significant extent, though she produced the SSE and MSR patterns of vocal effort for the vowel /i/ in the direction similar to that observed in the crosslinguistic adult data: i.e. less breathy laryngeal configuration for the /i/ before voiceless stops as opposed to two other contexts. No language interaction effects were observed for this variable. This means that BS produced language-specific vocal effort patterns for the vowel /i/ by the age of 3;4. The fact that BS seemed to have acquired the vocal effort pattern for the vowel /i/ is surprising, given that BS did not start differentiating between the crosslinguistic postvocalic vowel duration conditioning pattern until the age of 4;5, and she seemed to produce the postvocalic conditioning of vowel duration in SSE according to the Russian model. Therefore, this result may suggest that BS’ acquisition of the suprasegmental laryngeal contrast in SSE precedes her acquisition of the language-specific timing. For the laryngeal difference between the SSE tense and lax vowels /i/ and //, BS produced a difference between the two vowels in the language-specific direction, but she had not reached the same extent of the difference compared to the monolingual children. BS produced the tense-lax ratio for A2*b between 2 and 6 dB compared to 17 to 28 dB produced by the monolingual peers. Considering the results of BS’ acquisition of vowel quality discussed in Section 4.3.4.1, this finding at the laryngeal level of the tense/lax 246 vowel contrast is not surprising, since BS produced only 35% of adult targets // as [], while the rest as [i]. This means that BS neither fully differentiated between the vowel quality nor between the laryngeal configuration accompanying it. Her results showed that she was in the process of acquisition of both segmental and laryngeal differences between the vowels, but she had not yet fully acquired either of them. The question arises whether the acquisition of segmental vowel quality is a necessary condition for the acquisition of the accompanying laryngeal difference or whether it is it the other way around. Analysis of a subset of BS’ vowels actually produced as [i] and [] (rather than all targets /i/ and //) did not seem to change the picture at the laryngeal level. This suggests that the two levels of BS’ tense/lax vowels: vowel quality and accompanying vocal effort are acquired separately, and that acquisition of postvocalic conditioning of duration is yet to start. The fact that BS did not reach the same extent of tense/lax vowel differentiation at the laryngeal level as the SSE monolingual children reminds of the patterns for the vowel duration observed in Kehoe’s (2002) study that showed that the German/Spanish bilingual children aged 2;3 to 2;6 produced a significantly smaller extent of the durational difference between short and long vowels than the German monolingual children. The third acquisition pattern was observed between the consonant dependent vocal effort for the SSE vowel // and the MSR /u/. The ANOVA showed no language-specific language differentiation for this variable based on the lack of significant interaction between BS’ languages and the factor “FOLLOWING CONSONANT. Besides, the comparison of BS’ patterns of vocal effort for // (based on median A2*c, dB) to those of the SSE monolingual children showed no significant effects or interactions. The lack of significance revealed that BS had not acquired the fine-grained SSE vocal effort pattern. This pattern is in line with the fact that BS had not yet acquired the durational SVLR pattern for this SSE vowel. However, as in AN’s data, interpreting this apparent lack of language differentiation in favour of language interaction is problematic, since not all the SSE monolingual children seemed to produce the pattern of vocal effort for the vowel //. 247 7 Discussion and Conclusion 7.1 Overview of the main findings 7.1.1 Language differentiation and interaction patterns The study accounted for the language differentiation and interaction patterns in the speech of two early simultaneous bilinguals: i.e. BS (aged 3;4 to 4;5) and AN (aged 3;8 to 4;5). The bilingual girls were acquiring Russian and Scottish English in Edinburgh in Russian-speaking families with a similar sociolinguistic background. However, the subjects differed in the amount of language input received by the start of recordings: i.e. BS (Figure 3-1) had substantially less input in Scottish English than AN (Figure 3-2). We addressed the detail of their production of prominent syllable nuclear vowels /i / in Scottish English versus the vowels /i u/ in Russian for one segmental (vowel quality) and two suprasegmental aspects (vowel duration and vocal effort). The set up of the study was varied to trigger potential language differentiation and interaction effects based on crosslinguistic structure, language input conditions and longitudinal effects. The subjects produced variable degrees of language differentiation and interaction depending on their age, language exposure, crosslinguistic structure and variable concerned. We formulated four research questions for this study, namely: (1) Are the languages differentiated? (2) Is their SSE native-like compared to the SSE-speaking children and adults? (3) Is their MSR native-like compared to the MSR-speaking adults (including mothers)? (4) Is there language interaction? (What are the patterns?). The results of the study are summarised in Table 7-1. In the table we give the “yes/no” answers to the above questions based on a combination of statistical results in three comparisons (1) of each bilingual child’s speech to that of the SSE monolingual children (based on ANOVA’s) (2) to MSR adults (based on descriptive statistics), (3) each subjects two languages (based on ANOVA’s). The table gives the answers to the research questions for the total of eight research variables across different vowel sets and the level of speech production (vowel quality, duration and vocal effort). The results are shown per subject, age sample, research variable and vowel set considered. 248 The language differentiation effects can be split into three groups based on Table 7-1: (1) total differentiation, when a subject’s speech production was within the range of the SSE monolingual peers and MSR adults, and both languages differed from each other in the expected direction; (2) partial differentiation, when the subject’s languages differed from each other in the expected direction, but one of the languages differed from the monolingual controls; (3) lack of differentiation, when neither language differed from the other in the expected direction, and either one or both languages differed from the controls. Language interaction (accounted for in Chapters 4-6) appeared in the sound structures with partial or lacking language differentiation. For AN, Table 7-1 shows that out of the eight variables, at the age of 3;8 AN lacked language differentiation for the two variables involving postvocalic conditioning of vowel duration for the SSE/MSR /i/, SSE // and MSR /u/. At the same age, there were four language interaction patterns, two of which were due to partial language differentiation. At the age of 4;2 only one pattern of postvocalic conditioning of duration of SSE // and MSR /u/ lacked language differentiation, while there were three language interaction patterns. At the age of 4;5, AN fully differentiated between her MSR and SSE for all eight variables, and thus no more language interaction for these variables was observed. Overall for AN, the language interaction patterns involved only two research variables: i.e. vowel quality and duration, of which vowel duration was affected most due to the lack of language differentiation (rather than being partial). No language interaction effects were found for AN’s vocal effort patterns. She differentiated between her languages for all three variables involving vocal effort, except for the fact that no definite answer could be given for her vocal effort pattern of SSE // and MSR /u/ at the age of 4;5 (“?” in the table) due to potential methodological problems (see Section 7.1.2.6). Table 7-1 also shows for this subject, who was ‘balanced’ with regards to language exposure, three quite divergent directions of language interaction, even though the extent of this interaction was quite marginal (especially for the vowel quality variables). The uni-directional interaction from SSE to MSR involved AN’s production of vowel quality. AN introduced a non-existent lax vowel [] in her MSR splitting the MSR vowel phoneme /i/ into two phones [i] (90%) and [] (10%), while acquiring the SSE tense/lax contrast similarly to the monolingual peers. To the best of our knowledge, this is the first report of systematic language interaction involving tense/lax vowels of this 249 direction (apart from a note of a similar phenomenon in Keshavarz and Ingram (2002), see section 2.2.2. for discussion). The uni-directional interaction from MSR to SSE involved postvocalic conditioning of vowel duration of /i/. The pattern was very similar to the ‘reduced’ pattern (compared to monolingual peers) of intrinsically short-long vowel duration in German-Spanish bilinguals in Kehoe’s (2002) study (see discussion in Section 2.3.2). The bi-directional interaction from SSE to MSR and MSR to SSE involved the postvocalic conditioning of vowel duration of SSE // and MSR /u/. We have not found reports on this type of interaction in early simultaneous bilingual acquisition studies. However, similar bi-directional effects were reported on pitch alignment and intonation of proficient L1 Dutch learners of L2 Greek (Mennen, 2004). There are also studies reporting bi-directional cross-language effects in VOT (Caramazza et al., 1973; Flege, 1987; Williams, 1980). Despite the internal divergence of the direction of language interaction within AN’s speech production, all these patterns are backed up by the literature to some extent, and in this sense they are coherent. For the Russian-‘dominant’ subject BS (Table 7-1) there was also a longitudinal tendency for the eight research variables. However, she substantially differed from AN in that she had a lesser extent of language differentiation. She also produced more language interaction effects and an overall different direction of language interaction. Like AN, at the age of 3;4 BS lacked language differentiation for two variables involving postvocalic conditioning of vowel duration of SSE/MSR /i/, and SSE // versus MSR /u/. At the same age, she produced six language interaction effects (involving partial or lacking differentiation). The language interaction involved all three research variables (vowel quality, duration and vocal effort), of which vowel duration was affected most due to the total lack of language differentiation. At the age of 3;10 the situation did not change either in systematicity or in extent. At the age of 4;5, BS differentiated between her languages. However, she still produced five language interaction effects due to partial differentiation. Table 7-1 shows that the subject BS did not show any language interaction effects in MSR. All the interaction effects were unidirectional from MSR to SSE. The effects in SSE were quite extensive in that they affected all three research variables. Each of the variables: vowel quality, duration and vocal effort were affected to a variable degree. BS’ results for the acquisition of SVLR for the vowels /i/ and // both differ and overlap with AN’s results for these vowels. In statistical terms, the difference was in the 250 greater significance of the factor ‘bilinguality’ in BS’ case both in comparison to AN and to the SSE peers. The overlap was in the direction of language interaction observed for these vowels: i.e. at a younger age both subjects produced a somewhat reduced extent of SVLR (compared to the monolingual peers) between the contexts before voiceless stops and voiced fricatives: i.e. their VLS/VF ratios were greater than either maximal or average values of the SSE children. For AN, the reduced extent was only observed at the youngest age of 3;8. For BS, the reduced pattern persisted throughout the three age samples 3;4, 3;10 to 4;5. Once again the ‘reduced’ SVLR in the speech production of both subjects agrees with Kehoe’s (2002) study, which showed that the German/Spanish bilingual children aged 2;3 to 2;6 produced a significantly smaller extent of the durational difference between short and long vowels than the German monolingual children. Figure 7-1 exemplifies the empirical findings for the bilingual subjects in a more abstract way. The two languages of each subject are presented as a cross-section, which comprises subject specific extent of speech immaturity, and language interaction effects (their extent and direction for the same set of variables). In the following sections we shall address the differences and similarities between the subjects with regard to different conditioning factors. AN's model of SSE/MSR representation SSE-like Sound structures MSR-like Sound structures Speech Immaturity Speech Immaturity Interaction from MSR Interaction from SSE BS's model of SSE/MSR representation SSE-like Sound structures MSR-like Sound structures Interaction from MSR Speech Immaturity Speech Immaturity Figure 7-1 Visual footprint of BS’ and AN’s language differentiation in their two languages, speech immaturity and the direction of language interaction based on the results in this study. 251 Table 7-1 Patterns of language differentiation and interaction observed for the two bilingual subjects (BS and AN) in different age samples, for three research variables and two vowel sets. Subject and Age Research Questions BS 3;4 BS 3;10 BS 4;5 AN 3;8 AN 4;2 AN 4;5 Languages differentiated? SSE native-like? MSR native-like? Pattern of language interaction Languages differentiated? SSE native-like? MSR native-like? Pattern of language interaction Languages differentiated? SSE native-like? MSR native-like? Pattern of language interaction Languages differentiated? SSE native-like? MSR native-like? Pattern of language interaction Languages differentiated? SSE native-like? MSR native-like? Pattern of language interaction Languages differentiated? SSE native-like? MSR native-like? Pattern of language interaction Vowel Quality Vowel Duration /i / /i/ // /u/ Yes No Yes 1 Yes No Yes 1 Yes No Yes 1 Yes Yes No 2 Yes Yes No 2 Yes Yes Yes 0 Yes No Yes 1 Yes No Yes 1 Yes Yes Yes 0 Yes Yes No 2 Yes Yes No 2 Yes Yes Yes 0 Vocal Effort /i/ // /u/ // /i/ No No Yes 1 No No Yes 1 Yes No Yes 1 No No Yes 1 Yes Yes Yes 0 Yes Yes Yes 0 No No Yes 1 No No Yes 1 Yes No Yes 1 No No No 3 No No No 3 Yes Yes Yes 0 / No / 1 / No / 1 / No / 1 / Yes / 0 / Yes / 0 / Yes / 0 Yes Yes Yes 0 Yes Yes Yes 0 Yes Yes Yes 0 Yes Yes Yes 0 Yes Yes Yes 0 Yes Yes Yes 0 // /u/ /i/ // Yes Yes Yes 0 Yes Yes Yes 0 ? ? ? ? Yes Yes Yes 0 Yes Yes Yes 0 ? ? ? ? / No / 1 / No / 1 / No / 1 / Yes / 0 / Yes / 0 / Yes / 0 Patterns of language interaction: 0 none 1 uni-directional interaction from MSR to SSE 2 uni-directional interaction from SSE to MSR 3 bi-directional interaction from SSE to MSR and from MSR to SSE Abbreviations: / not applicable ? unable to determine 252 7.1.2 Conditioning Factors of Language Differentiation and Interaction 7.1.2.1 The role of language input conditions versus language structure In the introduction to the methods used in studies of Bilingual First Language Acquisition (BFLA) De Houwer (1998, p. 258) questions the use of the term ‘language dominance’ (Petersen, 1988; Lanza, 1992), because it is often dubbed in terms of another concept, ‘proficiency’, usually referring to assessment of adult language skills. De Houwer rightly points out that the link of ‘dominance’ and ‘proficiency’ is problematic with regard to immature child speech, and that assessment of ‘dominance’ is often performed using monolingual solutions, like word or morpheme based MLU (Brown, 1973), given the lack of a baseline for crosslinguistic comparison. De Houwer further states that “it remains to be considered whether and to what extent the notion of ‘dominance’ is at all needed either as a descriptive or an explanatory concept with regard to very young bilingual children” (1998, p. 258). Together with Lanza (2000) we do not agree with the latter statement for the following reason. Whatever form ‘dominance’ takes in the mental representation of a bilingual, it has an environmental source, namely it should be shaped by the amount of exposure to the two languages and by the need to “communicate with people in the immediate environment” (Grosjean, 1982, p.189). Obviously there may be as many variable situations with regard to the language input as there are bilingual children. One of the restrictions of studies doubting the usefulness of the concept of ‘dominance’ (de Houwer, 1990; Döpke, 1998; Döpke, 2000; Müller, 1998) is that they have looked at the acquisition of morphosyntax only. As we have seen, phonological studies that have considered environmental conditioning of ‘dominance’ conclude that the factor may play a role in language differentiation and interaction (e.g. in production of prosodic properties such as VOT or rhythm: Kehoe et al., 2001; Paradis, 2001). This conclusion has found support in this study. We defined the potential ‘language dominance’ of the subjects from the amount of exposure to the two languages rather than from the output ‘proficiency’. We asked the question whether the observed patterns of language differentiation and interaction differ along this ‘language exposure’ dimension and how this refers to structural properties of the two languages in contact. 253 In BS’ case, with her substantial exposure to MSR (Figure 3-1), overall we observed unidirectional language interaction from her MSR into SSE (Table 7-1, Figure 7-1). Qualitatively, language interaction effects were similar to ‘transfer’ accounted for in L2acquisition studies. For the vowel quality there is abundant evidence that L2-learners “under-differentiate” (Weinreich, 1953) in such phonological contrasts as tense/lax vowels if they are absent (or represented by one phoneme) in their L1 (Panasyuk et al., 1995; Markus & Bond, 1999; Escudero, 2000; Guion, 2003; Piske et al., 2002). The pattern of language interaction in BS’ case was similar, since it involved the overuse of the tense vowel [i] for the SSE lax vowel //. The direction of language interaction in BS’ case was compatible with the direction predicted by CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998) for simultaneous bilingual acquisition. According to both hypotheses it is directed unidirectionally from a structurally simpler into structurally more complex language (irrespective of ‘dominance’). The pattern of BS’ language interaction also agrees with her individual pattern of language exposure (more Russian than English). However, the pattern observed for AN for the vowel quality in this set of vowels was just the opposite of BS’: namely in AN’s case it was directed from SSE to MSR, where she “over-differentiated” (Weinreich, 1953) the tense/lax contrast. AN introduced a phonologically irrelevant tense/lax contrast in her Russian. Recall AN’s more ‘balanced’ language exposure pattern (Figure 3-2). The mirror-image language interaction for the same structural ambiguity between the two subjects is then not explainable in terms of simplicity or complexity of sound structures involved, but it rather can be explained by the subjects’ different language exposure patterns. Since both CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998) predict unidirectional language interaction for one and the same language structure (though it can be bi-directional for different ones), these hypotheses find thus no support in this study at the level of sound structure. As opposed to partial language differentiation in BS’ production of vowel quality for the SSE /i / and MSR /i/, we observed an overall lack of language differentiation for the subject’s vowel duration (if we do not consider a non-significant longitudinal change at the age 4;5). Her patterns of language differentiation and interaction again resembled the “underdifferentiation” effects observed in L2-learners of languages with complex vowel duration conditioning patterns. In the studies of vowel duration conditioning in L1 254 French learners of L2 English (Mack, 1982) and L1 Russian learners of L2 Latvian (Markus & Bond, 1999), the L2-learners produced some ‘intermediate’ results for phonetically or phonologically long and short vowels given lack of such a system in L1, or failed to produce them at all. Similarly, BS’ SSE SVLR targets /i/ generally were close to those in her MSR, i.e. she did not produce the long vowel duration before voiced fricatives the same way as the SSE monolingual children. Consequently, BS did not produce a differentiated postvocalic pattern for the lax vowel // compared to her production of SSE /i/. The pattern of language interaction again was clearly directed from her Russian into SSE. However, AN’s cross-linguistic patterns of vowel duration conditioning for /i/ and // were only quantitatively different from those of BS: i.e. while BS did not seem to produce the SVLR pattern at all (at least not in the first two age samples), AN had a ‘reduced’ extent of SVLR for /i/ at the age of 3;8 similar to the patterns reported in Kehoe (2004). Besides, in SSE AN did produce a differentiated postvocalic conditioning with SVLR for /i/ and invariably short conditioning for //. We can make two further comments regarding these findings. The difference in language interaction between the two subjects for this variable and set of vowels is thus quantitative (since the direction of interaction is the same). Therefore, plausibly the amount of language exposure between the two subjects is reflected in the different extent of language differentiation for the SSE vowel duration patterns in their speech production. Secondly, there is the striking fact that AN produced the vowel quality interaction between tense/lax vowel unidirectionally from SSE into MSR, while her SVLR vowel duration pattern for /i/ had an opposite unidirectional interaction from MSR to SSE. This means that in AN’s case for the two variables of vowel quality and duration, the language interaction effects were bi-directional within speech production of the same subject. This finding is problematic for the Language Dominance Hypothesis (Petersen, 1988) in simultaneous bilingual acquisition, since it predicts unidirectional language interaction in the linguistic output of the same individual. The third language interaction effect observed in AN’s postvocalic conditioning of duration of the SSE vowel // and MSR /u/ (Table 7-1) was a bi-directional transfer from SSE to MSR and MSR to SSE. Recall that she produced a more SVLR-like pattern in her Russian, while producing a ‘reduced’ (compared to the monolingual children) SVLR 255 difference in SSE. This effect was observed at both age of 3;8 and 4;2, while at the age of 4;5 she produced language-specific patterns. This means that for the postvocalic vowel duration pattern the language interaction effects were bi-directional within the same subject and sound structure variable. This conclusion is problematic for Language Dominance Hypothesis (Petersen, 1988), for CCCH (Döpke, 1998; Döpke, 2000), and the Markedness Hypothesis (Müller, 1998), since all of them predict only unidirectional language interaction for the same language property. However, such bi-directional interaction is in line with findings of bidirectional transfer of timing of intonation patterns in proficient L1 Dutch learners of L2 Greek (Mennen, 2004). In this sense, the bi-directional interaction is not confined to early simultaneous bilingual acquisition, and on this basis should not be used as an argument for a functional distinction between early simultaneous bilinguals and L2-learners. This finding shows that in more ‘balanced’ bilinguals, the direction of interaction can be ‘fuzzy’, and both languages can be affected. Such fine-grained phonetic interaction in variable speech production may even not be necessarily perceivable in the context of less mature child speech. To summarise, so far, for more ‘balanced’ bilingual children (like AN), at the level of sound structure, there seems to be no necessary direction of ‘dominance’, since the language balance can be blurred for some less categorical variables like vowel duration. The balance of acquired sound structures is generally language-specific (and mainly differentiated). It depends on the sound structure in question. In some cases language interaction can affect either of the languages and can even be bi-directional. Surfacestructural ‘markedness’ or ‘cue strength’ do not necessarily determine the direction of language interaction at the level of sound structure. Our data support Lanza’s suggestion that the two possible conditioning factors of language interaction: i.e. structural properties and environmental factors such as language exposure need not to be exclusive arguments (Lanza, 2000, p. 233). In that sense language interaction can be considered to be a ‘normal’, but not obligatory feature of the simultaneous bilingual acquisition of sound structure. It is important to emphasise that CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis were formulated based on morphosyntactic acquisition studies, while we dealt with an altogether different linguistic level of sound structure. Also Paradis & Genessee (1996) provided evidence on autonomous development based on the studies of syntactic structures. However, in two later studies Paradis (2000; 2001) refined the claim of autonomous development, based on further evidence from French-English 256 truncation patterns at the level of prosodic structure, which did show language interaction effects as in this study. She attributed the differences between the two studies to the methodological issues such as observation versus experimental manipulation, and to the differences in the language pairs involved. We used two methods (auditory labelling and instrumental acoustic analysis), yet most language interaction effects consistently showed up for the same variables and subjects irrespective of the method used. One obvious difference between morphosyntactic and sound structures is in the fact that the physical manifestation of speech production is dual. It embraces the discrete ‘phonologised’ mental hierarchy of language units and continuous speech motor control, whereas the dichotomy is absent in the production of morphosyntactic structures. It is possible that some language interaction effects observed in this study are bound to the level of sound structure due to this dual nature of speech production. However, this should not mean that studies of non-speech levels of language should not look at the possibility of such bi-directional interaction in the discrete language properties. 7.1.2.2 Sound-structural effects In the previous section we showed that even though some language interaction effects were compatible with structural arguments proposed by the CCCH (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998), we saw that structural factors such as ‘markedness’ or ‘cue strength’ do not predict the direction of language interaction at the level of sound structure. In addition to possible problems with the language level involved (morphosyntax versus sound structure), there is another issue that explains why these structural factors must be questioned. The problem at the level of sound structure may be in determining markedness for segments in isolation. Segments or their suprasegmental properties (like postvocalic vowel duration conditioning) are intertwined with other conditioning factors, such as crosslinguistic differences in final devoicing of the phonologically voiced obstruents (and their own relative markedness). For example, BS may not produce the SSE language-specific vowel duration pattern (with the lack of SVLR in SSE), because she might have produced complete final devoicing phonetically (or in a phonologically neutralising way) more than the monolingual children (see also Section 3.4.1.). This is because even if not neutralising, a phonetically voiceless final consonant makes the preceding vowel shorter, if the vowel is by definition phonated. Neutralisation of final voicing might be complete (or more 257 complete) in Russian compared to SSE in utterance-final positions (see e.g. Burton & Robblee, 1997), since SSE is like American English or other British English varieties (Docherty, 1992; Smith, 1997). Determining the ‘completeness/gradualness’ of the neutralisation is not a trivial issue (especially in child speech), since it potentially requires the use of more instrumental techniques such as airflow measurements or electroglottography in addition to auditory or acoustic analysis, which we could not perform within the scope of this thesis. We already discussed the fact (Section 2.1.2.6) that SVLR in SSE is conditioned by both voicing and manner of articulation, and thus, completeness of voicing alone would not have explained the big extent of the difference in duration produced by either subject. However, this argument may apply to bilingual acquisition studies of postvocalic conditioning dealing with languages that differ in postvocalic conditioning based on voicing effect, as in German and SSBE (Whitworth, 2003), especially when such studies involve markedness in the discussion (Section 2.3.2). We have seen in AN’s case that not all language interaction effects observed in the vowels originate in the vowels (AN’s [] for MSR /u/ in Section 4.3.3.2.2), but may be due to the phonotactic influence of the preceding consonant. This is another reason why it can be misleading to study segments in isolation without considering the contextual effects. Sound structure did matter in the sense that some research variables (Table 7-1) seemed to be more prone to language interaction than others, both depending on the level of speech (vowel quality, duration and vocal effort), and on the vowel set concerned. For example, within the segmental level the crosslinguistic systemic difference in vowel pair /i / showed more language interaction effects than the realisational difference between SSE // and MSR /u/ for both subjects. Our assessment of the SSE vowel set /i / versus MSR /i/ was more versatile compared to other monolingual, L2 or bilingual acquisition studies (Kehoe & StoelGammon, 2001; Buder & Stoel-Gammon, 2002; Kehoe, 2002; Stoel-Gammon et al., 1995; Buder & Stoel-Gammon, 2002). It was new in that in addition to vowel quality and/or duration, we also assessed the production of laryngeal effects connected to vocal effort accompanying these vowels. For this contrast, AN produced all vowel quality, duration and vocal effort effects in a native-like way in SSE, and the contrast only affected her vowel quality in Russian. At the same time, Russian-dominant BS did not fully differentiate between the SSE tense/lax contrast in vowel quality, duration, and vocal effort, although she produced the contrast to some extent at the level of vowel quality and 258 vocal effort. Therefore in BS’ case, the lack of contrast in Russian did seem to affect her SSE speech production. Both subjects seemed to have less problem acquiring the realizational SSE // and MSR /u/ difference (Table 7-1), since they had greater extent of language differentiation for this variable. From the bilingual point of view, it is possible that the systemic tense/lax like /i / is more difficult to acquire than the realizational differences like SSE // and MSR /u/, not because it is more complex in surface sound structure, but because the tense/lax contrast involves a more complicated speech motor control of both supralaryngeal and laryngeal levels on top of timing as opposed to the / u/ difference (which mainly involves supralaryngeal differences). On the other hand, for the monolingual SSE acquisition this study (Section 4.3.1.3) and Matthews (2002) have shown that the SSE monolingual children’s production of the lax vowel // was more ‘adult-like’ than that of the vowel // despite the more complicated speech motor control in the lax vowels compared to //. This discrepancy may be not a matter of acquisition of speech motor control of the target sounds at the age considered in this study. Both /i / had been acquired. The phone [] had been acquired and is produced alongside with other less frequent phonetic variants [] and [u] for //. After all, the children are systematically exposed to other non-SSE English varieties in Edinburgh in addition to SSE. Studies involving a sociolinguistic perspective of phonological acquisition (Docherty & Foulkes, 1999; Khattab, 2004; Scobbie, 2005) have convincingly shown that “aspects of variable performance must be learned alongside reflexes of the system of lexical contrast” (Docherty et al., in press). It is thus possible that this pattern of variation in // in the SSE monolingual children reflects the crossvarietal variability in Edinburgh, and will possibly be reduced at the later school age towards a more adult-like range, when the linguistic background of the majority SSE peers becomes more important through their socialisation pattern. As Chevrot et al. (2000, p.297) put it: “it is probable that stylistic skills precede stylistic awareness of the social meaning of variants”, meaning that this type of sociolinguistic cross-varietal variation is encoded in the mental representation through language input, but the metalinguistic awareness of the appropriate social meaning of variety is yet to be acquired and applied. Another sound-structural effect emerging from our bilingual data is the apparent discrepancy in language differentiation patterns between the three gross variables regarding sound structure in this study. In the results of both subjects we only observed 259 patterns apparently lacking language differentiation for vowel duration, while for vowel quality and vocal effort the languages were either fully or partially differentiated (Table 7-1). This mainly concerned the postvocalic conditioning of vowel duration involving SVLR in SSE, and a lack of such an extent of conditioning in Russian. As a result, most language interaction effects (resulting from a total lack of and partial language differentiation) – both uni- and bi-directional – appeared for this variable. At the same time we concluded that their monolingual SSE peers had already acquired the extrinsic and intrinsic vowel duration patterns. We should clearly note here that “lack of language differentiation” for this variable does not imply a categorical statement, since we dealt with continuous speech production reflecting variability ranges, in which some of the produced tokens fell within the monolingual production ranges. Besides, it is very much an open question how categorical perception of this timing parameter works in both child and adult cases either in bilingual and monolingual context (Macken, 1986). Our data do not allow evaluation of this, not least because it is difficult to separate different (supra-)segmental levels of speech from each other in accent judgment experiments based on real speech production. We were not aware of the language interaction in the vowel duration component in AN’s speech during data annotation, even though it was quite clear to us for BS’s long vowels. One conclusion arising from the comparison to the SSE monolingual children is that plausibly the language interaction effects in vowel duration the two bilingual children cannot be accounted for by speech immaturity. Another conclusion is that ‘markedness’ or ‘cue strength’ resulting from surface structure of vowel duration do not seem to play a role in the way proposed for the acquisition of morphosyntax (Döpke, 1998; Döpke, 2000; Müller, 1998), since we have observed bi-directional effects for the same variables. It is plausible that the bilingual input conditions make the two bilingual subjects different from the monolingual SSE peers. It is possible that the crosslinguistic difference in the input structure of postvocalic vowel duration conditioning had affected the bilingual children’s ability to ‘phonologise’ (Keating, 1984) the versatility of cross-linguistic input rules, and affected their categorical perception of these continuous variables and their acquisition. The bi-directionality of the language interaction in AN’s data does not support the view that it always works in a specific direction, as in BS’s case. It only suggests that in the case of postvocalic conditioning of duration the categoricalness of perception might be somewhat ‘blurred’ due to the versatility of bilingual input. 260 7.1.2.3 Lexicalisation effects Proponents of the segment-sized basis of phonological acquisition (Wode, 1992, p. 622) have argued that in a bilingual context variation due to crosslinguistic influence occurs in all targets containing a given segment, and that there is no evidence that phonological variation in early bilingual acquisition is lexically based. We have seen a clear example in AN’s data in Section 4.3.3.2.2 that a language interaction effect can be lexically bound. 92.4% of instances of AN’s [] for the MSR /u/ were confined to one MSR carrier word [’ut] (a joker), while the rest of the MSR tokens with this vowel were quite adult-like. AN had some difficulty in producing this particular lexical item, either because of the influence of the preceding consonant, which she consistently produced with laminal SSE articulation instead of apical as in MSR, or because this lexical item happened to be a false cognate of the English verb “to shoot”. In either case, this example illustrates that language interaction can be lexically bound, and that it can be misleading to adopt a strict segmental view of phonological acquisition without accounting for the other systemic influences on the segments. This finding thus supports the views proposing broader units of phonological acquisition, which involves “formalization of the strategies that a particular child has adopted to represent words and classes of phonetically similar words” (Macken, 1986, p.264), either lexically (Ferguson & Farwell, 1975), or “word template” based (Vihman, 1996; Vihman, 2002). 7.1.2.4 Maturation and age effects In Section 1.3.2.5 we discussed the Bilingual Bootstrapping Hypothesis (GawlitzekMaiwald & Tracy, 1996). The hypothesis views bilingual language acquisition in a maturational perspective. Under this view language interaction in syntactic acquisition of young simultaneous bilinguals is a relief strategy involving a temporary use of child expertise in one domain of LA to solve similar problems in LB. One of the falsifiable predictions from the hypothesis was that language interaction with regard to a structure should cease once the structure is acquired. However, two patterns in AN’s data are problematic for the Bilingual Bootstrapping Hypothesis (Gawlitzek-Maiwald & Tracy, 1996), at least at the level of sound structure. The patterns were: (1) introducing the SSE lax vowel [] for the Russian /i/ and (2) producing an SVLR-like durational difference in the otherwise ‘simple’ Russian model of postvocalic conditioning of vowel duration. If bilingual language interaction is due to the 261 fact that a property has not yet been acquired, then language interaction regarding a property in LB should cease once the similar property in LB is acquired. However, in the case of AN introducing a non-existent phoneme // for the Russian /i/ it is not the question of acquisition that seems to cause language interaction. We showed that AN acquired and used both phonemes in SSE in a way similar to that of the SSE monolingual children. In the light of ‘bilingual bootstrapping’ it should be no problem for AN to produce Russian /i/, if she can produce both SSE /i/ and // in appropriate contexts in a native-like way, since Russian and SSE /i/ are similar in vowel quality. It appears thus that at the level of sound structure language interaction does not necessarily cease when the language structures involved in this interaction are acquired. From this perspective acquisition of a single sound-structural property does not seem to be the only condition for its appropriate use, nor does completeness of its acquisition explain language interaction. This bilingual pattern concerns the nature of phonological development in general: “If a child’s business is to construct, within the boundaries of UG, the simplest account of the input data why does a child impose additional complexity on the grammar?” (Mohanan, 1992). We showed in Section 4.3.3.1.1 that this (admittedly marginal) pattern involved all carrier words; it did not seem to have phonotactic explanations from the preceding palatalised consonants, and was longitudinally coherent (ceasing at the age of 4;5). Therefore, the process was systematic. Yet it is still possible that the appearance of [] in AN’s Russian may have a phonotactic explanation from the influence of the following consonants in the carrier /ti/ [ti] (a finch). For example, phonetically devoiced // in the syllable coda may have been ‘perceptually assimilated’ to // word-finally. In English, there are only a few infrequent words ending in /i/ (‘sleesh’, ‘creesh’, ‘sneesh’, ‘quiche’, ‘niche’) and many more ending with // including the really frequent ones (such as ‘fish’ and ‘dish’) (Rockey, 1973) that a child is likely to know. In Russian, monosyllabic words ending with /i/ or /i/ are relatively infrequent too. So that the greater frequency effect of the SSE // may have cross-linguistically affected the production of the Russian low-frequency targets ending with /i/. This possibly explains appearance of [] for /i/ in one MSR carrier word out of three. However, this shows the enormous complexity of the task of eliciting data from children, the multidimentionality of crosslinguistic sound-structural (in-)compatibility, and the strength of the distributional characteristics of the input language. In that sense, 262 the pattern of language interaction is rather compatible with the input to the child rather than with surface structural ‘markedness’. This frequency effect suggests that bilingual acquisition of sound structure is lexically (Ferguson & Farwell, 1975) or “word template” based (Vihman, 1996; Vihman, 2002), rather than instantiated segmentally. With regard to age effects it is further worth noting here that most syntactic acquisition studies claiming autonomous development also studied children of younger age (usually 1;5 to 3;0). The two subjects in this study were older (3;4 to 4;5) and thus had had more time to practice their speech motor routines and phonologise them in both languages. Despite this, both subjects showed signs of systematic language interaction. This finding supports the idea that not all language levels might be equally prone to language interaction, with the level of sound structure in general being more prone to it than some regular morphosyntactic properties (Paradis, 2000) (if we don’t consider, for example, irregular or infrequent morphosyntactic subtleties), and this should encourage more research into acquisition of bilingual speech. Figure 7-2 Abstract representation of the longitudinal effect for the bilingual subjects AN and BS on their bilingual language differentiation based on the number of sound structure variables involved in total and partial language differentiation across their two languages. 263 We observed systematic longitudinal effects in language differentiation for both subjects. The longitudinal effect is shown in Figure 7-2. The language differentiation effects in Figure 7-2 are split into three groups based on Table 7-1: (1) total differentiation, when subject’s speech production was within the range of the SSE monolingual peers and MSR adults, and both languages differed from each other in the expected direction; (2) partial differentiation, when a subject’s languages differed from each other in the expected direction, but one of the languages differed from the monolingual controls; (3) lack of differentiation, when the languages did not differ from each other in the expected direction, and either one or both languages differed from the controls. The width of each type of differentiation is determined by the number of variables (our of eight in Table 7-1) showing each type of differentiation. All amounts of language differentiation are drawn across the two languages. Several tendencies are apparent from Figure 7-2. First of all, both subjects show a systematic progression towards more differentiation with increasing age, which shows up in the amount of total and partial differentiation. This suggests that their bilingual speech production for all variables becomes more and more language-specific, and that the observed amounts of language interaction still reflects some more ‘initial’ stage of language acquisition, which may eventually cease with growing linguistic experiences. In this sense their state of bilingual language acquisition is not necessarily different from L2 acquisition, for which it is known that ultimate attainment is proportional to the amount of language exposure (Flege et al., 1995; Birdsong, 2004) with a confounding effect of age of acquisition, and where ‘transfer’ is known to manifest itself most obviously in the initial stages of L2 acquisition. Secondly, the amount of differentiation lacking seems to be nearly equal in both subjects (despite their language exposure differences) and it does not affect the majority of their speech production. Recall that the lack of differentiation was mostly contributed by the vowel duration component. In that sense both girls’ patterns are different from predictions made for adult L2-learners. For example, the Competition Model (Bates & MacWhinney, 1989; MacWhinney, 1997) predicts for beginning L2 learners that everything that can transfer (given ‘cue strength’ differences) will transfer. Thirdly, the girls do substantially differ in the amount of ‘partial differentiation’ compared to total language differentiation. By the age of 4;5, AN achieved total language differentiation for all the variables considered, while BS reached a stage where she had no patterns lacking differentiation. It seems that given their different language exposure patterns, the parameters of exposure and age co-vary (or perhaps accumulate) and with 264 increasing age affect the output language differentiation patterns. Once again the interdependence of language preference in adulthood, age of onset (amount of exposure in years) and ultimate attainment is not new in L2-acquisition studies. For example, Flege et al (1995) assessed the relation between non-native subjects' age of learning (AOL) English and the overall degree of perceived foreign accent in their production of English sentences. The 240 native Italian subjects had begun learning English in Canada between the ages of 2 to 23, and had lived in Canada for an average of 32 years. Native Englishspeaking listeners used a continuous scale to rate sentences spoken by the native Italian subjects and by subjects in a L1 English comparison group. Age of onset accounted for an average of 59% of variance in the foreign accent ratings. Language use factors, such as dominance, accounted for an additional 15% of variance. Thus, also in that study the amount and length of language exposure determined ultimate attainment. To conclude, in a longitudinal perspective the language differentiation patterns primarily showed their dependence on the amount of language exposure, and a structural effect of vowel duration on the lack of differentiation. Language interaction seems to be part of normal bilingual phonological development, and it may eventually cease with growing linguistic experience. Maturationally, bilingual acquisition of sound structure cannot be accounted for in terms of bilingual bootstrapping based on segment-sized phonology. Some ‘unnecessarily complex’ sound structures in adult terms can potentially be explained by lexicalisation, frequency or/and phonotactic effects. The data supports hypotheses claiming that children acquire phonology in units of larger size than segments. 7.1.2.5 Other environmental effects This study was set up to account for cross-varietal influence on the English acquired by the bilingual subjects. The majority of Edinburgh population speaks broad Scottish Standard English varieties. However, there is a substantial proportion of non-SSE English speakers in Edinburgh, especially in Middle Class families. We addressed the population statistics in more detail in Section 3.2.1. The design of this study included four SSBE adult speakers and a monolingual child (C4) from a mixed SSE/SSBE parental background. Their data allowed us to determine the non-SSE British English patterns for the variables in this study. Despite the presence of input from non-SSE English varieties in the girls’ nursery and the fact that both of them were regularly exposed to RP-based mass media, our data show that the subjects acquired the Scottish English sound structures rather than those of the non-SSE English varieties. 265 Considering the fact that no English input was provided in the family, the proportion of the English varieties in bilingual children’s input (with SSE being the majority variety) seems to determine the English variety acquired. Additionally, our data for the monolingual subject C4 showed that a child exposed to two varieties of English (SSE and SSBE) in the parental input acquired an SSE SVLRlike pattern for the vowel /i/, and an SSBE pattern for the vowel //. This result replicated results observed for two older children with non-SSE English parents growing up in Edinburgh in Hewlett et al. (1999). Hewlett et al. (1999) suggested that the additional vowel quality difference (SSBE /u/ and // as opposed to the lack of such contrast for the SSE //) can mediate the acquisition of a variety-specific pattern of postvocalic conditioning (i.e. the SSBE-like voicing effect for /u/ in that case). While we do consider this as a possible explanation, we additionally suggest this might also be a result of differing input conditions for the two vowels. For example, it is possible that an SSBE–speaking parent gives more explicit attention to the child’s acquisition of SSBE segmental /u / difference, because it is perceptually more salient and phonologically more relevant for an SSBE-parent than the mere durational one involving /i/. This may encourage ‘explicit learning’ (Vihman, 2002) (on top of incidental) of the segmental contrast between /u/ and // and subsequent acquisition of the SSBE voicing effect rather than of SSE SVLR. 7.1.2.6 Methodological issues In this study we used two methods: i.e. observation (auditory labelling) of vowel quality and quantitative instrumental measurement of vowel duration and vocal effort. Two conclusions arise from the use of these methods. First of all, we observed language interaction effects for all three sound structure variables despite the methodological differences. The language interaction effects measured for the subject BS were coherently in the same direction, and were in agreement with the L2- and bilingualism literature. This allows us to state that the observed patterns of language differentiation / interaction (including those of subject AN) are not a methodological artefact. Secondly, in the instrumental measurements there were two language interaction patterns that could not be easily detected by observation. The first one involved the reduced extent of SVLR-conditioning in /i / produced by both subjects compared to the 266 monolingual peers (all in the earliest age samples). The second one involved the bidirectional influence between AN’s SSE and MSR systems in the production of the postvocalic conditioning of duration of // (at the same youngest age). The third one involved the ‘reduced’ laryngeal difference in vocal effort between the SSE tense and lax vowels produced by BS compared to the monolingual peers. It means that there are methods in the analysis of speech, which are more suitable to observe fine-grained phonetic details, which would not necessarily be detected by observation. Measuring vocal effort patterns for different vowel sets produced quite coherent results across the different subject groups. The analysis of the intrinsic laryngeal contrast between SSE tense/lax vowels in children and adults also replicated results for German in Jessen (2002). This shows that the methodology applied was largely sound. Some problematic issues arose in explaining the non-differentiated vocal effort patterns for the close rounded vowels in the bilingual data at the age of 4;5. In this age sample, both subjects produced patterns of vocal effort for the SSE close rounded vowels that differed from their own production at earlier ages, and from the SSE monolingual results of either children or adults. The pattern was not explainable in longitudinal terms. It could not be explained by the larger variability in their vowel quality for //, since both subjects showed language differentiation in vowel quality. It is important to note here that upon finding this problematic result we thoroughly crosschecked all the data looking for potential data analysis errors at different levels, but we did not find any explanation in that. One plausible explanation would be child speech variability. However, we still have doubts as to why this should happen for both subjects, in the same age sample out of three, and not occur in the monolingual results. We had two age samples for 3 SSE monolingual subjects (Figure 6-12): subjects C3 and C4 produced similar patterns in both age samples, while subject C7 produced a longitudinally coherent change. Thus, for the bilingual subjects we decided to discard this vocal effort variable for // at age 4;5 from the further discussion as unreliable (hence the question marks in Table 7-1). 267 7.1.3 Implications of the bilingual findings 7.1.3.1 Language differentiation/interaction patterns and their mental representation In the review of bilingual issues (Chapter 1) we considered some pros and cons for ‘autonomous or interdependent’ bilingual language acquisition (Paradis & Genesee, 1996), and potential manifestations of the interdependence. According to the ‘autonomous development hypothesis’ bilingual children acquire grammatical systems which are not functionally different to monolingual development, while the ‘interdependent development hypothesis’ claims that bilingual’s language systems develop differentially, “causing a bilingual child to look different from monolingual children” (Paradis & Genesee, 1996, p.2). Upon the finding of language interaction effects at prosodic level of speech, J. Paradis (2000; 2001) stepped aside from this categorical view on autonomy/independence, and moved the discussion towards ‘degrees of separation’ and interaction in its relation to different subcomponents of language grammar, its structure and language dominance. In the light of our results here, it seems that the ‘interdependent development hypothesis’ could indeed be interpreted in such a gradual fashion: i.e. rather than postulating language interaction as an obligatory property of bilingual language development, we can say that bilingual’s language systems develop differentially from each other and may (though need not) interact. We have seen that at the level of sound structure language interaction took place systematically in both subjects regardless of their language exposure patterns. Both subjects differentiated between their languages, and showed patterns of language interaction to variable degrees which primarily depended on their language exposure, but also depended on structural characteristics of the contrasts involved. Therefore, there is a need for the synthesis of these factors. There was no evidence in this study that language interaction effects work exclusively unidirectionally. On the contrary, it seems that the mental footprint of language balance at the level of sound structure can produce quite ‘fuzzy’ directions of language interaction based on the language exposure of the bilingual child and on the structures involved. The effects of language interaction observed in this study are compatible with the unidirectional and bi-directional effects observed in L2268 acquisition. From this point of view, there seems to be no need for a separate model of language interaction effects in simultaneous bilingual acquisition of sound structure. In our view, the gradual interpretation of the ‘interdependent development hypothesis’ is compatible with the postulations of the ‘neurolinguistic theory of bilingualism’ (Paradis, 2004; Paradis, 1993; Paradis, 1981). M. Paradis did not specifically develop the theory for language acquisition, but rather for some typical ‘end product’ state of adult bilinguals, but we do see its implications for the developmental perspective. Besides the ‘subsystems hypothesis’ does not predict language interaction in bilinguals, though its combination with ‘activation threshold’ makes language interaction possible. For example, according to the two above hypotheses, representations (such as lexicon) and automatic routines involved in the production of (supra-)segmental structures can be stored within the appropriate modules of the two language subsystems of a single neurofunctional language system. The subsystems use different neural paths, but are stored intertwined amongst each other. The environmental situation of bilingual children is usually quite volatile in terms of the amount and quality of language exposure, and should affect the development of speech and language skills (their perception and production). In the process of language acquisition, variable language input conditions may differentially affect ‘the activation threshold’ level for the components in each of the two language subsystems, enabling the selection of elements of LB for the LA and/or vice versa if ‘the activation threshold’ is sufficiently lowered. Since the ‘activation threshold’ is “operative in all higher cognitive representations”, and is “not associated with any particular anatomical area” (Paradis, 2004, p.28), the same mechanism should be available in all sub-modules of the two linguistic systems, including modules encoding phonology and speech motor control. Since we have dealt with continuous prosodic variables (vowel duration and vocal effort) in bilingual children’s speech production, and the observed language interaction patterns were systematic (and coherent) across the two subjects, different variables and methods, we can assume that the bilingual children produced variable patterns of language interaction unconsciously and automatically. This means that generally we were not dealing with pragmatic ‘code-switching’ issues (Muysken, 2000; Grosjean, 2001). The systematicity of language interaction also means that we were not dealing with occasional ‘unrepaired slips of the tongue’ (de Houwer, 1995). In fact, such systematicity has been claimed to be a sign of ‘static interference’ (Tomioka, 2002; Paradis, 2004): i.e. being part of the representation in the non-target language. ‘Static’ in this sense 269 presupposes that no other representation is available, or that there is no difference between two representations of the two language structures in the subsequent language submodules. However, potentially we argue that if two language-specific speech production options are available, but one of them has a greater ‘activation threshold’ in the non-target language due to increased environmental exposure to this pattern, perhaps we are not dealing with a ‘static interference’, but with the ‘dynamic’ one (Grosjean, 2001; Paradis, 2004). In Section 2.3.2 we discussed Kehoe’s (2002) study revealing bilingual GermanSpanish patterns of language interaction in vowel duration in the speech of early simultaneous bilinguals acquiring two systems featuring a length contrast (German) and lacking it (Spanish). There, we raised the problem that, given the structure of the crosslinguistic difference, it is difficult to decide whether a given pattern should be attributed to ‘transfer’ or to ‘delay’ (Paradis & Genesee, 1996). We argued that attributing the ‘reduced’ vowel length contrast in bilingual’s German speech production to ‘delay’ should be accompanied by evidence that the difference in the extent of short and long vowels compared to German monolingual children eventually ceased (that was not the case in Kehoe’s (2002) study). In fact, our data showed that a similar ‘reduced’ SVLR conditioning for both /i/ and // compared to the monolingual peers did longitudinally cease in the case of AN, and reduce in the case of BS. This may suggest a possibility of a ‘delay’ in a narrow sense suggested by Kehoe for this particular feature. However, there were other problems with the term ‘delay’. As we pointed out, Paradis & Genesee (1996) proposed some general systemic delay, rather than a delay for a feature. So there is a discrepancy in the specification of the term between the two studies. Besides, the term ‘delay’ means potentially ‘disordered’ in the speech and language therapy context, and should be avoided for this reason. Therefore, we keep to our previous position that this apparently ‘delayed’ pattern could rather be viewed as a systematic and normal bilingual language interaction for this particular feature resulting from the mutual structural influence of the two languages in contact in a bilingual’s mental representation. 7.1.3.2 Implications of the findings for the theory and models of language acquisition Our data on language interaction showed that the patterns observed were similar to those reported in second language acquisition studies (Weinreich, 1953; Mack, 1982; de Silva, 1999; Markus & Bond, 1999; Piske et al., 2002; Mennen, 2004). The pattern of 270 ‘over-differentiation’ (Weinreich, 1953) of tense/lax vowels in AN’s Russian can plausibly be explained by language input factors. As far as this data is concerned, there is no need for a separate model of language interaction in sound structure for early simultaneous bilinguals as opposed to L2-learners. However, in employing concepts like ‘markedness’ to bilingual and general phonological acquisition in child speech, it is questioned here whether the concept can be applied at all to stand-alone segments (Jakobson, 1941; Wode, 1992), since we found clear lexicalisation and frequency effects, as well as phonotactic explanations for some apparent ‘markedness’-related effects on the vowels. One of the important findings in this study is the fact that the amount of language differentiation (and subsequently language interaction) systematically differs with changing language exposure conditions. This is confirmed by either looking at speech production of two subjects with very different (yet systematic) exposure to two languages, or by the longitudinal perspective within each of the subjects. This means that the hypotheses such as the Cross-Language Cue Competition Hypothesis (Döpke, 1998; Döpke, 2000) and the Markedness Hypothesis (Müller, 1998), which postulate structural characteristics of languages as a single primary source of differentiation/interaction, cannot disregard environmental factors as a cause of this changing differentiation – on the contrary they should consider them at least as two primary confounding sources. In Section 1.3.2.3.2 we discussed the fact that the formulation of Döpke’s CCCH was based on the Competition Model of monolingual and second language acquisition (Bates & MacWhinney, 1989; MacWhinney, 1997) Unfortunately, the Competition Model provided no explanation of mechanisms of language interaction in simultaneous bilingual language acquisition. The model emphasised that language acquisition is driven by input – both in environmental and language-structural terms. Acquisition is driven by ‘cue strength’: the stronger the cue the earlier it is acquired. ‘Cue strength’ is determined by four dimensions: ‘task frequency’, ‘cue availability’, ‘cue reliability’ and ‘conflict reliability’ (MacWhinney, 1997, p.122). There is one important dimension in the model which is overlooked in Döpke’s hypothesis (1998; 2000) that is intended to account for the kind of environmental effects found in our study. The factor ‘task frequency’ comprises language internal frequencies of properties, but also environmental frequency (no input means there is nothing to acquire). MacWhinney notes that in the context of SLA and simultaneous bilingual language acquisition, the factor ‘task frequency’ may be of greater importance, because if one of the languages is infrequently used, “task frequency could become a factor determining a general slowdown of acquisition” 271 (MacWhinney, 1997, p.122). We suggest, thus, that the dimension similar to ‘task frequency’ should play a more important role in accounts of language interaction effects in order to explain the environmental effects on language interaction such as in this study, and similar environmentally based morphosyntactic language interaction effects observed by Petersen (1988) and Lanza (1992). 7.1.4 Implications of vocal effort findings There is limited evidence that ‘stress-accent’ languages (Beckman, 1986) with structural contrasts involving vowel length may have a differential implementation of the acoustic cues to the accentual systems other than duration. For example, both Fónagy (1966) for Hungarian and Berinstein (1979) for K’ekchi reported that vowel peak intensity played a secondary role in a paradigmatic contrast in distinguishing short and long vowels in words with the same structure and utterance position: i.e. in both languages short vowels had 1-2 dB higher overall intensities than the long counterparts. Importantly, if proven systematic, such evidence could empirically fortify the dynamic view of word-prosodic systems taken in the Stress-Accent Hypothesis (Beckman, 1986), which claims that phonological categories of accentual systems are not necessarily phonetically uniform within a language. Unfortunately the differences in overall intensity in both studies (Fónagy, 1966; Berinstein, 1979) were negligibly small, and could be due to chance. Alternatively, these studies (Fónagy, 1966; Berinstein, 1979) could just have looked at less relevant acoustic cues (overall intensity). More recent empirical studies on the acoustic correlates of stress and prominence (Sluijter & van Heuven, 1996a; Sluijter & van Heuven, 1996b; Sluijter et al., 1997; Traunmüller & Eriksson, 2000; Heldner, 2003) have emphasised the importance of laryngeal level of vocal effort (in addition to the pulmonic one) in conveying linguistic information about stress and prominence in speech production. These studies have shown that overall intensity is an unreliable cue to stress and prominence, while intensity in spectral midfrequencies (‘spectral balance’, ‘spectral emphasis’ or ‘spectral tilt’: i.e. different methods in different studies) seems to reliably reflect the laryngeal contribution to vocal effort, stress and prominence. Our own data for the differentiated vocal effort patterns accompanying durational SVLR suggest that indeed the two studies (Fónagy, 1966; Berinstein, 1979) might not have looked at the most relevant acoustic cues. 272 We have several reasons to think that this differentiated vocal effort pattern accompanying the Scottish Vowel Length Rule vowels under strong sentence accent (Section 6.3.1.1) is due the interaction between SSE word-prosodic system and SVLR, rather than to anticipatory effects of the following consonants. Accentual lengthening cued by duration is a known macroprosodic effect, and it usually affects the various parts of the whole prosodic word, depending on the language (Cambier-Langeveld & Turk, 1999; Turk & Sawusch, 1999), but its domain usually includes the stressed syllable nucleus (short or long). The presence of short/long phonological conditioning of vowel length in a language should impose certain restrictions on how much duration can be used for other functions than phonological length (such as SVLR). Specifically, the short vowel cannot be infinitely lengthened without trespassing the acoustic boundaries of phonological length, as this would be pragmatically odd (if it not lexically contrastive). This is a known effect from the L2acquistition studies, where L2-learners fail to achieve language-specific vowel length for long or short vowels (Mack, 1982; de Silva, 1999; Markus & Bond, 1999). Increasing vocal effort for the short vowels, as in the Scottish SVLR vowels, might be a strategy to compensate for the load on duration from the accentual system of prominence: it thus may be viewed as an additional word-prosodic means (next to duration) to achieve sufficient prominence for the short vowel. We have shown that this SSE pattern of vocal effort is not accidental, since it is systematic for both vowels /i/ and //; monolingual children at age 3;4 have acquired adult-like performance, and so have bilingual children (at least for the unrounded vowels in this study). However, there is a confounding effect to our claim for the SSE SVLR and vocal effort pattern and its relation to prominence: i.e. the varying right consonantal context (voiced fricatives as opposed to other contexts). Yet we have reason to believe that this pattern is not due to anticipatory effects of the following consonant, such as those described in the ‘timing’ model of glottal control (Gobl & Ní Chasaide, 1988). In the model, phonation of the vowel (in voice source parameters) varies as a function of the voicing and manner of articulation of the following consonant. In English, a breathy phonation type is only anticipated before voiceless fricatives, and sometimes voiceless stops (Gobl & Ní Chasaide, 1988; Gobl & Ní Chasaide, 1999b; Ní Chasaide & Gobl, 1999), and not before voiced fricatives, as it was the case in this study. This counts for both short and long SSE vowels /i/ and //. Additionally, the context before voiceless stops in our data is rather compatible with less breathy (more tense) mode of vowel 273 phonation. This apparent contradiction can be explained by the fact the mode of phonation changes in this particular short/long context vary as a function of prominence rather than of the following consonant. Another argument in favour of word-prosodic system/SVLR interaction comes from our pilot analysis of the empirical data gathered from the ongoing SVLR project (Scobbie et al., 1999a; Scobbie et al., 1999b; Scobbie, 2002). We analysed vocal effort in three adult SSE speakers producing the morphophonemically contrastive pairs, such as “rude” and “rued”, which only differ in vowel length. In this case, the confounding effect of the following consonant was absent. The preliminary results showed an effect on vocal effort in a similar direction and extent to that observed in the SVLR data in this study. We are not aware of similar studies of paradigmatic contrasts in vowel length, vocal effort in relation to stress accent. This could be done for the languages (such as Aleut, K’ekchi or Finnish) featuring vowel length with no confounding consonantal effects or vowel quality differences. Additionally, involving syntagmatic comparison of stressed and unstressed short/long vowels may provide more evidence for our hypothesis. Given this limitation, in the context of bilingual acquisition suffice it to state that it was not a trivial task to assess this novel monolingual finding in bilingual children, since neither aspect has been addressed before. Whatever argument proves correct for the laryngeal distinction accompanying the SVLR vowels (effect of prominence or the anticipatory effects), the high systematicity of the data both in the SSE monolingual and in the crosslinguistic context persuaded us to include the vocal effort variables in this study. Indeed, we showed that the SSE monolingual children acquired the pattern in the age samples concerned, as well as the bilingual children. The monolingual vocal effort results involved another novel finding for the phonological acquisition of tense/lax contrast. It has been shown in the literature (so far to a quite limited extent) for American English and for German (Stevens, 1998; Jessen, 2002) that tense/lax contrast not only involves phonetic differences in vowel quality and duration (as it is traditionally treated), but also requires an adjustment in laryngeal configuration. Similarly in our monolingual and bilingual SSE data child and adult data, the lax vowels were realised with less breathy glottal source configuration, while the tense vowels with a more breathy one (or more ‘lax’ in more conventional glottal source terms). This study has shown that the segmental tense/lax difference involves at least a triple phonetic distinction (arguably depending on the language), and the crosslinguistic studies assessing the contribution of phonetic properties involving tense and lax contrast and their acquisition should not overlook the vocal effort difference. 274 7.2 Suggestions for further research Regarding the bilingual ‘interdependence/autonomy debate’, the phonological and more specifically prosodic level of speech seems to be systematically prone to language interaction effects which are variable in extent, both in proficient L2-learners (Caramazza et al., 1973; Williams, 1980; Flege, 1987; Mennen, 2004) and young simultaneous bilinguals (Kehoe et al., 2001; Paradis, 2001; Lleó, 2002; Kehoe, 2002; Kehoe, 2004 and this study). Therefore, it remains to be proven whether fully ‘autonomous’ development of sound structures is possible at all. One of the findings in this study, to which we have no clear answer, is the apparent discrepancy in the bilingual acquisition of vowel duration as opposed to both vowel quality and vocal effort patterns. As far as we are aware there are no claims in the general phonological development literature that vowel duration is more difficult to acquire than other suprasegmental aspects. It seems that the type of crosslinguistic differences in vowel duration (such as in Russian and Scottish English) may be difficult to acquire in the context of simultaneous bilingual acquisition. However, so far only a few studies so far have dealt with these issues (Kehoe, 2002; Whitworth, 2003). Further, since this study supports the importance of language exposure patterns and sound structure differences, it seems reasonable to further look at both of these aspects, to gain more views of how the input/structure interface may operate. We have shown that some of the language interaction patterns in bilingual child speech can be bi-directional. There seems to be a common ground in the bi-directional patterns in the speech of proficient L2 learners (Mennen, 2004) and young bilingual children in this study. However, this option is not yet seriously considered in the accounts on the sources of language interaction or L2 ‘transfer’ (Bates & MacWhinney, 1989; Petersen, 1988; Müller, 1998; Döpke, 1998; Döpke, 2000; Flege, 2002). Monolingual or bilingual studies looking into phonetic aspects of the tense/lax contrasts and their phonological acquisition should not overlook the importance of another phonetic contributor, ‘laryngeal configuration’, in addition to vowel quality and duration. 275 7.3 General Conclusion The results from this study offer new insights on the extent of language differentiation and interaction in bilingual phonological acquisition. In studies of simultaneous bilingual acquisition there is a seeming consensus that children acquire their languages as separate entities. However, we showed that there is evidence that bilingual children’s languages may interact. The systematicity of language interaction in our data show that language interaction in early simultaneous acquisition cannot be discarded as slips of the tongue, and that the development is not fully autonomous (even in a bilingual child who is more ‘balanced’ with regard to the language input). So far these two types of evidence of ‘autonomous’ and ‘interdependent’ development have been largely treated in a mutually exclusive fashion. This study shows that this cannot be assumed. We showed that at the level of sound structure, the development of a bilingual’s languages does not appear to be fully autonomous: language differentiation can be partial, or even be missing at certain developmental stages. The extent of differentiation varies depending on the sound structure involved, but importantly it also depends on the amount of language exposure in both languages. Longitudinal patterns and comparison to monolingual peers suggest that the bilingual differentiation of sound structures mainly increases as a function of age (and possibly as a function of accumulated exposure), and as a function of maturation processes similar to the monolingual children. Evidence of language interaction on the level of sound structure production considered in this study provides some support for a unified model of acquisition. The processes of language interaction observed in our data are largely in line with the types of language interaction observed in L2 learners. Its directionality did not necessarily depend on the relative markedness of the crosslinguistic structures, and in some cases was bidirectional for the same properties. We showed that some structurally complex processes, which are potentially explainable by such concepts as ‘markedness’ (with regard to isolated segments), can – upon closer investigation – rather be explained by lexical, distributional and phonotactic conditioning. 276 References Agutter, A. (1988). The non-so-Scottish Vowel Length Rule. In Edinburgh Studies in the English Language, eds. Anderson, J. M. & MacLeod, N., John Donald Publishers, Edinburgh. Aitken, A. J. (1981). The Scottish Vowel Length Rule. In So Many People, Longages, and Tongues, Edinburgh: Middle English Dialect Project, ed. Benskin, M. L., pp. 131-157. Avanesov, R. I. (1972). Russkoe literaturnoe proiznoshenie, Moscow. Bates, E. & MacWhinney, B. (1989). Functionalism and the Competition Model. In The cross-linguistic study of sentence processing, eds. Bates, E. & MacWhinney, B., Cambridge University Press, Cambridge. Bauer, L. (1985). Tracing phonetic change in the recieved pronunciation of British English. Journal of Phonetics 13, pp. 61-81. Beckman, M. E. (1986). Stress and Non-Stress Accent, Foris Publications, Doordrecht. Berinstein, A. E. (1979). A cross-liguistic study on the contribution of duration to the perception of stress, UCLA Working Papers in Phonetics ed. UCLA, Los Angeles. Bilton, T., Bonnett, K., Jones, P., Lawson, T., Skinner, D., Stanworth, M., & Webster, A. (2002). Introductory Sociology, 4th ed. Birdsong, D. (2004). Second Language Acquisition and Ultimate Attainment. In Handbook of Applied Linguistics, eds. Davies, A. & Elder, C., pp. 82-105. Blackwell, London. Bloomfield, L. (1933). Language, George Allen & Unwin Ltd, London. Boersma, P. & Weenink, D. (2004). PRAAT, a system for doing phonetics by computer. www.praat.org version 4.3.04. Bondarko, L. V. (1981). Foneticheskoe opisanie yazyka, fonologicheskoe opisanie rechi, pp. 1-192. Izdatel'stvo Leningradskogo universiteta, Leningrad. Bondarko, L. V. (1998). Fonetika sovremennogo russkogo yazyka, pp. 1-276. Izdatel'stvo Sankt Peterburgskogo universiteta, St.-Petersburg. Brown, R. (1973). A First Language: The early stages, Harward University Press, Cambridge, MA. 277 Buder, E. H. & Stoel-Gammon, C. (2002). American and Swedish children's acquisition of vowel duration: effects of vowel identity and final stop voicing. Journal of Acoustical Society of America 111, pp. 1854-1864. Burton, M. B. & Robblee, K. E. A phonetic analysis of voicing assimilation in Russian. Journal of Phonetics 25, 97-114. 1997. Cambier-Langeveld, T. & Turk, A. (1999). A cross-linguistic study of accentual lengthening: Dutch vs. English. Journal of Phonetics 27, pp. 171-206. Campbell, W. N. (1995). Loudness, spectral tilt, and perceived prominence in dialogues. In Proceedings of the XIIIth International Congress of Phonetic Science, eds. Elenius, K. & Branderud, P., pp. 676-679. KTH and Stockholm University, Stocholm. Caramazza, A., Yeni-Komshian, G., Zurif, E., & Carbone, E. (1973). The acquisition of a new phonological contrast: the case of stop consonants in French-English bilinguals. Journal of Acoustical Society of America 54, pp. 421-428. Chambers, J. (2002). Dynamics of Dialect Convergence. In Investigating Change and Variation through Dialect Contact, ed. Milroy, L., pp. 117-130. Chen, M. (1970). Vowel length variation as a function of the voicing of the consonant environment. Phonetica 22, pp. 129-159. Chevrot, J.-P., Beaud, L., & Varga, R. (2000). Developmental data on a French sociolinguistic variable: Post-consonantal word-final /R/. Language Variation and Change 12, pp. 295-319. Chomsky, N. (1986). Knowledge of Language: its Nature, Origin and Use, Praeger, New York. Clyne, M. (1967). Transference and Triggering, Martinus Nijhoff, The Hague. Corbett, J., McClure, J. D., & Stuart-Smith, J. (2003). A Brief History of Scots. In The Edinburgh Companion to Scots, eds. Corbett, J., McClure, J. D., & Stuart-Smith, J., pp. 116. Edinburgh University Press, Edinburgh. Crutchley, A., Conti-Ramsden, G., & Botting, N. (1997). Bilingual children with specific language impairment and standardised assessments: preliminary findings from a study of children in language units. International Journal of Bilingualism 6, pp. 117-134. Crystal, D. (1997). The Cambridge Encyclopedia of Language, Cambridge University Press, Cambridge. Dale, P. S. & Fenson, L. (1996). Lexical development norms for young children. Behavior Research Methods, Instruments, & Computers 28, pp. 125-127. 278 de Houwer, A. (1995). Bilingual Language Acquisition. In The Handbook of Child Language, eds. Fletcher, P. & MacWhinney, B., pp. 219-250. Blackwell, Oxford. de Houwer, A. (1998). By way of introduction: Methods in studies of bilingual first language acquisition. International Journal of Bilingualism 2, pp. 249-264. de Houwer, A. (1990). The Acquisition of Two Languages from Birth: a Case Study, Cambridge University Press, Cambridge. de Silva, V. (1999). Interference of a Quantity Language in Rhythmic Structure of a Stress Language. In Proceedings of the 14th International Congress of Phonetic Sciences pp. 559-562. San Francisco. Deterding, D. (1997). The formants of monophthong vowels in Standard Southern British English pronunciation. Journal of the International Phonetic Association 27, pp. 47-55. Deuchar, M. & Quay, S. (2000). Bilingual Acquisition: Theoretical Implications of a case study., Oxford University Press, Oxford. Docherty, G. J. (1992). The timing of voicing in British English Obstruents, Netherlands Phonetics Archives, 9, Foris, Berlin. Docherty, G. J. & Foulkes, P. (1999). Derby and Newcastle: instrumental phonetics and variationist studies. In Urban Voices: Accent Studies in the British Isles, eds. P.Foulkes & G.Docherty, pp. 47-71. Arnold, London, UK. Docherty, G. J., Foulkes, P., Tillotson, J., & Watt, D. (2005). On the scope of phonological learning: issues arising from socially structured variation. In Labphon 8. Döpke, S. (1998). Competing language structures: the acquisition of verb placement by bilingual German-English children. Journal of Child Language 25, pp. 555-584. Döpke, S. (2000). The Interplay Between Language-Specific Development and Crosslinguistic Influence. In Cross-linguistic structures in simultaneous language acquisition, ed. Döpke, S., pp. 79-104. John Benjamins, Amsterdam. Ellis, R. (1994). The Study of Second Language Acquisition, Oxford University Press, Oxford. Escudero, P. (2000). The Perception of English Vowel Contrasts: Acoustic Cue Reliance in the Development of New Contrasts. New Sounds 2000, the Fourth International Symposium on the Acquisition of Second-Language Speech. Fant, G. (1960). Acoustic Theory of Speech Production, 2nd ed., pp. 1-328. Mouton, The Hague - Paris. 279 Ferguson, C. A. & Farwell, C. (1975). Words and sounds in early language acquisition: English initial consonants in the first fifty words. Language 51, pp. 419-439. Finnegan, E. M., Lushei, E. S., & Hoffman, H. T. (2000). Modulations of respiratory and laryngeal activity associated with changes in vocal intensity during speech. Journal of Speech, Language and Hearing Research 43, pp. 934-950. Flege, J. E. (1987). The production of "new" and "similar" phones in a foreign language: evidence for the effect of equivalence classification. Journal of Acoustical Society of America 15, pp. 47-65. Flege, J. E. (2002). Interactions between the Native and Second Language Phonetic Systems. In An Integrated View of Language Development: Papers in Honor of Henning Wode, eds. Burmeister, P., Piske, T., & Rohde, A., Wissenschaftlicher Verlag, Trier. Flege, J. E., Munro, M. J., & MacKay, I. R. A. (1995). Factors affecting strength of perceived foreign accent in a second language. Journal of Acoustical Society of America 97, pp. 3125-3134. Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin 76, pp. 378-382. Fletcher, H. F. & Munson, W. A. (1933). Loudness, its definition, measurement and calculation. Journal of Acoustical Society of America 5, pp. 82-108. Fónagy, I. (1966). Elecro-physiological and acoustic correlates of stress and stress perception. Journal of Speech and Hearing Research pp. 231-244. Fry, D. B. (1955). Duration and intensity as physical correlates of linguistic stress. Journal of Acoustical Society of America 27, pp. 765-768. Fry, D. B. (1976). Experiments in the perception of stress. In Acoustic Phonetics: a Course of Basic Readings, ed. Fry, D. B., Cambridge University Press, Cambridge. Fudge, E. C. (1969). Syllables. In Phonological Theory: The Essential Readings, ed. Goldsmith, J. A., pp. 370-391. Blackwell, Malden,Massachusetts - Oxford. Gauffin, J. & Sundberg, J. (1989). Spectral correlates of glottal voice source waveform characteristics. Journal of Speech and Hearing Research 32, pp. 556-565. Gawlitzek-Maiwald, I. & Tracy, R. (1996). Bilingual Bootstrapping. Linguistics 34, pp. 901-926. Genesee, F. (1989). Early bilingual development: one language or two? Journal of Child Language 16, pp. 161-179. 280 Genesee, F., Nicoladis, E., & Paradis, J. (1995). Language differentiation in early bilingual development. Journal of Child Language 22, pp. 611-631. Gimson, A. C. (1962). An Introduction to the Pronunciation of English, Edward Arnold (Publishers) Ltd, London, UK. Gobl, C. & Ní Chasaide, A. (1988). The effects of adjasent voiced/voiceless consonants on the vowel voice source: a cross-language study. STL-QPSR 2-3. Gobl, C. & Ní Chasaide, A. (1999b). Voice source variation in the vowel as a function of consonantal context. In Coarticulation: Theory, Data, Techniques, eds. Hardcastle, W. J. & Hewlett, N., pp. 122-143. Cambridge University Press. Gobl, C. & Ní Chasaide, A. (1999a). Perceptual correlates of source paramenters in breathy voice. In Proceeding of the 14th International Congress of Phonetic Sciences pp. 2437-2440. San Francisco. González-Bueno, M. (2002). Dental versus Alveolar Articulation of L2 Spanish Stops as Perceived by Native Speakers of Malayalam. In Proceedings of "Linguistics and Phonetics 2002" (LP2002), eds. Haraguchi Shoshuke, Palek Bohumil, & Fujimura Osamu, Charles University Press and Meikai University, Japan. Gordeeva, O. B., Mennen, I., & Scobbie, J. M. (2003). Vowel Duration and Spectral Balance in Scottish English and Russian. In Proceedings of the 15th International Congress of Phonetic Sciences, eds. Solé, M. J., Recasens, D., & Romero, J., pp. 31933196. Barcelona. Gordeeva, O. B. & Scobbie, J. M. Non-normative preaspiration of voiceless fricatives in Scottish English. [paper presented at the Colloquim of the British Association of Academic Phoneticians]. 2004. Cambridge, University of Cambridge. 2004. Grosjean, F. (1982). Life with Two Languages: An Introduction to Bilingualism, reprint 2002 ed., pp. 1-370. Harvard University Press, Cambridge, Massachusetts, London. Grosjean, F. (2001). Bilingual's Language Modes. In One Mind, Two Languages: Bilingual Language Processing, ed. Nicol, J. L., pp. 1-23. Blackwell, Oxford. Grunwell, P. (1982). Clinical Phonology, 2nd ed. Churchill Livingstone, London. Guion, S. G. (2003). The Vowel Systems of Quichua-Spanish Bilinguals: Age of Acquisition Effects on the Mutual Influence of the First and Second Languages. Phonetica 60, pp. 98-128. Hamers, J. F. & Blanc, M. H. A. (2000). Bilinguality and Bilingualism, Cambridge University Press, Cambridge. Hanson, H. M. (1997). Glottal characteristics of female speakers: Acoustic correlates. Journal of Acoustical Society of America 101, pp. 466-481. 281 Hawkins, S. & Midgley, J. (2004). Formant frequencies of RP monophthongs in four agegroups of speakers. Journal of the International Phonetic Association. Heldner, M. (2001). Spectral emphasis as a perceptual cue to prominence. Fonetik 42, pp. 51-57. Heldner, M. (2003). On the reliability of overall intensity and spectral emphasis as acoustic correlates of focal accents in Swedish. Journal of Phonetics 31, pp. 39-62. Hewlett, N., Matthews, B. M., & Scobbie, J. M. (1999). Vowel Duration in Scottish English Speaking Children. pp. 2157-2160. 14th International Congress of Phonetic Sciences, San Francisco. Hillenbrand, J., Getty, L. A., Clark M.J., & Wheeler, K. (1995). Acoustic Characteristics of American English Vowels. Journal of Acoustical Society of America 97, pp. 30993111. Hirose, H. (1999). Investigating the Physiology of Laryngeal Structures. In The Handbook of Phonetic Sciences, eds. Hardcastle, W. J. & Laver, J., Blackwell, Oxford/Massachusetts. Hirst, D. & Di Christo, A. (1998). Intonation Systems. In A Survey of Intonation Systems, eds. Hirst, D. & Di Christo, A., pp. 1-44. Cambridge University Press, Cambridge, U.K. House, A. S. (1961). On Vowel Duration in English. Journal of Acoustical Society of America 33, pp. 1174-1178. Jakobson, R. (1941). Child Language, Aphasia and Phonological Universals, reprint in English 1968 ed. Mouton, The Hague, Paris. Jessen, M. (2002). Spectral Balance in German and its relevance for syllable cut theory. In Silbenschnitt und Tonakzente, eds. Auer, P., Gilles, P., & Spiekermann, H., pp. 153-179. Max Niemeyer, Teubingen. Johnson, C. E. & Lancaster, P. (1998). The Development of More Than One Phonology: A Case Study of a Norwegian-English Bilingual Child. International Journal of Bilingualism 2, pp. 265-300. Jones, D. (1918). An Outline of English Phonetics, 9th (1972) ed. W. Heffer & Sons W. Heffer & Sons, Cambridge. Kavitskaya, D. (2002). Perceptual Salience and Palatalization in Russian. Oral Paper at the Eighth Conference on Laboratory Phonology"Varieties of Phonological Competence". Keating, P. A. (1984). Phonetic and Phonological Representation of stop consonant voicing. In Phonetic Linguistics, ed. Fromkin, V., Academic Press, New York. 282 Kehoe, M. M. (2002). Developing vowel systems as a window to bilingual phonology. International Journal of Bilingualism 6, pp. 315-334. Kehoe, M. M. (2004). Voice Onset time in bilingual German-Spanish children. Bilingualism: Language and Cognition 7, pp. 71-88. Kehoe, M. M. & Stoel-Gammon, C. (2001). Development of syllable structure in Englishspeaking children with particular reference to rhymes. Journal of Child Language 28, pp. 393-432. Kehoe, M. M., Trujullo, C., & Lleó, C. (2001). Phonological acquisition of bilingual children: An analysis of syllable structure and Voice Onset Time. In Proceedings of the Colloqium on Structure, Acquisition, and Change of Grammars: Phonological and Syntactic Aspects, eds. Cantone, K. & Hinzelin, M., pp. 38-54. Kent, R. D. & Read, C. (2002). Acoustic Analysis of Speech, pp. 1-311. Thomson Learning, Albany. Keshavarz, M. H. & Ingram, D. (2002). The early phonological development of a FarsiEnglish bilingual child. International Journal of Bilingualism 6, pp. 255-269. Kessler, B. & Treiman, R. (1997). Syllable Structure and the Distribution of Phonemes in English Syllables. Journal of Memory and Language 37, pp. 295-311. Khattab, G. (2000). VOT Production in English and Arabic Bilingual and Monolingual Children. Leeds Working Papers in Linguistics and Phonetics 8, pp. 95-122. Khattab, G. (2002). Sociolinguistic Competence and the Bilingual's Adoption of Phonetic Variants: Auditory and Instumental Data from English-Arabic Bilinguals, Unpublished Ph.D. Thesis. The University of Leeds, Leeds. Khattab, G. (2004). Variation in vowel production by English-Arabic bilinguals. Paper at the 9th Conference of Laboratory Phonology, June 24-26. Kuznetsov, V. I. (1997). Vokalizm russkoj rechi, Izdatel'stvo Sankt Peterburgskogo universiteta, St-Petersburg. Ladd, D. R. (1996). Intonational Phonology, Cambridge University Press, Cambridge. Ladefoged, P. (1971). Preliminaries in Linguistic Phonetics, University of Chicago Press, Chicago. Ladefoged, P. (1993). A Course in Phonetics, pp. 1-300. Harcourt Brace College Publishers. Ladefoged, P. & McKinney N.P. (1963). Loudness, sound pressure, and subglottal pressure in speech. Journal of Acoustical Society of America pp. 454-460. 283 Lado, R. (1957). Linguistics Across Cultures: Applied Linguistics for Language Teachers, Ann Arbor, MIchigan:University of Michigan. Lanza, E. (1992). Can bilingual two-year olds code-switch? Journal of Child Language 19, pp. 633-658. Lanza, E. (2000). Concluding Remarks: Language Contact -- A Dilemma for the bilingual Child or for the Linguist? In Cross-linguistic structures in simultaneous language acquisition, ed. Doepke, S., pp. 227-246. John Benjamins, Amsterdam. Laver, J. (1994). Principles of Phonetics, Cambridge University Press, Cambridge. Lehiste, I. (1977). Suprasegmentals, pp. 1-194. The Massachusetts Institute of Technology, Massachusetts. Lenneberg, E. H. (1967). Biological Foundations of Language, Wiley, New York. Lindblom, B. (1998). Systemic constraints and adaptive change in the formation of sound structure. In Approaches to the Evolution of Language: Social and Cognitive Bases, eds. Hurford, J. R., Studdert-Kennedy, M., & Knight, C., pp. 242-264. Cambridge University Press, Cambridge. Lisker, L. (1974). On "Explaining" Vowel Duration Variation. Glossa: An International Journal of Linguistics 8, pp. 233-245. Lleó, C. (2002). The role of markedness in the acquisition of complex prosodic structures by German-Spanish bilinguals. International Journal of Bilingualism 6, pp. 291-313. Lüdi, G. (1987). Les marques transcodiques: regards nouveaux sur le bilinguisme. In Devenir bilingue-parler bilingue. Actes du 2e colloque sur le bilinguisme, Université de Neuchatel, 2O-22 Septembre, 1984, ed. Lüdi, G., pp. 1-21. Max Niemeyer Verlag, Tubingen. Lyon, J. (1996). Becoming Bilingual: Language acquisition in a bilingual community, Multilingual Matters, Clevedon, England; Philadelphia, PA. Mack, M. (1982). Voicing-dependent vowel duration in English and French: monolingual and bilingual production. Journal of Acoustical Society of America 71, pp. 173-178. Macken, M. A. (1986). Phonological development: a crosslinguistic perspective. In Language Acquisition, eds. Fletcher, P. & Garman, M., pp. 251-268. Cambridge University Press, Cambridge. Mackenzie Beck, J. (1997). Organic Variation of the Vocal Apparatus. In The Handbook of Phonetic Sciences, eds. Hardcastle, W. J. & Laver, J., Blackwell, Oxford/Massachusetts. 284 MacWhinney, B. (1997). Second Language Acquisition and the Competition Model. In Tutorials in Bilingualism: Psycholinguistic Perspectives, eds. De Groot, A. M. B. & Kroll, J. F., pp. 113-144. Lawrence Erlbaum Associates, Mahwah, New Jersey. MacWhinney, B. (2004). A Unified Model of Language Acquisition. In Handbook of bilingualism: Psycholinguistic approaches, eds. Kroll, J. & De Groot, A., Oxford University Press, Oxford. Markus, D. & Bond, D. (1999). Stress and Length in Learning Latvian. In 14th International Congress of Phonetic Sciences pp. 563-566. San Francisco. Matthews, B. M. (2002). On Variability and the Acquisition of Vowels in Normally Developing Scottish Children (18-36 months), Unpublished Ph.D. thesis. Queen Margaret University College, Edinburgh. McKenna, G. (1988). Vowel Duration in the Standard English of Scotland, unpublished MSc thesis, University of Edinburgh, Edinburgh. McLaughlin, B. (1984). Second language acquisition in childhood, 2 ed. Erlbaum, Hillsdale,NJ. Meisel, J. (1989). Early differentiation of languages in bilingual children. In Bilingualism across the life span. Aspects of acquisition, maturity and loss., eds. Hyltenstam, K. & Obler, L., pp. 13-40. Cambridge University Press, Cambridge. Meisel, J. (2003). The Bilingual Child. In The Handbook of Bilingualism, eds. Batia, T. K. & Ritchie, W. C., Blackwell Publishing, Oxford (UK) - Cambridge (USA). Menn, L. & Stoel-Gammon, C. (1995). Phonological Development. In The Handbook of Child Language, eds. Fletcher, P. & MacWhinney, B., pp. 335-360. Blackwell, Oxford. Mennen, I. (2004). Bi-directional interference in the intonation of Dutch speakers of Greek. Journal of Phonetics 32, pp. 543-563. Mohanan, K. P. (1992). Emergence of Complexity in Phonological Development. In Phonological Development: Models, Research, Implications, eds. Ferguson, C. A., Menn, L., & Stoel-Gammon, C., pp. 635-662. Timonium, Maryland. Müller, N. (1998). Transfer in bilingual first language acquisition. Bilingualism: Language and Cognition 1, pp. 151-171. Muysken, P. (2000). Bilingual speech a typology of code-mixing, Cambridge University Press, Cambridge. Netsell, R., Lotz, W. K., Peters, J. E., & Schulte, L. (1994). Developmental patterns of Laryngeal and Respiratory Function for Speech Production. Journal of Voice 8, pp. 123131. 285 Ní Chasaide, A. & Gobl, C. (1999). Voice Source Variation. In The Handbook of Phonetic Sciences, eds. Hardcastle, W. J. & Laver, J., pp. 427-461. Blackwell, Oxford/Massachusetts. Odlin, T. (1989). Language Transfer, Cambridge University Press, Cambridge. Otomo, K. & Stoel-Gammon, C. (1992). The acquisition of unrounded vowels in English. Journal of Speech and Hearing Research 35, pp. 604-616. Padgett, J. (2005). Russian voicing assimilation, final devoicing, and the problem of [v] (or, the mouse that squeaked). Natural Language and Linguistic Theory to appear. Panasyuk, A. Y., Panasyuk, I. V., Gorlovsky, A. L., & Anfimova, O. V. (1995). Perception of Tense-Lax Vowels and Fortis-Lenis Consonants by Russian Learners of English. In Proceedings of XIIIth International Congress of Phonetic Sciences, eds. Elenius, K. & Branderud, P., pp. 566-569. KTH and Stockholm University, Stockholm. Paradis, J. (2001). Do bilingual two-year olds have separate phonological systems? International Journal of Bilingualism 5, pp. 19-38. Paradis, J. (2000). Beyond "One System or Two?": Degrees of Separation Between the Languages of French-English Bilingual Children. In Cross-linguistic structures in simultaneous language acquisition, ed. Döpke, S., pp. 175-200. John Benjamins, Amsterdam. Paradis, J. & Genesee, F. (1996). Syntactic acquisition in bilingual children: Autonomous or interdependent? Studies in Second Language Acquisition 18, pp. 1-25. Paradis, M. (2004). A neurolinguistic Theory of Bilingualism, John Benjamins Publishing Company, Amsterdam/Philadelphia. Paradis, M. (1993). Linguistic, psycholinguistic, and neurolinguistic aspects of "interference" in bilingual speakers: The Activation Threshold Hypothesis. International Journal of Psycholinguistics 9, pp. 133-145. Paradis, M. (1981). Neurolinguistic Organisation of Bilingualism. LACUS Forum 7, pp. 486-494. Paradis, M. (1998). Aphasia in Bilinguals: How Atypical is it? In Aphasia in Atypical Populations, eds. Coppens, P., Lebrun, Y., & Basso, A., pp. 35-66. Lawrence Erlbaum Associates, London. Pater, J. (2003). The Perceptual Acquisition of Thai Phonology by English Speakers: Task and Stimulus Effects. Second Language Research 19, pp. 209-223. Petersen, J. (1988). Word-internal code-switching constraints in a bilingual child's grammar. Linguistics 26, pp. 479-493. 286 Peterson, G. E. & Lehiste, I. (1960). Duration of Syllable Nuclei in English. Journal of Acoustical Society of America 32, pp. 693-703. Petitto, L. A. (2001). Bilingual signed and spoken language acquisition from birth: implications for the machanisms underlying early bilingual language acquisition. Journal of Child Language 28, pp. 453-496. Piske, T., Flege, J. E., & MacKay, I. R. A. (2002). The Production of English Vowels by Fluent Early and Late Italian-English Bilinguals. Phonetica 59, pp. 49-71. Potisuk, S., Gandour, J., & Harper, M. P. (1996). Acoustic Correlates of Stress in Thai. Phonetica 53, pp. 200-220. Press, W. H., Teukolsky, W. T., Vetterling W.T., & Flannery, B. P. (1992). Numerical Recipes in C: the Art of Scientific Computing, 2nd ed. Cambridge University Press, Cambridge. Redlinger, W. E. & Park, T.-Z. (1980). Language mixing in young bilinguals. Journal of Child Language 7, pp. 337-352. Remijsen, B. (2002). Word-prosodic Systems of Raja Ampat Languages, Universiteit Leiden Centre of Linguistics, Leiden. Rietveld, A. C. M. & van Heuven, V. J. (1997). Algemene Fonetiek, pp. 1-420. Dick Coutinho, Bussum. Robinson, D. W. & Dadson, R. S. (1956). A redetermination of the equal-loudness relations for pure tones. British Journal of Applied Physics 7, pp. 166-181. Rockey, D. (1973). Phonetic lexicon of monosyllabic and some disyllabic words, with homophones, arranged according to their phonetic structure, Heyden & Son LTD, London, New York, Rheine. Schlyter, S. (1993). The weaker language in bilingual Swedish-French children. In Progression and Regression in Language, eds. K.Hyltenstam & A.Viberg, Cambridge University Press, Cambridge. Schnitzer, M. L. & Krasinski, E. (1994). The development of segmental phonological production in a bilingual child. Journal of Child Language 21, pp. 585-622. Schnitzer, M. L. & Krasinski, E. (1996). The development of segmental phonological production in a bilingual child: a contrasting second case. Journal of Child Language 23, pp. 547-571. Scobbie, J. M. (2005). Flexibility in the face of incompatible English VOT systems. In Papers in Laboratory Phonology 8: Varieties of Phonological Competence, eds. Goldstein, L. M., Best, C., & Whalen, D.. 287 Scobbie, J. M. (2002). Fuzzy contrasts, fuzzy inventories, fuzzy systems: Thoughts on quasi-phonemic contrasts, the phonetics/phonology interface and sociolinguistic variation. Second International Conference of Contrast in Phonology, University of Toronto (oral paper), Toronto. Scobbie, J. M., Hewlett, N., & Turk, A. (1999a). Standard English in Edinburgh and Glasgow: the Scottish Vowel Length Rule revealed. In Urban Voices: Accent Studies in the British Isles, eds. P.Foulkes & G.Docherty, pp. 230-245. Arnold, London. Scobbie, J. M., Turk, A., & Hewlett, N. (1999b). Morphemes, Phonetics and Lexical Items: The Case of the Scottish Vowel Length Rule. In Proceedings of the 14th International Congress of Phonetic Sciences pp. 1617-1620. San Francisco. Selkirk, E. (1982). The Syllable. In Phonological Theory: The Essential Readings, ed. Goldsmith, J. A., Blackwell, Malden,Massachusetts - Oxford. Shvachkin, N. K. (1948). The Development of Phonemic Speech Perception in Early Childhood. In Studies in Child Language Development, eds. Ferguson, C. A. & Slobin, D. I., pp. 91-127. Holt, Rinehart and Winston, Inc., New York. Sjölander, K. & Beskow, J. WaveSurfer - an Open Source Speech Tool. 2000. Bejing, China, International Conference of Speech and Language Processing 2000. Sluijter, A. M. C. & van Heuven, V. J. (1996b). Spectral balance as an acoustic correlate of linguistic stress. Journal of Acoustical Society of America 100, pp. 2471-76. Sluijter, A. M. C. & van Heuven, V. J. (1996a). Acoustic correlates of linguistic stress and accent in Dutch and American English. In ICSLP'96 Philadelphia. Sluijter, A. M. C., van Heuven, V. J., & Pacilly, J. J. A. (1997). Spectral Balance as a cue in the perception of linguistic stress. Journal of Acoustical Society of America 101, pp. 503-513. Smith, C. L. (1997). The devoicing of /z/ in American English: effects of local and prosodic context. Journal of Phonetics 25, pp. 471-500. Stevens, K. N. (1998). Acoustic Phonetics, The MIT Press, Cambridge, Massachusetts. Stoel-Gammon, C. & Buder, E. H. (1999). Vowel Length, Post-Vocalic Voicing and VOT in the Speech of Two-Year Olds. In Proceedings of the 14th International Congress of Phonetic Sciences pp. 2485-2488. San Francisco. Stoel-Gammon, C., Buder, E. H., & Kehoe, M. M. (1995). Acquisition of vowel duration: a comparison of Swedish and English. Proceesings of the 13th International Congress of Phonetic Sciences, Stockholm. 288 Stoel-Gammon, C. & Herrington, P. B. (1990). Vowel systems of normally developing and phonologically disordered children. Clinical Linguistics and Phonetics 4, pp. 145160. Stow, C. & Dodd, B. (2003). Providing an equitable service to bilingual children in the UK: a review. International Journal of Language and Communication Disorders 38, pp. 351-378. Strathopoulos, E. T. (1995). Variability revisited: an acoustic, aerodynamic, and respiratory kinematic comparison of children and adults during speech. Journal of Phonetics 23, pp. 67-80. Strathopoulos, E. T. & Sapienza, C. (1993). Respriratory and laryngeal measures of children during vocal intensity variation. Journal of Acoustical Society of America 94, pp. 2531-2543. Svetozarova, N. (1998). Intonation in Russian., eds. Hirst, D. & Di Christo, A., pp. 261274. Cambridge University Press, Cambridge. Taff, A., Rozelle, L., Cho, T., Ladefoged, P., Dirks, M., & Wegelin, J. (2004). Phonetic Structures of Aleut. Journal of Phonetics 29, pp. 231-271. Titze, I. R. (1994). Principles of Voice Production, Prentice-Hall; Englewood Cliffs, N.J., USA. Titze, I. R. & Sundberg, J. (1992). Vocal internsity in speakers and singers. Journal of Acoustical Society of America 91, pp. 2936-2946. Tomioka, N. (2002). A bilingual language production model. Paper presented at the International Symposium on the Multimodality of Human Communication, University of Toronto, 5 May. Traunmüller, H. & Eriksson, A. (1997). A method of measuring formant frequencies at high fundamental frequencies. Proceedings of EuroSpeech '97 1, pp. 477-480. Traunmüller, H. & Eriksson, A. (2000). Acoustic effects of variation in vocal effort by men, women, and children. Journal of Acoustical Society of America 107, pp. 3438-3451. Trubetskoy, N. S. (1939). Gründzuge der Phonologie, Moscow, 2000. Tsejtlin, S. V. (2002). Yazyk i rebenok: lingvistika detskoj rechi, pp. 1-239. Vlados, Moscow. Turk, A. & Sawusch, J. R. (1999). The domain of accentual lengthening in American English. Journal of Phonetics 25, pp. 25-41. 289 van Zanten, E., Damen, L., & van Houten, E. The ASSP Speech Database. SPIN/ASSPreport 41. 1991. Utrecht, Speech Technology Foundation. Vihman, M. M. (1996). Phonological Development: The Origins of Language in the Child, Blackwell Publishers, Cambridge, Massachusetts - Oxford. Vihman, M. M. (2002). Getting started without a system: from phonetics to phonology in bilingual development. International Journal of Bilingualism 6, pp. 239-254. Volterra, V. & Taeschner, T. (1978). The acquisition and development of language in bilingual children. Journal of Child Language 5, pp. 311-326. Walker, V. (1992). The Formant Frequencies of Scottish Vowels, Unpublished BSc dissertation. Queen Margeret University College, Edinburgh. Weinreich, U. (1953). Languages in Contact: Findings and Problems, 9th 1979 ed. Mouton, The Hague. Wells, J. Computer-coding the IPA: a proposed extension of SAMPA. 1995. Wells, J. A study of the formants of the pure vowels of British English. 1962. University of London, London. Wells, J. (1982). Accents of English, pp. 1-673. Cambridge University Press, Cambridge. Whitworth, N. (2003). Bilingual Acquisition of Speech Timing: Aspects of Rhythm Production by German-English Families, Unpublished Ph.D. thesis. The University of Leeds, Leeds. Williams, L. (1980). Phonetic variation as a function of second-language learning. In Child Phonology: perception, eds. Yeni-Komshian, G., Ferguson, C., & Kavanagh, J., pp. 185-216. Academic Press, New York. Wode, H. (1992). Categorical Perception and Segmental Coding. In Phonological Development: Models, Research, Implications, eds. Ferguson, C. A., Menn, L., & StoelGammon, C., pp. 605-631. Timonium, Maryland. Zharkova, N. N. (2002). Razvitie fonologicheskoj sistemy detskoy rechi (eksperimental'no-foneticheskoe issledovanie), unpublished M.Sc thesis. St. Petersburg State University, St. Petersburg. 290 Appendix A Phonetic ranges of the production of the target /i/ by the SSE monolingual children. Speaker C3_3;4 C7_4;2 C4_3;8 C3_3;11 C6_4;0 C8_4;2 C5_4;0 C4_4;1 C9_4;9 C7_4;8 Total Tokens per speaker N % N % N % N % N % N % N % N % N % N % N % Label [i] [] 87 100.0% 74 100.0% 36 100.0% 78 100.0% 68 100.0% 122 100.0% 52 100.0% 54 100.0% 105 99.1% 108 97.3% 784 99.5% 0 .0% 0 .0% 0 .0% 0 .0% 0 .0% 0 .0% 0 .0% 0 .0% 1 .9% 3 2.7% 4 .5% Total 87 100.0% 74 100.0% 36 100.0% 78 100.0% 68 100.0% 122 100.0% 52 100.0% 54 100.0% 106 100.0% 111 100.0% 788 100.0% 291 Appendix B Distributions of the three most frequent phonetic labels (per carrier word) for the target // produced by the SSE monolingual children. Tokens Label [] Carrier shoes soup cook food put foot took Total N % within carrier % within label N % within carrier % within label N % within carrier % within label N % within carrier % within label N % within carrier % within label N % within carrier % within label N % within carrier % within label N % within carrier % within label 193 98.0% 38.8% 6 85.7% 1.2% 126 69.2% 25.4% 136 81.9% 27.4% 28 93.3% 5.6% 6 85.7% 1.2% 2 66.7% .4% 497 84.0% 100.0% [u] 4 2.0% 15.4% 0 .0% .0% 6 3.3% 23.1% 15 9.0% 57.7% 0 .0% .0% 1 14.3% 3.8% 0 .0% .0% 26 4.4% 100.0% Total [] 0 .0% .0% 1 14.3% 1.4% 50 27.5% 72.5% 15 9.0% 21.7% 2 6.7% 2.9% 0 .0% .0% 1 33.3% 1.4% 69 11.7% 100.0% 197 100.0% 33.3% 7 100.0% 1.2% 182 100.0% 30.7% 166 100.0% 28.0% 30 100.0% 5.1% 7 100.0% 1.2% 3 100.0% .5% 592 100.0% 100.0% 292 Appendix C Duration of the close(-mid) vowels produced by the adult subjects as a function of the following consonant in SSE, MSR and SSBE. Language Speaker Vowel SSE /i/ S2 // // S1 /i/ // // S5 /i/ // // S4 /i/ // // Following consonant fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voice- Median duration (ms) 218.02 135.89 105.13 138.11 112.84 90.10 70.72 97.25 203.01 119.19 107.32 130.95 165.29 114.91 99.41 117.27 119.23 117.09 99.19 113.41 169.96 113.43 104.31 121.27 219.32 115.73 100.36 117.55 104.05 105.39 92.87 102.66 223.11 106.28 109.51 121.45 216.78 111.92 99.73 119.09 118.15 92.15 95.79 102.21 249.05 118.54 96.83 Mean duratio n (ms) Std. Dev. 212.59 136.11 105.36 153.06 115.18 93.96 71.48 97.95 197.68 118.08 110.09 149.80 173.28 114.62 97.97 128.63 118.56 115.90 100.65 112.07 177.11 114.17 105.15 137.54 223.86 116.80 101.19 148.05 106.69 105.63 91.02 101.34 225.00 108.81 107.35 152.30 218.43 116.43 99.52 145.06 111.77 105.90 100.41 105.20 225.09 114.60 104.73 39.55 13.49 22.20 52.81 14.22 16.51 6.88 21.51 28.96 14.47 20.83 46.70 25.15 12.72 12.71 36.97 15.34 7.70 10.25 13.66 27.69 9.56 9.46 37.71 25.85 15.21 9.97 57.74 10.22 12.68 12.61 13.62 31.79 13.84 11.69 60.99 59.97 22.13 27.91 65.94 26.37 32.59 26.93 28.24 73.49 25.83 29.09 n of tokens 28 28 25 81 12 12 6 30 28 26 13 67 30 30 30 90 14 15 13 42 30 30 15 75 27 28 25 80 15 15 14 44 26 29 14 69 21 22 20 63 9 11 14 34 28 23 11 293 S3 /i/ // // MSR R3 /i/ /u/ R4 /i/ /u/ R2 /i/ /u/ R1 /i/ /u/ R5 /i/ /u/ SSBE E2 /i/ Total fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal 134.81 223.97 147.68 124.08 151.69 139.25 118.67 112.40 125.77 224.93 132.66 124.19 150.54 104.85 99.83 90.86 98.59 116.16 99.62 106.37 106.37 84.78 93.78 80.72 87.69 100.52 90.48 95.11 93.02 90.76 90.48 83.91 87.97 96.95 101.96 95.71 98.34 135.93 116.52 107.46 117.79 142.10 104.29 97.60 107.61 114.35 97.08 82.15 97.54 121.63 92.06 89.77 92.92 246.41 229.88 138.42 214.65 162.75 225.18 148.59 124.00 168.18 139.83 119.61 111.29 123.49 225.79 135.35 123.24 169.72 106.71 103.71 88.05 100.87 115.63 104.84 113.98 111.00 88.19 94.32 81.45 88.58 100.10 89.12 91.47 93.96 92.61 92.19 84.17 90.36 99.34 102.83 92.63 99.82 134.03 117.10 104.13 118.27 142.64 103.78 100.48 110.55 116.44 100.59 81.26 99.84 120.89 99.28 88.19 99.14 245.96 223.55 134.31 197.31 77.69 22.46 18.01 19.18 47.85 13.28 13.69 21.59 20.13 24.56 18.73 16.28 51.17 19.03 20.87 13.19 19.87 18.64 17.46 23.71 19.72 14.51 9.89 12.86 13.49 14.41 11.49 15.27 14.26 19.69 15.14 9.88 16.29 12.37 15.57 9.87 13.81 16.35 12.45 13.95 19.86 20.17 18.12 16.41 24.28 15.75 12.55 16.06 20.46 15.70 17.65 8.66 18.72 27.32 22.19 16.18 56.66 62 29 28 25 82 14 15 14 43 30 30 14 74 44 45 30 119 29 29 15 73 45 39 29 113 29 29 15 73 40 39 26 105 26 28 10 64 41 20 44 105 21 40 40 101 15 15 14 44 15 15 30 60 30 14 29 73 294 // /u/ // E1 /i/ // /u/ // E3 /i/ // /u/ // E5 /i/ // /u/ // fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ Total stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ Total stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ Total stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ Total stop voiceTotal 146.65 145.81 126.54 142.87 235.30 214.94 227.73 110.52 110.52 250.11 228.31 155.65 217.17 143.12 132.71 113.08 133.43 228.96 221.59 226.61 111.68 111.68 336.38 320.51 176.34 288.21 224.15 193.57 159.26 193.37 323.06 327.39 324.11 155.56 155.56 289.12 254.49 145.64 253.27 172.54 153.78 126.96 152.04 288.46 247.77 275.63 112.09 112.09 149.06 144.76 125.91 139.91 234.01 222.02 229.92 112.42 112.42 246.64 231.96 151.76 206.48 150.55 133.65 118.16 134.12 233.48 229.00 231.95 112.64 112.64 333.81 313.11 174.84 265.45 218.79 190.62 163.16 190.86 318.58 315.46 317.54 158.43 158.43 294.37 256.51 143.99 228.91 169.96 155.78 128.39 151.37 295.55 253.99 281.05 115.86 115.86 17.95 15.25 16.19 19.06 23.96 26.73 25.29 15.39 15.39 25.19 19.89 23.96 50.34 22.79 22.01 25.91 26.68 26.19 32.16 28.04 20.17 20.17 37.86 32.83 17.98 81.36 26.04 24.09 20.30 32.54 43.82 44.43 43.54 18.63 18.63 29.97 13.75 15.12 72.34 17.27 16.17 14.45 23.43 35.65 16.97 36.26 14.90 14.90 15 15 15 45 29 15 44 22 22 30 15 29 74 15 15 15 45 27 14 41 30 30 30 14 30 74 15 15 15 45 30 15 45 28 28 30 15 28 73 15 15 15 45 28 15 43 30 30 295 Appendix D Duration of the close(-mid) vowels produced by the adult subjects averaged per language (SSE, MSR and SSBE) and speaker as a function of the following consonant. Language SSE Vowel /i/ // // MSR /i/ /u/ SSBE /i/ // /u/ // Following consonant fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ Total stop voiceTotal Median duration (ms) 211.20 123.60 102.72 132.79 118.57 108.97 96.37 108.01 208.54 118.62 108.55 132.16 102.72 97.81 89.72 97.07 111.18 97.77 94.51 99.78 271.66 248.62 149.99 227.88 165.25 150.52 130.62 148.99 265.60 245.31 255.55 119.12 119.12 Mean duration (ms) 209.72 126.78 105.55 148.35 118.84 108.96 97.96 108.76 209.68 118.46 110.28 154.21 106.00 99.95 90.29 99.41 113.73 100.31 97.03 103.68 280.20 255.87 151.49 224.61 172.09 156.20 133.90 154.06 271.26 255.56 265.90 125.13 125.13 Std. Dev. 40.33 21.12 20.86 53.55 19.35 19.21 21.06 21.47 45.34 19.15 18.81 56.54 24.39 17.27 16.10 20.96 22.48 17.08 17.26 20.27 47.53 41.40 23.86 71.06 35.21 28.87 25.95 33.94 50.34 48.37 50.10 26.13 26.13 n of tokens 135 136 125 396 64 68 61 193 142 138 67 347 185 158 143 486 120 141 110 371 120 58 116 294 60 60 60 180 114 59 173 110 110 296 Appendix E Individual results of the SSE monolingual children for the duration of the vowel /i/ as a function of the following consonant. SSE monolingual child C3_3;4 C7_4;2 C4_3;8 C3_3;11 C6_4;0 C8_4;2 C5_4;0 C4_4;1 C9_4;9 C7_4;8 Following Consonant fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal fric voice+ stop voice+ stop voiceTotal Median duration of /i/ (ms) 299.68 165.27 173.51 191.99 435.95 163.42 182.98 268.53 350.06 209.38 163.28 209.38 316.29 158.36 136.17 199.37 346.64 172.37 127.72 173.86 304.05 180.09 94.60 201.99 367.13 289.11 227.93 301.91 339.46 166.81 199.17 248.44 258.44 144.53 125.69 174.01 331.08 162.33 138.90 187.27 n of tokens 38 21 47 106 38 10 26 74 17 7 19 43 32 16 34 82 34 18 41 93 56 25 41 122 33 15 27 75 27 8 22 57 48 13 42 103 39 20 49 108 297 Appendix F Individual results of the SSE monolingual children for the duration of the vowel // as a function of the following consonant. Following SSE Child consonant C3_3;4 fric voice+ stop voice+ stop voiceTotal C7_4;2 fric voice+ stop voice+ stop voiceTotal C4_3;8 fric voice+ stop voice+ stop voiceTotal C3_3;11 fric voice+ stop voice+ stop voiceTotal C6_4;0 fric voice+ stop voice+ stop voiceTotal C8_4;2 fric voice+ stop voice+ stop voiceTotal C5_4;0 fric voice+ stop voice+ stop voiceTotal C4_4;1 fric voice+ stop voice+ stop voiceTotal C9_4;9 fric voice+ stop voice+ stop voiceTotal C7_4;8 fric voice+ stop voice+ stop voiceTotal Median vowel duration (ms) 334.12 210.24 136.55 198.75 539.71 142.39 130.97 151.32 232.29 255.14 71.77 247.67 310.43 174.94 111.26 174.94 382.75 192.70 114.95 187.15 375.94 157.37 98.54 150.65 483.88 211.10 136.80 235.53 321.89 471.61 115.15 249.73 217.14 106.52 68.42 136.67 322.19 151.56 123.46 187.25 n of tokens 27 15 28 70 14 14 20 48 7 6 2 15 16 13 18 47 19 15 27 61 25 26 39 90 21 21 21 63 14 11 10 35 21 15 15 51 21 23 22 66 298 Appendix G Individual results of the SSE monolingual children for the duration of the vowel // as a function of the following consonant. SSE child C3_3;4 C7_4;2 C4_3;8 C3_3;11 C6_4;0 C8_4;2 C5_4;0 C4_4;1 C9_4;9 C7_4;8 Following Consonant fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total fric voice+ fric voice stop voice+ Total Median duration (ms) 269.28 139.31 228.52 262.40 182.75 130.46 156.42 179.61 206.57 176.12 157.96 195.87 202.19 142.87 186.87 179.47 171.74 109.38 134.47 153.40 219.20 108.38 128.24 157.16 208.81 112.57 270.59 239.62 177.26 157.64 200.09 183.26 148.25 99.35 102.82 145.66 176.57 132.69 159.37 158.75 n of tokens 22 26 25 73 11 17 10 38 7 12 7 26 16 41 15 72 19 9 21 49 30 35 29 94 22 23 24 69 12 21 11 44 26 32 22 80 21 32 21 74 299 Appendix H Duration of the vowel /i/ as a function of the following consonant produced by the bilingual subject AN: longitudinal results for MSR and SSE. Following Consonant voiced fricative Language SSE MSR voiced stop SSE MSR voiceless stop SSE MSR AGE 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total Mean duration (ms) 247.45 315.22 287.41 265.29 245.35 183.33 219.24 215.70 179.31 169.33 181.42 178.48 248.82 145.42 145.30 178.61 176.48 105.43 143.30 152.18 213.38 296.22 201.37 223.85 Std. Dev. 117.92 235.80 127.60 138.61 94.60 88.11 45.72 80.60 59.42 49.36 67.71 60.65 128.60 91.08 47.38 104.19 94.87 41.21 71.02 83.34 113.44 454.53 79.32 211.73 n of tokens 118 20 50 188 22 23 26 71 33 12 28 73 27 26 31 84 66 25 49 140 25 12 27 64 300 Appendix I Duration of the vowels // and /u/ as a function of the following consonant produced by the bilingual subject AN: longitudinal results for MSR and SSE. Following Consonant voiced fricative Language SSE MSR voiced stop SSE MSR voiceless stop SSE MSR AGE 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total 1st 2nd 3rd Total 3;7 4;2 4;5 Total 3;7 4;2 4;5 Total Mean duration (ms) 271.33 240.17 343.85 292.58 307.92 308.72 226.71 275.52 225.76 168.32 222.66 220.51 177.17 109.56 187.54 155.67 196.75 109.41 123.45 152.84 198.89 186.86 169.83 186.06 Std. Dev. 134.86 127.98 186.21 157.34 136.30 144.95 69.42 122.97 92.48 37.37 94.66 91.07 67.98 50.06 126.47 88.22 107.05 46.21 57.48 89.30 70.85 99.82 71.03 81.48 n of tokens 38 11 27 76 25 30 37 92 37 5 31 73 20 19 14 53 47 13 51 111 48 41 40 129 301 Appendix J Duration of the vowel /i/ as a function of the following consonant produced by the bilingual subject BS: longitudinal results for MSR and SSE. Following Consonant voiced fricative Age 3;4 3;10 4;5 Total voiced stop 3;4 3;10 4;5 Total voiceless stop 3;4 3;10 4;5 Total Language SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total Mean duration (ms) 255.18 239.99 252.62 232.34 207.50 222.09 266.59 241.87 264.59 253.56 219.97 246.05 201.26 232.07 212.46 175.48 189.15 184.38 189.36 281.79 226.83 189.75 217.99 203.87 251.73 272.35 258.06 233.43 170.83 208.74 221.06 249.31 228.76 235.36 229.37 233.44 Std. Dev. 116.91 71.37 110.34 110.68 91.65 103.38 107.33 66.94 104.53 112.05 84.47 107.26 76.73 94.55 83.90 76.02 117.63 104.54 92.30 97.85 103.97 81.14 113.83 99.51 155.58 119.85 145.18 120.85 73.93 108.72 143.88 161.01 148.34 142.68 127.74 137.84 n of tokens 69 14 83 47 33 80 68 6 74 184 53 237 28 16 44 22 41 63 22 15 37 72 72 144 61 27 88 43 28 71 64 24 88 168 79 247 302 Appendix K Duration of the vowels /u/ and // as a function of the following consonant produced by the bilingual subject BS: longitudinal results for MSR and SSE. Following Consonant voiced fricative Age 3;4 3;10 4;5 Total voiced stop 3;4 3;10 4;5 Total voiceless stop 3;4 3;10 4;5 Total Language SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total SSE MSR Total Mean duration (ms) 259.91 259.43 259.76 278.58 257.99 265.47 343.03 244.83 301.21 291.98 254.35 274.95 252.69 247.19 250.29 247.23 203.17 216.84 244.07 260.52 250.17 246.90 228.31 237.48 237.06 272.13 259.46 205.27 189.60 195.55 238.04 225.38 229.41 221.71 218.36 219.55 Std. Dev. 99.27 85.61 94.57 96.44 127.38 116.57 220.21 114.76 187.93 153.79 113.67 138.00 158.71 59.98 123.81 115.23 132.28 127.89 130.94 128.71 129.31 132.87 122.90 127.83 112.20 70.73 88.05 108.36 84.03 93.70 111.87 73.35 86.76 109.73 83.04 93.10 n of tokens 41 18 59 20 35 55 31 23 54 92 76 168 18 14 32 18 40 58 39 23 62 75 77 152 13 23 36 33 54 87 21 45 66 67 122 189 303 Appendix L Mean RMS-power around F2 (dB) for the adult subjects averaged per language (SSE, MSR and SSBE) for the vowel /i/ as a function of the following consonant. Acoustic Following measure Consonant A2 A2*a A2*b Mean n of Language (dB) Std. Dev. subjects voiced fricative SSE MSR SSBE Total voiced stop SSE MSR SSBE Total voiceless stop SSE MSR SSBE Total voiced fricative SSE MSR SSBE Total voiced stop SSE MSR SSBE Total voiceless stop SSE MSR SSBE Total voiced fricative SSE MSR SSBE Total voiced stop SSE MSR SSBE Total voiceless stop SSE MSR SSBE Total -26.80 -26.49 -23.81 -25.83 -24.52 -28.19 -25.52 -26.12 -22.37 -27.98 -22.07 -24.29 -27.60 -26.44 -25.57 -26.60 -24.67 -28.52 -25.50 -26.28 -22.03 -29.04 -21.41 -24.35 -31.36 -29.63 -27.97 -29.77 -28.43 -31.71 -27.91 -29.45 -25.79 -32.23 -23.81 -27.52 3.09 3.85 3.87 3.57 1.24 5.35 1.92 3.59 2.76 5.30 2.75 4.57 3.42 5.14 4.52 4.14 1.96 6.14 2.93 4.23 3.09 5.68 3.54 5.38 3.42 5.14 3.97 4.16 1.96 6.14 2.83 4.21 3.09 5.68 3.74 5.48 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 304 Appendix M Mean RMS-power around F2 (dB) for the adult subjects averaged per language (SSE, MSR and SSBE) for the close rounded vowels as a function of the following consonant. Acoustic Following n of measure Consonant Language Mean Std. Dev. subjects A2 voiced fricative SSE -28.97 4.97 MSR -23.58 3.48 SSBE -27.19 2.24 Total -26.54 4.27 voiced stop SSE -31.89 4.00 MSR -25.39 4.15 SSBE -23.71 2.13 Total -27.23 4.97 voiceless stop SSE -28.22 5.85 MSR -26.48 3.61 SSBE -17.00 2.06 Total -24.39 6.30 A2*a voiced fricative SSE -31.29 4.45 MSR -23.39 4.97 SSBE -28.37 3.41 Total -27.63 5.35 voiced stop SSE -31.32 3.40 MSR -25.87 6.04 SSBE -23.58 3.94 Total -27.17 5.44 voiceless stop SSE -24.77 5.44 MSR -28.42 6.42 SSBE -18.31 1.35 Total -24.23 6.32 A2*c voiced fricative SSE -36.43 4.45 MSR -13.11 4.97 SSBE -34.46 2.42 Total -27.54 11.84 voiced stop SSE -36.46 3.40 MSR -15.59 6.04 SSBE -27.10 3.09 Total -26.33 10.05 voiceless stop SSE -29.91 5.44 MSR -18.14 6.42 SSBE -12.20 2.44 Total -20.65 8.98 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 5 5 4 14 305 Appendix N Mean RMS-power around F2 (dB) produced by the SSE subjects of different ages for the vowel /i/ as a function of the following consonant. Acoustic Measure A2 A2*a A2*b Following n of Consonant Age Mean (dB) Std. Dev. subjects voiced fricative adult -26.80 3.09 child 3;4 to 3;11 -30.70 2.66 child 4;0 to 4;4 -29.84 4.28 child 4;5 to 4;9 -27.97 4.80 Total -28.75 3.65 voiced stop adult -24.52 1.24 child 3;4 to 3;11 -30.11 1.49 child 4;0 to 4;4 -29.63 3.32 child 4;5 to 4;9 -27.10 5.76 Total -27.68 3.54 voiceless stop adult -22.37 2.76 child 3;4 to 3;11 -28.46 3.59 child 4;0 to 4;4 -27.75 3.10 child 4;5 to 4;9 -26.71 5.52 Total -25.96 4.01 voiced fricative adult -27.60 3.42 child 3;4 to 3;11 -28.84 3.76 child 4;0 to 4;4 -27.64 5.77 child 4;5 to 4;9 -26.47 6.21 Total -27.71 4.26 voiced stop adult -24.67 1.96 child 3;4 to 3;11 -26.50 2.87 child 4;0 to 4;4 -25.91 4.77 child 4;5 to 4;9 -23.57 6.97 Total -25.30 3.64 voiceless stop adult -22.03 3.09 child 3;4 to 3;11 -25.82 3.60 child 4;0 to 4;4 -23.66 6.21 child 4;5 to 4;9 -22.68 7.02 Total -23.42 4.60 voiced fricative adult -31.36 3.42 child 3;4 to 3;11 -33.59 3.76 child 4;0 to 4;4 -32.91 5.41 child 4;5 to 4;9 -31.21 6.21 Total -32.30 4.18 voiced stop adult -28.43 1.96 child 3;4 to 3;11 -31.24 2.87 child 4;0 to 4;4 -31.18 4.85 child 4;5 to 4;9 -28.31 6.97 Total -29.89 3.82 voiceless stop adult -25.79 3.09 child 3;4 to 3;11 -30.57 3.60 child 4;0 to 4;4 -28.93 6.01 child 4;5 to 4;9 -27.42 7.02 Total -28.01 4.68 5 3 5 2 15 5 3 5 2 15 5 3 5 2 15 5 3 5 2 15 5 3 5 2 15 5 3 5 2 15 5 3 5 2 15 5 3 5 2 15 5 3 5 2 15 306 Appendix O Mean RMS-power around F2 (dB) produced by the SSE subjects of different ages for the vowels /i/ and // across all consonantal contexts. Acoustic Measure Vowel /i/ A2 // A2*a /i/ // A2*b /i/ // n of Age Mean (dB)Std. Dev. subjects adult -24.60 1.71 5 child 3;4 to 3;11 -29.72 2.44 3 child 4;0 to 4;4 -29.14 4.04 5 child 4;5 to 4;9 -27.27 5.21 2 Total -27.49 3.65 15 adult -23.11 4.82 5 child 3;4 to 3;11 -19.29 1.09 3 child 4;0 to 4;4 -23.90 2.41 5 child 4;5 to 4;9 -20.07 3.96 2 Total -22.21 3.66 15 adult -25.07 1.85 5 child 3;4 to 3;11 -27.08 3.14 3 child 4;0 to 4;4 -25.49 5.43 5 child 4;5 to 4;9 -24.38 6.97 2 Total -25.52 3.88 15 adult -23.00 4.02 5 child 3;4 to 3;11 -8.08 6.25 3 child 4;0 to 4;4 -14.78 5.94 5 child 4;5 to 4;9 -11.85 1.83 2 Total -15.79 7.37 15 adult -28.83 1.85 5 child 3;4 to 3;11 -31.83 3.14 3 child 4;0 to 4;4 -30.76 5.26 5 child 4;5 to 4;9 -29.13 6.97 2 Total -30.11 3.91 15 adult -15.97 4.30 5 child 3;4 to 3;11 -5.32 4.83 3 child 4;0 to 4;4 -14.79 5.84 5 child 4;5 to 4;9 -8.14 3.56 2 Total -12.40 6.26 15 307 Appendix P Mean RMS-power around F2 (dB) produced by the SSE subjects of different ages for the vowel // as a function of the following consonant. Acoustic Following Mean n of Measure Consonant Age (dB) Std. Dev. subjects A2 voiced fricative adult -28.97 4.97 5 child 3;4 to 3;11 -29.03 2.81 3 child 4;0 to 4;4 -29.76 4.55 5 child 4;5 to 4;9 -31.96 0.13 2 Total -29.64 3.89 15 voiced stop adult -31.89 4.00 5 child 3;4 to 3;11 -27.96 2.32 3 child 4;0 to 4;4 -29.00 6.20 5 child 4;5 to 4;9 -27.72 0.90 2 Total -29.58 4.41 15 voiceless stop adult -28.22 5.85 5 child 3;4 to 3;11 -28.35 3.36 3 child 4;0 to 4;4 -27.67 4.00 5 child 4;5 to 4;9 -28.09 1.71 2 Total -28.04 4.03 15 A2*a voiced fricative adult -31.29 4.45 5 child 3;4 to 3;11 -24.20 5.82 3 child 4;0 to 4;4 -24.33 5.18 5 child 4;5 to 4;9 -26.18 2.44 2 Total -26.87 5.43 15 voiced stop adult -31.32 3.40 5 child 3;4 to 3;11 -19.95 5.25 3 child 4;0 to 4;4 -19.00 7.35 5 child 4;5 to 4;9 -20.81 1.57 2 Total -23.54 7.46 15 voiceless stop adult -24.77 5.44 5 child 3;4 to 3;11 -19.41 3.98 3 child 4;0 to 4;4 -17.71 5.32 5 child 4;5 to 4;9 -19.76 1.98 2 Total -20.68 5.36 15 A2*b voiced fricative adult -36.43 4.45 5 child 3;4 to 3;11 -33.87 5.82 3 child 4;0 to 4;4 -34.97 6.04 5 child 4;5 to 4;9 -35.86 2.44 2 Total -35.36 4.73 15 voiced stop adult -36.46 3.40 5 child 3;4 to 3;11 -29.63 5.25 3 child 4;0 to 4;4 -29.63 8.26 5 child 4;5 to 4;9 -30.49 1.57 2 Total -32.02 6.13 15 voiceless stop adult -29.91 5.44 5 child 3;4 to 3;11 -29.09 3.98 3 child 4;0 to 4;4 -28.35 5.58 5 child 4;5 to 4;9 -29.44 1.98 2 Total -29.16 4.51 15 308 Appendix Q Descriptive statistics of SSE/MSR bilingual production of vocal effort for the vowel /i/ as a function of the following consonant based on three acoustic measures A2, A2*a, A2*b (dB) per speaker, language and age. Language SSE Speaker AN_3;7 Following Consonant fric voice+ stop voice+ stop voice- BS_3;4 fric voice+ stop voice+ stop voice- AN_4;2 fric voice+ stop voice+ stop voice- BS_3;10 fric voice+ stop voice+ stop voice- Median Mean Std. Dev N Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. A2 -26.98 -26.36 7.65 118 -27.73 -27.81 6.43 33 -23.87 -24.92 7.39 66 -25.61 -26.61 6.94 69 -24.03 -23.97 5.15 28 -25.39 -23.77 7.06 61 -23.93 -23.93 4.59 20 -19.81 -18.94 6.97 12 -19.68 -19.92 5.22 25 -25.12 -23.91 8.09 47 -23.20 -22.31 8.90 22 -22.62 -22.60 7.46 A2*a -25.01 -23.19 10.19 118 -24.79 -25.13 7.50 33 -22.32 -22.32 9.03 66 -21.85 -21.63 8.22 69 -20.18 -20.16 6.66 28 -17.61 -16.05 9.14 61 -22.61 -22.33 4.90 20 -18.78 -14.45 14.27 12 -15.56 -14.12 7.94 25 -20.30 -17.26 12.14 47 -18.40 -18.16 11.19 22 -17.46 -15.52 10.28 A2*b -29.76 -27.94 10.19 118 -29.54 -29.88 7.50 33 -27.06 -27.07 9.03 66 -26.59 -26.38 8.22 69 -24.93 -24.90 6.66 28 -22.36 -20.80 9.14 61 -27.36 -27.08 4.90 20 -23.53 -19.20 14.27 12 -20.31 -18.87 7.94 25 -25.05 -22.01 12.14 47 -23.14 -22.90 11.19 22 -22.21 -20.27 10.28 309 AN_4;5 fric voice+ stop voice+ stop voice- BS_4;5 fric voice+ stop voice+ stop voice- MSR AN_3;7 fric voice+ stop voice+ stop voice- BS_3;4 fric voice+ stop voice+ stop voice- AN_4;2 fric voice+ stop voice+ n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean 43 -24.03 -23.88 5.38 50 -21.98 -22.04 6.64 28 -20.80 -20.22 6.51 49 -26.37 -25.81 7.90 68 -20.58 -21.94 7.14 22 -24.77 -25.30 6.73 64 -21.42 -21.50 5.16 22 -26.07 -27.59 6.87 27 -25.80 -24.95 6.13 25 -23.70 -25.06 7.11 14 -22.91 -24.22 6.33 16 -24.61 -25.23 8.21 27 -19.94 -20.36 7.90 23 -21.46 -23.04 43 -24.03 -22.92 6.99 50 -19.92 -19.33 7.73 28 -19.25 -18.29 7.80 49 -24.34 -22.35 8.24 68 -15.06 -16.05 8.25 22 -18.19 -19.21 7.28 64 -13.65 -14.75 6.13 22 -26.13 -23.31 8.84 27 -22.74 -22.24 7.45 25 -20.55 -20.33 6.58 14 -15.52 -18.82 8.21 16 -20.35 -20.15 9.37 27 -17.10 -16.23 12.10 23 -17.20 -16.49 43 -28.78 -27.66 6.99 50 -24.67 -24.08 7.73 28 -24.00 -23.04 7.80 49 -29.09 -27.10 8.24 68 -19.81 -20.80 8.25 22 -22.94 -23.95 7.28 64 -18.43 -19.53 6.13 22 -30.91 -28.09 8.84 27 -27.53 -27.03 7.45 25 -25.33 -25.11 6.58 14 -20.30 -23.60 8.21 16 -25.13 -24.93 9.37 27 -21.88 -21.02 12.10 23 -21.99 -21.28 310 BS_3;10 AN_4;5 BS_4;5 Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens fric voice+ Median Mean Std. Dev. n of tokens stop voice+ Median Mean Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens fric voice+ Median Mean Std. Dev. n of tokens stop voice+ Median Mean Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens fric voice+ Median Mean Std. Dev. n of tokens stop voice+ Median Mean Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens 9.85 26 -20.54 -22.54 10.71 12 -23.52 -24.59 7.68 33 -26.74 -26.84 8.17 41 -24.37 -23.15 6.56 28 -22.77 -23.92 5.64 26 -22.67 -22.55 5.87 31 -25.29 -24.00 8.08 27 -27.50 -29.18 4.83 6 -30.14 -28.72 7.26 15 -24.93 -24.86 6.84 24 12.76 26 -14.96 -15.29 15.21 12 -22.64 -21.22 8.10 33 -20.59 -21.51 8.82 41 -21.31 -18.73 8.44 28 -18.48 -21.02 6.19 26 -21.31 -21.72 6.43 31 -25.13 -23.47 8.03 27 -22.84 -25.80 8.16 6 -24.15 -21.98 9.24 15 -23.42 -23.03 8.69 24 12.76 26 -19.75 -20.07 15.21 12 -27.42 -26.00 8.10 33 -25.37 -26.30 8.82 41 -26.09 -23.52 8.44 28 -23.27 -25.80 6.19 26 -26.09 -26.50 6.43 31 -29.91 -28.26 8.03 27 -27.63 -30.59 8.16 6 -28.93 -26.77 9.24 15 -28.20 -27.81 8.69 24 311 Appendix R Descriptive statistics of bilingual SSE production of vocal effort for the tense/lax vowels /i/ and // based on three acoustic measures A2, A2*a, A2*b (dB) per speaker and age. Speaker AN_3;7 BS_3;4 AN_4;2 BS_3;10 AN_4;5 BS_4;5 SSE vowel Median /i/ Mean Std. Dev. n of tokens Median // Mean Std. Dev. n of tokens Median /i/ Mean Std. Dev. n of tokens Median // Mean Std. Dev. n of tokens Median /i/ Mean Std. Dev. n of tokens Median // Mean Std. Dev. n of tokens Median /i/ Mean Std. Dev. n of tokens Median // Mean Std. Dev. n of tokens Median /i/ Mean Std. Dev. n of tokens Median // Mean Std. Dev. n of tokens Median /i/ Mean Std. Dev. n of tokens Median // Mean Std. Dev. n of tokens A2 -26.23 -26.14 7.43 217 -18.11 -17.86 5.50 106 -25.01 -25.05 6.81 158 -22.07 -22.49 6.53 77 -21.44 -21.12 5.73 57 -20.97 -21.62 8.21 40 -23.90 -23.06 7.95 113 -24.28 -24.36 6.85 74 -22.67 -22.06 6.28 127 -17.67 -17.36 5.63 151 -24.96 -25.05 7.39 154 -26.29 -26.55 6.76 78 A2*a -24.44 -23.22 9.48 217 -5.89 -6.19 8.35 106 -19.28 -19.22 8.68 158 -21.90 -21.70 7.54 77 -19.13 -17.07 9.51 57 -17.40 -17.08 9.39 40 -18.82 -16.75 11.17 113 -25.70 -25.49 8.43 74 -20.42 -20.34 7.71 127 -6.45 -7.18 7.37 151 -20.64 -20.15 8.11 154 -25.47 -25.52 9.64 78 A2*b -29.19 -27.97 9.48 217 -2.98 -3.28 8.35 106 -24.03 -23.96 8.68 158 -18.99 -18.79 7.54 77 -23.88 -21.82 9.51 57 -14.49 -14.17 9.39 40 -23.57 -21.50 11.17 113 -22.79 -22.58 8.43 74 -25.17 -25.09 7.71 127 -3.54 -4.27 7.37 151 -25.39 -24.89 8.11 154 -22.56 -22.61 9.64 78 312 Appendix S Descriptive statistics of SSE/MSR bilingual production of vocal effort for the close rounded vowels as a function of the following consonant based on three acoustic measures A2, A2*a, A2*c (dB) per speaker, language and age. Language SSE Speaker AN_3;7 Following Consonant fric voice+ stop voice+ stop voice- BS_3;4 fric voice+ stop voice+ stop voice- AN_4;2 fric voice+ stop voice+ stop voice- BS_3;10 fric voice+ stop voice+ stop voice- Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median A2 -19.62 -20.52 6.69 38 -22.18 -22.41 5.97 37 -21.00 -21.08 5.58 47 -23.39 -23.97 7.31 41 -27.94 -27.93 6.63 18 -29.24 -30.00 5.99 13 -19.17 -19.71 4.50 11 -20.88 -22.17 7.56 5 -20.75 -21.40 5.74 13 -21.47 -21.05 8.07 20 -30.46 -31.46 6.99 18 -27.12 A2*a -16.89 -17.28 8.96 38 -16.84 -15.26 6.77 37 -12.93 -13.13 6.68 47 -19.46 -20.10 10.63 41 -14.43 -14.27 11.42 18 -21.87 -21.10 9.67 13 -12.48 -13.70 6.06 11 -11.97 -13.17 5.53 5 -12.50 -12.44 3.89 13 -13.51 -14.49 13.29 20 -13.32 -18.46 10.30 18 -15.40 A2*c -26.56 -26.95 8.96 38 -26.52 -24.93 6.77 37 -22.61 -22.80 6.68 47 -29.13 -29.78 10.63 41 -24.10 -23.94 11.42 18 -31.55 -30.77 9.67 13 -22.16 -23.59 5.87 11 -21.65 -22.85 5.53 5 -22.18 -22.12 3.89 13 -23.19 -24.16 13.29 20 -23.00 -28.14 10.30 18 -25.07 313 AN_4;5 fric voice+ stop voice+ stop voice- BS_4;5 fric voice+ stop voice+ stop voice- MSR AN_3;7 fric voice+ stop voice+ stop voice- BS_3;4 fric voice+ stop voice+ stop voice- AN_4;2 fric voice+ stop voice+ Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median Mean Std. Dev. n of tokens Median -27.05 6.84 33 -20.85 -22.24 7.80 27 -21.17 -20.81 4.82 31 -21.36 -22.18 5.24 51 -24.27 -25.36 8.39 31 -28.19 -27.80 6.55 39 -26.90 -27.29 6.20 21 -23.21 -23.75 7.07 25 -25.63 -24.13 10.36 20 -33.55 -31.91 6.35 48 -27.61 -27.95 6.54 18 -25.63 -24.20 7.88 14 -28.02 -28.84 5.50 23 -22.13 -23.37 9.25 30 -15.62 -15.51 8.89 33 -12.43 -14.58 11.48 27 -12.34 -12.32 7.14 31 -15.62 -16.07 7.47 51 -21.43 -21.66 8.68 31 -19.96 -19.31 8.13 39 -21.51 -20.99 8.23 21 -2.92 -2.87 13.42 25 -7.00 -2.08 16.92 20 -23.19 -18.75 13.64 48 -11.75 -10.89 10.20 18 -3.18 -3.23 13.40 14 -11.75 -14.63 8.25 23 3.00 0.90 12.39 30 8.31 -25.19 8.89 33 -22.11 -24.26 11.48 27 -22.02 -22.00 7.14 31 -25.29 -25.75 7.47 51 -31.10 -31.34 8.68 31 -29.64 -28.98 8.13 39 -31.18 -30.66 8.23 21 -13.00 -12.95 13.42 25 -17.08 -12.16 16.92 20 -33.27 -28.83 13.64 48 -21.83 -20.97 10.20 18 -13.26 -13.31 13.40 14 -21.83 -24.71 8.25 23 -7.08 -9.19 12.39 30 -1.77 314 BS_3;10 AN_4;5 BS_4;5 Mean Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens fric voice+ Median Mean Std. Dev. n of tokens stop voice+ Median Mean Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens fric voice+ Median Mean Std. Dev. n of tokens stop voice+ Median Mean Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens fric voice+ Median Mean Std. Dev. n of tokens stop voice+ Median Mean Std. Dev. n of tokens stop voice- Median Mean Std. Dev. n of tokens -22.03 11.46 19 -24.77 -24.45 7.68 41 -23.43 -23.94 8.64 35 -23.69 -24.36 8.27 40 -24.43 -25.60 7.31 54 -26.28 -26.94 7.89 37 -30.70 -27.88 10.18 14 -28.61 -29.09 7.88 40 -26.64 -25.95 7.78 23 -22.34 -23.18 10.15 23 -26.94 -26.19 6.37 45 6.14 18.73 19 -9.79 -6.08 14.08 41 -2.61 -3.14 12.71 35 -3.63 -3.56 14.03 40 -5.56 -6.32 11.06 54 -9.20 -7.63 15.04 37 -9.11 -6.64 17.98 14 -14.99 -13.64 14.19 40 -7.39 -5.80 11.50 23 -3.02 0.72 19.46 23 -8.26 -5.56 10.89 45 -3.94 18.73 19 -19.87 -16.16 14.08 41 -12.69 -13.22 12.71 35 -13.71 -13.64 14.03 40 -15.64 -16.40 11.06 54 -19.28 -17.71 15.04 37 -19.19 -16.72 17.98 14 -25.07 -23.73 14.19 40 -17.47 -15.88 11.50 23 -13.10 -9.36 19.46 23 -18.34 -15.64 10.89 45 315 Appendix T Durational ratios for the postvocalic conditioning of vowel duration for all subjects by language, age and bilinguality. Ratios for Ratios for /u/ // Ratios for /i/ Language Bilinguality Subject ID VLS/VF VLS/VS VLS/VF VLS/VS VLF/VF SSE monolingual S2 0.48 0.77 0.53 0.90 0.62 S1 0.60 0.87 0.61 0.92 0.90 S5 0.46 0.87 0.49 1.03 0.82 S4 0.46 0.89 0.39 0.82 0.80 S3 0.55 0.84 0.55 0.94 0.86 C3_3;4 0.58 1.05 0.41 0.65 0.52 C4_3;8 0.47 0.78 0.31 0.28 0.85 C3_3;11 0.43 0.86 0.36 0.64 0.71 C6_4;0 0.37 0.74 0.30 0.60 0.64 C8_4;2 0.31 0.53 0.26 0.63 0.49 C7_4;2 0.42 1.12 0.24 0.92 0.71 C5_4;0 0.62 0.79 0.28 0.65 0.68 C4_4;1 0.59 1.19 0.36 0.24 0.89 C9_4;9 0.49 0.87 0.32 0.64 0.67 C7_4;8 0.42 0.86 0.38 0.81 0.75 bilingual AN_3;7 0.72 0.89 0.72 0.78 1.08 BS_3;4 0.94 1.07 0.89 1.03 1.30 AN_4;2 0.36 0.58 0.61 0.73 0.79 BS_3;10 0.91 1.23 0.72 0.81 1.25 AN_4;5 0.48 0.77 0.36 0.57 0.71 BS_4;5 0.73 1.04 0.81 1.00 1.54 MSR monolingual R3 0.87 0.91 0.92 1.07 R4 0.95 0.86 0.95 1.05 R2 0.92 0.93 0.99 0.94 R1 0.79 0.92 0.69 0.94 R5 0.72 0.85 0.74 0.98 bilingual AN_3;7 0.73 0.72 0.63 1.09 BS_3;4 1.08 1.12 1.06 1.09 AN_4;2 0.61 0.76 0.67 1.73 BS_3;10 0.85 0.98 0.83 1.01 AN_4;5 0.90 1.38 0.71 0.89 BS_4;5 0.75 0.76 0.94 0.90 SSBE monolingual E2 0.56 0.60 0.47 0.51 0.69 E1 0.62 0.68 0.49 0.50 0.69 E3 0.52 0.55 0.48 0.48 0.66 E4 0.46 0.72 0.39 0.45 0.74 316