please scroll down for article
Transcription
please scroll down for article
This article was downloaded by: [Universiteit Utrecht] On: 6 July 2009 Access details: Access Details: [subscription number 907217953] Publisher Psychology Press Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Language Acquisition Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t775653668 The Acquisition of Subset and Superset Phonotactic Knowledge in a Second Language Mirjam Trapman a; René Kager b a University of Amsterdam, b Utrecht University, Online Publication Date: 01 July 2009 To cite this Article Trapman, Mirjam and Kager, René(2009)'The Acquisition of Subset and Superset Phonotactic Knowledge in a Second Language',Language Acquisition,16:3,178 — 221 To link to this Article: DOI: 10.1080/10489220903011636 URL: http://dx.doi.org/10.1080/10489220903011636 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. Language Acquisition, 16:178–221, 2009 Copyright © Taylor & Francis Group, LLC ISSN: 1048-9223 print/1532-7817 online DOI: 10.1080/10489220903011636 ARTICLE Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 The Acquisition of Subset and Superset Phonotactic Knowledge in a Second Language Mirjam Trapman University of Amsterdam René Kager Utrecht University Can second language (L2) learners acquire a grammar that allows a subset of the structures allowed by their native grammar? This question is addressed here with respect to acquisition of phonotactics. On the assumption that the L2 initial state equals the native grammar’s final state, learnability theory would predict that a lack of negative evidence for phonotactic structures that are illegal in the target language precludes acquisition of the target grammar. This prediction is tested for L1-Russian (superset) and L1-Spanish (subset) L2 learners of Dutch by means of word-likeness judgments and lexical decision experiments. Participants responded to nonwords containing consonant clusters in onsets and codas that are legal (1) only in Russian, (2) only in Russian and Dutch, or (3) in all three languages. The results converge to show that advanced L1-Russian and L1-Spanish L2 learners possess native-like phonotactic knowledge. Analysis shows that this knowledge cannot be attributed to transfer of lexical statistics from the native language. The results suggest that L2 phonotactic acquisition is not affected by subset/superset relations between the native language and target language. Some possible explanations for our findings are discussed. 1. INTRODUCTION Phonologies of natural languages differ not only in terms of phoneme inventories, but also in terms of phoneme distributions. Phonotactic constraints state positional restrictions on speech sounds, typically with respect to the syllable. Constraints may vary in strength or Correspondence should be sent to Mirjam Trapman, Amsterdam Center for Language and Communication, University of Amsterdam, Spuistraat 210, 1012 VT Amsterdam, The Netherlands. E-mail: [email protected]. René Kager, Utrecht Institute of Linguistics/OTS, Utrecht University, Janskerkhof 13, 3512 BL Utrecht, The Netherlands. 178 Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 179 ranking, providing a major source of cross-linguistic variation in syllable inventories. Studies of phonological markedness show that the presence of marked structures in a language strongly predicts the presence of less-marked structures (Jakobson 1941/1968; Greenberg 1965; Prince & Smolensky 1993; Blevins 1995). For example, universally languages which allow complex onsets also allow simple onsets, languages that allow plosive-nasal onsets also allow plosiveliquid onsets, etc. Syllable typology is an area particularly rich in inclusion relations between inventories. Inclusion relations between inventories can be recursive, so that the inventory of Language A may be a proper subset of the inventory of Language B, which itself is a proper subset of the inventory of Language C. For example, the small set of syllable onsets that are legal in Spanish, is properly included in the somewhat larger set of Dutch legal onsets, which itself is properly included in the even larger Russian set, to be shown later. The general issue that we address here is whether such ‘stringency relations’ between grammars affect naturalistic phonotactic acquisition. In particular, we investigate whether phonotactic acquisition in a second language depends on the learner’s initial state, which may either define a proper subset or a superset of the phonotactic structures allowed by the target language. Learnability theory predicts a strong asymmetry between these two initial states, such that succesful phonotactic acquisition should only be possible starting from a subset initial state. This is because only learners who acquire phonotactic superset grammars receive positive evidence in their input. Before developing this prediction, we will first discuss phonotactic knowledge from the dual perspective of language processing and acquisition. Phonotactic contraints are not merely notational devices helpful for language description; they possess psychological reality. The assumption that native speakers possess phonotactic knowledge of their language is supported by classical types of evidence: loanword adaptations and well-formedness ratings of nonwords. Loanwords tend to be ‘repaired’ by processes such as vowel epenthesis making the resulting forms conform to native phonotactics (Silverman 1992; Yip 1993; Peperkamp & Dupoux 2003; Davidson 2007). Native speakers’ ability to judge the phonotactic well-formedness (word-likeness) of nonwords is documented by many studies (Scholes 1966; Berent & Shimron 1997; Bailey & Hahn 2001; Frisch & Zawaydeh 2001; Coetzee 2004, 2008, 2009). Well-formedness judgments have been found to be gradient and to correlate with measures of phonotactic probability in the lexicon (Coleman & Pierrehumbert 1997; Frisch, Large & Pisoni 2000; Bailey & Hahn 2001; Hay, Pierrehumbert & Beckman 2004; Coetzee 2008; Albright 2009), suggesting that phonotactic knowledge, partially or entirely, emerges from distributions in the lexicon. However, native speakers are able to distinguish degrees of well-formedness between phonotactic structures that are illegal in the native language (Pertz & Bever 1975; Berent, Steriade, Lennertz & Vaknin 2007; Coetzee 2008; Berent, Lennertz, Smolensky & Vaknin 2009). This finding that cannot be explained from the hypothesis that phonotactic knowledge is learned only from exposure to lexical distributions, and suggests that such knowledge also has a basis in universal constraints. The psychological reality of phonotactic knowledge is supported by a range of evidence from speech production and perception. For production, the classical finding is that speech errors tend to result in structures that are phonotactically legal (Dell, Reed, Adams & Meyer 2000; Goldrick 2004). Phonotactically legal nonwords are also produced faster and more accurately than phonotactically illegal nonwords (Vitevitch & Luce 1998). Turning to the domain of speech perception, an increasing number of studies show phonotactic influences on native listeners’ responses in tasks such as lexical decision (Praamstra, Meyer & Levelt Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 180 TRAPMAN AND KAGER 1994; Vitevitch & Luce 1999; Berent, Marcus, Shimron & Gafos 2002; Kager & Shatzman 2007; Coetzee 2008) and word-spotting (McQueen 1998; Suomi, McQueen & Cutler 1997; Vroomen, Tuomainen & de Gelder 1998; McQueen, Otake & Cutler 2001; Weber & Cutler 2006). Moreover, native phonotactic constraints shape perception by filtering out segmental sequences that are illegal in the native language, affecting phoneme identification (Massaro & Cohen 1983; Hallé, Segui, Frauenfelder & Meunier 1998; Pitt 1998; Moreton & Amano 1999; Moreton 2002; Coetzee 2005) and the perception of syllables, involving ‘perceptual epenthesis’ (Dupoux, Kakehi, Hirose, Pallier & Mehler 1999; Dupoux, Pallier, Kakehi & Mehler 2001; Berent, Steriade, Lennertz & Vaknin 2007; Kabak & Idsardi 2007). This growing body of studies offers converging evidence that native speakers and listeners possess implicit knowledge of the phonotactic constraints of their native language and use this knowledge for processing. Phonotactic knowledge of the native language develops early, taking off during the first year of life. In perception experiments, 6-month-old infants respond differently to speech which contains phoneme sequences conforming to native phonotactics than to speech containing phonotactically illegal sequences (Friederici & Wessels 1993; Jusczyk, Friederici, Wessels, Svenkerud & Jusczyk 1993; Jusczyk, Luce & Charles-Luce 1994). Younger infants, aged 3 months, do not show differential responses to phonotactically legal and illegal sequences. These studies suggest that native phonotactic knowledge begins developing during the first year, possibly to assist the segmentation of speech. It has been hypothesized that infants use phonotactics in continuous speech to start tackling the problem of where word boundaries fall, which would assist them in setting up an initial lexicon (Mattys, Jusczyk, Luce & Morgan 1999; Mattys & Jusczyk 2001). Much research has addressed the question of how the acquisition of novel phonotactic knowledge is affected by native phonotactic knowledge, by studying phonotactic acquisition in a second language (henceforth, L2). A crucial difference from native (henceforth, L1) phonotactic acquisition resides in the circumstance that at the onset of L2 acquisition, a fullfledged phonotactic grammar has already been acquired. Many studies have found pervasive effects of L1 phonotactics on the L2. L2 learners repair nonnative structures in their productions, resulting in outputs meeting phonotactic constraints of their L1. For example, Korean L2 learners of English simplify consonant clusters that are illegal in their L1 (Broselow & Finer 1991), while English L2 learners of Russian simplify complex syllable onsets (Ostapenko 2005). Strategies to adjust L1-illegal consonant clusters are multiple, and include vowel epenthesis, consonant deletion, and metathesis. Vowel epenthesis is often applied by L2 learners to adjust L1-illegal complex onsets (Broselow 1987; Bhatt & Hancin-Bhatt 1997). However, advanced learners also apply consonant deletion (Ostapenko 2005). The repair of L1-illegal structures has been found to depend on their degree of markedness. A number of studies show that more production errors occur in L1-illegal forms that are more marked than in L1-illegal forms that are less marked (Eckman 1987; Carlisle 1988, 1998).1 For example, Carlisle (1988) showed that relatively unmarked obstruent-liquid onsets are modified less often than more marked obstruent-nasal clusters even when neither type of structure was present in the learner’s L1. 1 This result corresponds to similar findings by Davidson (2003) and Haunz (2002), who found that native speakers of English had more difficulties pronouncing more marked than less marked nonattested clusters. The similarity between these native speakers and beginning L2 learners is that both groups have had hardly any input in the foreign language. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 181 Carlisle (1998) examined the acquisition of English onsets by native speakers of Spanish in a longitudinal study, comparing CC-onsets (/sp/ and /sk/) with more marked CCC-onsets (/spr/ and /skr/) and found that the less-marked onsets /sp/ and /sk/ were produced correctly more often than more marked tri-consonantal onsets. Ostapenko (2005) investigated influences of markedness through the sonority sequencing principle, finding that Russian consonant clusters that violate this principle are difficult to acquire for English learners. Most studies of L2 phonotactics are based on production data (Eckman 1977; Broselow 1987; Weinberger 1988; Broselow & Finer 1991; Hancin-Bhatt & Bhatt 1997; Broselow, Chen & Wang 1998; Carlisle 1998; Hancin-Bhatt 2000; Haunz 2002; Davidson 2003; Ostapenko 2005, among others). Although production data are indicative of phonological development, exclusive reliance on such data carries a certain risk of failure to distinguish the acquisition of representational phonotactic knowledge from learning the motoric skills which are needed to produce nonnative sound sequences. Hence, incomplete development of motoric skills may be misinterpreted as a lack of development of native-like phonological representations. For this reason, L2 production studies may underestimate the development of phonological representations. For a finer-grained understanding of L2 phonological development, perception studies seem to be called for. As compared to the large number of L2 production studies, the number of studies addressing L2 phonotactic knowledge in perception is limited. Most address the influence of phonotactic contexts on segmental perception (Rochet & Rochet 1999; Harnsberger 2001; Levy & Strange 2008). In a study focusing on the role of native phonotactic constraints in nonnative listening, Weber & Cutler (2006) showed that native constraints influence the segmentation of a nonnative language even in highly advanced learners. Using a word-spotting task, Weber & Cutler found that advanced L2 learners of English whose L1 was German used native phonotactic constraints (such as the ban on */sp/ in word onsets) to locate word boundaries in spoken English. This finding suggests that native phonotactic constraints are difficult to suppress in nonnative listening. At the same time, Weber & Cutler’s study offers evidence of L2 phonotactic development. The L2 listeners used not only L1 constraints for segmenting English, but also constraints of the target language. More specifically, constraints that hold for English, not for German, such as the ban on */ʃl/ in word onsets, facilitated the spotting of English words. This suggests that advanced L2 learners acquire phonotactic constraints banning structures that are legal in their L1. This is a remarkable finding, since L2 learners receive no direct negative evidence that English words cannot start in */ʃl/. Altenberg (2005) investigated well-formedness judgments, perception, and production of English consonant clusters by L1-Spanish L2 learners of English. Participants rated written nonwords in two versions, which were presented to them as new words of English and Spanish, respectively. Three types of nonwords occurred: type A contained initial clusters that are grammatical in both English and Spanish (e.g., /fl, dr, kr, bl/); type B, initial clusters that are grammatical in English but not in Spanish (e.g., /sp, sm, sn, sl/), while type C contained initial clusters that are grammatical in neither English nor Spanish (e.g., /sr, zn, dl, fn/). Native participants and L2 learners made highly similar judgments in the English version, suggesting that L2 learners had acquired native-like phonotactic knowledge. The L2 learners rated nonwords in the Spanish version according to their phonotactic well-formedness in the L1. In a subsequent perception task, no significant differences emerged between native participants and L2 learners on the orthographic identification of initial clusters in type A and Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 182 TRAPMAN AND KAGER B items, and hence, no evidence for transfer was found. In Altenberg’s study, the phonotactic target grammar (English) was a superset of the native grammar (Spanish). From the viewpoint of learnability, this result should not come as a surprise because learners receive abundant positive evidence for the well-formedness of /sC/ initial clusters in English. Since nonwords in the well-formedness judgment task were presented orthographically, it remains unclear to what extent well-formedness responses reflected the acquisition of clusters in orthography, rather than perception. Hence, the question of whether L2 learners are able to acquire native-like phonotactic responses to spoken language data remains unanswered. In sum, only a few studies have addressed the role of L2 phonotactic constraints in perception and virtually none have addressed the development of L2 phonotactics on the basis of perception data or well-formedness judgments. Most studies take only the L2 final state into consideration, without considering the issue of development. Also relevant, but less directly so, are studies showing that infants and adults can learn novel phonotactic constraints from exposure to artificial languages (Onishi, Chambers & Fisher 2002; Chambers, Onishi & Fisher 2003; Saffran & Thiessen 2003).2 Target constraints in these studies are novel to learners as they rule out structures that are legal in the participants’ native language. Although these studies seldom explicitly address relations between the target patterns in the artificial language and L1 phonotactics, their results may be interpreted as offering some evidence that novel phonotactic constraints can be learned which define a phonotactic subset of the native grammar. However, since they involve individual constraints rather than phonotactic grammars, artificial language learning studies have only remote significance for the naturalistic acquisition of L2 phonotactic knowledge. Two specific scenarios of phonotactic L2 acquisition will be addressed here, which arise when the structures that are phonotactically legal under the native and target grammar are related by a proper inclusion. The target language structures are phonotactically a superset or a subset of native language structures. We will call these the superset and subset scenarios, respectively, after Berwick (1985).3 The first scenario is that of the phonotactic structures allowed by the target language being a superset of those allowed by the L1, so that the target language is phonotactically more lenient than the L1. More precisely, phonotactic structures that are illegal in the native language are legal in the target language, while no phonotactic structures that are legal in the native language are illegal in the target language. Studies reviewed above show that L2-learners are able to overcome the effects of a more restrictive L1 on production, and eventually suppress the phonotactic repairs that are characteristic of early stages of L2 production. This interpretation is supported by the limited amount of evidence available from well-formedness judgments (Altenberg 2005). A different interpretation arises from L2 perception, since Weber & Cutler (2006) found that even advanced learners display an influence of L1-specific phonotactic constraints on their segmentation of the nonnative language. In sum, while production studies and well-formedness judgments suggest that successful phonotactic acquisition in a superset scenario is possible, evidence from perception offers a more nuanced picture. 2 Artificial language learning studies have also revealed effects on learners’ production (Taylor & Houghton 2005) or perception (Davidson, Shaw & Adams 2007). 3 Escudero & Boersma (2002) address the subset scenario in L2 segmental acquisition, focusing on the special case of two vowels in the L2 (Spanish) being mapped onto three vowels in the L1 (Dutch). Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 183 The subset scenario, that of the target language being phonotactically more stringent than the L1, has received virtually no attention in the phonotactic acquisition literature. This scenario occurs when the native grammar allows a set of legal structures (for example, consonant clusters in syllable onsets), which are illegal in the target grammar, while structures that are legal in the target grammar are also legal in the native grammar. Hence, the set of phonotactic structures that is allowed by the target language is a subset of those allowed by the L1. It should be noted a priori that the acquisition of subset phonotactics may have fewer observable consequences in production and perception than the acquisition of superset phonotactics. On the production side, there will usually be no overt behavioral evidence showing that learners have developed a target grammar that is more stringent than their L1, because the surplus of L1 structures will not impede their L2 production. This invisibility problem has a counterpart on the perception side, where L1-based ability to perceive a superset of structures should have few if any detrimental effects on L2 perception. In sum, the effects of a subset scenario on L2 production and perception may appear to be marginal, and on top of that, difficult to observe; hence it should not come as a surprise that few studies have addressed them. Nevertheless, Weber & Cutler (2006) show that advanced L2 learners segment the target language using phonotactic constraints which hold for the target language, but not the L1. Phonotactically, German and English do not stand in a subset-superset relation in the specific sense previously defined, and hence these results cannot be interpreted as showing that L2 phonotactic acquisition occurs in a subset scenario. Hence, the issue of whether L2 learners are able to acquire a phonotactic subset grammar is still open. This issue is interesting from a learnability viewpoint, as will be argued later. Against this background, we can now state the following research questions: Q1a. Can L2 learners acquire phonotactic knowledge under a superset scenario, that is, in case the target grammar defines a superset of the phonotactic structures which are legal in the L1? Q1b. Do superset learners show development, such that advanced learners possess more native-like phonotactic knowledge of the target language than beginning learners? Q2a. Can L2 learners acquire phonotactic knowledge under a subset scenario, that is, in case the target grammar defines a proper subset of the phonotactic structures which are legal in the L1? Q2b. Do subset learners show development, such that advanced learners possess more native-like phonotactic knowledge of the target language than beginning learners? In line with many studies on L2 phonology, we make the important assumption of transfer, which is supported by a wide range of evidence from production and perception, as previously reviewed. We adopt a grammatical interpretation, such that the initial state of the L2 grammar equals the final state of the L1 grammar (Broselow, Chen & Wang 1998; Escudero 2005). Given the assumption of transfer, specific hypotheses about the L2 acquisition of phonotactics in the superset and subset scenarios can be derived from the theory on learnability of grammars. The subset problem (Baker 1979, Angluin 1980) states that a learner who adopts a superset grammar, which allows a superset of structures allowed by the target grammar, will be unable to return to a more restrictive grammar unless corrected by negative evidence about the target language. Since it is assumed that negative evidence is not available to learners, or 184 TRAPMAN AND KAGER at least rarely so under naturalistic conditions of language acquisition, the subset problem is interpreted in the learnability literature as implying that “the misstep of choosing a superset grammar makes the subset grammar unreachable (from positive evidence)” (Prince & Tesar 2004).4 When one adopts the additional assumption that phonotactic knowledge is grammatical in nature (like syntactic or semantic knowledge), the subset problem naturally extends to phonotactic acquisition, as has been explicitly argued in the phonological learnability literature (Smolensky 1996; Prince & Tesar 2004; Hayes 2004). The particular case that we address, of an L2 initial state that happens to constitute a superset of the target grammar, should, likewise, be an unsurmountable obstacle to the learner. Hence, we can state this as our first hypothesis: Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 H1 L2 learners can attain the target phonotactic grammar when starting from an L1 subset initial state, but not from an L1 superset initial state. Our second hypothesis spells out the consequences of successful L2 phonotactic acquisition in terms of development: H2 Successful L2 acquisition of phonotactics should be subject to development between the initial and final states of the grammar; consequently, L2 learners should become more native-like in their phonotactic responses. Answers to our research questions can be predicted from our two hypotheses as follows. Superset learners face the task of acquiring a target grammar which allows a superset of the phonotactic structures of their native language. These learners receive positive evidence in the form of words containing L1-illegal structures. Positive evidence allows superset learners to adjust their initial state by relaxing or demoting the relevant phonotactic constraints, moving their grammar closer to the target grammar. Successful acquisition of the target grammar should eventually occur under these circumstances; hence, Hypothesis 1 predicts that question 1a should be answered positively. Likewise, question 1b should be positively answered as Hypothesis 2 states that phonotactic development should occur. In contrast, subset learners face the task of acquiring a target grammar that allows a subset of the phonotactic structures of their native language. These learners will receive no negative evidence about the ill-formedness of phonotactic structures that are illegal in the target language. Due to a lack of relevant input, these learners should not be able to adjust their initial state and hence, acquisition of the target grammar is expected to be impossible, rendering the answer to question 2a negative. Since there is no evidence available, subset learners should not get closer to the L2 grammar. So question 2b should be negatively answered as well. In order to test our hypotheses, we set up a study with L2 learners of Dutch. As section 2 will show, consonant clusters in Dutch word onsets and codas form a proper subset of those of Russian, and hence, Russian L2 learners of Dutch phonotactics face a subset scenario. Since consonant clusters that are legal in word onsets and codas in Dutch form a superset of those in Spanish, it follows that Spanish L2 learners of Dutch phonotactics face the superset scenario. 4 In response to the subset problem, the Subset Principle was proposed by Berwick (1985) stating that if the learner is faced with a choice between a set of different grammars that all account for the input data seen thus far, the learner should always adopt the most restrictive grammar (i.e., the subset grammar), which is the one that is most easily falsified by positive input data. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 185 Our study includes two groups of L2 learners of Dutch: superset learners with Spanish as their L1 and subset learners with Russian as their L1, plus a control group of Dutch native speakers. Each language group is divided into two subgroups of advanced and beginning learners. Proficiency levels were included for the two language groups for different reasons. In the case of L1-Spanish superset learners, for whom successful phonotactic acquisition is predicted, a comparison of proficiency groups allows us to test the case for phonotactic acquisition through evidence from phonotactic development. If development occurs as predicted, this might rule out alternative accounts of native-like responses of L2 learners that might be based on an overlap between the final states of the phonotactic grammars of Spanish and Dutch. For Russian subset learners, for whom we predict no acquisition, proficiency groups were included only as a consistency check on the data: no phonotactic differences should occur between low-proficient and high-proficient learners. If we were to find that L1-Russian learners show no native-like responses, this would not suffice to rule out that they were acquiring Dutch phonotactics as it might be the case that development takes place, albeit slowly. The data from participants of two proficiency levels may serve to monitor development. Finally, it deserves mentioning that this study was not designed to compare the phonotactic acquisition of subset and superset learners directly but only to test predictions about the phonotactic acquisition for each group of learners separately. A comparative aim would be more ambitious, but also necessitate controls of proficiency levels between participants in the subset and superset conditions, which did not occur in the present study. We selected two methods to test the predictions. First, we elicited word-likeness judgments, in which participants rated spoken nonwords with varying degrees of phonotactic legality on a seven-point scale. Second, we included a lexical decision task, measuring response latencies and error rates for nonwords. This task reflects online lexical processing, and for this reason it has the advantage of being less vulnerable to being dominated by (semi)-conscious response strategies that participants might develop than the classical word-likeness task, which is essentially a meta-linguistic assessment. This article is organized as follows. Section 2 contains a detailed overview of the word margin (onset and coda) consonant cluster phonotactics of the three languages that figure in this study: Dutch, Russian, and Spanish. This section has the main goal of demonstrating that with respect to consonant clusters in syllable margins, the three languages are in subsetsuperset relations, with Spanish margins being a subset of the margins in the other two languages, and Dutch margins being a subset of Russian margins. Section 3 presents the results of an experiment in which word-likeness ratings of nonwords were elicited from Dutch native listeners, as well as from (more and less advanced) L1-Russian and L1-Spanish second language learners of Dutch. Section 4 presents a lexical decision experiment in which nonwords (now mixed with real words) were presented to the same five groups of participants. Section 5 contains a general discussion of the results. 2. WORD MARGINS IN DUTCH, RUSSIAN, AND SPANISH This section has two goals: to present the spectrum of consonant clusters allowed in word initial and final position in Dutch, and to show that these form a subset of Russian and a superset of Spanish. 186 TRAPMAN AND KAGER Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 2.1. Dutch The Dutch syllable conforms to a basic CCVCC template, in which the onset and coda each consist of maximally two consonants (Trommelen 1983). In word initial and final position, the positions that we focus on, coronal obstruents can be appended, creating initial clusters of maximally three consonants (sCC), and final clusters of maximally four consonants (CCst). Because of their exceptional distribution, initial /s/ and final coronal obstruents have been analyzed as extrasyllabic, licensed by an appendix to the word (Trommelen 1983). Table 1 lists two-consonantal word onset clusters occurring in the Dutch CELEX lemmas database (Baayen, Piepenbrock & Gulikers 1995).5 Marginal clusters (with type frequency below 25)6 occur in italics.7 Two-consonantal word onsets consist of an obstruent followed by a sonorant or marginally, by another obstruent. In terms of their sonority profiles, word onset clusters are rising or level, not falling; a single apparent counterexample, /wr/ (CELEX transcription) is commonly pronounced as /vr/.8 Sonorant-initial word onsets (nasal-glide and liquid-glide) are marginal. Among the obstruent-sonorant clusters, those having a liquid in second position are numerous (18 attested clusters altogether), and most have high type frequencies. Nevertheless, coronal obstruents (/t, d, s, z, ʃ, Z/) before liquids are severely restricted by OCP-C OR (*/tl, dl, zl, Zl; sr, ʃr, zr, Zr/, with /tr, dr/ being positive exceptions).9 Obstruent-glide clusters are equally numerous but most are marginal at best, partly due to OCP-LAB (/pw, bw, fw, vw/). Obstruentnasal clusters are marginal except /kn, sn, sm/. (The latter two once more show word initial /s/ as an escape hatch.) Obstruent-nasal clusters are strongly restricted by OCP-C OR (*/tn, dn, zn, Zn/, with /sn/ as a true positive exception and /ʃn/ marginal) and by OCP-LAB (*/pm, bm, fm, vm/). Word onsets consisting of obstruents are marginal, with a major exception: word-initial /s/ freely combines with voiceless nonsibilants (/sp, st, sk, sx, sf/) supporting its analysis as a word appendix (Trommelen 1983). Table 1 reveals three restrictions on obstruent clusters which will become relevant in the comparison with Russian: (i) Such clusters are voiceless throughout (Zonneveld 1983; e.g., */zb, zd, dz/; for /dZ dj/, see footnote 7); (ii) No all- 5 Type frequencies are based on the CELEX DPL (Dutch Phonology Lemma) file, which contains 124,136 lemmas. A sublexicon was created of 69,245 types, by eliminating all lemmas with null frequency and collapsing all homophones. 6 A cut-off point of 25 for marginal status was chosen because it forms a natural division in the frequency distribution in the database. All onset clusters treated as marginal by this criterion are experienced as ‘foreign’ by native speakers. Moreover, marginal clusters tend to occur in low-frequency words: their average type-to-token ratio is much lower than that of nonmarginal clusters (46.5 s.d. 72 versus 215.0 s.d. 111). 7 The velar nasal, which generally cannot occur in Dutch onsets, is not represented in Table 1. Three phonotactically legal clusters of the type coronal-obstruent-plus-/j/ were added: /tj, dj, sj/, which occur in free variation with /tʃ , dZ, ʃ/, respectively. However, only the latter realizations occur in CELEX. Examples are /tj tʃ/ tjilpen ‘to chirp’, /dj dZ/ djati ‘jati wood’, /sj ʃ/ sjaal ‘shawl’. 8 For example, wrat ‘wart’ and vrat ‘fed’ are commonly neutralized. In most dialects of Dutch, /w/ is realized as a labio-dental approximant [V] in word onset position. 9 Two exceptions with /sr/, not represented in CELEX, are Sri Lanka and Sranan tongo (the Creole language spoken in Surinam). SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 187 Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 TABLE 1 Dutch Two-Consonantal Word Onset Clusters Note: Clusters printed in italics have type frequencies N < 25. The shaded areas indicate manner and voicing combinations which are generally unattested. fricative clusters (e.g., */fs, fx/; with once more, /s/ being exceptional in /sf, sx/); (iii) No all-plosive clusters (e.g., */tk, kt/).10 Word-initial clusters of three consonants uniformly start with /s/ followed by a legal obstruent plus liquid/glide cluster (e.g., /spl, spr, str, sxr/, plus marginal /skl, skr, skw, stj/).11 The fact that ternary clusters all start with /s/ accords with its status as an appendix. Table 2 lists two-consonantal word codas that occur in the CELEX lemmas database. Due to final devoicing, no voiced obstruents are represented here. Again, marginal clusters with type frequencies below 25 are italicized.12 The sonority profile of word codas is falling (sonorant-obstruent, liquid-nasal, glide-nasal) or level (obstruent-obstruent). A small number of rising sonority clusters in shaded areas occur in CELEX transcriptions but are nevertheless phonotactically illegal. These clusters are subject 10 An isolated exception, not in CELEX, is /pt/ pterodactylus ‘pterodactyl.’ On the basis of the weak evidence from the remaining obstruent clusters, a further restriction seems to hold: the right-hand obstruent must be a coronal (e.g., */fp, fk, xp, xk, pf, px, kf, kx/), where once again, initial /s/ is exceptional (/sp, sk, sf, sx/). 11 Word onsets /sfl, sfr, sxl, skw, stw/ are unattested, and presumably accidental gaps. 12 The average type-to-token ratio for marginal word coda clusters falls well below that of nonmarginal ones (151.8 versus 446.6). 188 TRAPMAN AND KAGER Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 TABLE 2 Dutch Two-Consonantal Word Coda Clusters Note: Clusters printed in italics have type frequencies N < 25. The shaded areas indicate manner and voicing combinations which are generally unattested. to repairs, such as obligatory schwa epenthesis in /tl/ (axolotl) and /rl/ (Karl), while /rw/ (murw) is commonly pronounced as [rf]. Sonorant-obstruent word codas occur in most logically possible combinations. Among these, liquid-obstruent clusters are virtually unconstrained; /lʃ/ must be an accidental gap.13 Liquid plus noncoronal obstruent codas are optionally broken up by schwa epenthesis; yet the fact that epenthesis is only optional supports their phonotactic legality. Nasal-obstruent clusters are homorganic (*/mk, mx, np, nk, nf, nx, ŋp, ŋf/) except that word-finally, a coronal obstruent can follow any nasal (e.g., /mt, ms, mʃ, ŋs/). The unattested clusters /ŋt, ŋʃ/ are accidental gaps since /ŋt/ is freely derived by affixation of the 3.sg.pres. suffix /t/ to /ŋ/-final verbs. The free distribution of coronal obstruents, occurring after virtually any consonant, has earned them the status of word appendix (Trommelen 1983; Kager & Zonneveld 1986). Combinations of glides and consonants are mostly marginal. All except two (/jt, js/) occur only in loanwords (/jp/ hype, /jk/ spike, /jf/ live, /jm/ time, /jn/ online, /jl/ file). Obstruent clusters are more severely restricted, somewhat similarly to word onsets, and contain at least one coronal member (except /pf/ in German loanwords), while sibilant clusters are ruled out. Sonorant-sonorant clusters are restricted to liquid-nasal (/rm, rn, lm/, excluding */ln/ by OCP-C OR) and a few more occurring only in loanwords (see above). The velar nasal never occurs as a second element in clusters. 13 A potential loanword such as Welsh (the name of the language) fails to undergo phonotactic repairs. SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 189 Word-final clusters of three consonants uniformly end in a coronal obstruent, again supporting a word appendix analysis. All 60 attested ternary clusters combine a legal binary cluster with /t/ or /s/ (e.g., /nst, nts, rst, rts, rmt, rkt, ŋkt, kst, tst/). A coronal cluster /st/ can be appended to a binary cluster, which produces (14 different) maximal word offsets of four consonants (e.g., /ntst, rtst, xtst, rnst/). Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 2.2. Russian The goal of this section is to show that Russian word onsets and codas are a superset of Dutch. Hence, we do not aim at an exhaustive overview of Russian (three and four consonant) clusters. Counts are based on the large Uppsala corpus (Lönngren 1993), from which a phonetically transcribed lexicon of 32,459 words was created. The data were checked against secondary sources (Kucera & Monroe 1968; Kempgen 1995;14 Chew 2000; Scheer 2000; Ostapenko 2005). The demonstration that Russian word margins constitute a superset of Dutch has two parts. Here we show that every Dutch word margin cluster has a Russian counterpart. In Section 2.4, we will show that the inclusion relation also holds at the level of major class features, place of articulation, and voicing. When determining the Russian counterparts of Dutch consonants, we ignored palatalization, which is contrastive in Russian but not in Dutch. For example, /p0 j/ and /pj/ were both assumed to be counterparts of Dutch /pj/. Furthermore, we judged Russian /v/ to be phonetically closer to Dutch /v/ than to /w/. We included /ts/ and /tʃ/ as consonant clusters due to their cluster status in Dutch, although these are single segments (affricates) in Russian. Russian word onsets of two consonants consist of any combination of major classes (Table 3). Clusters printed in boldface have Dutch counterparts. Clusters appearing in italics have type frequencies below 5.15 All clusters attested in Dutch have Russian counterparts, with the single exception of /fn/, which is marginal in Dutch and presumably just an accidental gap in Russian given that /vn/ and /ft/ are fully legal. All logically possible combinations of manners occur in Russian word onsets, including clusters of falling sonority, except that the glide /j/ is unattested in initial position. The set of obstruent-sonorant clusters is relatively similar to Dutch. The main difference is that as compared to Dutch, Russian relaxes OCP-C OR as evidenced by its wide range of coronal clusters (/dn, ʃn, zn, Zn, ln; tl, dl, ʃl, zl, Zl, rl; sr, ʃr, zr, Zr, nr/).16 Apparently like Dutch, Russian restricts labial clusters by OCP-LAB (*/pm, bm, fm/), but exceptions (e.g., /vm/) occur by prefixation of /v/. Russian possesses a large number of double obstruent onsets, subject to assimilation of voicing, which excludes shaded cells in Table 3. As compared to Dutch, Russian allows voiced obstruent clusters (e.g., /vb, vd, zb, zd, bd, dv, dz/), as well as clusters of mixed voicing ending in /v/ (e.g., /tv, kv, sv, xv/), due to voicing assimilation properties of Russian /v/ (Hayes 1984). 14 Kempgen’s (1995) data were based on two Russian dictionaries: Orfograficeskij Slovar0 (Barchudarova, Ožegova & Šapiro 1967), and Obratnyj Slovar0 (Ševeleva 1974). 15 A lower threshold value for marginality was chosen than for Dutch because the Russian lexicon was considerably smaller than the CELEX database for Dutch. The average type-to-token ratio for marginal word onset clusters falls well below the ratio of nonmarginal ones (7.8 versus 18.9). 16 Some examples are /tl0 et/ ‘decay’, /dl0 ina/ ‘length’, /zl0 it/ ‘to anger’, /znat0 / ‘know’. 190 TRAPMAN AND KAGER Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 TABLE 3 Russian Two-Consonantal Word Onsets Compared to Dutch Note: Palatal and nonpalatal consonants have been collapsed. Clusters printed in boldface have Dutch counterparts. Shaded areas exclude manner and voicing combinations which are generally unattested, as well as any combinations with /w/, which is not a phoneme of Russian. Clusters printed in italics have type frequencies N < 5 in a lexicon based on the Uppsala Corpus. Moreover, fricative clusters (e.g., /fx, ʃx, zv, vz/) and plosive clusters (e.g., /tk, kt/) are more freely allowed than in Dutch.17 Sonorant-initial word onset clusters, including clusters of falling sonority, are numerous in Russian, whereas they are illegal in Dutch.18 Mild effects of sonority sequencing occur, but without causing phonotactic illegality: sonorant-obstruent clusters are numerous despite having marginal type frequencies, and sonorant-sonorant clusters of rising sonority (specifically nasalliquid and nasal-glide) are well attested. Russian allows all ternary word onsets that are legal in Dutch (/s/ plus a legal obstruentliquid cluster), in addition to large numbers of ternary onsets starting with other consonants 17 Examples of words with voiced clusters are /dver0 / ‘door’, /vd0 es0 / ‘here’, /zbor/ ‘collection’, /zduru/ ‘stupid, daft’; fricative-fricative clusters: /fxodit0 / ‘enter’, /xvalit0 / ‘praise’, /vzamen/ ‘instead’, /zvezda/ ‘star’, /Zvatʃka/ ‘chewing gum’; plosive-plosive clusters: /kto/ ‘who’, /tkan0 / ‘fabric, tissue’. 18 Examples of sonorant-initial clusters are /mlatʃij/ ‘younger’, /rtut0 / ‘mercury’, /lba/ ‘forehead (gen.sg.)’, /mnogo/ ‘much’. SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 191 Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 TABLE 4 Russian Two-Consonantal Word Codas Compared to Dutch Note: Palatal and nonpalatal consonants have been collapsed. Clusters printed in boldface have Dutch counterparts. Shaded areas exclude manner and voicing combinations which are generally unattested, as well as any combinations with /w/ and /ŋ/, which are not phonemes of Russian. Clusters printed in italics have type frequencies N < 5 in a lexicon based on the Uppsala Corpus. (e.g., /fkl, fpr, ftr, fsp, fst, fsk; vzb, vzd, vzm, vzn, vzl, vzr, vzv, vgl, vbr, vsk; zbl, zbr, zdr, zdv, zgl, zgn, zgr; ʃtr, mst, mgl/).19 Russian allows word onsets of four consonants, uniformly starting with /f, v/ followed by a fricativeCobstruentCliquid triplet (e.g., /fstr, fskr, fspr, fspl, fsxl, vzbr, vzdr, vzgr, vzgl/).20 Turning to word coda clusters, it is evident that Russian once again forms a phonotactic superset of Dutch, placing no categorical bans on any combination of manners (Table 4).21 19 Examples are /fpravo/ ‘to the right’, /fsp0 at0 / ‘back’, /fstavat0 / ‘get up’, /vzriv/ ‘explosion’, /vrdug/ ‘suddenly’, /vznos/ ‘payment’, /vzlom/ ‘breaking in’, /kstat0 / ‘to the point’, /sklat/ ‘store, stock’, /zdrav/ ‘sound’, /mstit0 / ‘to revenge one self’, /mgla/ ‘haze’. 20 Examples are /vzbros/ ‘upthrow’, /fspl0 esk/ ‘splash’, /vzgl0 at/ ‘glance’. 21 The average type-to-token ratio for marginal word coda clusters falls well below the ratio of nonmarginal ones (2.8 versus 8.5). Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 192 TRAPMAN AND KAGER In sonorant-obstruent clusters, Russian is maximally similar to Dutch. Liquid-obstruent clusters have a single omission /lx/, probably an accidental gap because /rx/ and /lk/ are both attested. Nasal-obstruent clusters lack the velar nasal, which is not a phoneme of Russian. For obstruent-obstruent coda clusters, Russian is highly similar to Dutch: clusters are voiceless, while most clusters end in coronals /s t/. There are a number of (apparent) omissions, however. /ts, tʃ/ correspond to (single-phoneme) affricates in Russian, while /pf, fs, sp, xs/ are missing. However, it should be noted that all four omissions are only marginal in Dutch, while presumably the clusters are accidental gaps in Russian, given the occurrence of clusters in neighboring cells (e.g., /tf, fx, sk/). To balance affairs, Russian possesses clusters that are disallowed in Dutch (/tf, kx, fk, fx/). Russian allows a fair number of sonorant-sonorant word coda clusters. Although most of these are marginal, the attested clusters form a superset of Dutch, most notably by including codas of rising sonority (nasal-liquid) and level sonority (nasal-nasal, liquid-liquid). Risingsonority word codas occur in the form of obstruent-sonorant clusters (of all logically possible types: plosive-nasal, plosive-liquid, fricative-nasal, fricative-liquid), none of which are legal in Dutch. Just as in word onsets, OCP-C OR is relaxed (e.g., /dn, sn, sl, zn, zl/), but OCP-LAB is not (*/pm, bm, fm/, plus marginal /vm/). Russian allows ternary word codas of any type that is legal in Dutch (existing binary obstruent-liquid cluster plus /s/ or /t/), in addition to ternary codas ending in other consonants (e.g., /stf, rtf; tsk, fsk, nsk, rsk/). No sonority restrictions hold in word codas, as clusters of rising sonority are legal (e.g., /str, ktr, ntr, ndr/). Word codas of four consonants occur, most ending in /tf/ (e.g., /rstf, jstf, tstf, nstf, pstf, mstf/). 2.3. Spanish The goal of this brief section is to show that Spanish word onsets and word codas form a proper subset of Dutch. We will not provide an exhaustive description of Spanish syllable structure (for a discussion, see Harris 1983; Hualde 1991; Quilis & Fernández 1992), but focus on word margin consonant clusters in comparison to Dutch. Spanish word onset clusters are maximally binary, and uniformly of the type obstruent-liquid (Table 5). Only 12 clusters are attested. Much as in Dutch, OCP-C OR restricts clusters (*/tl, dl, sl, sr/; exceptions /tr, dr/).22 In contrast to Dutch, Spanish generally disallows obstruent-nasal word onsets, as well as obstruentobstruent onsets. Moreover, /s/-consonant clusters in loanwords are repaired by /e/-prothesis (for example, slalom [eslalon], smoking [esmokin], snob [esnob], stereo [estereo], stress [estres]), documented for L1-Spanish L2 learners of English by Carlisle (1998). The status of /xr, xl/ is unclear.23 Our assumption that Spanish lacks consonant-glide onsets is based on distributional arguments (Harris 1983; Hualde 1991; Quilis & Fernández 1992) that post-consonantal prevocalic 22 Coronal onset clusters of the language minimally differ as Spanish lacks /sl/. Moreover, /tl/ is allowed in Mexican Spanish, e.g. tlapería ‘paint/hardware store.’ 23 Harris (1983) claims that these are accidental gaps, offering an example Jruschev ‘Khrushchev’, which is disputed by Pensado (1985). SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 193 Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 TABLE 5 Spanish Binary Obstruent-Sonorant Word Onset Clusters Note: Shaded areas indicate manner combinations which are generally unattested. glides are part of the nucleus, members of a set of rising diphthongs /wa, we, wo, wi, ja, je, jo, ju/. The alternative, to analyze glides as parts of complex onsets, meets with a number of problems. First, diphthongization, as seen in the alternation of /e/ /je/ and /o/ /we/ (e.g., sierra serrano, buen bondad), becomes a process that involves the syllable (onset C nucleus), rather than strictly involving the nucleus. Second, diphthongization occurs after liquids (e.g., ruego rogar), resulting in liquid-glide clusters which would violate the minimal sonority distance requirements holding for Spanish onsets, as motivated by the ill-formedness of both obstruent-nasal and clusters. Third, sonorant-initial onset clusters are generally illegal but sonorant-glide clusters (nasal-glide, liquid-glide) would be exceptional. Fourth, /s/-glide clusters (/sw/ suerte, /sj/ sierra) would become exceptions to the otherwise general prohibition against /s/-consonant clusters.24 Fifth, glides following clusters (e.g., prieto [prj], triunfo [trj]) would imply ternary clusters, which are otherwise excluded. Sixth, postconsonantal prevocalic glides render the syllable heavy, which can only be explained from a nuclear analysis (Harris 1983). Spanish codas are highly restricted. In singleton word codas, only /d, s, n, l, r/ occur in native words, and /b, g/ in loanwords (e.g., club, bistec).25 Complex word codas only occur in loanwords, and uniformly have the structure consonant-/s/ (e.g., /ps, ks, ns, ls/; Harris 1983; Hualde 1991).26 All attested two-consonantal codas occur in Dutch. 24 A possible argument against a nuclear analysis of obstruent-glide sequences could be that it is difficult to imagine what would rule them out as complex onsets, as they are best of all clusters in terms of their sonority distance, which is maximal. However, sources such as Greenberg (1965) present no implicational universal such that the presence of obstruent-liquid clusters in a given language implies the presence of obstruent-glide clusters in that language. 25 /d/ is realized as a continuant [ð] in coda, where it is easily devoiced to [θ], thus undergoing neutralization of voicing and continuancy. Similar neutralizations apply to /b/ (club) and /g/ (bistec). 26 Examples are bíceps, tórax, Máyans, vals. 194 TRAPMAN AND KAGER Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 2.4. Proper Inclusions Among the Three Languages Proper inclusions among the word margins of the three languages will now be represented at the level of natural classes: sonority, place of articulation, and voicing (for obstruent clusters). When representing the word onset clusters in terms of sonority classes, Spanish is the most restricted language of the three, allowing only obstruent-plus-liquid clusters (Figure 1). Dutch is more lenient, adding to the Spanish-legal set three more types of binary onsets: obstruent-obstruent, obstruent-nasal, and obstruent-glide. Russian is least restricted of all, adding to the Dutch-legal set nasal-consonant and liquid-consonant clusters. In terms of restrictions on voicing in obstruent clusters in word onset, Spanish and Dutch are equally restrictive (Figure 2). Dutch satisfies agreement of voice as well as a ban against voiced obstruent clusters. Spanish allows no obstruent clusters, satisfying both constraints vacuously. Russian allows two more cluster types: voiced clusters, as well as mixed voiceless-voiced clusters, the latter arising as a consequence of the properties of /v/ in voicing assimilation. For interactions of place of articulation in word onset clusters, the three languages differ mainly in their satisfaction of OCP-PLACE (Figure 3). Spanish word onsets respect OCP-C OR (with exceptions); labials never occur in second position of a cluster. Dutch satisfies OCP-LAB and OCP-C OR (the latter with exceptions). Russian satisfies OCP-LAB (with exceptions), but not OCP-C OR. FIGURE 1 Subset relations in sonority structure of binary word onset clusters in three languages: Spanish Dutch Russian. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 195 FIGURE 2 Subset relations in voicing structure of binary obstruent clusters in word onset in three languages: Spanish Dutch Russian. (Spanish has no obstruent clusters in word onset.) Ternary word onset clusters are illegal in Spanish. Dutch allows only ternary clusters beginning with /s/. Russian allows ternary clusters starting with other consonants than /s/, as well as quaternary clusters of which the initial consonant is /f, v/ (Figure 4). Turning to word coda clusters, sonority structure once again shows a subset relation between the three languages (Figure 5). Russian allows all logical possibilities (with the exclusion of consonant-glide). Dutch allows only falling sonority clusters (sonorant-obstruent, liquid-nasal, FIGURE 3 Subset relations in place of articulation structure of binary word onset clusters in three languages: Spanish Dutch Russian. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 196 TRAPMAN AND KAGER FIGURE 4 Subset relations in word onset clusters of two, three, and four consonants in three languages: Spanish Dutch Russian. FIGURE 5 Subset relations in sonority structure of binary word coda clusters in three languages: Spanish Dutch Russian. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 197 FIGURE 6 Subset relations in place of articulation structure of binary word coda clusters in three languages: Spanish Dutch, Russian. glide-nasal, glide-liquid) in addition to obstruent-obstruent clusters. Spanish has virtually no word coda clusters (which occur in loanwords only), but the ones that occur always end in an obstruent (more specifically, /s/). The three languages are highly similar in the interactions of voicing in obstruent clusters in the word coda. Apart from the fact that such clusters are marginal in Spanish but frequent in Dutch and Russian, all three languages neutralize the contrast to voiceless in word-final obstruent clusters. This is the only case in which no subset relation holds between the languages. Finally, for place of articulation in word coda clusters, Spanish only allows /s/ in second position, possibly violating OCP-C OR in loanwords (/ns, ls, rs/), but vacuously respecting OCP-LAB, whereas Dutch and Russian both disrespect OCP-LAB and OCP-C OR (Figure 6). In sum, this overview of the phonotactic possibilities in word margins at the level of natural classes has established an overall subset structure between the three languages: Spanish Dutch Russian for sonority, voicing, and place of articulation. In two specific cases, no strict subsets occur: Dutch and Russian match in terms of the place of articulation structure of word codas, while all three languages match in terms of voicing structure of word codas. Nevertheless, Spanish word margins never form a superset of Dutch and/or Russian margins, and Dutch word margins never form a superset of Russian margins. Hence, the overall subset structure between the three languages is not violated by these cases. 2.5. Summary and Predictions In this section, it was established that in terms of the consonant clusters allowed in the word onset and coda, Russian qualifies as a superset of Dutch, and Spanish as a subset. Hence, L1-Russian learners of Dutch face the task of acquiring a subset target grammar, one which is more restrictive than their native grammar, while L1-Spanish learners of Dutch face the opposite task of acquiring a superset grammar, one which is less restrictive than their native language. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 198 TRAPMAN AND KAGER On the basis of these results, we can now make specific predictions about the L2 acquisition of Dutch phonotactics by L1-Russian and L1-Spanish learners of Dutch, on the basis of the hypotheses stated in section 1. The first hypothesis, which was derived from L2 theory (transfer) in combination with learnability theory (the subset problem), stated that L2 learners can acquire a target grammar only when starting from an L1 subset initial state, but not from an L1 superset initial state. By the additional assumption that phonotactics involves grammatical knowledge, this predicts that L1-Russian learners of Dutch should be impeded by a lack of negative evidence, while L1-Spanish learners should benefit from the availability of positive evidence against their initial state. Our second hypothesis was that L2-acquisition of phonotactics is subject to development, in the sense that advanced learners should move closer to the target grammar, and thus become more native-like in their responses to Dutch nonwords. Hence, it is now predicted that advanced L1-Spanish learners of Dutch should be more native-like than beginning L1-Spanish learners of Dutch in terms of their phonotactic competence. In contrast, because L2 phonotactic acquisition should not occur under a subset scenario, advanced L1Russian learners of Dutch should not be more native-like than beginning L1-Russian learners of Dutch. 3. EXPERIMENT 1: WORD-LIKENESS JUDGMENT TASK In the word-likeness judgment task, participants gave ratings to spoken nonwords based on their perceived word-likeness in Dutch. Stimuli that contained Dutch-illegal clusters were expected to have lower scores than stimuli containing Dutch-legal clusters, for native speakers and the L1-Spanish L2 learners, because both Dutch and Spanish disallow the Dutch-illegal clusters. For the L1-Russian learners, however, differences in ratings between Dutch-legal and Dutchillegal clusters were not expected because of the subset problem regarding phonotactic learning: none of the stimuli have consonant clusters disallowed by the native grammar. 3.1. Participants Three groups of subjects participated in the experiment. The first group of participants consisted of 30 adult Dutch monolinguals.27 The other two groups contained adult L2 learners of Dutch: 18 native speakers of Russian and 13 native speakers of Spanish. In the L2 learner groups, some participants had more than one native language. Among the L1-Russian participants, five were bilinguals: two Russian/Azerbaijani, one Russian/Romanian, one Russian/Ukrainian, and one Russian/Belarusian. All reported Russian to be their dominant language except one (the Russian/Ukrainian bilingual). In the L1-Spanish group, there were three bilingual participants, all Spanish/Catalan.28 Each of the groups of L2 learners was subdivided into two equally large subgroups, based on the participants’ proficiency in Dutch. An estimate of the Dutch proficiency level of the participants was determined by their performance on a C-test (Taylor 1953). This test consists of five unrelated texts. In each text, a number of words are incomplete. 27 Most participants had mastered English to a certain degree. However, Dutch and English are highly similar on the phonotactic well-formedness of the stimuli selected. 28 Catalan allows more final clusters than Spanish (Wheeler 1979). SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 199 Participants were asked to fill out the missing second half of these words. The maximum score on the C-test was 100, the minimum score 0. The mean score for the L1-Russian beginning learners (N D 9) was 39.9 (sd D 10.6) and 70.3 (sd D 11.5) for advanced L1-Russian learners (N D 9). The mean score for the L1-Spanish beginning learners (N D 7) was 12.1 (sd D 7.6) and 58.3 (sd D 13.7) for advanced L1-Spanish learners (N D 6). The differences between the scores on the C-test of beginning and advanced learners were significant for the L1-Russian (t-test, two-tailed, t D 12:317, df D 17, p < :001), as well as for the L1-Spanish participants (t-test, t D 4:624, df D 12, p D :001). Biographical data of the participants were obtained by a brief oral interview. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 3.2. Materials Stimuli presented in the word-likeness judgment task were nonwords—monosyllabic and bisyllabic—which contained different types of consonant clusters (Table 6). Some clusters occurred in word onset position, others in word coda position. No stimuli contained more than one consonant cluster. The clusters were in a strict subset/superset relationship with respect to each other. That is, all of the 37 target consonant clusters are legal in Russian. A proper subset of these clusters are legal in Dutch, and a proper subset of the Dutch-legal clusters are legal in Spanish. Accordingly, stimuli were subdivided into three classes. The first class of clusters (Type 1; N D 20) contained those that are legal in Russian but not in the other two languages under investigation. The second class of clusters (Type 2; N D 12) consisted of those that are legal in Dutch and Russian, but not in Spanish. The final class of clusters (Type 3; N D 5) TABLE 6 Consonant Clusters Used in the Nonword Stimuli in the Word-Likeness Judgment Task Type 1 CC CCC CCCC Type 2 Type 3 Onset Coda Onset Coda Onset Coda ktrttkxmzbzdzlznfprfspfstsklzdrfsplfstr- -zm slsmsnst- -kt -nt -rk -rm -rs -rt flprtr- -ls -ns -nsk -rsk -stf -str splstr- Note: Type 1 is legal in Russian only; Type 2 is legal in Russian and Dutch; Type 3 is legal in Russian, Dutch, and Spanish. 200 TRAPMAN AND KAGER TABLE 7 Mean Type Frequency and Observed/Expected Values for Type 1, Type 2, and Type 3 Clusters in the Three Languages Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 Note: Type 1 is legal in Russian only; Type 2 is legal in Russian and Dutch; Type 3 is legal in Russian, Dutch, and Spanish. contained those that are legal in all three languages, including Spanish. Both onset and coda clusters were used. All consonants used in the stimuli were phonemes of both the L1 and the target language, that is, consonants belonging to the intersection of the inventories of the three languages.29 As far as vowels are concerned, no diphthongs or reduced vowels were included in the stimuli. Furthermore, parts of the nonwords other than the consonant cluster all contained phoneme combinations that have a relatively high frequency in Dutch in the word position in which they occur. This can be illustrated by the stimulus /fpro.0 lan/. The target cluster in this nonword is /fpr/, which is illegal in Dutch. The rhyme /o/ in the initial (weak) syllable as well as the onset /l/ and rhyme /an/ of the final (strong) syllable are relatively frequent in the Dutch lexicon.30 Clusters were selected so as to highlight structural differences between the languages in terms of phonotactic constraints. For example, Type 1 clusters violated the sonority sequencing principle (/rt, zm/), OCP-C OR (/zn, zl/), or the constraint against voiced obstruent clusters (/zb, zd/). Type 1 ternary/quaternary clusters had initial consonants other than /s/ (/fpr, fsp, fst, zdr/; /fspl, fspr/). Type 2 clusters included /s/-initial onsets (all illegal in Spanish, but under the exception clause for initial /s/ in Dutch), as well as coda clusters ending in consonants other than /s/. Each cluster type has an average type frequency of minimally 18.8 in each language in which it occurs legally (Types 1-2-3 for Russian; Types 2-3 for Dutch; Type 3 for Spanish)31 (Table 7). As an additional check on the phonotactic legality of the stimuli, average observed/expected ratios for each of the three cluster types were calculated and found to be above 1.0, confirming that none of the legal types are underrepresented in the three languages.32 Nevertheless, it turned out to be impossible to fully balance the frequencies of cluster types for each of the three languages. In Dutch, Type 2 clusters were more frequent than 29 Spanish has no phoneme /z/ but /s/ becomes voiced before a voiced consonant, as in rasgo ‘feature,’ jazmín ‘jasmine’ (Martínez-Celdrán, Fernández-Planas & Carrera-Sabaté 2003). 30 As before, type frequencies are based on a sublexicon of 69,245 lemmas based on Dutch CELEX. 31 Statistics for Spanish were taken from a computerized lexicon of 35,162 word types derived from a text corpus of 5,000,000 words in the Corpos del Español (Davies 2002). 32 Table 13 shows a high average O/E value (O/E = 11.85) for Russian Type 1 clusters. This might be an artifact of the average length of Type 1 clusters: with increased segment number, expected values naturally drop, as these are based on the product of the segmental probabilities: E.C1 C2 C3 / D pC1 pC2 pC3 N.CCC/. As a result, even with relatively low observed values, O/E values will naturally rise. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 201 Type 3 clusters, mainly due to the high frequency of /s/-consonant word onsets. In Russian, the reverse situation holds, with Type 3 > Type 2 > Type 1 clusters in terms of frequency. Since lexical frequency is known to influence participants’ responses in word-likeness rating and the lexical decision task, the possibility arises that L1-Russian participants rate stimuli according to the frequencies of the clusters in Russian, which might interfere with the prediction from Hypothesis 1 that these participants should not distinguish Type 1 and Type 2/3 stimuli by their phonotactic well-formedness. Hence, it was decided that possible effects of Russian frequencies on responses would be addressed afterward by means of a correlation analysis. If L1-Russian participants based their judgments of nonwords on Russian lexical frequencies, then a correlation analysis should reveal this; if instead, they based their judgments on acquired phonotactic constraints of Dutch, then responses should be more strongly correlated with the Type 1 versus Type 2/3 distinction. None of the nonword stimuli were existing words of Russian or Spanish. Two native speakers of Spanish and two native speakers of Russian were asked to check the stimuli. The stimuli were read by a phonetically trained native speaker of Dutch who was unaware of the purpose of this study, and digitally stored. In the stimulus list of the word-likeness judgment task, each target consonant cluster occurred four times: twice in a monosyllable, once in an iamb (a bisyllable with final stress), and once in a trochee (a bisyllable with initial stress). The stimulus list contained fillers as well. These fillers met the same criteria as the test items with respect to their stress patterns and phonemes that were included. The filler items lacked clusters entirely. The stimulus list of the wordlikeness judgment task contained 216 items: 156 test items and 60 fillers. The test items are all included in the Appendix. 3.3. Procedure All participants in this study took part in two experiments: the word-likeness judgment task and the lexical decision task (which will be presented in section 4). The participants were tested individually. Half of the participants of each native language group performed the word-likeness judgment task before the lexical decision task. For the other half of each group, the reverse order applied. This was done at random. There was a short break between the two tasks. Afterwards, the nonnative speakers were asked some questions in a short interview and they performed the C-test. The subjects were paid for their participation in the experiments. In the word-likeness judgment task, the 216 spoken stimuli were played over headphones. The experiment took place in a sound-proof booth. The participants were instructed orally. The instructions were stated in Dutch in order to move (or to keep) the participants in the right language mode. The participants were told that they were to listen to words that do not exist in Dutch, but that nevertheless, some of the nonwords would sound more typically Dutch than others. Participants were instructed to judge the extent to which a given nonword was more or less typically Dutch on a seven-point scale. On the screen, this scale was indicated by numbers 1 to 7. At the extreme left and right ends of the scale, the words slecht (‘bad’) and goed (‘good’) appeared. After each stimulus, participants entered a score on the scale on the screen by using a mouse. The word-likeness judgment task was self-paced, taking 10 minutes on average. 202 TRAPMAN AND KAGER 3.4. Results The scores that the participants assigned to the stimuli were analyzed for each group of participants. The set of target items was subdivided into two main categories, namely, items that are phonotactically legal in Dutch and items that contain Dutch-illegal consonant clusters. The statistical test that was used to detect significant differences between the responses to the two main categories was the Mann-Whitney U-test. A nonparametric test was used because the data set was not normally distributed. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 3.4.1. Dutch Participants As expected, the Dutch participants discriminated nonwords with phonotactically ill-formed clusters (Type 1) from those with phonotactically well-formed clusters (Types 2 and 3). This difference was significant for the onset cluster items (Mann-Whitney U-test, U D 232171:5, p < :001) as well as for the coda cluster items (Mann-Whitney U-test, U D 138580:5, p < :001) (see Figure 7). Furthermore, the native speakers of Dutch made a distinction within the class of legal onset clusters. Type 3 clusters received significantly higher scores (Mann-Whitney U-test, U D 111376:5, p < :001) than Type 2 clusters. This finding suggests that phonotactic judgments of native speakers are gradient rather than categorical. No such distinction was made within the class of legal coda clusters between Type 2 and 3 clusters. Gradience was also found for nonwords with phonotactically illegal (Type 1) clusters. Within this category, nonwords starting with an obstruent-obstruent cluster (/tk, kt, zb, zd/) received lower scores than nonwords starting with obstruent-sonorant clusters (/zl, zn, xm/), 1.65 versus 2.10. This difference is significant (Mann-Whitney U-test, U D 66877, p < :001). FIGURE 7 Average word-likeness judgments of non-words containing Type 1, Type 2, and Type 3 word onset and word coda clusters by Dutch native speakers. SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 203 Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 3.4.2. L1-Russian Learners of Dutch Contrary to the predictions, both groups of L1-Russian learners of Dutch discriminated between the Dutch-legal (Types 2-3) and the Dutch-illegal (Type 1) onset clusters (Figure 8). Nonwords that have illegal onset clusters received significantly lower scores than those that have legal onset clusters by beginning learners (Mann-Whitney U-test, U D 48070:0, p < :001) and advanced learners (Mann-Whitney U-test, U D 42083:0, p < :001). Both groups of the L1Russian participants also discriminated between L2-legal (Types 2-3) and L2-illegal (Type 1) coda clusters (Mann-Whitney U-test, beginning learners: U D 18974:0, p < :001; advanced learners: U D 14922:0, p < :001). The difference between beginning and advanced L1-Russian learners of Dutch was that the responses of the advanced learners were more like those of Dutch native speakers: they assigned significantly lower scores to the Dutch-illegal coda clusters (Type 1) than the beginners (MannWhitney U-test, U D 12877:0, p D :001). Furthermore, the advanced learners discriminated between Types 2 and 3 Dutch-legal onset clusters (Mann-Whitney U-test, U D 9276:0, p D :002), whereas the beginning learners failed to make this distinction. 3.4.3. L1-Spanish Learners of Dutch As expected, the advanced L1-Spanish learners of Dutch discriminated between the legal and illegal onset clusters (Figure 9). This group assigned significantly lower scores to nonwords containing illegal onset clusters (Type 1) than to nonwords with legal onset clusters (Types 2 and 3) (Mann-Whitney U-test, U D 21071, p < 0:001). The less advanced L1-Spanish learners did not make such a distinction, which shows that these learners had not yet acquired the rele- FIGURE 8 Average word-likeness judgments of non-words containing Type 1, Type 2, and Type 3 word onset and word coda clusters by beginning and advanced L1-Russian learners of Dutch. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 204 TRAPMAN AND KAGER FIGURE 9 Average word-likeness judgments of non-words containing Type 1, Type 2, and Type 3 word onset and word coda clusters by beginning and advanced L1-Spanish learners of Dutch. vant knowledge of Dutch, suggesting that there is development in the acquisition of the target language phonotactic knowledge. Although the beginning learners did not distinguish the illegal onset clusters from the legal ones, both groups of Spanish participants discriminated between legal and illegal coda clusters (Mann-Whitney U-test, beginning learners: U D 13067:5, p D :007; advanced learners: U D 8426:0, p < :001). That is, the beginning learners did not discriminate L2-illegal onsets from legal ones, but they did discriminate between legal and illegal coda clusters. Surprisingly, no significant difference was found between L1-Spanish learners’ judgments of Type 2 and 3 Dutch-legal consonant clusters. Although in the L1 of these learners, there is a difference between these two types of clusters (Type 2 clusters being illegal in Spanish, and Type 3 clusters legal), this difference is not visible in their judgments of Dutch nonwords. Within the class of illegal onset clusters, the advanced L1-Spanish learners discriminated between obstruent-obstruent and obstruent-sonorant clusters, 1.81 versus 3.01 (Mann-Whitney U-test, U D 2291:5, p < :001), like the native speakers of Dutch did. 3.5. Discussion As expected, the Dutch native listeners assigned low ratings to nonwords containing phonotactically illegal consonant clusters. This was the case for onset as well as for coda clusters. L2 learners of Dutch also showed sensitivity to distinctions between Dutch-legal and Dutch-illegal onset and coda clusters. Both beginning and advanced L1-Russian learners of Dutch assigned lower scores to Dutch-illegal onset and coda clusters than to Dutch-legal clusters. The results of the advanced L1-Spanish learners showed the same pattern. The beginning L1-Spanish learners, Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 205 however, only distinguished between L2-legal and L2-illegal coda clusters; they did not make a similar distinction for onset clusters. Hence, the advanced L1-Spanish learners displayed more native-like responses than the beginners, suggesting that there is development of phonotactic knowledge in L2 acquisition. Native speakers distinguished degrees of word-likeness between the classes of phonotactically legal clusters (differentiating Type 2 and Type 3 clusters), as well as within the class of illegal clusters (differentiating Type 1 clusters). Not all groups of L2 learners displayed such gradience. Only in the advanced groups, not in the beginning groups, could some significant differences be detected: the advanced L1-Russian learners distinguished within the class of Dutch-legal clusters (Types 2 and 3), while the advanced L1-Spanish learners distinguished within the class of the Dutch-illegal clusters (Type 1). This finding adds further evidence to our second hypothesis, that L2 phonotactic knowledge is subject to development. As we observed earlier, the response patterns for L1-Russian learners might be influenced by native language statistics, possibly obscuring the effects of acquired phonotactic knowledge of Dutch. This might introduce a confounding factor, which needs to be addressed. To assess the influence of Russian lexical statistics on L1-Russian learners’ responses, we conducted analyses with two types of Russian lexical statistics data: the type frequencies of the individual clusters and overall bi-phone probabilities of the nonword stimuli. First, we measured the Pearson correlation (two-tailed) between the average word-likeness rating per item by the L1Russian participants and the logarithm of the Russian type frequency for individual clusters. This relationship turned out to be significant (r D :37, p < :001) albeit rather weak. In addition, we measured the correlation between the average L1-Russian responses and a binary distinction legal/illegal between clusters (Type 2-3 versus Type 1). It turned out that word-likeness ratings correlated considerably more highly with the legal/illegal distinction. (r D :63, p < :001) than with the frequency statistics measure. Moreover, a multiple linear regression analysis reveals that the frequency measure does not explain significant unique variance once the legal/illegal distinction is included in the analysis (Table 8). Second, overall bi-phone probabilities of the items calculated over the Russian lexicon have no predictive effect as to L1-Russian word-likeness ratings. The correlation between the bi-phone probabilities of the items and average word-likeness judgments of the items is not significant (r D :04, p D :657). On the contrary, the Dutch bi-phone probabilities correlate significantly with the L1-Russian well-formedness ratings (r D :38, p < :001), TABLE 8 Regression Analyses on the L1-Russian Word-Likeness Ratings R-Square Step 1 Constant Legal/illegal distinction Step 2 Constant Legal/illegal distinction Log type frequency Russian Significant at .001. B SE B Beta 1.043 .12 .229 .08 .633 1.046 1.384 .084 .242 .181 .138 .599 .048 .392 .400 Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 206 TRAPMAN AND KAGER suggesting that the effect of L1 lexical statistics can be suppressed in the L2. The above analyses suggest that the similarities in word-likeness judgments between the L1-Russian learners of Dutch and the native speakers of Dutch were more strongly based on a shared phonotactic knowledge of Dutch than on similarities in terms of lexical statistics between Russian and Dutch. The results of the word-likeness rating experiment suggest that L2 learners have phonotactic knowledge of the target language which is similar to native listeners’ knowledge. Nativelike phonotactic knowledge of the target language was found for superset learners as well as for subset learners. This means that our first hypothesis is confirmed for the L1-Spanish superset learners, but not for the L1-Russian subset learners. Our second hypothesis, stating that development should take place in the case of succesful phonotactic acquisition, was confirmed for the L1-Spanish learners; for the L1-Russian learners, who unexpectedly showed native-like phonotactic knowledge, evidence for development was found as well. Nevertheless, it may have been the case that the participants in the word-likeness judgment task were guided by some (semi)-conscious awareness of Dutch consonant cluster legality, which may have resulted in a response strategy during the experiment. If so, the results of the word-likeness experiment may have reflected a kind of meta-linguistic knowledge which was different from the subconscious grammatical knowledge that we intended to assess. In order to minimize the possible effects of semi-conscious knowledge of phonotactics on participants’ responses, we conducted a lexical decision experiment with the same groups of participants. 4. EXPERIMENT 2: LEXICAL DECISION Phonotactic knowledge of native and nonnative listeners of Dutch was also measured in an online task—a lexical decision task. In this task, reaction times and accuracy scores are measured and analyzed. Generally, phonotactically illegal nonwords take less time to be rejected by native speakers than legal ones (Stone & Van Orden, 1993; Vitevitch & Luce 1999; Berent, Marcus, Shimron & Gafos 2002; Coetzee 2004, 2008, 2009; Kager & Shatzman 2007) since phonotactic knowledge will assist listeners in determining that a nonword is not a word of the L1’s lexicon. Hence, stimuli that have Dutch-illegal word onset or coda clusters are expected to have shorter reaction times than stimuli that are phonotactically legal in Dutch. Furthermore, accuracy rates for nonwords with Dutch-illegal clusters are expected to be higher for the Dutch native group and the L1-Spanish learners of Dutch. Among the L1-Spanish listeners, this difference is expected to be larger for the advanced learners than for the beginners. For the L1-Russian learners, L2-legal and L2-illegal onset and coda clusters are expected to have approximately the same reaction times and accuracy scores, because both types of clusters are hypothesized to satisfy the interlanguage phonotactic grammar. 4.1. Participants Participants in the lexical decision task were the same as in Experiment 1. However, the results of one of the beginning L1-Spanish participants were excluded because he reacted too slowly and therefore many responses were missing. SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 207 4.2. Materials The stimuli in the lexical decision experiment were of the same type as the stimuli in Experiment 1. The stimuli lists of the two experimental tasks did not contain the same test stimuli, although a small number of fillers occurred in both experiments. Based on the same set of target consonant clusters as in Experiment 1, a new list of stimuli was recorded. In this list, each target consonant cluster occurred three times—once in a monosyllable, once in an iamb, and once in a trochee. Moreover, existing words and nonword fillers (identical to those used in the word-likeness judgment task) were added. The stimulus list of the lexical decision experiment contained 280 items: 117 test items, 113 existing words, and 50 nonword fillers. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 4.3. Procedure The lexical decision task took place in the same booth as the word-likeness judgment task. Again, headphones were used to listen to the stimuli. The participants were seated behind a button box with a yes-button and a no-button. Because the no-responses are most important for this experiment, as these recorded the responses to the nonwords, the no-button was under the dominant hand: for right-handed people under the right hand, for left-handed people under the left hand. The order of stimuli was randomized for each participant in order to avoid ordering effects. The subjects were instructed to press the no-button when they heard a nonexisting word and the yes-button when they heard an existing word of Dutch. They were also instructed to respond as quickly and as accurately as possible. Before the real task, a short training session of five stimuli was presented. In this exercise session, no target items were included. After this exercise the subjects had the opportunity to ask questions. The real test session contained 280 trials. After each trial the participant had to press either the JA (yes) or NEE (no) button on the button box. The participants had to respond within 2400 msec after the beginning of each stimulus; otherwise no response was registered. After these 2400 msec, the next trial was presented. The trials were presented in random order and after 140 trials, the subjects had the opportunity to have a short break. After this break, the other 140 trials were presented. The lexical decision task took about 20 minutes. 4.4. Results Accuracy rates and reaction times were analyzed for each group of participants. The set of target items was subdivided into two main categories—items that are phonotactically legal in Dutch (Types 2 and 3) and items that contain Dutch-illegal consonant clusters (Type 1). The corrected reaction times were used for the analysis. These were calculated by subtracting the stimulus duration from the total reaction time measured from the onset of the stimulus. The statistical tests used to detect significant differences between the responses were t-tests and ANOVAs (item analyses). Native speakers and L1-Spanish learners of Dutch were expected to have lower accuracy scores and slower responses to the Dutch-legal (Types 2-3) than to the Dutchillegal (Type 1) consonant clusters. The responses of the L1-Russian learners of Dutch were not expected to show this difference. The native speakers of Dutch were also expected to 208 TRAPMAN AND KAGER discriminate between different levels of well-formedness (based on frequency effects) and ill-formedness (based on markedness effects). The overall accuracy scores for the experimental groups were as follows: native speakers of Dutch: 91%; beginning L1-Russian learners: 74%; advanced L1-Russian learners: 84%; beginning L1-Spanish learners: 64%; advanced L1-Spanish learners: 83%. These data show that the more advanced L2 learners had higher accuracy scores than the less advanced L2 learners. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 4.4.1. Dutch Participants The accuracy scores reveal that the native speakers of Dutch are more accurate on the Dutch-illegal (Type 1) onsets than on the Dutch-legal (Types 2 and 3) onsets: 99.3% versus 94.1% (Mann-Whitney U-test, U D 518474:0, p < :001). The accuracy scores of the Dutch participants did not distinguish the Dutch-illegal and Dutch-legal codas (98.2% versus 97.9%). Furthermore, the native speakers needed significantly more time to reject nonwords with legal onset clusters than to reject nonce words with illegal onset clusters (one-way ANOVA, F .2; 2099/ D 94:642, p < :001). For the coda clusters however, this difference is not significant (one-way ANOVA, F .2; 1144/ D 1:117, p D :291). Within the classes of legal and illegal consonant clusters, no significant differences are observed for the native speakers in the lexical decision task. 4.4.2. L1-Russian L2 Learners of Dutch The accuracy scores reveal that the beginning and advanced L1-Russian learners of Dutch were more accurate on the Dutch-illegal (Type 1) clusters than on the other clusters (Types 2 and 3). This was the case for the onset clusters (Beginning learners: 89.9 versus 60.5%; Advanced learners 96.8 versus 76.1%; Mann-Whitney U-test, Beginning learners: U D 35280:0, p < :001; Advanced learners: U D 39146:5, p < :001) as well as for the coda clusters (Beginning learners: 82.2 versus 52.7%; Advanced learners: 85.2 versus 68.1%; Mann-Whitney U-test, Beginning learners: U D 10670:0, p < :001; Advanced learners: U D 12176:5, p D :001). In general, native speakers responded faster than the L1-Russian learners of Dutch (Figure 10) and the advanced learners responded more quickly than beginning learners (Figure 11). This generalization held for nonwords with onset clusters as well as for nonce words with coda clusters. Both groups of L1-Russian learners of Dutch were sensitive to the phonotactic illegality of Type 1 onset clusters. They needed significantly less time to reject items with illegal onsets than legal onsets (One-way ANOVA, Beginning learners: F .2; 508/ D 12:899, p < :001; Advanced learners: F .2; 574/ D 45:474, p < :001). For the coda clusters, neither group made this distinction. Within the classes of legal and illegal consonant clusters, the L1-Russian learners did not show significant differences. The same was found for the native speakers of Dutch in the lexical decision task. In short, the accuracy scores suggest sensitivity to both illegal onset and coda clusters for both groups of L1-Russian learners of Dutch. The reaction times reveal only a difference between legal and illegal onset clusters (and not for the different types of coda clusters) for the beginning and advanced learners. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 209 FIGURE 10 Average reaction times (RTs) to non-words containing Type 1, Type 2, and Type 3 word onset and word coda clusters by Dutch native speakers. RTs were measured by subtracting the stimulus duration from the total RT measured from the onset of the stimulus. FIGURE 11 Average reaction times (RTs) to non-words containing Type 1, Type 2, and Type 3 word onset and word coda clusters by beginning and advanced L1-Russian learners of Dutch. RTs were measured by subtracting the stimulus duration from the total RT measured from the onset of the stimulus. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 210 TRAPMAN AND KAGER FIGURE 12 Average reaction times (RTs) to non-words containing Type 1, Type 2, and Type 3 word onset and word coda clusters by beginning and advanced L1-Spanish learners of Dutch. RTs were measured by subtracting the stimulus duration from the total RT measured from the onset of the stimulus. 4.4.3. L1-Spanish Learners of Dutch The accuracy scores of the beginning L1-Spanish learners did not distinguish the Dutch-legal (Types 2 and 3) from the Dutch-illegal (Type 1) consonant clusters. The advanced learners were more accurate on the Dutch-illegal onset clusters than on the other onset clusters: 87.9 versus 82.1% (Mann-Whitney U-test, U D 19195:0, p D :001). For the coda clusters, there was no difference between the accuracy scores for the Dutch-legal and illegal clusters. An analysis of the reaction times of the L1-Spanish learners (Figure 12) revealed the following pattern: Only the advanced learners showed sensitivity to the phonotactic illegality of the Type 1 onset clusters (One-way ANOVA, F .2; 361/ D 26:176, p < :001). The reaction times of this group did not show a difference between legal and illegal coda clusters. Beginning learners did not show sensitivity to the distinction between legal and illegal consonant clusters at all in their reaction times to different stimulus types. Like the other groups of participants, the L1-Spanish learners did not show significant differences within the classes of legal and illegal consonant clusters. 4.5. Discussion The results of the lexical decision task, which are summarized in Table 9, show that the native speakers of Dutch discriminated between legal and illegal consonant clusters. That is, correct responses to nonwords containing illegal clusters were faster than correct responses to phonotactically well-formed nonwords. The L1-Russian learners of Dutch, both the beginning and advanced learners, showed the same pattern: they made a distinction between consonant 211 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 TABLE 9 Significant Differences Between the Reaction Times of the Responses to the Different Cluster Types Distinction Between Legal and Illegal Clusters Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 Dutch Russian Russian Spanish Spanish 1 2 1 2 Onset Coda Distinction Within the Class of Legal Clusters Distinction Within the Class of Illegal Clusters Yes Yes Yes No Yes No No No No No No No No No No No No No No No clusters that are legal and illegal in Dutch, although both types are legal in their L1. The accuracy scores and reaction times of both groups of L1-Russian participants also showed sensitivity to finer-grained word-likeness differences within the class of illegal clusters. Moreover, the accuracy scores revealed a difference between legal and illegal onset and coda clusters, whereas the reaction times only differed between legal and illegal onset clusters, not between legal and illegal coda clusters. For the L1-Spanish learners of Dutch, results of the lexical decision task were different. Only the advanced learners distinguished between legal and illegal consonant clusters of Dutch. They only made this distinction for onset clusters, not for coda clusters. In contrast with the word-likeness judgment task, gradience within the class of legal clusters or within the class of illegal clusters cannot be shown for any group by the responses in the lexical decision task. Table 10 shows that the general results of the different tasks follow the same pattern for the onset consonant clusters. However, for the coda consonant clusters, the pattern is different. In the word-likeness judgment task (an offline task), the participants made distinctions that they did not make in the lexical decision task (an online task). Possibly, the decision for rejecting a TABLE 10 Significant Differences Between Legal and Illegal Consonant Clusters in Both Tasks: Word-Likeness Judgments (WLJ) and Lexical Decision (LD) for Five Experimental Groups Significant Difference Between Legal and Illegal Onset Clusters Native Language Dutch Russian Russian Spanish Spanish Significant Difference Between Legal and Illegal Coda Clusters Level WLJ LD Accuracy LD Reaction Times WLJ LD Accuracy LD Reaction Times 1 2 1 2 Yes Yes Yes No Yes Yes Yes Yes No Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes No Yes Yes No No No No No No No 212 TRAPMAN AND KAGER TABLE 11 Significant Differences Within the Main Classes in Both Tasks: Word-Likeness Judgments (WLJ) and Lexical Decision (LD) for Five Experimental Groups Significant Difference Within the Class of Legal Clusters Native Language Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 Dutch Russian Russian Spanish Spanish Significant Difference Within the Class of Illegal Clusters Level WLJ LD Reaction Times WLJ LD Reaction Times 1 2 1 2 Yes No Yes No No No No No No No Yes No No No Yes No No No No No nonword is already made on the basis of the initial part of the word (Marslen-Wilson & Welsh 1978; Marslen-Wilson & Zwitserlood 1989) before the coda has been processed. The same finding holds for distinctions within the class of legal and illegal clusters, which are presented in Table 11. The native speakers of Dutch and a number of L2-learners discriminated between different levels of well-formedness and ill-formedness in the word-likeness judgment task, but in the lexical decision task there was no such significant difference. This difference between the tasks suggests that the word-likeness judgment task measures finer-grained differences in phonotactic well-formedness than the lexical decision task. 5. GENERAL DISCUSSION The L2 acquisition of phonotactic knowledge was examined by means of two experimental tasks reflecting such knowledge: word-likeness judgments and lexical decision. L2 learners of Dutch whose native language phonotactics is either a subset or a superset of Dutch phonotactics took part in these experiments, in addition to a control group of native speakers. Since most literature on L2 acquisition of phonotactics is based on production tasks, and since production data may (partially) reflect production difficulties rather than pure tacit grammatical knowledge, very little was known about tacit phonotactic knowledge of L2 learners and its development. The results reveal that native speakers assigned higher word-likeness ratings to nonwords that have Dutch-legal onsets and needed more time to reject these phonotactically well-formed nonwords in the lexical decision task. The difference between nonwords containing Dutchlegal and Dutch-illegal codas was only significant in the word-likeness judgment task, which was found to reflect more fine-grained phonotactic knowledge than lexical decision. Beginning and advanced L1-Russian learners of Dutch also discriminated between Dutch-legal and Dutchillegal word onset and coda clusters (although no significant difference between legal and illegal codas occurred in the lexical decision task). The advanced, but not the beginning, L1-Spanish learners of Dutch differentiated between Dutch-illegal and Dutch-legal onset clusters in the word-likeness judgment task as well as in the lexical decision task. Both the beginning and advanced L1-Spanish learners distinguished Dutch-legal from Dutch-illegal codas in the word- Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 213 likeness judgment task, but not in the lexical decision task. That is, both groups of learners discriminated between legal and illegal consonant clusters. Moreover, the more advanced learners appeared more native-like than the beginning learners. Furthermore, the results of the word-likeness judgment task indicate that native speakers have gradient phonotactic knowledge distinguishing within the broad classes of legal and illegal consonant clusters; the lexical decision task elicited less fine-grained results. Here, not only native speakers exhibited gradient phonotactic knowledge within the two main classes of consonant clusters, but advanced L2 learners also showed gradient responses to some extent. In particular, the advanced L1-Russian learners discriminated degrees of word-likeness within the class of Dutch-legal (Types 2 and 3) onsets, whereas the advanced L1-Spanish learners only discriminated within the class of the Dutch-illegal (Type 1) onsets. L1-Russian learners did not distinguish levels of word-likeness within Dutch-illegal clusters (which are Russian-legal) since they receive no input of these clusters in Dutch. Possibly, a perception experiment in which more similar Dutch-illegal clusters are included can shed more light on gradient judgments by L2 speakers. Since L1-Russian learners of Dutch distinguished the Dutch-illegal clusters from the Dutchlegal clusters, the results of the experiments presented in this study suggest that L2-learners are able to acquire phonotactic knowledge of a subset grammar. These findings contradict the strong hypothesis that learners of a superset language cannot achieve target grammars that are more restrictive. A possibly confounding factor in this study is that the stimuli were pronounced by a native speaker of Dutch. In order to verify whether the stimuli with Dutch-illegal clusters sounded natural, a naturalness task might be added. Another option to avoid such an effect would be to repeat this experiment with stimuli that are pronounced by a Russian/Dutch bilingual. The question that remains to be addressed at the end of this study is what might explain our finding that phonotactic acquisition is possible in a ‘subset scenario’ despite the prediction from learnability theory. We will briefly discuss four logically possible explanations, based on universal markedness, the L2 initial state, indirect negative evidence, and learning mechanisms that are not vulnerable to the subset problem. First, L1-Russian learners of Dutch might derive implicit knowledge about the relative well-formedness of word margins from universal markedness (Pertz & Bever 1975; Berent, Steriade, Lennertz & Vaknin 2007). Most Type 1 (Dutch-illegal) clusters violate markedness constraints such as the sonority sequencing principle, OCP-PLACE, and the ban against voiced obstruent clusters, constraints which are all satisfied by Type 2/3 (Dutch-legal) clusters. Under a markedness account, no exposure to Dutch input is needed for L1-Russian learners of Dutch to represent the relative ill-formedness of Type 1 as compared to Type 2/3 clusters. Although all three cluster types are legal in Russian, the L1, markedness constraints assess Type 1 clusters as less well-formed than Type 2/3 clusters. This account correctly predicts that Type 1 and Type 2/3 clusters differ in word-likeness ratings for both subgroups of L1-Russian participants, beginning and advanced learners. Since our experiments were not designed to monitor learners’ representations of the absolute illegality of Type 1 clusters, but only the perceived differences in word-likeness between Types 1-2-3, we cannot rule out the interpretation, consistent with the markedness account, that L1-Russian learners represent all three cluster types, including Dutch-illegal Type 1, as legal based on their L1. Although a markedness account is compatible with L1-Russian learners’ responses to Type 1 versus Types 2/3 clusters, it is nevertheless Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 214 TRAPMAN AND KAGER incomplete for two reasons. First, it cannot account for our finding that only the advanced Russian learners, not the beginners, discriminated degrees of word-likeness within the class of Dutch-legal onsets. This finding suggests that phonotactic knowledge of the target language becomes more native-like and fine-grained in the course of acquisition. Although a markedness account does not rule out phonotactic development, it also makes no predictions in this respect. Second, a markedness account fails to explain the correlation between L1-Russian learners’ word-likeness judgments of the nonword stimuli and their phonotactic probabilities based on the Dutch lexicon. In sum, this account ultimately falls short of explaining how fine-grained phonotactic knowledge of the target language might develop under the subset scenario. A second logical possibility is that our assumption about the initial state of L2 equaling the final state of the L1 is incorrect. For example, the initial state of the L2 phonotactic grammar might be identical to the initial state of L1, involving no transfer from the native language (see Epstein, Flynn & Martohardjono (1996) on the ‘no-transfer/full-access-to-UG’ hypothesis). This would immediately solve the subset problem for L1-Russian learners. However, the notransfer account is highly unlikely to hold for L2 phonotactic acquisition, on the basis of a large body of results from L2 production and perception, reviewed in section 1. Although our experiments did not adduce evidence for transfer in L2-learners’ responses to nonwords, the overall evidence for transfer from L2 phonotactic acquisition studies is too substantial to ignore. A third possibility is that our assumption that learners receive no negative input about clusters that are illegal in Dutch word margins may simply be too strong. For example, learners might derive negative evidence against the legality of syllable margins from the syllabification of word-medial clusters. Many clusters that are illegal in Dutch word margins, including all our Type 1 clusters, are phonotactically legal when occurring in intervocalic position, where they span a syllable boundary (for example, /zd/ in esdoorn ‘maple tree’ or /xm/ in stigma ‘stigma’). Such syllabifications might offer indirect negative evidence against these clusters in their role as syllable margins, under the assumption that the learner can compare syllabification candidates of intervocalic clusters ([Vz.dV] > [V.zdV]) (Tesar & Smolensky 2000). This account faces two difficulties. First, syllabification cues are often subtle, difficult to detect, and ambiguous. This causes problems especially for superset learners, since ambiguity in the learners’ input patterns reinforces superset grammars. Second, the fact alone that a cluster is obligatorily heterosyllabic fails to rule it out as a legal syllable margin. For example, a cluster such as /nt/, legal as a word coda, is heterosyllabic in intervocalic position. For these reasons, it is highly uncertain whether this source of indirect negative evidence might avoid the subset problem. A final possibility to be considered is that the acquisition of phonotactic knowledge may rely strongly on learning mechanisms that are relatively invulnerable to the subset problem, in particular statistical learning. Correlations between phonotactic distributions in the lexicon and subjects’ responses to nonwords are well-attested (see again references in section 1). Moreover, adults and infants are able to learn phonotactic patterns from relatively short exposure based on distributional cues in artificial language learning (Onishi, Chambers & Fisher 2002; Chambers, Onishi & Fisher 2003). This accords with our finding that L1-Russian learners’ word-likeness judgments are correlated with Dutch bi-phone probabilities. However, probabilistic accounts of phonotactic acquisition fail to explain that L1-Russian learners’ word-likeness judgments show even stronger correlations with the legality/illegality of clusters in Dutch, which may tentatively be interpreted in favor of a more coarse-grained, abstract representation of phonotactic well- SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 215 formedness.33 In light of the results of the current study, it is highly unlikely that the acquisition of phonotactic knowledge is based on a single learning mechanism. ACKNOWLEDGMENTS The authors wish to thank three anonymous reviewers and the associate editor for their helpful comments and suggestions, and Tom Lentz for commenting on a previous version of this article. This research was supported by a grant from the Netherlands Organisation for Scientific Research (NWO) (277-70-001) to the second author. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 REFERENCES Adriaans, Frans & René Kager. To appear. Adding generalization to statistical learning: The induction of phonotactics from continuous speech. Albright, Adam. 2009. Feature-based generalisation as a source of gradient acceptability. Phonology 26(1), to appear. Altenberg, E. P. 2005. The judgement, perception, and production of consonant clusters in a second language. International Review of Applied Linguistics in Language Teaching 43. 53–80. Angluin, Dana. 1980. Inductive inference of formal languages form positive data. Information and Control 45. 117–135. Baayen, Harald, Richard Piepenbrock & Leon Gulikers. 1995. The CELEX lexical database. Philadelphia: Linguistics Data Consortium, University of Pennsylvania. Bailey, Todd M. & Ulrike Hahn. 2001. Determinants of word-likeness: Phonotactics or lexical neighborhoods? Journal of Memory and Language 44. 568–591. Baker, C. Lee. 1979. Syntactic theory and the projection problem. Linguistic Inquiry 10. 533–581. Barchudarova, S. G., S. I. Ožegova& A. B. Šapiro. 1967. Orfograficeskij Slovar0 Russkogo Jazyka. Moscow: Sovetskaja Enciklopedija. Berent, Iris, T. Lennertz, P. Smolensky & V. Vaknin. 2009. Listeners’ knowledge of phonological universals: Evidence from nasal clusters. Phonology 26.1, to appear. Berent, Iris, Gary F. Marcus, Joseph Shimron & Adamantios I. Gafos. 2002. The scope of linguistic generalizations: Evidence from Hebrew word formation. Cognition 83. 113–139. Berent, Iris & Joseph Shimron. 1997. The representation of Hebrew words: Evidence from the obligatory contour principle. Cognition 64. 39–72. Berent, Iris, Donca Steriade, Tracy Lennertz & Vered Vaknin. 2007. What we know about what we have never heard: Evidence from perceptual illusions. Cognition 104. 591–630. Berwick, Robert. 1985. The acquisition of syntactic knowledge. Cambridge, MA: MIT Press. Bhatt, Rakesh M. & Barbara Hancin-Bhatt. 1997. Optimal L2 syllables: Interactions of transfer and developmental effects. Studies in Second Language Acquisition 19. 331–378. Blevins, Juliette. 1995. The syllable in phonological theory. In J. Goldsmith (ed.) Handbook of phonological theory, 206–244. Cambridge, MA: Blackwell. Broselow, Ellen. 1987. An investigation of transfer in second language phonology. In G. Ioup & S. Weinberger (eds.) Interlanguage phonology, 261–278. Cambridge, MA: Newbury House. Broselow, Ellen, Su-I Chen & Chilin Wang. 1998. The emergence of the unmarked in second language phonology. Studies in Second Language Acquisition 20. 261–280. Broselow, Ellen & Daniel Finer. 1991. Parameter setting in second language phonology and syntax. Second Language Research 7. 35–59. Carlisle, Robert S. 1988. The effects of markedness on epenthesis in Spanish/English interlangauge phonology. Issues and Developments in English and Applied Linguistics 3. 15–23. 33 This is in line with recent computational models of phonotactic learning (Hayes & Wilson 2008; Albright 2009; Adriaans & Kager submitted), which combine statistical learning with feature-based generalization. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 216 TRAPMAN AND KAGER Carlisle, Robsert S. 1998. The acquisition of onsets in a markedness relationship: A longitudinal study. Studies in Second Language Acquisition 20. 245–260. Chambers, Kyle E., Kristine H. Onishi & Cynthia Fisher. 2003. Infants learn phonotactic regularities from brief auditory experience. Cognition 87. 69–77. Chew, Peter A. 2000. A computational phonology of Russian. PhD dissertation, University of Oxford. Coetzee, Andries W. 2004. What it means to be a loser: Non-optimal candidates in optimality theory. PhD dissertation, University of Massachusetts, Amherst, MA. Coetzee, Andries W. 2005. The OCP in the perception of English. In S. Frota, M. Vigario & M. J. Freitas (eds.), Prosodies, 223–245. New York: Mouton de Gruyter. Coetzee, Andries W. 2008. Grammaticality and ungrammaticality in phonology. Language 84. 218–257. Coetzee, Andries W. 2009. Grammar is both categorical and gradient. In S. Parker (ed.), Phonological argumentation: Essays on evidence and motivation. London: Equinox Publishers. Coleman, John & Janet B. Pierrehumbert. 1997. Stochastic phonological grammars and acceptability. In Proceedings of the Third Meeting of the ACL Special Interest Group in Computational Phonology. Somerset, NJ: Association for Computational Linguistics. Davidson, Lisa. 2003. The atoms of phonological representation: Gestures, coordination and perceptual features in consonant cluster phonotactics. PhD dissertation, Johns Hopkins University, Baltimore. Davidson, Lisa. 2007. The relationship between the perception of non-native phonotactics and loanword adaptation. Phonology, 24. 261–286. Davidson, Lisa, Jason Shaw & Tuuli Adams. 2007. The effect of word learning on the perception of non-native consonant sequences. Journal of the Acoustical Society of America 122. 3697–3709. Davies, Mark. 2002. Corpus del Español (100 million words, 1200s–1900s). Available online at http://www.corpusdelespanol.org/ Dell, Gary S., Kristopher D. Reed, David R. Adams & Antje S. Meyer. 2000. Speech errors, phonotactic constraints, and implicit learning: A study of the role of experience in language production. Journal of Experimental Psychology: Learning, Memory, and Cognition 26. 1355–1367. Dupoux, Emmanuel, Kazuhiko Kakehi, Yuki Hirose, Christophe Pallier, & Jacques Mehler. 1999. Epenthetic vowels in Japanese: A perceptual illusion? Journal of Experimental Psychology: Human Perception and Performance 25. 1568–1578. Dupoux, Emmanuel, Christophe Pallier, Kazuhiko Kakehi, & Jacques Mehler. 2001. New evidence for prelexical phonological processing in word recognition. Language and Cognitive Processes 5. 491–505. Eckman, Fred R. 1977. Markedness and the contrastive analysis hypothesis. Language Learning 27. 315–330. Eckman, Fred R. 1987. On the reduction of word-final consonant clusters in interlanguage. In A. James & J. Leahter (eds.), The sound pattern of second language acquisition, 143–162. Dordrecht: Foris Publications. Epstein, Samuel D., Suzanne Flynn & Gita Martohardjono. 1996. Second language acquisition: Theoretical and experimental issues in contemporary research. Behavioral and Brain Sciences 19. 677–758. Escudero, Paola. 2005. Linguistic perception and second language acquisition: Explaining the acquisition of optimal phonological categorization. PhD dissertation, Utrecht University. Escudero, Paola & Paul Boersma. 2002. The subset problem in L2 perceptual development: Multiple-category assimilation by Dutch learners of Spanish. In B. Skarabela, S. Fish & H.-J. Do (eds.) Proceedings of the 26th Annual Boston University Conference on Language Development, 208–219. Somerville, MA: Cascadilla Press. Friederici, Angela D. & Jeanine M. I. Wessels. 1993. Phonotactic knowledge and its use in infant speech perception. Perception and Psychophysics 54. 287–295. Frisch, Stephan A., Nathan R. Large & David B. Pisoni. 2000. Perception of word-likeness: Effects of segment probability and length on the processing of nonwords. Journal of Memory and Language 42. 481–496. Frisch, Stephan A. & Bushra A. Zawaydeh. 2001. The psychological reality of OCP-Place in Arabic. Language 77. 91–106. Goldrick, Matthew. 2004. Phonological features and phonotactic constraints in speech production. Journal of Memory and Language 51. 586–603. Greenberg, Joseph. 1965. Some generalizations concerning initial and final consonant sequences. Linguistics 18. 5–34. Hallé, Pierre, Juan Segui, Uli Frauenfelder & Christine Meunier. 1998. The processing of illegal consonant clusters: a case of perceptual assimilation? Journal of Experimental Psychology: Human Perception and Performance 24. 592–608. Hancin-Bhatt, Barbara. 2000. Optimality in second language phonology: Codas in Thai ESL. Second Language Research 16. 201–232. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 217 Hancin-Bhatt, Barbara. & Rakesh M. Bhatt. 1997. Optimal L2 syllables: Interactions of transfer and developmental effects. Studies in Second Language Acquisition 19. 331–378. Harnsberger, James D. 2001. On the relationship between identification and discrimination of non-native nasal consonants. Journal of the Acoustical Society of America, 110, 489–503. Harris, James. 1983. Syllable structure and stress in Spanish: A nonlinear analysis. Cambridge, MA: MIT Press. Haunz, Christine. 2002. Speech perception in loanword adaptation. Talk presented at the Postgraduate Conference of the Edinburgh University, Department of Theoretical and Applied Linguistics, May 27–28, 2002. Hay, Jennifer., Janet Pierrehumbert & Mary Beckman. 2004. Speech perception, wellformedness and the statistics of the lexicon. In J. Local., R. Ogden & R. Temple (eds.), Phonetic interpretation: Papers in laboratory phonology VI, 58–74. Cambridge: Cambridge University Press. Hayes, Bruce. 1984. The phonetics and phonology of Russian voicing assimilation. In M. Aronoff & R. T. Oehrle (eds.), Language Sound Structure, 318–328. Cambridge, MA: MIT Press. Hayes, Bruce. 2004. Phonological acquisition in Optimality Theory: The early stages. In R. Kager, J. Pater & W. Zonneveld (eds.), Constraints on phonological acquisition. Cambridge: Cambridge University Press. Hayes, Bruce & Colin Wilson. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39. 379–440. Hualde, Jose I. 1991. On Spanish syllabification. In H. Campos & F. Martínez-Gil (eds.), Current studies in Spanish linguistics, 475–493. Washington, DC: Georgetown University Press. Jakobson, Roman. 1941/1968. Child language, aphasia and phonological universals. The Hague: Mouton. Jusczyk, Peter W., Angela D. Friederici, Jeanine M. Wessels, Vigdis Y. Svenkerud & Ann Marie Jusczyk. 1993. Infants’ sensitivity to the sound patterns of native language words. Journal of Memory and Language 32. 402–420. Jusczyk, Peter W., Paul. A. Luce & Jan Charles-Luce. 1994. Infants’sensitivity to phonotactic patterns in the native language. Journal of Memory and Language 33. 630–645. Kabak, Baris & William J. Idsardi. 2007. Perceptual distortions in the adaptation of English consonant clusters: Syllable structure or consonantal contact contraints? Language and Speech 50. 23–52. Kager, René & Wim Zonneveld. 1986. Schwa, syllables, and extrametricality in Dutch. The Linguistic Review 5. 197–221. Kager, René and Keren Shatzman. 2007. Phonological constraints in speech processing. In B. Los & M. van Koppen (eds.), Linguistics in the Netherlands 2007, 99–111. Kempgen, Sebastian. 1995. Phonemcluster und Phonemdistanzen (im Russischen). In D. Weiss (ed.), Slavische Linguistik 1994, 197–221. München. Kucera, Henry & George K. Monroe. 1968. A comparative quantitative phonology of Russian, Czech, and German. New York: American Elsevier Publishing Company. Levy, Erika S. & Winifred Strange. 2008. Perception of French vowels by American English adults with and without French language experience. Journal of Phonetics 36. 141–157. Lönngren, Lennart. 1993. Chastotnyj Slovar0 Sovremennogo Russkogo Jazyka (A frequency dictionary of modern Russian with a summary in English). Acta Universitatis Upsaliensis, Studia Slavica Upsaliensia 32, Uppsala. Marslen-Wilson, William D. & Alan Welsh. 1978. Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology 10. 29–63. Marslen-Wilson, William D. & Pienie Zwitserlood. 1989. Accessing spoken words: The importance of word onsets. Journal of Experimental Psychology: Human Perception and Performance 15. 576–585. Martínez-Celdrán, Eugenio, Ana M. Fernández-Planas & Josefina Carrera-Sabaté. 2003. Castilian Spanish. Journal of the International Phonetic Association 33. 255–259. Massaro, Dominic W. & Michael M. Cohen. 1983. Phonological constraints in speech perception. Perception and Psychophysics 34. 338–348. Mattys, Sven L. & Peter W. Jusczyk. 2001. Phonotactic cues for segmentation of fluent speech by infants. Cognition 78. 91–121. Mattys, Sven L., Peter W. Jusczyk, Paul A. Luce & James L. Morgan. 1999. Phonotactic and prosodic effects on word segmentation in infants. Cognitive Psychology 38. 465–494. McQueen, James. 1998. Segmentation of continuous speech using phonotactics. Journal of Memory and Language 39. 21–46. McQueen, James, Takashi Otake & A. Cutler. 2001. Rhythmic cues and possible-word constraints in Japanese speech segmentation. Journal of Memory and Language 45. 103–132. Moreton, Elliott. 2002. Structural constraints in the perception of English stop-sonorant clusters. Cognition 84. 55–71. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 218 TRAPMAN AND KAGER Moreton, Elliott & Shigeaki Amano. 1999. Phonotactics in the perception of Japanese vowel length: Evidence for longdistance dependencies. In Proceedings of the 6th European Conference on Speech Communication and Technology, Budapest, Hungary. Onishi, Kristine H., Kyle E. Chambers & Cynthia Fisher. 2002. Learning phonotactic constraints from brief auditory exposure. Cognition 83. 13–23. Ostapenko, Olesya. 2005. The optimal L2 Russian syllable onset. In Linguistics Students Organization Working Papers in Linguistics 5: Proceedings of the Workshop in General Linguistics 2005, 140–151. Madison, WI: Department of Linguistics, University of Wisconsin-Madison. Pensado, Carmen. 1985. On the interpretation of the non-existent: Nonoccurring syllable types in Spanish phonology. Folia Linguistica 19. 313–320. Peperkamp, Sharon & Dupoux, Emmanuel. 2003. Reinterpreting loanword adaptations: The role of perception. Proceedings of the 15th International Congress of Phonetic Sciences, 367–370. Pertz, Doris L. & Thomas G. Bever. 1975. Sensitivity to phonological universals in children and adolescents. Language 51. 149–162. Pitt, Mark A. 1998. Phonological processes and the perception of phonotactically illegal consonant clusters. Perception and Psychophysics, 60, 941–951. Praamstra, Peter, Antje S. Meyer & Willem J. M. Levelt. 1994. Neurophysiological manifestations of phonological processing: Latency variations of a negative ERP component time-locked to phonological mismatch. Journal of Cognitive Neuroscience 6. 204–219. Prince, Alan & Paul Smolensky. 1993. Optimality theory: Constraint interaction in generative grammar. Technical Report, Rutgers Center for Cognitive Science, Rutgers University, New Brunswick, NJ, and Computer Science Department, University of Colorado, Boulder. Prince, Alan & Bruce Tesar. 2004. Learning phonotactic distributions. In R. Kager, J. Pater & W. Zonneveld (eds.), Constraints on Phonological Acquisition. Cambridge, MA: Cambridge University Press. Quilis, Antonio & Joseph A. Fernández. 1992. Curso de Fonética y Fonología Españolas. Madrid, Consejo Superior de Investigaciones Cientificas. Rochet, Bernard L. & Anne Putnam Rochet. 1999. Effects of L1 phonotactic constraints on L2 speech perception. In J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville & A. Baily (eds.), Proceedings of the 14th International Congress of Phonetic Sciences, 1443–1446. Berkeley, CA: University of California. Saffran, Jenny R. & Erik D. Thiessen. 2003. Pattern induction by infant language learners. Developmental Psychology 39. 484–494. Scheer, Tobias. 2000. De la localité, de la morphologie et de la phonologie en phonologie. Thèse d’Habilitation, Université de Nice. Scholes, Robert J. 1966. Phonotactic grammaticality. The Hague: Mouton and Co. Ševeleva, M. S. 1974. Obratnyj Slovar0 Russkogo Jazyka (Reverse dictionary of the Russian language). Moscow: Sovetskaja Enciklopedija. Silverman, Daniel. 1992. Multiple scansions in loanword phonology: Evidence from Cantonese. Phonology 9. 289– 328. Smolensky, Paul. 1996. The initial state and ‘richness of the base’ in optimality theory. Technical Report, Department of Cognitive Science, Johns Hopkins University. Stone, Gregory O. & Guy C. van Orden. 1993. Strategic control of processing in word recognition. Journal of Experimental Psychology: Human Perception and Performance 19. 744–774. Suomi, Kari, James M. McQueen& Anne Cutler. 1997. Vowel harmony and speech segmentation in Finnish. Journal of Memory and Language 36. 422–444. Taylor, Conrad F. & George Houghton. 2005. Learning artificial phonotactic constraints: Time course, durability, and relationships to natural constraints. Journal of Experimental Psychology: Learning, Memory, and Cognition 31. 1398–1416. Taylor, W. L. 1953. Cloze procedure: A new tool for measuring read-ability. Journalism Quarterly 30. 414–438. Tesar, Bruce & Paul Smolensky. 2000. Learnability in Optimality Theory. Cambridge, MA: MIT Press. Trommelen, Mieke. 1983. The syllable in Dutch; with special reference to diminutive formation. Dordrecht: Foris. Vitevitch, Michael S. & Paul A. Luce. 1998. When words compete: Levels of processing in perception of spoken words. Psychological Science 9. 325–329. Vitevitch, Michael S. & Paul A. Luce. 1999. Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language 40. 374–408. SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 219 Vroomen, Jean, Jyrki Tuomainen & Beatrice de Gelder. 1998. The roles of word stress and vowel harmony in speech segmentation. Journal of Memory and Language 38. 133–149. Weber, Andrea & Anne Cutler. 2006. First-language phonotactics in second-language listening. Journal of the Acoustical Society of America 119. 597–607. Weinberger, Steven. 1988. Theoretical foundations of second language phonology. PhD dissertation, University of Washington. Wheeler, Max. 1979. The Phonology of Catalan. Oxford: Blackwell. Yip, Michel C. W. 1993. Cantonese loanword phonology and optimality theory. Journal of East Asian Linguistics 2. 261–291. Zonneveld, Wim. 1983. Lexical and phonological properties of Dutch voicing assimilation. In M. Van den Broeke, V. Van Heuven & W. Zonneveld (eds.), Sound structures, studies for Antonie Cohen, 297–312. Dordrecht: Foris. Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 Submitted 10 June 2008 Final version accepted 28 April 2009 220 TRAPMAN AND KAGER Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 APPENDIX A Target Nonwords in the Word-Likeness Judgment Task Target Part Cluster Type Cluster Monosyllable Monosyllable iamb trochee Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 1 1 1 1 1 2 2 2 2 2 2 3 3 kt rt tk xm zb zd zl zn fpr fsp fst skl zdr fspl fstr sl sm sn st spl str fl pr tr zm nsk rsk stf str kt nt rk rm rs rt ls ns ktɑm rtɑn tke xmɑt zbal zdεk zlεn znεr fprɑn fspɑm fstɑm sklɑn zdrɑn fsplar fstrɑn slɑt smn snɑt stim splir strɔl flɔs prɔn trεl pɑzm lɑnsk mɔrsk lɑstf kεstr kεkt dɔnt fεrk lεrm dɑrs kεrt mɔls mɔns ktɔl rtεk tkɔl xmεn zbɔt zdur zlɔm znus fprel fspe fstur sklir zdre fsplur fstruk slεn smɔt snɔk stuf splur strun fle prεn trun tεzm mɔnsk pεrsk mɔstf lɔstr mɑkt rnt trk tɔrm lɔrs pεrt rεls tɔns ktope rtomun tkoman xmotun zbomεl zdaman zlaton znilɔn fpriton fspatan fstiman sklomεl zdromun fsploda fstreni slomɔn smatan snamɔn stame spliton striman flatun proman trilan larɔzm tolεnsk pimεrsk kapɔstf tonεstr rotɔkt litεnt palɔrk tilɔrm tanεrs monɔrt sirɔls ramεns ktela rtono tkari xmado zbeli zdolu zlara znuri fprani fspudo fstano sklida zdromo fsplonir fstrinεl sladi smoni snoda stano spledi strano flira prano traka lirɑzm dolɑnsk teprsk pakɔstf molɑstr rotɑkt melɔɔnt latrk kedɔrm makɔrs rokart tarils dalons SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2 Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009 APPENDIX B Target Nonwords in the Lexical Decision Task Target Part Cluster Type Cluster Monosyllable iamb trochee Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Onset Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda Coda 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 1 1 1 1 1 2 2 2 2 2 2 3 3 kt rt tk xm zb zd zl zn fpr fsp fst skl zdr fspl fstr sl sm sn st spl str fl pr tr zm nsk rsk stf str kt nt rk rm rs rt ls ns kte rtɑl tkεn xmɑn zbi zde zlɑp znɑp fprɔs fspεn fstɔs sklεn zdren fsplɔt fstrak slam smεr snεp stun splɔt strɔs flar prɔt tros dɑzm rεnsk trsk lɔstf dεstr rɑkt dɑnt dɔrk pεrm pɑrs lεrt kεls kɔns ktilon rtamεl tkamun xmitan zbatan zdalun zlinal znimon fprolan fspiran fstamεl sklaton zdronal fsplimεl fstromir sliman smilan snitan stolir spliret strale flaman pramɔl trone tilεzm satɔnsk molεrsk tonεstf rimɔstr lamɔkt kolɔnt timɑrk minɔrm tilɔrs pamεrt panɔls palɔns ktepo rtado tkali xmopa zbemo zdeno zlita znoki fpreda fspako fstumi skloda zdrena fsplamo fstrola sloko smida snado stoda splino stropa flemi praki tralu pelzm ratεnsk kodεrsk tatɔstf dolɑstr lotɑkt pokɑnt kemrk dedrm takɔrs lodεrt merls ladɔns 221