please scroll down for article

Transcription

please scroll down for article
This article was downloaded by: [Universiteit Utrecht]
On: 6 July 2009
Access details: Access Details: [subscription number 907217953]
Publisher Psychology Press
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Language Acquisition
Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t775653668
The Acquisition of Subset and Superset Phonotactic Knowledge in a Second
Language
Mirjam Trapman a; René Kager b
a
University of Amsterdam, b Utrecht University,
Online Publication Date: 01 July 2009
To cite this Article Trapman, Mirjam and Kager, René(2009)'The Acquisition of Subset and Superset Phonotactic Knowledge in a
Second Language',Language Acquisition,16:3,178 — 221
To link to this Article: DOI: 10.1080/10489220903011636
URL: http://dx.doi.org/10.1080/10489220903011636
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf
This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
Language Acquisition, 16:178–221, 2009
Copyright © Taylor & Francis Group, LLC
ISSN: 1048-9223 print/1532-7817 online
DOI: 10.1080/10489220903011636
ARTICLE
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
The Acquisition of Subset and Superset
Phonotactic Knowledge in a Second Language
Mirjam Trapman
University of Amsterdam
René Kager
Utrecht University
Can second language (L2) learners acquire a grammar that allows a subset of the structures allowed
by their native grammar? This question is addressed here with respect to acquisition of phonotactics.
On the assumption that the L2 initial state equals the native grammar’s final state, learnability theory
would predict that a lack of negative evidence for phonotactic structures that are illegal in the target
language precludes acquisition of the target grammar. This prediction is tested for L1-Russian
(superset) and L1-Spanish (subset) L2 learners of Dutch by means of word-likeness judgments and
lexical decision experiments. Participants responded to nonwords containing consonant clusters
in onsets and codas that are legal (1) only in Russian, (2) only in Russian and Dutch, or (3) in
all three languages. The results converge to show that advanced L1-Russian and L1-Spanish L2
learners possess native-like phonotactic knowledge. Analysis shows that this knowledge cannot
be attributed to transfer of lexical statistics from the native language. The results suggest that L2
phonotactic acquisition is not affected by subset/superset relations between the native language and
target language. Some possible explanations for our findings are discussed.
1. INTRODUCTION
Phonologies of natural languages differ not only in terms of phoneme inventories, but also
in terms of phoneme distributions. Phonotactic constraints state positional restrictions on
speech sounds, typically with respect to the syllable. Constraints may vary in strength or
Correspondence should be sent to Mirjam Trapman, Amsterdam Center for Language and Communication,
University of Amsterdam, Spuistraat 210, 1012 VT Amsterdam, The Netherlands. E-mail: [email protected]. René
Kager, Utrecht Institute of Linguistics/OTS, Utrecht University, Janskerkhof 13, 3512 BL Utrecht, The Netherlands.
178
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
179
ranking, providing a major source of cross-linguistic variation in syllable inventories. Studies
of phonological markedness show that the presence of marked structures in a language strongly
predicts the presence of less-marked structures (Jakobson 1941/1968; Greenberg 1965; Prince
& Smolensky 1993; Blevins 1995). For example, universally languages which allow complex
onsets also allow simple onsets, languages that allow plosive-nasal onsets also allow plosiveliquid onsets, etc. Syllable typology is an area particularly rich in inclusion relations between
inventories. Inclusion relations between inventories can be recursive, so that the inventory of
Language A may be a proper subset of the inventory of Language B, which itself is a proper
subset of the inventory of Language C. For example, the small set of syllable onsets that are
legal in Spanish, is properly included in the somewhat larger set of Dutch legal onsets, which
itself is properly included in the even larger Russian set, to be shown later. The general issue
that we address here is whether such ‘stringency relations’ between grammars affect naturalistic
phonotactic acquisition. In particular, we investigate whether phonotactic acquisition in a second
language depends on the learner’s initial state, which may either define a proper subset or
a superset of the phonotactic structures allowed by the target language. Learnability theory
predicts a strong asymmetry between these two initial states, such that succesful phonotactic
acquisition should only be possible starting from a subset initial state. This is because only
learners who acquire phonotactic superset grammars receive positive evidence in their input.
Before developing this prediction, we will first discuss phonotactic knowledge from the dual
perspective of language processing and acquisition.
Phonotactic contraints are not merely notational devices helpful for language description;
they possess psychological reality. The assumption that native speakers possess phonotactic
knowledge of their language is supported by classical types of evidence: loanword adaptations
and well-formedness ratings of nonwords. Loanwords tend to be ‘repaired’ by processes such as
vowel epenthesis making the resulting forms conform to native phonotactics (Silverman 1992;
Yip 1993; Peperkamp & Dupoux 2003; Davidson 2007). Native speakers’ ability to judge
the phonotactic well-formedness (word-likeness) of nonwords is documented by many studies
(Scholes 1966; Berent & Shimron 1997; Bailey & Hahn 2001; Frisch & Zawaydeh 2001;
Coetzee 2004, 2008, 2009). Well-formedness judgments have been found to be gradient and to
correlate with measures of phonotactic probability in the lexicon (Coleman & Pierrehumbert
1997; Frisch, Large & Pisoni 2000; Bailey & Hahn 2001; Hay, Pierrehumbert & Beckman
2004; Coetzee 2008; Albright 2009), suggesting that phonotactic knowledge, partially or
entirely, emerges from distributions in the lexicon. However, native speakers are able to
distinguish degrees of well-formedness between phonotactic structures that are illegal in the
native language (Pertz & Bever 1975; Berent, Steriade, Lennertz & Vaknin 2007; Coetzee 2008;
Berent, Lennertz, Smolensky & Vaknin 2009). This finding that cannot be explained from the
hypothesis that phonotactic knowledge is learned only from exposure to lexical distributions,
and suggests that such knowledge also has a basis in universal constraints.
The psychological reality of phonotactic knowledge is supported by a range of evidence
from speech production and perception. For production, the classical finding is that speech
errors tend to result in structures that are phonotactically legal (Dell, Reed, Adams & Meyer
2000; Goldrick 2004). Phonotactically legal nonwords are also produced faster and more
accurately than phonotactically illegal nonwords (Vitevitch & Luce 1998). Turning to the
domain of speech perception, an increasing number of studies show phonotactic influences
on native listeners’ responses in tasks such as lexical decision (Praamstra, Meyer & Levelt
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
180
TRAPMAN AND KAGER
1994; Vitevitch & Luce 1999; Berent, Marcus, Shimron & Gafos 2002; Kager & Shatzman
2007; Coetzee 2008) and word-spotting (McQueen 1998; Suomi, McQueen & Cutler 1997;
Vroomen, Tuomainen & de Gelder 1998; McQueen, Otake & Cutler 2001; Weber & Cutler
2006). Moreover, native phonotactic constraints shape perception by filtering out segmental
sequences that are illegal in the native language, affecting phoneme identification (Massaro &
Cohen 1983; Hallé, Segui, Frauenfelder & Meunier 1998; Pitt 1998; Moreton & Amano 1999;
Moreton 2002; Coetzee 2005) and the perception of syllables, involving ‘perceptual epenthesis’
(Dupoux, Kakehi, Hirose, Pallier & Mehler 1999; Dupoux, Pallier, Kakehi & Mehler 2001;
Berent, Steriade, Lennertz & Vaknin 2007; Kabak & Idsardi 2007). This growing body of
studies offers converging evidence that native speakers and listeners possess implicit knowledge
of the phonotactic constraints of their native language and use this knowledge for processing.
Phonotactic knowledge of the native language develops early, taking off during the first year
of life. In perception experiments, 6-month-old infants respond differently to speech which
contains phoneme sequences conforming to native phonotactics than to speech containing
phonotactically illegal sequences (Friederici & Wessels 1993; Jusczyk, Friederici, Wessels,
Svenkerud & Jusczyk 1993; Jusczyk, Luce & Charles-Luce 1994). Younger infants, aged
3 months, do not show differential responses to phonotactically legal and illegal sequences.
These studies suggest that native phonotactic knowledge begins developing during the first
year, possibly to assist the segmentation of speech. It has been hypothesized that infants use
phonotactics in continuous speech to start tackling the problem of where word boundaries fall,
which would assist them in setting up an initial lexicon (Mattys, Jusczyk, Luce & Morgan
1999; Mattys & Jusczyk 2001).
Much research has addressed the question of how the acquisition of novel phonotactic
knowledge is affected by native phonotactic knowledge, by studying phonotactic acquisition
in a second language (henceforth, L2). A crucial difference from native (henceforth, L1)
phonotactic acquisition resides in the circumstance that at the onset of L2 acquisition, a fullfledged phonotactic grammar has already been acquired. Many studies have found pervasive
effects of L1 phonotactics on the L2. L2 learners repair nonnative structures in their productions,
resulting in outputs meeting phonotactic constraints of their L1. For example, Korean L2
learners of English simplify consonant clusters that are illegal in their L1 (Broselow & Finer
1991), while English L2 learners of Russian simplify complex syllable onsets (Ostapenko 2005).
Strategies to adjust L1-illegal consonant clusters are multiple, and include vowel epenthesis,
consonant deletion, and metathesis. Vowel epenthesis is often applied by L2 learners to adjust
L1-illegal complex onsets (Broselow 1987; Bhatt & Hancin-Bhatt 1997). However, advanced
learners also apply consonant deletion (Ostapenko 2005). The repair of L1-illegal structures
has been found to depend on their degree of markedness. A number of studies show that more
production errors occur in L1-illegal forms that are more marked than in L1-illegal forms that
are less marked (Eckman 1987; Carlisle 1988, 1998).1 For example, Carlisle (1988) showed
that relatively unmarked obstruent-liquid onsets are modified less often than more marked
obstruent-nasal clusters even when neither type of structure was present in the learner’s L1.
1 This result corresponds to similar findings by Davidson (2003) and Haunz (2002), who found that native speakers
of English had more difficulties pronouncing more marked than less marked nonattested clusters. The similarity
between these native speakers and beginning L2 learners is that both groups have had hardly any input in the foreign
language.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
181
Carlisle (1998) examined the acquisition of English onsets by native speakers of Spanish in a
longitudinal study, comparing CC-onsets (/sp/ and /sk/) with more marked CCC-onsets (/spr/
and /skr/) and found that the less-marked onsets /sp/ and /sk/ were produced correctly more
often than more marked tri-consonantal onsets. Ostapenko (2005) investigated influences of
markedness through the sonority sequencing principle, finding that Russian consonant clusters
that violate this principle are difficult to acquire for English learners.
Most studies of L2 phonotactics are based on production data (Eckman 1977; Broselow
1987; Weinberger 1988; Broselow & Finer 1991; Hancin-Bhatt & Bhatt 1997; Broselow, Chen
& Wang 1998; Carlisle 1998; Hancin-Bhatt 2000; Haunz 2002; Davidson 2003; Ostapenko
2005, among others). Although production data are indicative of phonological development,
exclusive reliance on such data carries a certain risk of failure to distinguish the acquisition of
representational phonotactic knowledge from learning the motoric skills which are needed
to produce nonnative sound sequences. Hence, incomplete development of motoric skills
may be misinterpreted as a lack of development of native-like phonological representations.
For this reason, L2 production studies may underestimate the development of phonological
representations. For a finer-grained understanding of L2 phonological development, perception
studies seem to be called for.
As compared to the large number of L2 production studies, the number of studies addressing
L2 phonotactic knowledge in perception is limited. Most address the influence of phonotactic
contexts on segmental perception (Rochet & Rochet 1999; Harnsberger 2001; Levy & Strange
2008). In a study focusing on the role of native phonotactic constraints in nonnative listening,
Weber & Cutler (2006) showed that native constraints influence the segmentation of a nonnative
language even in highly advanced learners. Using a word-spotting task, Weber & Cutler
found that advanced L2 learners of English whose L1 was German used native phonotactic
constraints (such as the ban on */sp/ in word onsets) to locate word boundaries in spoken
English. This finding suggests that native phonotactic constraints are difficult to suppress in
nonnative listening. At the same time, Weber & Cutler’s study offers evidence of L2 phonotactic
development. The L2 listeners used not only L1 constraints for segmenting English, but also
constraints of the target language. More specifically, constraints that hold for English, not for
German, such as the ban on */ʃl/ in word onsets, facilitated the spotting of English words.
This suggests that advanced L2 learners acquire phonotactic constraints banning structures that
are legal in their L1. This is a remarkable finding, since L2 learners receive no direct negative
evidence that English words cannot start in */ʃl/.
Altenberg (2005) investigated well-formedness judgments, perception, and production of
English consonant clusters by L1-Spanish L2 learners of English. Participants rated written
nonwords in two versions, which were presented to them as new words of English and
Spanish, respectively. Three types of nonwords occurred: type A contained initial clusters that
are grammatical in both English and Spanish (e.g., /fl, dr, kr, bl/); type B, initial clusters
that are grammatical in English but not in Spanish (e.g., /sp, sm, sn, sl/), while type C
contained initial clusters that are grammatical in neither English nor Spanish (e.g., /sr, zn,
dl, fn/). Native participants and L2 learners made highly similar judgments in the English
version, suggesting that L2 learners had acquired native-like phonotactic knowledge. The L2
learners rated nonwords in the Spanish version according to their phonotactic well-formedness
in the L1. In a subsequent perception task, no significant differences emerged between native
participants and L2 learners on the orthographic identification of initial clusters in type A and
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
182
TRAPMAN AND KAGER
B items, and hence, no evidence for transfer was found. In Altenberg’s study, the phonotactic
target grammar (English) was a superset of the native grammar (Spanish). From the viewpoint
of learnability, this result should not come as a surprise because learners receive abundant
positive evidence for the well-formedness of /sC/ initial clusters in English. Since nonwords in
the well-formedness judgment task were presented orthographically, it remains unclear to what
extent well-formedness responses reflected the acquisition of clusters in orthography, rather
than perception. Hence, the question of whether L2 learners are able to acquire native-like
phonotactic responses to spoken language data remains unanswered.
In sum, only a few studies have addressed the role of L2 phonotactic constraints in perception and virtually none have addressed the development of L2 phonotactics on the basis of
perception data or well-formedness judgments. Most studies take only the L2 final state into
consideration, without considering the issue of development.
Also relevant, but less directly so, are studies showing that infants and adults can learn
novel phonotactic constraints from exposure to artificial languages (Onishi, Chambers & Fisher
2002; Chambers, Onishi & Fisher 2003; Saffran & Thiessen 2003).2 Target constraints in these
studies are novel to learners as they rule out structures that are legal in the participants’ native
language. Although these studies seldom explicitly address relations between the target patterns
in the artificial language and L1 phonotactics, their results may be interpreted as offering some
evidence that novel phonotactic constraints can be learned which define a phonotactic subset of
the native grammar. However, since they involve individual constraints rather than phonotactic
grammars, artificial language learning studies have only remote significance for the naturalistic
acquisition of L2 phonotactic knowledge.
Two specific scenarios of phonotactic L2 acquisition will be addressed here, which arise
when the structures that are phonotactically legal under the native and target grammar are
related by a proper inclusion. The target language structures are phonotactically a superset or
a subset of native language structures. We will call these the superset and subset scenarios,
respectively, after Berwick (1985).3
The first scenario is that of the phonotactic structures allowed by the target language being a
superset of those allowed by the L1, so that the target language is phonotactically more lenient
than the L1. More precisely, phonotactic structures that are illegal in the native language
are legal in the target language, while no phonotactic structures that are legal in the native
language are illegal in the target language. Studies reviewed above show that L2-learners are
able to overcome the effects of a more restrictive L1 on production, and eventually suppress the
phonotactic repairs that are characteristic of early stages of L2 production. This interpretation
is supported by the limited amount of evidence available from well-formedness judgments
(Altenberg 2005). A different interpretation arises from L2 perception, since Weber & Cutler
(2006) found that even advanced learners display an influence of L1-specific phonotactic
constraints on their segmentation of the nonnative language. In sum, while production studies
and well-formedness judgments suggest that successful phonotactic acquisition in a superset
scenario is possible, evidence from perception offers a more nuanced picture.
2 Artificial language learning studies have also revealed effects on learners’ production (Taylor & Houghton 2005)
or perception (Davidson, Shaw & Adams 2007).
3 Escudero & Boersma (2002) address the subset scenario in L2 segmental acquisition, focusing on the special case
of two vowels in the L2 (Spanish) being mapped onto three vowels in the L1 (Dutch).
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
183
The subset scenario, that of the target language being phonotactically more stringent than the
L1, has received virtually no attention in the phonotactic acquisition literature. This scenario
occurs when the native grammar allows a set of legal structures (for example, consonant
clusters in syllable onsets), which are illegal in the target grammar, while structures that are
legal in the target grammar are also legal in the native grammar. Hence, the set of phonotactic
structures that is allowed by the target language is a subset of those allowed by the L1. It
should be noted a priori that the acquisition of subset phonotactics may have fewer observable
consequences in production and perception than the acquisition of superset phonotactics. On
the production side, there will usually be no overt behavioral evidence showing that learners
have developed a target grammar that is more stringent than their L1, because the surplus of
L1 structures will not impede their L2 production. This invisibility problem has a counterpart
on the perception side, where L1-based ability to perceive a superset of structures should have
few if any detrimental effects on L2 perception. In sum, the effects of a subset scenario on L2
production and perception may appear to be marginal, and on top of that, difficult to observe;
hence it should not come as a surprise that few studies have addressed them. Nevertheless,
Weber & Cutler (2006) show that advanced L2 learners segment the target language using
phonotactic constraints which hold for the target language, but not the L1. Phonotactically,
German and English do not stand in a subset-superset relation in the specific sense previously
defined, and hence these results cannot be interpreted as showing that L2 phonotactic acquisition
occurs in a subset scenario. Hence, the issue of whether L2 learners are able to acquire a
phonotactic subset grammar is still open. This issue is interesting from a learnability viewpoint,
as will be argued later.
Against this background, we can now state the following research questions:
Q1a. Can L2 learners acquire phonotactic knowledge under a superset scenario, that is, in
case the target grammar defines a superset of the phonotactic structures which are
legal in the L1?
Q1b. Do superset learners show development, such that advanced learners possess more
native-like phonotactic knowledge of the target language than beginning learners?
Q2a. Can L2 learners acquire phonotactic knowledge under a subset scenario, that is, in
case the target grammar defines a proper subset of the phonotactic structures which
are legal in the L1?
Q2b. Do subset learners show development, such that advanced learners possess more
native-like phonotactic knowledge of the target language than beginning learners?
In line with many studies on L2 phonology, we make the important assumption of transfer,
which is supported by a wide range of evidence from production and perception, as previously
reviewed. We adopt a grammatical interpretation, such that the initial state of the L2 grammar
equals the final state of the L1 grammar (Broselow, Chen & Wang 1998; Escudero 2005).
Given the assumption of transfer, specific hypotheses about the L2 acquisition of phonotactics in the superset and subset scenarios can be derived from the theory on learnability of
grammars. The subset problem (Baker 1979, Angluin 1980) states that a learner who adopts a
superset grammar, which allows a superset of structures allowed by the target grammar, will
be unable to return to a more restrictive grammar unless corrected by negative evidence about
the target language. Since it is assumed that negative evidence is not available to learners, or
184
TRAPMAN AND KAGER
at least rarely so under naturalistic conditions of language acquisition, the subset problem is
interpreted in the learnability literature as implying that “the misstep of choosing a superset
grammar makes the subset grammar unreachable (from positive evidence)” (Prince & Tesar
2004).4 When one adopts the additional assumption that phonotactic knowledge is grammatical
in nature (like syntactic or semantic knowledge), the subset problem naturally extends to
phonotactic acquisition, as has been explicitly argued in the phonological learnability literature
(Smolensky 1996; Prince & Tesar 2004; Hayes 2004). The particular case that we address, of
an L2 initial state that happens to constitute a superset of the target grammar, should, likewise,
be an unsurmountable obstacle to the learner. Hence, we can state this as our first hypothesis:
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
H1 L2 learners can attain the target phonotactic grammar when starting from an L1 subset
initial state, but not from an L1 superset initial state.
Our second hypothesis spells out the consequences of successful L2 phonotactic acquisition
in terms of development:
H2 Successful L2 acquisition of phonotactics should be subject to development between
the initial and final states of the grammar; consequently, L2 learners should become
more native-like in their phonotactic responses.
Answers to our research questions can be predicted from our two hypotheses as follows.
Superset learners face the task of acquiring a target grammar which allows a superset of
the phonotactic structures of their native language. These learners receive positive evidence in
the form of words containing L1-illegal structures. Positive evidence allows superset learners
to adjust their initial state by relaxing or demoting the relevant phonotactic constraints, moving
their grammar closer to the target grammar. Successful acquisition of the target grammar
should eventually occur under these circumstances; hence, Hypothesis 1 predicts that question 1a should be answered positively. Likewise, question 1b should be positively answered as
Hypothesis 2 states that phonotactic development should occur.
In contrast, subset learners face the task of acquiring a target grammar that allows a subset
of the phonotactic structures of their native language. These learners will receive no negative
evidence about the ill-formedness of phonotactic structures that are illegal in the target language.
Due to a lack of relevant input, these learners should not be able to adjust their initial state and
hence, acquisition of the target grammar is expected to be impossible, rendering the answer to
question 2a negative. Since there is no evidence available, subset learners should not get closer
to the L2 grammar. So question 2b should be negatively answered as well.
In order to test our hypotheses, we set up a study with L2 learners of Dutch. As section 2
will show, consonant clusters in Dutch word onsets and codas form a proper subset of those of
Russian, and hence, Russian L2 learners of Dutch phonotactics face a subset scenario. Since
consonant clusters that are legal in word onsets and codas in Dutch form a superset of those in
Spanish, it follows that Spanish L2 learners of Dutch phonotactics face the superset scenario.
4 In
response to the subset problem, the Subset Principle was proposed by Berwick (1985) stating that if the learner
is faced with a choice between a set of different grammars that all account for the input data seen thus far, the
learner should always adopt the most restrictive grammar (i.e., the subset grammar), which is the one that is most
easily falsified by positive input data.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
185
Our study includes two groups of L2 learners of Dutch: superset learners with Spanish as
their L1 and subset learners with Russian as their L1, plus a control group of Dutch native
speakers. Each language group is divided into two subgroups of advanced and beginning
learners. Proficiency levels were included for the two language groups for different reasons.
In the case of L1-Spanish superset learners, for whom successful phonotactic acquisition is
predicted, a comparison of proficiency groups allows us to test the case for phonotactic acquisition through evidence from phonotactic development. If development occurs as predicted,
this might rule out alternative accounts of native-like responses of L2 learners that might be
based on an overlap between the final states of the phonotactic grammars of Spanish and
Dutch. For Russian subset learners, for whom we predict no acquisition, proficiency groups
were included only as a consistency check on the data: no phonotactic differences should
occur between low-proficient and high-proficient learners. If we were to find that L1-Russian
learners show no native-like responses, this would not suffice to rule out that they were acquiring
Dutch phonotactics as it might be the case that development takes place, albeit slowly. The
data from participants of two proficiency levels may serve to monitor development. Finally, it
deserves mentioning that this study was not designed to compare the phonotactic acquisition
of subset and superset learners directly but only to test predictions about the phonotactic
acquisition for each group of learners separately. A comparative aim would be more ambitious,
but also necessitate controls of proficiency levels between participants in the subset and superset
conditions, which did not occur in the present study.
We selected two methods to test the predictions. First, we elicited word-likeness judgments,
in which participants rated spoken nonwords with varying degrees of phonotactic legality
on a seven-point scale. Second, we included a lexical decision task, measuring response
latencies and error rates for nonwords. This task reflects online lexical processing, and for this
reason it has the advantage of being less vulnerable to being dominated by (semi)-conscious
response strategies that participants might develop than the classical word-likeness task, which
is essentially a meta-linguistic assessment.
This article is organized as follows. Section 2 contains a detailed overview of the word
margin (onset and coda) consonant cluster phonotactics of the three languages that figure in
this study: Dutch, Russian, and Spanish. This section has the main goal of demonstrating
that with respect to consonant clusters in syllable margins, the three languages are in subsetsuperset relations, with Spanish margins being a subset of the margins in the other two
languages, and Dutch margins being a subset of Russian margins. Section 3 presents the
results of an experiment in which word-likeness ratings of nonwords were elicited from Dutch
native listeners, as well as from (more and less advanced) L1-Russian and L1-Spanish second
language learners of Dutch. Section 4 presents a lexical decision experiment in which nonwords
(now mixed with real words) were presented to the same five groups of participants. Section 5
contains a general discussion of the results.
2.
WORD MARGINS IN DUTCH, RUSSIAN, AND SPANISH
This section has two goals: to present the spectrum of consonant clusters allowed in word initial
and final position in Dutch, and to show that these form a subset of Russian and a superset of
Spanish.
186
TRAPMAN AND KAGER
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
2.1. Dutch
The Dutch syllable conforms to a basic CCVCC template, in which the onset and coda each
consist of maximally two consonants (Trommelen 1983). In word initial and final position,
the positions that we focus on, coronal obstruents can be appended, creating initial clusters
of maximally three consonants (sCC), and final clusters of maximally four consonants (CCst).
Because of their exceptional distribution, initial /s/ and final coronal obstruents have been
analyzed as extrasyllabic, licensed by an appendix to the word (Trommelen 1983). Table 1
lists two-consonantal word onset clusters occurring in the Dutch CELEX lemmas database
(Baayen, Piepenbrock & Gulikers 1995).5 Marginal clusters (with type frequency below 25)6
occur in italics.7
Two-consonantal word onsets consist of an obstruent followed by a sonorant or marginally,
by another obstruent. In terms of their sonority profiles, word onset clusters are rising or
level, not falling; a single apparent counterexample, /wr/ (CELEX transcription) is commonly pronounced as /vr/.8 Sonorant-initial word onsets (nasal-glide and liquid-glide) are
marginal.
Among the obstruent-sonorant clusters, those having a liquid in second position are numerous (18 attested clusters altogether), and most have high type frequencies. Nevertheless,
coronal obstruents (/t, d, s, z, ʃ, Z/) before liquids are severely restricted by OCP-C OR (*/tl, dl,
zl, Zl; sr, ʃr, zr, Zr/, with /tr, dr/ being positive exceptions).9 Obstruent-glide clusters are equally
numerous but most are marginal at best, partly due to OCP-LAB (/pw, bw, fw, vw/). Obstruentnasal clusters are marginal except /kn, sn, sm/. (The latter two once more show word initial
/s/ as an escape hatch.) Obstruent-nasal clusters are strongly restricted by OCP-C OR (*/tn, dn,
zn, Zn/, with /sn/ as a true positive exception and /ʃn/ marginal) and by OCP-LAB (*/pm, bm,
fm, vm/).
Word onsets consisting of obstruents are marginal, with a major exception: word-initial
/s/ freely combines with voiceless nonsibilants (/sp, st, sk, sx, sf/) supporting its analysis as
a word appendix (Trommelen 1983). Table 1 reveals three restrictions on obstruent clusters
which will become relevant in the comparison with Russian: (i) Such clusters are voiceless
throughout (Zonneveld 1983; e.g., */zb, zd, dz/; for /dZ dj/, see footnote 7); (ii) No all-
5 Type frequencies are based on the CELEX DPL (Dutch Phonology Lemma) file, which contains 124,136
lemmas. A sublexicon was created of 69,245 types, by eliminating all lemmas with null frequency and collapsing
all homophones.
6 A cut-off point of 25 for marginal status was chosen because it forms a natural division in the frequency distribution
in the database. All onset clusters treated as marginal by this criterion are experienced as ‘foreign’ by native speakers.
Moreover, marginal clusters tend to occur in low-frequency words: their average type-to-token ratio is much lower
than that of nonmarginal clusters (46.5 s.d. 72 versus 215.0 s.d. 111).
7 The velar nasal, which generally cannot occur in Dutch onsets, is not represented in Table 1. Three phonotactically
legal clusters of the type coronal-obstruent-plus-/j/ were added: /tj, dj, sj/, which occur in free variation with /tʃ , dZ,
ʃ/, respectively. However, only the latter realizations occur in CELEX. Examples are /tj tʃ/ tjilpen ‘to chirp’, /dj dZ/ djati ‘jati wood’, /sj ʃ/ sjaal ‘shawl’.
8 For example, wrat ‘wart’ and vrat ‘fed’ are commonly neutralized. In most dialects of Dutch, /w/ is realized as
a labio-dental approximant [V] in word onset position.
9 Two exceptions with /sr/, not represented in CELEX, are Sri Lanka and Sranan tongo (the Creole language spoken
in Surinam).
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
187
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
TABLE 1
Dutch Two-Consonantal Word Onset Clusters
Note: Clusters printed in italics have type frequencies N < 25. The shaded areas indicate manner and voicing
combinations which are generally unattested.
fricative clusters (e.g., */fs, fx/; with once more, /s/ being exceptional in /sf, sx/); (iii) No
all-plosive clusters (e.g., */tk, kt/).10
Word-initial clusters of three consonants uniformly start with /s/ followed by a legal obstruent
plus liquid/glide cluster (e.g., /spl, spr, str, sxr/, plus marginal /skl, skr, skw, stj/).11 The fact
that ternary clusters all start with /s/ accords with its status as an appendix.
Table 2 lists two-consonantal word codas that occur in the CELEX lemmas database. Due
to final devoicing, no voiced obstruents are represented here. Again, marginal clusters with
type frequencies below 25 are italicized.12
The sonority profile of word codas is falling (sonorant-obstruent, liquid-nasal, glide-nasal)
or level (obstruent-obstruent). A small number of rising sonority clusters in shaded areas occur
in CELEX transcriptions but are nevertheless phonotactically illegal. These clusters are subject
10 An isolated exception, not in CELEX, is /pt/ pterodactylus ‘pterodactyl.’ On the basis of the weak evidence from
the remaining obstruent clusters, a further restriction seems to hold: the right-hand obstruent must be a coronal (e.g.,
*/fp, fk, xp, xk, pf, px, kf, kx/), where once again, initial /s/ is exceptional (/sp, sk, sf, sx/).
11 Word onsets /sfl, sfr, sxl, skw, stw/ are unattested, and presumably accidental gaps.
12 The average type-to-token ratio for marginal word coda clusters falls well below that of nonmarginal ones (151.8
versus 446.6).
188
TRAPMAN AND KAGER
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
TABLE 2
Dutch Two-Consonantal Word Coda Clusters
Note: Clusters printed in italics have type frequencies N < 25. The shaded areas indicate manner and voicing
combinations which are generally unattested.
to repairs, such as obligatory schwa epenthesis in /tl/ (axolotl) and /rl/ (Karl), while /rw/ (murw)
is commonly pronounced as [rf].
Sonorant-obstruent word codas occur in most logically possible combinations. Among these,
liquid-obstruent clusters are virtually unconstrained; /lʃ/ must be an accidental gap.13 Liquid
plus noncoronal obstruent codas are optionally broken up by schwa epenthesis; yet the fact
that epenthesis is only optional supports their phonotactic legality. Nasal-obstruent clusters are
homorganic (*/mk, mx, np, nk, nf, nx, ŋp, ŋf/) except that word-finally, a coronal obstruent
can follow any nasal (e.g., /mt, ms, mʃ, ŋs/). The unattested clusters /ŋt, ŋʃ/ are accidental
gaps since /ŋt/ is freely derived by affixation of the 3.sg.pres. suffix /t/ to /ŋ/-final verbs. The
free distribution of coronal obstruents, occurring after virtually any consonant, has earned them
the status of word appendix (Trommelen 1983; Kager & Zonneveld 1986). Combinations of
glides and consonants are mostly marginal. All except two (/jt, js/) occur only in loanwords
(/jp/ hype, /jk/ spike, /jf/ live, /jm/ time, /jn/ online, /jl/ file).
Obstruent clusters are more severely restricted, somewhat similarly to word onsets, and
contain at least one coronal member (except /pf/ in German loanwords), while sibilant clusters
are ruled out. Sonorant-sonorant clusters are restricted to liquid-nasal (/rm, rn, lm/, excluding
*/ln/ by OCP-C OR) and a few more occurring only in loanwords (see above). The velar nasal
never occurs as a second element in clusters.
13 A
potential loanword such as Welsh (the name of the language) fails to undergo phonotactic repairs.
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
189
Word-final clusters of three consonants uniformly end in a coronal obstruent, again supporting a word appendix analysis. All 60 attested ternary clusters combine a legal binary cluster
with /t/ or /s/ (e.g., /nst, nts, rst, rts, rmt, rkt, ŋkt, kst, tst/). A coronal cluster /st/ can be
appended to a binary cluster, which produces (14 different) maximal word offsets of four
consonants (e.g., /ntst, rtst, xtst, rnst/).
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
2.2. Russian
The goal of this section is to show that Russian word onsets and codas are a superset of Dutch.
Hence, we do not aim at an exhaustive overview of Russian (three and four consonant) clusters.
Counts are based on the large Uppsala corpus (Lönngren 1993), from which a phonetically
transcribed lexicon of 32,459 words was created. The data were checked against secondary
sources (Kucera & Monroe 1968; Kempgen 1995;14 Chew 2000; Scheer 2000; Ostapenko
2005). The demonstration that Russian word margins constitute a superset of Dutch has two
parts. Here we show that every Dutch word margin cluster has a Russian counterpart. In
Section 2.4, we will show that the inclusion relation also holds at the level of major class
features, place of articulation, and voicing. When determining the Russian counterparts of
Dutch consonants, we ignored palatalization, which is contrastive in Russian but not in Dutch.
For example, /p0 j/ and /pj/ were both assumed to be counterparts of Dutch /pj/. Furthermore,
we judged Russian /v/ to be phonetically closer to Dutch /v/ than to /w/. We included /ts/
and /tʃ/ as consonant clusters due to their cluster status in Dutch, although these are single
segments (affricates) in Russian.
Russian word onsets of two consonants consist of any combination of major classes (Table 3).
Clusters printed in boldface have Dutch counterparts. Clusters appearing in italics have type
frequencies below 5.15
All clusters attested in Dutch have Russian counterparts, with the single exception of /fn/,
which is marginal in Dutch and presumably just an accidental gap in Russian given that /vn/
and /ft/ are fully legal. All logically possible combinations of manners occur in Russian word
onsets, including clusters of falling sonority, except that the glide /j/ is unattested in initial
position.
The set of obstruent-sonorant clusters is relatively similar to Dutch. The main difference
is that as compared to Dutch, Russian relaxes OCP-C OR as evidenced by its wide range of
coronal clusters (/dn, ʃn, zn, Zn, ln; tl, dl, ʃl, zl, Zl, rl; sr, ʃr, zr, Zr, nr/).16 Apparently like
Dutch, Russian restricts labial clusters by OCP-LAB (*/pm, bm, fm/), but exceptions (e.g.,
/vm/) occur by prefixation of /v/.
Russian possesses a large number of double obstruent onsets, subject to assimilation of
voicing, which excludes shaded cells in Table 3. As compared to Dutch, Russian allows voiced
obstruent clusters (e.g., /vb, vd, zb, zd, bd, dv, dz/), as well as clusters of mixed voicing ending
in /v/ (e.g., /tv, kv, sv, xv/), due to voicing assimilation properties of Russian /v/ (Hayes 1984).
14 Kempgen’s (1995) data were based on two Russian dictionaries: Orfograficeskij Slovar0 (Barchudarova, Ožegova
& Šapiro 1967), and Obratnyj Slovar0 (Ševeleva 1974).
15 A lower threshold value for marginality was chosen than for Dutch because the Russian lexicon was considerably
smaller than the CELEX database for Dutch. The average type-to-token ratio for marginal word onset clusters falls
well below the ratio of nonmarginal ones (7.8 versus 18.9).
16 Some examples are /tl0 et/ ‘decay’, /dl0 ina/ ‘length’, /zl0 it/ ‘to anger’, /znat0 / ‘know’.
190
TRAPMAN AND KAGER
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
TABLE 3
Russian Two-Consonantal Word Onsets Compared to Dutch
Note: Palatal and nonpalatal consonants have been collapsed. Clusters printed in boldface have Dutch counterparts.
Shaded areas exclude manner and voicing combinations which are generally unattested, as well as any combinations
with /w/, which is not a phoneme of Russian. Clusters printed in italics have type frequencies N < 5 in a lexicon
based on the Uppsala Corpus.
Moreover, fricative clusters (e.g., /fx, ʃx, zv, vz/) and plosive clusters (e.g., /tk, kt/) are more
freely allowed than in Dutch.17
Sonorant-initial word onset clusters, including clusters of falling sonority, are numerous in
Russian, whereas they are illegal in Dutch.18 Mild effects of sonority sequencing occur, but
without causing phonotactic illegality: sonorant-obstruent clusters are numerous despite having
marginal type frequencies, and sonorant-sonorant clusters of rising sonority (specifically nasalliquid and nasal-glide) are well attested.
Russian allows all ternary word onsets that are legal in Dutch (/s/ plus a legal obstruentliquid cluster), in addition to large numbers of ternary onsets starting with other consonants
17 Examples of words with voiced clusters are /dver0 / ‘door’, /vd0 es0 / ‘here’, /zbor/ ‘collection’, /zduru/ ‘stupid, daft’;
fricative-fricative clusters: /fxodit0 / ‘enter’, /xvalit0 / ‘praise’, /vzamen/ ‘instead’, /zvezda/ ‘star’, /Zvatʃka/ ‘chewing
gum’; plosive-plosive clusters: /kto/ ‘who’, /tkan0 / ‘fabric, tissue’.
18 Examples of sonorant-initial clusters are /mlatʃij/ ‘younger’, /rtut0 / ‘mercury’, /lba/ ‘forehead (gen.sg.)’, /mnogo/
‘much’.
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
191
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
TABLE 4
Russian Two-Consonantal Word Codas Compared to Dutch
Note: Palatal and nonpalatal consonants have been collapsed. Clusters printed in boldface have Dutch counterparts.
Shaded areas exclude manner and voicing combinations which are generally unattested, as well as any combinations
with /w/ and /ŋ/, which are not phonemes of Russian. Clusters printed in italics have type frequencies N < 5 in a
lexicon based on the Uppsala Corpus.
(e.g., /fkl, fpr, ftr, fsp, fst, fsk; vzb, vzd, vzm, vzn, vzl, vzr, vzv, vgl, vbr, vsk; zbl, zbr, zdr,
zdv, zgl, zgn, zgr; ʃtr, mst, mgl/).19 Russian allows word onsets of four consonants, uniformly
starting with /f, v/ followed by a fricativeCobstruentCliquid triplet (e.g., /fstr, fskr, fspr, fspl,
fsxl, vzbr, vzdr, vzgr, vzgl/).20
Turning to word coda clusters, it is evident that Russian once again forms a phonotactic
superset of Dutch, placing no categorical bans on any combination of manners (Table 4).21
19 Examples are /fpravo/ ‘to the right’, /fsp0 at0 / ‘back’, /fstavat0 / ‘get up’, /vzriv/ ‘explosion’, /vrdug/ ‘suddenly’,
/vznos/ ‘payment’, /vzlom/ ‘breaking in’, /kstat0 / ‘to the point’, /sklat/ ‘store, stock’, /zdrav/ ‘sound’, /mstit0 / ‘to revenge
one self’, /mgla/ ‘haze’.
20 Examples are /vzbros/ ‘upthrow’, /fspl0 esk/ ‘splash’, /vzgl0 at/ ‘glance’.
21 The average type-to-token ratio for marginal word coda clusters falls well below the ratio of nonmarginal ones
(2.8 versus 8.5).
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
192
TRAPMAN AND KAGER
In sonorant-obstruent clusters, Russian is maximally similar to Dutch. Liquid-obstruent
clusters have a single omission /lx/, probably an accidental gap because /rx/ and /lk/ are both
attested. Nasal-obstruent clusters lack the velar nasal, which is not a phoneme of Russian.
For obstruent-obstruent coda clusters, Russian is highly similar to Dutch: clusters are
voiceless, while most clusters end in coronals /s t/. There are a number of (apparent) omissions,
however. /ts, tʃ/ correspond to (single-phoneme) affricates in Russian, while /pf, fs, sp, xs/ are
missing. However, it should be noted that all four omissions are only marginal in Dutch, while
presumably the clusters are accidental gaps in Russian, given the occurrence of clusters in
neighboring cells (e.g., /tf, fx, sk/). To balance affairs, Russian possesses clusters that are
disallowed in Dutch (/tf, kx, fk, fx/).
Russian allows a fair number of sonorant-sonorant word coda clusters. Although most of
these are marginal, the attested clusters form a superset of Dutch, most notably by including
codas of rising sonority (nasal-liquid) and level sonority (nasal-nasal, liquid-liquid). Risingsonority word codas occur in the form of obstruent-sonorant clusters (of all logically possible
types: plosive-nasal, plosive-liquid, fricative-nasal, fricative-liquid), none of which are legal in
Dutch. Just as in word onsets, OCP-C OR is relaxed (e.g., /dn, sn, sl, zn, zl/), but OCP-LAB is
not (*/pm, bm, fm/, plus marginal /vm/).
Russian allows ternary word codas of any type that is legal in Dutch (existing binary
obstruent-liquid cluster plus /s/ or /t/), in addition to ternary codas ending in other consonants
(e.g., /stf, rtf; tsk, fsk, nsk, rsk/). No sonority restrictions hold in word codas, as clusters of
rising sonority are legal (e.g., /str, ktr, ntr, ndr/). Word codas of four consonants occur, most
ending in /tf/ (e.g., /rstf, jstf, tstf, nstf, pstf, mstf/).
2.3. Spanish
The goal of this brief section is to show that Spanish word onsets and word codas form a proper
subset of Dutch. We will not provide an exhaustive description of Spanish syllable structure
(for a discussion, see Harris 1983; Hualde 1991; Quilis & Fernández 1992), but focus on word
margin consonant clusters in comparison to Dutch.
Spanish word onset clusters are maximally binary, and uniformly of the type obstruent-liquid
(Table 5). Only 12 clusters are attested.
Much as in Dutch, OCP-C OR restricts clusters (*/tl, dl, sl, sr/; exceptions /tr, dr/).22 In
contrast to Dutch, Spanish generally disallows obstruent-nasal word onsets, as well as obstruentobstruent onsets. Moreover, /s/-consonant clusters in loanwords are repaired by /e/-prothesis (for
example, slalom [eslalon], smoking [esmokin], snob [esnob], stereo [estereo], stress [estres]),
documented for L1-Spanish L2 learners of English by Carlisle (1998). The status of /xr, xl/ is
unclear.23
Our assumption that Spanish lacks consonant-glide onsets is based on distributional arguments (Harris 1983; Hualde 1991; Quilis & Fernández 1992) that post-consonantal prevocalic
22 Coronal onset clusters of the language minimally differ as Spanish lacks /sl/. Moreover, /tl/ is allowed in Mexican
Spanish, e.g. tlapería ‘paint/hardware store.’
23 Harris (1983) claims that these are accidental gaps, offering an example Jruschev ‘Khrushchev’, which is disputed
by Pensado (1985).
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
193
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
TABLE 5
Spanish Binary Obstruent-Sonorant Word Onset Clusters
Note: Shaded areas indicate manner combinations which are generally unattested.
glides are part of the nucleus, members of a set of rising diphthongs /wa, we, wo, wi,
ja, je, jo, ju/. The alternative, to analyze glides as parts of complex onsets, meets with a
number of problems. First, diphthongization, as seen in the alternation of /e/ /je/ and
/o/ /we/ (e.g., sierra serrano, buen bondad), becomes a process that involves the
syllable (onset C nucleus), rather than strictly involving the nucleus. Second, diphthongization
occurs after liquids (e.g., ruego rogar), resulting in liquid-glide clusters which would violate
the minimal sonority distance requirements holding for Spanish onsets, as motivated by the
ill-formedness of both obstruent-nasal and clusters. Third, sonorant-initial onset clusters are
generally illegal but sonorant-glide clusters (nasal-glide, liquid-glide) would be exceptional.
Fourth, /s/-glide clusters (/sw/ suerte, /sj/ sierra) would become exceptions to the otherwise
general prohibition against /s/-consonant clusters.24 Fifth, glides following clusters (e.g., prieto
[prj], triunfo [trj]) would imply ternary clusters, which are otherwise excluded. Sixth, postconsonantal prevocalic glides render the syllable heavy, which can only be explained from a
nuclear analysis (Harris 1983).
Spanish codas are highly restricted. In singleton word codas, only /d, s, n, l, r/ occur in
native words, and /b, g/ in loanwords (e.g., club, bistec).25 Complex word codas only occur in
loanwords, and uniformly have the structure consonant-/s/ (e.g., /ps, ks, ns, ls/; Harris 1983;
Hualde 1991).26 All attested two-consonantal codas occur in Dutch.
24 A possible argument against a nuclear analysis of obstruent-glide sequences could be that it is difficult to imagine
what would rule them out as complex onsets, as they are best of all clusters in terms of their sonority distance, which
is maximal. However, sources such as Greenberg (1965) present no implicational universal such that the presence of
obstruent-liquid clusters in a given language implies the presence of obstruent-glide clusters in that language.
25 /d/ is realized as a continuant [ð] in coda, where it is easily devoiced to [θ], thus undergoing neutralization of
voicing and continuancy. Similar neutralizations apply to /b/ (club) and /g/ (bistec).
26 Examples are bíceps, tórax, Máyans, vals.
194
TRAPMAN AND KAGER
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
2.4. Proper Inclusions Among the Three Languages
Proper inclusions among the word margins of the three languages will now be represented at
the level of natural classes: sonority, place of articulation, and voicing (for obstruent clusters).
When representing the word onset clusters in terms of sonority classes, Spanish is the
most restricted language of the three, allowing only obstruent-plus-liquid clusters (Figure 1).
Dutch is more lenient, adding to the Spanish-legal set three more types of binary onsets:
obstruent-obstruent, obstruent-nasal, and obstruent-glide. Russian is least restricted of all,
adding to the Dutch-legal set nasal-consonant and liquid-consonant clusters.
In terms of restrictions on voicing in obstruent clusters in word onset, Spanish and Dutch are
equally restrictive (Figure 2). Dutch satisfies agreement of voice as well as a ban against voiced
obstruent clusters. Spanish allows no obstruent clusters, satisfying both constraints vacuously.
Russian allows two more cluster types: voiced clusters, as well as mixed voiceless-voiced
clusters, the latter arising as a consequence of the properties of /v/ in voicing assimilation.
For interactions of place of articulation in word onset clusters, the three languages differ
mainly in their satisfaction of OCP-PLACE (Figure 3). Spanish word onsets respect OCP-C OR
(with exceptions); labials never occur in second position of a cluster. Dutch satisfies OCP-LAB
and OCP-C OR (the latter with exceptions). Russian satisfies OCP-LAB (with exceptions), but
not OCP-C OR.
FIGURE 1 Subset relations in sonority structure of binary word onset clusters in three languages: Spanish Dutch Russian.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
195
FIGURE 2 Subset relations in voicing structure of binary obstruent clusters in word onset in three languages:
Spanish Dutch Russian. (Spanish has no obstruent clusters in word onset.)
Ternary word onset clusters are illegal in Spanish. Dutch allows only ternary clusters
beginning with /s/. Russian allows ternary clusters starting with other consonants than /s/,
as well as quaternary clusters of which the initial consonant is /f, v/ (Figure 4).
Turning to word coda clusters, sonority structure once again shows a subset relation between
the three languages (Figure 5). Russian allows all logical possibilities (with the exclusion of
consonant-glide). Dutch allows only falling sonority clusters (sonorant-obstruent, liquid-nasal,
FIGURE 3 Subset relations in place of articulation structure of binary word onset clusters in three languages:
Spanish Dutch Russian.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
196
TRAPMAN AND KAGER
FIGURE 4 Subset relations in word onset clusters of two, three, and four consonants in three languages:
Spanish Dutch Russian.
FIGURE 5 Subset relations in sonority structure of binary word coda clusters in three languages: Spanish Dutch Russian.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
197
FIGURE 6 Subset relations in place of articulation structure of binary word coda clusters in three languages:
Spanish Dutch, Russian.
glide-nasal, glide-liquid) in addition to obstruent-obstruent clusters. Spanish has virtually no
word coda clusters (which occur in loanwords only), but the ones that occur always end in an
obstruent (more specifically, /s/).
The three languages are highly similar in the interactions of voicing in obstruent clusters in the word coda. Apart from the fact that such clusters are marginal in Spanish but
frequent in Dutch and Russian, all three languages neutralize the contrast to voiceless in
word-final obstruent clusters. This is the only case in which no subset relation holds between
the languages.
Finally, for place of articulation in word coda clusters, Spanish only allows /s/ in second position, possibly violating OCP-C OR in loanwords (/ns, ls, rs/), but vacuously respecting
OCP-LAB, whereas Dutch and Russian both disrespect OCP-LAB and OCP-C OR (Figure 6).
In sum, this overview of the phonotactic possibilities in word margins at the level of natural
classes has established an overall subset structure between the three languages: Spanish Dutch Russian for sonority, voicing, and place of articulation. In two specific cases, no
strict subsets occur: Dutch and Russian match in terms of the place of articulation structure
of word codas, while all three languages match in terms of voicing structure of word codas.
Nevertheless, Spanish word margins never form a superset of Dutch and/or Russian margins,
and Dutch word margins never form a superset of Russian margins. Hence, the overall subset
structure between the three languages is not violated by these cases.
2.5. Summary and Predictions
In this section, it was established that in terms of the consonant clusters allowed in the word
onset and coda, Russian qualifies as a superset of Dutch, and Spanish as a subset. Hence,
L1-Russian learners of Dutch face the task of acquiring a subset target grammar, one which
is more restrictive than their native grammar, while L1-Spanish learners of Dutch face the
opposite task of acquiring a superset grammar, one which is less restrictive than their native
language.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
198
TRAPMAN AND KAGER
On the basis of these results, we can now make specific predictions about the L2 acquisition
of Dutch phonotactics by L1-Russian and L1-Spanish learners of Dutch, on the basis of the
hypotheses stated in section 1. The first hypothesis, which was derived from L2 theory (transfer)
in combination with learnability theory (the subset problem), stated that L2 learners can acquire
a target grammar only when starting from an L1 subset initial state, but not from an L1 superset
initial state. By the additional assumption that phonotactics involves grammatical knowledge,
this predicts that L1-Russian learners of Dutch should be impeded by a lack of negative
evidence, while L1-Spanish learners should benefit from the availability of positive evidence
against their initial state. Our second hypothesis was that L2-acquisition of phonotactics is
subject to development, in the sense that advanced learners should move closer to the target
grammar, and thus become more native-like in their responses to Dutch nonwords. Hence, it
is now predicted that advanced L1-Spanish learners of Dutch should be more native-like than
beginning L1-Spanish learners of Dutch in terms of their phonotactic competence. In contrast,
because L2 phonotactic acquisition should not occur under a subset scenario, advanced L1Russian learners of Dutch should not be more native-like than beginning L1-Russian learners
of Dutch.
3. EXPERIMENT 1: WORD-LIKENESS JUDGMENT TASK
In the word-likeness judgment task, participants gave ratings to spoken nonwords based on their
perceived word-likeness in Dutch. Stimuli that contained Dutch-illegal clusters were expected
to have lower scores than stimuli containing Dutch-legal clusters, for native speakers and the
L1-Spanish L2 learners, because both Dutch and Spanish disallow the Dutch-illegal clusters.
For the L1-Russian learners, however, differences in ratings between Dutch-legal and Dutchillegal clusters were not expected because of the subset problem regarding phonotactic learning:
none of the stimuli have consonant clusters disallowed by the native grammar.
3.1. Participants
Three groups of subjects participated in the experiment. The first group of participants consisted
of 30 adult Dutch monolinguals.27 The other two groups contained adult L2 learners of Dutch:
18 native speakers of Russian and 13 native speakers of Spanish. In the L2 learner groups,
some participants had more than one native language. Among the L1-Russian participants, five
were bilinguals: two Russian/Azerbaijani, one Russian/Romanian, one Russian/Ukrainian, and
one Russian/Belarusian. All reported Russian to be their dominant language except one (the
Russian/Ukrainian bilingual). In the L1-Spanish group, there were three bilingual participants,
all Spanish/Catalan.28 Each of the groups of L2 learners was subdivided into two equally
large subgroups, based on the participants’ proficiency in Dutch. An estimate of the Dutch
proficiency level of the participants was determined by their performance on a C-test (Taylor
1953). This test consists of five unrelated texts. In each text, a number of words are incomplete.
27 Most participants had mastered English to a certain degree. However, Dutch and English are highly similar on
the phonotactic well-formedness of the stimuli selected.
28 Catalan allows more final clusters than Spanish (Wheeler 1979).
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
199
Participants were asked to fill out the missing second half of these words. The maximum score
on the C-test was 100, the minimum score 0. The mean score for the L1-Russian beginning
learners (N D 9) was 39.9 (sd D 10.6) and 70.3 (sd D 11.5) for advanced L1-Russian learners
(N D 9). The mean score for the L1-Spanish beginning learners (N D 7) was 12.1 (sd D 7.6)
and 58.3 (sd D 13.7) for advanced L1-Spanish learners (N D 6). The differences between the
scores on the C-test of beginning and advanced learners were significant for the L1-Russian
(t-test, two-tailed, t D 12:317, df D 17, p < :001), as well as for the L1-Spanish participants
(t-test, t D 4:624, df D 12, p D :001). Biographical data of the participants were obtained by
a brief oral interview.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
3.2. Materials
Stimuli presented in the word-likeness judgment task were nonwords—monosyllabic and bisyllabic—which contained different types of consonant clusters (Table 6). Some clusters occurred
in word onset position, others in word coda position. No stimuli contained more than one
consonant cluster. The clusters were in a strict subset/superset relationship with respect to each
other. That is, all of the 37 target consonant clusters are legal in Russian. A proper subset
of these clusters are legal in Dutch, and a proper subset of the Dutch-legal clusters are legal
in Spanish. Accordingly, stimuli were subdivided into three classes. The first class of clusters
(Type 1; N D 20) contained those that are legal in Russian but not in the other two languages
under investigation. The second class of clusters (Type 2; N D 12) consisted of those that are
legal in Dutch and Russian, but not in Spanish. The final class of clusters (Type 3; N D 5)
TABLE 6
Consonant Clusters Used in the Nonword Stimuli in the
Word-Likeness Judgment Task
Type 1
CC
CCC
CCCC
Type 2
Type 3
Onset
Coda
Onset
Coda
Onset
Coda
ktrttkxmzbzdzlznfprfspfstsklzdrfsplfstr-
-zm
slsmsnst-
-kt
-nt
-rk
-rm
-rs
-rt
flprtr-
-ls
-ns
-nsk
-rsk
-stf
-str
splstr-
Note: Type 1 is legal in Russian only; Type 2 is legal in Russian and Dutch;
Type 3 is legal in Russian, Dutch, and Spanish.
200
TRAPMAN AND KAGER
TABLE 7
Mean Type Frequency and Observed/Expected Values for Type 1, Type 2, and Type 3 Clusters
in the Three Languages
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
Note: Type 1 is legal in Russian only; Type 2 is legal in Russian and Dutch; Type 3 is legal in Russian, Dutch,
and Spanish.
contained those that are legal in all three languages, including Spanish. Both onset and coda
clusters were used.
All consonants used in the stimuli were phonemes of both the L1 and the target language,
that is, consonants belonging to the intersection of the inventories of the three languages.29 As
far as vowels are concerned, no diphthongs or reduced vowels were included in the stimuli.
Furthermore, parts of the nonwords other than the consonant cluster all contained phoneme
combinations that have a relatively high frequency in Dutch in the word position in which they
occur. This can be illustrated by the stimulus /fpro.0 lan/. The target cluster in this nonword is
/fpr/, which is illegal in Dutch. The rhyme /o/ in the initial (weak) syllable as well as the onset
/l/ and rhyme /an/ of the final (strong) syllable are relatively frequent in the Dutch lexicon.30
Clusters were selected so as to highlight structural differences between the languages in
terms of phonotactic constraints. For example, Type 1 clusters violated the sonority sequencing
principle (/rt, zm/), OCP-C OR (/zn, zl/), or the constraint against voiced obstruent clusters (/zb,
zd/). Type 1 ternary/quaternary clusters had initial consonants other than /s/ (/fpr, fsp, fst, zdr/;
/fspl, fspr/). Type 2 clusters included /s/-initial onsets (all illegal in Spanish, but under the
exception clause for initial /s/ in Dutch), as well as coda clusters ending in consonants other
than /s/.
Each cluster type has an average type frequency of minimally 18.8 in each language in which
it occurs legally (Types 1-2-3 for Russian; Types 2-3 for Dutch; Type 3 for Spanish)31 (Table 7).
As an additional check on the phonotactic legality of the stimuli, average observed/expected
ratios for each of the three cluster types were calculated and found to be above 1.0, confirming
that none of the legal types are underrepresented in the three languages.32
Nevertheless, it turned out to be impossible to fully balance the frequencies of cluster
types for each of the three languages. In Dutch, Type 2 clusters were more frequent than
29 Spanish has no phoneme /z/ but /s/ becomes voiced before a voiced consonant, as in rasgo ‘feature,’ jazmín
‘jasmine’ (Martínez-Celdrán, Fernández-Planas & Carrera-Sabaté 2003).
30 As before, type frequencies are based on a sublexicon of 69,245 lemmas based on Dutch CELEX.
31 Statistics for Spanish were taken from a computerized lexicon of 35,162 word types derived from a text corpus
of 5,000,000 words in the Corpos del Español (Davies 2002).
32 Table 13 shows a high average O/E value (O/E = 11.85) for Russian Type 1 clusters. This might be an artifact
of the average length of Type 1 clusters: with increased segment number, expected values naturally drop, as these are
based on the product of the segmental probabilities: E.C1 C2 C3 / D pC1 pC2 pC3 N.CCC/. As a result, even with
relatively low observed values, O/E values will naturally rise.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
201
Type 3 clusters, mainly due to the high frequency of /s/-consonant word onsets. In Russian,
the reverse situation holds, with Type 3 > Type 2 > Type 1 clusters in terms of frequency.
Since lexical frequency is known to influence participants’ responses in word-likeness rating
and the lexical decision task, the possibility arises that L1-Russian participants rate stimuli
according to the frequencies of the clusters in Russian, which might interfere with the prediction
from Hypothesis 1 that these participants should not distinguish Type 1 and Type 2/3 stimuli
by their phonotactic well-formedness. Hence, it was decided that possible effects of Russian
frequencies on responses would be addressed afterward by means of a correlation analysis.
If L1-Russian participants based their judgments of nonwords on Russian lexical frequencies,
then a correlation analysis should reveal this; if instead, they based their judgments on acquired
phonotactic constraints of Dutch, then responses should be more strongly correlated with the
Type 1 versus Type 2/3 distinction.
None of the nonword stimuli were existing words of Russian or Spanish. Two native speakers
of Spanish and two native speakers of Russian were asked to check the stimuli. The stimuli
were read by a phonetically trained native speaker of Dutch who was unaware of the purpose
of this study, and digitally stored.
In the stimulus list of the word-likeness judgment task, each target consonant cluster occurred
four times: twice in a monosyllable, once in an iamb (a bisyllable with final stress), and once
in a trochee (a bisyllable with initial stress). The stimulus list contained fillers as well. These
fillers met the same criteria as the test items with respect to their stress patterns and phonemes
that were included. The filler items lacked clusters entirely. The stimulus list of the wordlikeness judgment task contained 216 items: 156 test items and 60 fillers. The test items are
all included in the Appendix.
3.3. Procedure
All participants in this study took part in two experiments: the word-likeness judgment task
and the lexical decision task (which will be presented in section 4). The participants were
tested individually. Half of the participants of each native language group performed the
word-likeness judgment task before the lexical decision task. For the other half of each
group, the reverse order applied. This was done at random. There was a short break between
the two tasks. Afterwards, the nonnative speakers were asked some questions in a short
interview and they performed the C-test. The subjects were paid for their participation in
the experiments.
In the word-likeness judgment task, the 216 spoken stimuli were played over headphones.
The experiment took place in a sound-proof booth. The participants were instructed orally. The
instructions were stated in Dutch in order to move (or to keep) the participants in the right
language mode. The participants were told that they were to listen to words that do not exist
in Dutch, but that nevertheless, some of the nonwords would sound more typically Dutch than
others. Participants were instructed to judge the extent to which a given nonword was more
or less typically Dutch on a seven-point scale. On the screen, this scale was indicated by
numbers 1 to 7. At the extreme left and right ends of the scale, the words slecht (‘bad’) and
goed (‘good’) appeared. After each stimulus, participants entered a score on the scale on the
screen by using a mouse. The word-likeness judgment task was self-paced, taking 10 minutes
on average.
202
TRAPMAN AND KAGER
3.4. Results
The scores that the participants assigned to the stimuli were analyzed for each group of
participants. The set of target items was subdivided into two main categories, namely, items
that are phonotactically legal in Dutch and items that contain Dutch-illegal consonant clusters.
The statistical test that was used to detect significant differences between the responses to the
two main categories was the Mann-Whitney U-test. A nonparametric test was used because the
data set was not normally distributed.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
3.4.1. Dutch Participants
As expected, the Dutch participants discriminated nonwords with phonotactically ill-formed
clusters (Type 1) from those with phonotactically well-formed clusters (Types 2 and 3). This
difference was significant for the onset cluster items (Mann-Whitney U-test, U D 232171:5,
p < :001) as well as for the coda cluster items (Mann-Whitney U-test, U D 138580:5,
p < :001) (see Figure 7).
Furthermore, the native speakers of Dutch made a distinction within the class of legal
onset clusters. Type 3 clusters received significantly higher scores (Mann-Whitney U-test, U D
111376:5, p < :001) than Type 2 clusters. This finding suggests that phonotactic judgments
of native speakers are gradient rather than categorical. No such distinction was made within
the class of legal coda clusters between Type 2 and 3 clusters. Gradience was also found for
nonwords with phonotactically illegal (Type 1) clusters. Within this category, nonwords starting
with an obstruent-obstruent cluster (/tk, kt, zb, zd/) received lower scores than nonwords
starting with obstruent-sonorant clusters (/zl, zn, xm/), 1.65 versus 2.10. This difference is
significant (Mann-Whitney U-test, U D 66877, p < :001).
FIGURE 7 Average word-likeness judgments of non-words containing Type 1, Type 2, and Type 3 word
onset and word coda clusters by Dutch native speakers.
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
203
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
3.4.2. L1-Russian Learners of Dutch
Contrary to the predictions, both groups of L1-Russian learners of Dutch discriminated
between the Dutch-legal (Types 2-3) and the Dutch-illegal (Type 1) onset clusters (Figure 8).
Nonwords that have illegal onset clusters received significantly lower scores than those that
have legal onset clusters by beginning learners (Mann-Whitney U-test, U D 48070:0, p < :001)
and advanced learners (Mann-Whitney U-test, U D 42083:0, p < :001). Both groups of the L1Russian participants also discriminated between L2-legal (Types 2-3) and L2-illegal (Type 1)
coda clusters (Mann-Whitney U-test, beginning learners: U D 18974:0, p < :001; advanced
learners: U D 14922:0, p < :001).
The difference between beginning and advanced L1-Russian learners of Dutch was that the
responses of the advanced learners were more like those of Dutch native speakers: they assigned
significantly lower scores to the Dutch-illegal coda clusters (Type 1) than the beginners (MannWhitney U-test, U D 12877:0, p D :001). Furthermore, the advanced learners discriminated
between Types 2 and 3 Dutch-legal onset clusters (Mann-Whitney U-test, U D 9276:0, p D
:002), whereas the beginning learners failed to make this distinction.
3.4.3. L1-Spanish Learners of Dutch
As expected, the advanced L1-Spanish learners of Dutch discriminated between the legal
and illegal onset clusters (Figure 9). This group assigned significantly lower scores to nonwords
containing illegal onset clusters (Type 1) than to nonwords with legal onset clusters (Types 2
and 3) (Mann-Whitney U-test, U D 21071, p < 0:001). The less advanced L1-Spanish learners
did not make such a distinction, which shows that these learners had not yet acquired the rele-
FIGURE 8 Average word-likeness judgments of non-words containing Type 1, Type 2, and Type 3 word
onset and word coda clusters by beginning and advanced L1-Russian learners of Dutch.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
204
TRAPMAN AND KAGER
FIGURE 9 Average word-likeness judgments of non-words containing Type 1, Type 2, and Type 3 word
onset and word coda clusters by beginning and advanced L1-Spanish learners of Dutch.
vant knowledge of Dutch, suggesting that there is development in the acquisition of the target
language phonotactic knowledge.
Although the beginning learners did not distinguish the illegal onset clusters from the legal
ones, both groups of Spanish participants discriminated between legal and illegal coda clusters
(Mann-Whitney U-test, beginning learners: U D 13067:5, p D :007; advanced learners: U D
8426:0, p < :001). That is, the beginning learners did not discriminate L2-illegal onsets from
legal ones, but they did discriminate between legal and illegal coda clusters.
Surprisingly, no significant difference was found between L1-Spanish learners’ judgments
of Type 2 and 3 Dutch-legal consonant clusters. Although in the L1 of these learners, there is
a difference between these two types of clusters (Type 2 clusters being illegal in Spanish, and
Type 3 clusters legal), this difference is not visible in their judgments of Dutch nonwords.
Within the class of illegal onset clusters, the advanced L1-Spanish learners discriminated
between obstruent-obstruent and obstruent-sonorant clusters, 1.81 versus 3.01 (Mann-Whitney
U-test, U D 2291:5, p < :001), like the native speakers of Dutch did.
3.5. Discussion
As expected, the Dutch native listeners assigned low ratings to nonwords containing phonotactically illegal consonant clusters. This was the case for onset as well as for coda clusters. L2
learners of Dutch also showed sensitivity to distinctions between Dutch-legal and Dutch-illegal
onset and coda clusters. Both beginning and advanced L1-Russian learners of Dutch assigned
lower scores to Dutch-illegal onset and coda clusters than to Dutch-legal clusters. The results of
the advanced L1-Spanish learners showed the same pattern. The beginning L1-Spanish learners,
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
205
however, only distinguished between L2-legal and L2-illegal coda clusters; they did not make a
similar distinction for onset clusters. Hence, the advanced L1-Spanish learners displayed more
native-like responses than the beginners, suggesting that there is development of phonotactic
knowledge in L2 acquisition.
Native speakers distinguished degrees of word-likeness between the classes of phonotactically legal clusters (differentiating Type 2 and Type 3 clusters), as well as within the class of
illegal clusters (differentiating Type 1 clusters). Not all groups of L2 learners displayed such
gradience. Only in the advanced groups, not in the beginning groups, could some significant
differences be detected: the advanced L1-Russian learners distinguished within the class of
Dutch-legal clusters (Types 2 and 3), while the advanced L1-Spanish learners distinguished
within the class of the Dutch-illegal clusters (Type 1). This finding adds further evidence to
our second hypothesis, that L2 phonotactic knowledge is subject to development.
As we observed earlier, the response patterns for L1-Russian learners might be influenced
by native language statistics, possibly obscuring the effects of acquired phonotactic knowledge
of Dutch. This might introduce a confounding factor, which needs to be addressed. To assess
the influence of Russian lexical statistics on L1-Russian learners’ responses, we conducted
analyses with two types of Russian lexical statistics data: the type frequencies of the individual
clusters and overall bi-phone probabilities of the nonword stimuli. First, we measured the
Pearson correlation (two-tailed) between the average word-likeness rating per item by the L1Russian participants and the logarithm of the Russian type frequency for individual clusters.
This relationship turned out to be significant (r D :37, p < :001) albeit rather weak. In addition,
we measured the correlation between the average L1-Russian responses and a binary distinction
legal/illegal between clusters (Type 2-3 versus Type 1). It turned out that word-likeness ratings
correlated considerably more highly with the legal/illegal distinction. (r D :63, p < :001) than
with the frequency statistics measure. Moreover, a multiple linear regression analysis reveals
that the frequency measure does not explain significant unique variance once the legal/illegal
distinction is included in the analysis (Table 8).
Second, overall bi-phone probabilities of the items calculated over the Russian lexicon
have no predictive effect as to L1-Russian word-likeness ratings. The correlation between
the bi-phone probabilities of the items and average word-likeness judgments of the items
is not significant (r D :04, p D :657). On the contrary, the Dutch bi-phone probabilities
correlate significantly with the L1-Russian well-formedness ratings (r D :38, p < :001),
TABLE 8
Regression Analyses on the L1-Russian Word-Likeness Ratings
R-Square
Step 1
Constant
Legal/illegal distinction
Step 2
Constant
Legal/illegal distinction
Log type frequency Russian
Significant
at .001.
B
SE B
Beta
1.043
.12
.229
.08
.633
1.046
1.384
.084
.242
.181
.138
.599
.048
.392
.400
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
206
TRAPMAN AND KAGER
suggesting that the effect of L1 lexical statistics can be suppressed in the L2. The above analyses
suggest that the similarities in word-likeness judgments between the L1-Russian learners of
Dutch and the native speakers of Dutch were more strongly based on a shared phonotactic
knowledge of Dutch than on similarities in terms of lexical statistics between Russian and
Dutch.
The results of the word-likeness rating experiment suggest that L2 learners have phonotactic
knowledge of the target language which is similar to native listeners’ knowledge. Nativelike phonotactic knowledge of the target language was found for superset learners as well
as for subset learners. This means that our first hypothesis is confirmed for the L1-Spanish
superset learners, but not for the L1-Russian subset learners. Our second hypothesis, stating that
development should take place in the case of succesful phonotactic acquisition, was confirmed
for the L1-Spanish learners; for the L1-Russian learners, who unexpectedly showed native-like
phonotactic knowledge, evidence for development was found as well. Nevertheless, it may have
been the case that the participants in the word-likeness judgment task were guided by some
(semi)-conscious awareness of Dutch consonant cluster legality, which may have resulted in a
response strategy during the experiment. If so, the results of the word-likeness experiment may
have reflected a kind of meta-linguistic knowledge which was different from the subconscious
grammatical knowledge that we intended to assess. In order to minimize the possible effects of
semi-conscious knowledge of phonotactics on participants’ responses, we conducted a lexical
decision experiment with the same groups of participants.
4. EXPERIMENT 2: LEXICAL DECISION
Phonotactic knowledge of native and nonnative listeners of Dutch was also measured in an
online task—a lexical decision task. In this task, reaction times and accuracy scores are
measured and analyzed. Generally, phonotactically illegal nonwords take less time to be rejected
by native speakers than legal ones (Stone & Van Orden, 1993; Vitevitch & Luce 1999; Berent,
Marcus, Shimron & Gafos 2002; Coetzee 2004, 2008, 2009; Kager & Shatzman 2007) since
phonotactic knowledge will assist listeners in determining that a nonword is not a word of the
L1’s lexicon. Hence, stimuli that have Dutch-illegal word onset or coda clusters are expected to
have shorter reaction times than stimuli that are phonotactically legal in Dutch. Furthermore,
accuracy rates for nonwords with Dutch-illegal clusters are expected to be higher for the
Dutch native group and the L1-Spanish learners of Dutch. Among the L1-Spanish listeners,
this difference is expected to be larger for the advanced learners than for the beginners. For
the L1-Russian learners, L2-legal and L2-illegal onset and coda clusters are expected to have
approximately the same reaction times and accuracy scores, because both types of clusters are
hypothesized to satisfy the interlanguage phonotactic grammar.
4.1. Participants
Participants in the lexical decision task were the same as in Experiment 1. However, the results
of one of the beginning L1-Spanish participants were excluded because he reacted too slowly
and therefore many responses were missing.
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
207
4.2. Materials
The stimuli in the lexical decision experiment were of the same type as the stimuli in Experiment 1. The stimuli lists of the two experimental tasks did not contain the same test
stimuli, although a small number of fillers occurred in both experiments. Based on the same
set of target consonant clusters as in Experiment 1, a new list of stimuli was recorded. In
this list, each target consonant cluster occurred three times—once in a monosyllable, once in
an iamb, and once in a trochee. Moreover, existing words and nonword fillers (identical to
those used in the word-likeness judgment task) were added. The stimulus list of the lexical
decision experiment contained 280 items: 117 test items, 113 existing words, and 50 nonword
fillers.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
4.3. Procedure
The lexical decision task took place in the same booth as the word-likeness judgment task.
Again, headphones were used to listen to the stimuli. The participants were seated behind a
button box with a yes-button and a no-button. Because the no-responses are most important
for this experiment, as these recorded the responses to the nonwords, the no-button was under
the dominant hand: for right-handed people under the right hand, for left-handed people under
the left hand. The order of stimuli was randomized for each participant in order to avoid
ordering effects.
The subjects were instructed to press the no-button when they heard a nonexisting word
and the yes-button when they heard an existing word of Dutch. They were also instructed to
respond as quickly and as accurately as possible. Before the real task, a short training session
of five stimuli was presented. In this exercise session, no target items were included. After this
exercise the subjects had the opportunity to ask questions.
The real test session contained 280 trials. After each trial the participant had to press either
the JA (yes) or NEE (no) button on the button box. The participants had to respond within
2400 msec after the beginning of each stimulus; otherwise no response was registered. After
these 2400 msec, the next trial was presented. The trials were presented in random order and
after 140 trials, the subjects had the opportunity to have a short break. After this break, the
other 140 trials were presented. The lexical decision task took about 20 minutes.
4.4. Results
Accuracy rates and reaction times were analyzed for each group of participants. The set of target
items was subdivided into two main categories—items that are phonotactically legal in Dutch
(Types 2 and 3) and items that contain Dutch-illegal consonant clusters (Type 1). The corrected
reaction times were used for the analysis. These were calculated by subtracting the stimulus
duration from the total reaction time measured from the onset of the stimulus. The statistical
tests used to detect significant differences between the responses were t-tests and ANOVAs
(item analyses). Native speakers and L1-Spanish learners of Dutch were expected to have
lower accuracy scores and slower responses to the Dutch-legal (Types 2-3) than to the Dutchillegal (Type 1) consonant clusters. The responses of the L1-Russian learners of Dutch were
not expected to show this difference. The native speakers of Dutch were also expected to
208
TRAPMAN AND KAGER
discriminate between different levels of well-formedness (based on frequency effects) and
ill-formedness (based on markedness effects).
The overall accuracy scores for the experimental groups were as follows: native speakers
of Dutch: 91%; beginning L1-Russian learners: 74%; advanced L1-Russian learners: 84%;
beginning L1-Spanish learners: 64%; advanced L1-Spanish learners: 83%. These data show
that the more advanced L2 learners had higher accuracy scores than the less advanced L2
learners.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
4.4.1. Dutch Participants
The accuracy scores reveal that the native speakers of Dutch are more accurate on the
Dutch-illegal (Type 1) onsets than on the Dutch-legal (Types 2 and 3) onsets: 99.3% versus
94.1% (Mann-Whitney U-test, U D 518474:0, p < :001). The accuracy scores of the Dutch
participants did not distinguish the Dutch-illegal and Dutch-legal codas (98.2% versus 97.9%).
Furthermore, the native speakers needed significantly more time to reject nonwords with
legal onset clusters than to reject nonce words with illegal onset clusters (one-way ANOVA,
F .2; 2099/ D 94:642, p < :001). For the coda clusters however, this difference is not
significant (one-way ANOVA, F .2; 1144/ D 1:117, p D :291).
Within the classes of legal and illegal consonant clusters, no significant differences are
observed for the native speakers in the lexical decision task.
4.4.2. L1-Russian L2 Learners of Dutch
The accuracy scores reveal that the beginning and advanced L1-Russian learners of Dutch
were more accurate on the Dutch-illegal (Type 1) clusters than on the other clusters (Types 2
and 3). This was the case for the onset clusters (Beginning learners: 89.9 versus 60.5%;
Advanced learners 96.8 versus 76.1%; Mann-Whitney U-test, Beginning learners: U D 35280:0,
p < :001; Advanced learners: U D 39146:5, p < :001) as well as for the coda clusters
(Beginning learners: 82.2 versus 52.7%; Advanced learners: 85.2 versus 68.1%; Mann-Whitney
U-test, Beginning learners: U D 10670:0, p < :001; Advanced learners: U D 12176:5,
p D :001).
In general, native speakers responded faster than the L1-Russian learners of Dutch (Figure 10) and the advanced learners responded more quickly than beginning learners (Figure 11).
This generalization held for nonwords with onset clusters as well as for nonce words with
coda clusters. Both groups of L1-Russian learners of Dutch were sensitive to the phonotactic
illegality of Type 1 onset clusters. They needed significantly less time to reject items with
illegal onsets than legal onsets (One-way ANOVA, Beginning learners: F .2; 508/ D 12:899,
p < :001; Advanced learners: F .2; 574/ D 45:474, p < :001). For the coda clusters, neither
group made this distinction.
Within the classes of legal and illegal consonant clusters, the L1-Russian learners did not
show significant differences. The same was found for the native speakers of Dutch in the
lexical decision task. In short, the accuracy scores suggest sensitivity to both illegal onset and
coda clusters for both groups of L1-Russian learners of Dutch. The reaction times reveal only
a difference between legal and illegal onset clusters (and not for the different types of coda
clusters) for the beginning and advanced learners.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
209
FIGURE 10 Average reaction times (RTs) to non-words containing Type 1, Type 2, and Type 3 word onset
and word coda clusters by Dutch native speakers. RTs were measured by subtracting the stimulus duration
from the total RT measured from the onset of the stimulus.
FIGURE 11 Average reaction times (RTs) to non-words containing Type 1, Type 2, and Type 3 word onset
and word coda clusters by beginning and advanced L1-Russian learners of Dutch. RTs were measured by
subtracting the stimulus duration from the total RT measured from the onset of the stimulus.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
210
TRAPMAN AND KAGER
FIGURE 12 Average reaction times (RTs) to non-words containing Type 1, Type 2, and Type 3 word onset
and word coda clusters by beginning and advanced L1-Spanish learners of Dutch. RTs were measured by
subtracting the stimulus duration from the total RT measured from the onset of the stimulus.
4.4.3. L1-Spanish Learners of Dutch
The accuracy scores of the beginning L1-Spanish learners did not distinguish the Dutch-legal
(Types 2 and 3) from the Dutch-illegal (Type 1) consonant clusters. The advanced learners were
more accurate on the Dutch-illegal onset clusters than on the other onset clusters: 87.9 versus
82.1% (Mann-Whitney U-test, U D 19195:0, p D :001). For the coda clusters, there was no
difference between the accuracy scores for the Dutch-legal and illegal clusters. An analysis
of the reaction times of the L1-Spanish learners (Figure 12) revealed the following pattern:
Only the advanced learners showed sensitivity to the phonotactic illegality of the Type 1 onset
clusters (One-way ANOVA, F .2; 361/ D 26:176, p < :001). The reaction times of this group
did not show a difference between legal and illegal coda clusters. Beginning learners did not
show sensitivity to the distinction between legal and illegal consonant clusters at all in their
reaction times to different stimulus types.
Like the other groups of participants, the L1-Spanish learners did not show significant
differences within the classes of legal and illegal consonant clusters.
4.5. Discussion
The results of the lexical decision task, which are summarized in Table 9, show that the
native speakers of Dutch discriminated between legal and illegal consonant clusters. That is,
correct responses to nonwords containing illegal clusters were faster than correct responses to
phonotactically well-formed nonwords. The L1-Russian learners of Dutch, both the beginning
and advanced learners, showed the same pattern: they made a distinction between consonant
211
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
TABLE 9
Significant Differences Between the Reaction Times of the Responses
to the Different Cluster Types
Distinction Between Legal
and Illegal Clusters
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
Dutch
Russian
Russian
Spanish
Spanish
1
2
1
2
Onset
Coda
Distinction Within the
Class of Legal Clusters
Distinction Within the
Class of Illegal Clusters
Yes
Yes
Yes
No
Yes
No
No
No
No
No
No
No
No
No
No
No
No
No
No
No
clusters that are legal and illegal in Dutch, although both types are legal in their L1. The
accuracy scores and reaction times of both groups of L1-Russian participants also showed sensitivity to finer-grained word-likeness differences within the class of illegal clusters. Moreover,
the accuracy scores revealed a difference between legal and illegal onset and coda clusters,
whereas the reaction times only differed between legal and illegal onset clusters, not between
legal and illegal coda clusters.
For the L1-Spanish learners of Dutch, results of the lexical decision task were different. Only
the advanced learners distinguished between legal and illegal consonant clusters of Dutch. They
only made this distinction for onset clusters, not for coda clusters.
In contrast with the word-likeness judgment task, gradience within the class of legal clusters
or within the class of illegal clusters cannot be shown for any group by the responses in the
lexical decision task.
Table 10 shows that the general results of the different tasks follow the same pattern for the
onset consonant clusters. However, for the coda consonant clusters, the pattern is different. In
the word-likeness judgment task (an offline task), the participants made distinctions that they
did not make in the lexical decision task (an online task). Possibly, the decision for rejecting a
TABLE 10
Significant Differences Between Legal and Illegal Consonant Clusters in Both Tasks: Word-Likeness
Judgments (WLJ) and Lexical Decision (LD) for Five Experimental Groups
Significant Difference Between
Legal and Illegal Onset Clusters
Native
Language
Dutch
Russian
Russian
Spanish
Spanish
Significant Difference Between
Legal and Illegal Coda Clusters
Level
WLJ
LD
Accuracy
LD Reaction
Times
WLJ
LD
Accuracy
LD Reaction
Times
1
2
1
2
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
No
No
No
No
No
No
212
TRAPMAN AND KAGER
TABLE 11
Significant Differences Within the Main Classes in Both Tasks: Word-Likeness Judgments (WLJ) and
Lexical Decision (LD) for Five Experimental Groups
Significant Difference Within
the Class of Legal Clusters
Native
Language
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
Dutch
Russian
Russian
Spanish
Spanish
Significant Difference Within
the Class of Illegal Clusters
Level
WLJ
LD Reaction Times
WLJ
LD Reaction Times
1
2
1
2
Yes
No
Yes
No
No
No
No
No
No
No
Yes
No
No
No
Yes
No
No
No
No
No
nonword is already made on the basis of the initial part of the word (Marslen-Wilson & Welsh
1978; Marslen-Wilson & Zwitserlood 1989) before the coda has been processed. The same
finding holds for distinctions within the class of legal and illegal clusters, which are presented
in Table 11. The native speakers of Dutch and a number of L2-learners discriminated between
different levels of well-formedness and ill-formedness in the word-likeness judgment task, but
in the lexical decision task there was no such significant difference. This difference between
the tasks suggests that the word-likeness judgment task measures finer-grained differences in
phonotactic well-formedness than the lexical decision task.
5. GENERAL DISCUSSION
The L2 acquisition of phonotactic knowledge was examined by means of two experimental tasks
reflecting such knowledge: word-likeness judgments and lexical decision. L2 learners of Dutch
whose native language phonotactics is either a subset or a superset of Dutch phonotactics
took part in these experiments, in addition to a control group of native speakers. Since
most literature on L2 acquisition of phonotactics is based on production tasks, and since
production data may (partially) reflect production difficulties rather than pure tacit grammatical
knowledge, very little was known about tacit phonotactic knowledge of L2 learners and its
development.
The results reveal that native speakers assigned higher word-likeness ratings to nonwords
that have Dutch-legal onsets and needed more time to reject these phonotactically well-formed
nonwords in the lexical decision task. The difference between nonwords containing Dutchlegal and Dutch-illegal codas was only significant in the word-likeness judgment task, which
was found to reflect more fine-grained phonotactic knowledge than lexical decision. Beginning
and advanced L1-Russian learners of Dutch also discriminated between Dutch-legal and Dutchillegal word onset and coda clusters (although no significant difference between legal and illegal
codas occurred in the lexical decision task). The advanced, but not the beginning, L1-Spanish
learners of Dutch differentiated between Dutch-illegal and Dutch-legal onset clusters in the
word-likeness judgment task as well as in the lexical decision task. Both the beginning and
advanced L1-Spanish learners distinguished Dutch-legal from Dutch-illegal codas in the word-
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
213
likeness judgment task, but not in the lexical decision task. That is, both groups of learners
discriminated between legal and illegal consonant clusters. Moreover, the more advanced
learners appeared more native-like than the beginning learners.
Furthermore, the results of the word-likeness judgment task indicate that native speakers
have gradient phonotactic knowledge distinguishing within the broad classes of legal and
illegal consonant clusters; the lexical decision task elicited less fine-grained results. Here,
not only native speakers exhibited gradient phonotactic knowledge within the two main classes
of consonant clusters, but advanced L2 learners also showed gradient responses to some extent.
In particular, the advanced L1-Russian learners discriminated degrees of word-likeness within
the class of Dutch-legal (Types 2 and 3) onsets, whereas the advanced L1-Spanish learners only
discriminated within the class of the Dutch-illegal (Type 1) onsets. L1-Russian learners did not
distinguish levels of word-likeness within Dutch-illegal clusters (which are Russian-legal) since
they receive no input of these clusters in Dutch. Possibly, a perception experiment in which
more similar Dutch-illegal clusters are included can shed more light on gradient judgments by
L2 speakers.
Since L1-Russian learners of Dutch distinguished the Dutch-illegal clusters from the Dutchlegal clusters, the results of the experiments presented in this study suggest that L2-learners
are able to acquire phonotactic knowledge of a subset grammar. These findings contradict the
strong hypothesis that learners of a superset language cannot achieve target grammars that are
more restrictive. A possibly confounding factor in this study is that the stimuli were pronounced
by a native speaker of Dutch. In order to verify whether the stimuli with Dutch-illegal clusters
sounded natural, a naturalness task might be added. Another option to avoid such an effect
would be to repeat this experiment with stimuli that are pronounced by a Russian/Dutch
bilingual.
The question that remains to be addressed at the end of this study is what might explain
our finding that phonotactic acquisition is possible in a ‘subset scenario’ despite the prediction
from learnability theory. We will briefly discuss four logically possible explanations, based on
universal markedness, the L2 initial state, indirect negative evidence, and learning mechanisms
that are not vulnerable to the subset problem.
First, L1-Russian learners of Dutch might derive implicit knowledge about the relative
well-formedness of word margins from universal markedness (Pertz & Bever 1975; Berent,
Steriade, Lennertz & Vaknin 2007). Most Type 1 (Dutch-illegal) clusters violate markedness
constraints such as the sonority sequencing principle, OCP-PLACE, and the ban against voiced
obstruent clusters, constraints which are all satisfied by Type 2/3 (Dutch-legal) clusters. Under
a markedness account, no exposure to Dutch input is needed for L1-Russian learners of Dutch
to represent the relative ill-formedness of Type 1 as compared to Type 2/3 clusters. Although all
three cluster types are legal in Russian, the L1, markedness constraints assess Type 1 clusters
as less well-formed than Type 2/3 clusters. This account correctly predicts that Type 1 and
Type 2/3 clusters differ in word-likeness ratings for both subgroups of L1-Russian participants,
beginning and advanced learners. Since our experiments were not designed to monitor learners’
representations of the absolute illegality of Type 1 clusters, but only the perceived differences
in word-likeness between Types 1-2-3, we cannot rule out the interpretation, consistent with
the markedness account, that L1-Russian learners represent all three cluster types, including
Dutch-illegal Type 1, as legal based on their L1. Although a markedness account is compatible
with L1-Russian learners’ responses to Type 1 versus Types 2/3 clusters, it is nevertheless
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
214
TRAPMAN AND KAGER
incomplete for two reasons. First, it cannot account for our finding that only the advanced
Russian learners, not the beginners, discriminated degrees of word-likeness within the class of
Dutch-legal onsets. This finding suggests that phonotactic knowledge of the target language
becomes more native-like and fine-grained in the course of acquisition. Although a markedness
account does not rule out phonotactic development, it also makes no predictions in this respect.
Second, a markedness account fails to explain the correlation between L1-Russian learners’
word-likeness judgments of the nonword stimuli and their phonotactic probabilities based on
the Dutch lexicon. In sum, this account ultimately falls short of explaining how fine-grained
phonotactic knowledge of the target language might develop under the subset scenario.
A second logical possibility is that our assumption about the initial state of L2 equaling the
final state of the L1 is incorrect. For example, the initial state of the L2 phonotactic grammar
might be identical to the initial state of L1, involving no transfer from the native language (see
Epstein, Flynn & Martohardjono (1996) on the ‘no-transfer/full-access-to-UG’ hypothesis).
This would immediately solve the subset problem for L1-Russian learners. However, the notransfer account is highly unlikely to hold for L2 phonotactic acquisition, on the basis of a
large body of results from L2 production and perception, reviewed in section 1. Although
our experiments did not adduce evidence for transfer in L2-learners’ responses to nonwords,
the overall evidence for transfer from L2 phonotactic acquisition studies is too substantial to
ignore.
A third possibility is that our assumption that learners receive no negative input about clusters
that are illegal in Dutch word margins may simply be too strong. For example, learners might
derive negative evidence against the legality of syllable margins from the syllabification of
word-medial clusters. Many clusters that are illegal in Dutch word margins, including all our
Type 1 clusters, are phonotactically legal when occurring in intervocalic position, where they
span a syllable boundary (for example, /zd/ in esdoorn ‘maple tree’ or /xm/ in stigma ‘stigma’).
Such syllabifications might offer indirect negative evidence against these clusters in their role as
syllable margins, under the assumption that the learner can compare syllabification candidates
of intervocalic clusters ([Vz.dV] > [V.zdV]) (Tesar & Smolensky 2000). This account faces two
difficulties. First, syllabification cues are often subtle, difficult to detect, and ambiguous. This
causes problems especially for superset learners, since ambiguity in the learners’ input patterns
reinforces superset grammars. Second, the fact alone that a cluster is obligatorily heterosyllabic
fails to rule it out as a legal syllable margin. For example, a cluster such as /nt/, legal as a
word coda, is heterosyllabic in intervocalic position. For these reasons, it is highly uncertain
whether this source of indirect negative evidence might avoid the subset problem.
A final possibility to be considered is that the acquisition of phonotactic knowledge may
rely strongly on learning mechanisms that are relatively invulnerable to the subset problem, in
particular statistical learning. Correlations between phonotactic distributions in the lexicon and
subjects’ responses to nonwords are well-attested (see again references in section 1). Moreover,
adults and infants are able to learn phonotactic patterns from relatively short exposure based on
distributional cues in artificial language learning (Onishi, Chambers & Fisher 2002; Chambers,
Onishi & Fisher 2003). This accords with our finding that L1-Russian learners’ word-likeness
judgments are correlated with Dutch bi-phone probabilities. However, probabilistic accounts of
phonotactic acquisition fail to explain that L1-Russian learners’ word-likeness judgments show
even stronger correlations with the legality/illegality of clusters in Dutch, which may tentatively
be interpreted in favor of a more coarse-grained, abstract representation of phonotactic well-
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
215
formedness.33 In light of the results of the current study, it is highly unlikely that the acquisition
of phonotactic knowledge is based on a single learning mechanism.
ACKNOWLEDGMENTS
The authors wish to thank three anonymous reviewers and the associate editor for their helpful
comments and suggestions, and Tom Lentz for commenting on a previous version of this
article. This research was supported by a grant from the Netherlands Organisation for Scientific
Research (NWO) (277-70-001) to the second author.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
REFERENCES
Adriaans, Frans & René Kager. To appear. Adding generalization to statistical learning: The induction of phonotactics
from continuous speech.
Albright, Adam. 2009. Feature-based generalisation as a source of gradient acceptability. Phonology 26(1), to appear.
Altenberg, E. P. 2005. The judgement, perception, and production of consonant clusters in a second language.
International Review of Applied Linguistics in Language Teaching 43. 53–80.
Angluin, Dana. 1980. Inductive inference of formal languages form positive data. Information and Control 45. 117–135.
Baayen, Harald, Richard Piepenbrock & Leon Gulikers. 1995. The CELEX lexical database. Philadelphia: Linguistics
Data Consortium, University of Pennsylvania.
Bailey, Todd M. & Ulrike Hahn. 2001. Determinants of word-likeness: Phonotactics or lexical neighborhoods? Journal
of Memory and Language 44. 568–591.
Baker, C. Lee. 1979. Syntactic theory and the projection problem. Linguistic Inquiry 10. 533–581.
Barchudarova, S. G., S. I. Ožegova& A. B. Šapiro. 1967. Orfograficeskij Slovar0 Russkogo Jazyka. Moscow: Sovetskaja
Enciklopedija.
Berent, Iris, T. Lennertz, P. Smolensky & V. Vaknin. 2009. Listeners’ knowledge of phonological universals: Evidence
from nasal clusters. Phonology 26.1, to appear.
Berent, Iris, Gary F. Marcus, Joseph Shimron & Adamantios I. Gafos. 2002. The scope of linguistic generalizations:
Evidence from Hebrew word formation. Cognition 83. 113–139.
Berent, Iris & Joseph Shimron. 1997. The representation of Hebrew words: Evidence from the obligatory contour
principle. Cognition 64. 39–72.
Berent, Iris, Donca Steriade, Tracy Lennertz & Vered Vaknin. 2007. What we know about what we have never heard:
Evidence from perceptual illusions. Cognition 104. 591–630.
Berwick, Robert. 1985. The acquisition of syntactic knowledge. Cambridge, MA: MIT Press.
Bhatt, Rakesh M. & Barbara Hancin-Bhatt. 1997. Optimal L2 syllables: Interactions of transfer and developmental
effects. Studies in Second Language Acquisition 19. 331–378.
Blevins, Juliette. 1995. The syllable in phonological theory. In J. Goldsmith (ed.) Handbook of phonological theory,
206–244. Cambridge, MA: Blackwell.
Broselow, Ellen. 1987. An investigation of transfer in second language phonology. In G. Ioup & S. Weinberger (eds.)
Interlanguage phonology, 261–278. Cambridge, MA: Newbury House.
Broselow, Ellen, Su-I Chen & Chilin Wang. 1998. The emergence of the unmarked in second language phonology.
Studies in Second Language Acquisition 20. 261–280.
Broselow, Ellen & Daniel Finer. 1991. Parameter setting in second language phonology and syntax. Second Language
Research 7. 35–59.
Carlisle, Robert S. 1988. The effects of markedness on epenthesis in Spanish/English interlangauge phonology. Issues
and Developments in English and Applied Linguistics 3. 15–23.
33 This is in line with recent computational models of phonotactic learning (Hayes & Wilson 2008; Albright 2009;
Adriaans & Kager submitted), which combine statistical learning with feature-based generalization.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
216
TRAPMAN AND KAGER
Carlisle, Robsert S. 1998. The acquisition of onsets in a markedness relationship: A longitudinal study. Studies in
Second Language Acquisition 20. 245–260.
Chambers, Kyle E., Kristine H. Onishi & Cynthia Fisher. 2003. Infants learn phonotactic regularities from brief auditory
experience. Cognition 87. 69–77.
Chew, Peter A. 2000. A computational phonology of Russian. PhD dissertation, University of Oxford.
Coetzee, Andries W. 2004. What it means to be a loser: Non-optimal candidates in optimality theory. PhD dissertation,
University of Massachusetts, Amherst, MA.
Coetzee, Andries W. 2005. The OCP in the perception of English. In S. Frota, M. Vigario & M. J. Freitas (eds.),
Prosodies, 223–245. New York: Mouton de Gruyter.
Coetzee, Andries W. 2008. Grammaticality and ungrammaticality in phonology. Language 84. 218–257.
Coetzee, Andries W. 2009. Grammar is both categorical and gradient. In S. Parker (ed.), Phonological argumentation:
Essays on evidence and motivation. London: Equinox Publishers.
Coleman, John & Janet B. Pierrehumbert. 1997. Stochastic phonological grammars and acceptability. In Proceedings
of the Third Meeting of the ACL Special Interest Group in Computational Phonology. Somerset, NJ: Association
for Computational Linguistics.
Davidson, Lisa. 2003. The atoms of phonological representation: Gestures, coordination and perceptual features in
consonant cluster phonotactics. PhD dissertation, Johns Hopkins University, Baltimore.
Davidson, Lisa. 2007. The relationship between the perception of non-native phonotactics and loanword adaptation.
Phonology, 24. 261–286.
Davidson, Lisa, Jason Shaw & Tuuli Adams. 2007. The effect of word learning on the perception of non-native
consonant sequences. Journal of the Acoustical Society of America 122. 3697–3709.
Davies, Mark. 2002. Corpus del Español (100 million words, 1200s–1900s). Available online at http://www.corpusdelespanol.org/
Dell, Gary S., Kristopher D. Reed, David R. Adams & Antje S. Meyer. 2000. Speech errors, phonotactic constraints,
and implicit learning: A study of the role of experience in language production. Journal of Experimental Psychology:
Learning, Memory, and Cognition 26. 1355–1367.
Dupoux, Emmanuel, Kazuhiko Kakehi, Yuki Hirose, Christophe Pallier, & Jacques Mehler. 1999. Epenthetic vowels
in Japanese: A perceptual illusion? Journal of Experimental Psychology: Human Perception and Performance 25.
1568–1578.
Dupoux, Emmanuel, Christophe Pallier, Kazuhiko Kakehi, & Jacques Mehler. 2001. New evidence for prelexical
phonological processing in word recognition. Language and Cognitive Processes 5. 491–505.
Eckman, Fred R. 1977. Markedness and the contrastive analysis hypothesis. Language Learning 27. 315–330.
Eckman, Fred R. 1987. On the reduction of word-final consonant clusters in interlanguage. In A. James & J. Leahter
(eds.), The sound pattern of second language acquisition, 143–162. Dordrecht: Foris Publications.
Epstein, Samuel D., Suzanne Flynn & Gita Martohardjono. 1996. Second language acquisition: Theoretical and
experimental issues in contemporary research. Behavioral and Brain Sciences 19. 677–758.
Escudero, Paola. 2005. Linguistic perception and second language acquisition: Explaining the acquisition of optimal
phonological categorization. PhD dissertation, Utrecht University.
Escudero, Paola & Paul Boersma. 2002. The subset problem in L2 perceptual development: Multiple-category assimilation by Dutch learners of Spanish. In B. Skarabela, S. Fish & H.-J. Do (eds.) Proceedings of the 26th Annual
Boston University Conference on Language Development, 208–219. Somerville, MA: Cascadilla Press.
Friederici, Angela D. & Jeanine M. I. Wessels. 1993. Phonotactic knowledge and its use in infant speech perception.
Perception and Psychophysics 54. 287–295.
Frisch, Stephan A., Nathan R. Large & David B. Pisoni. 2000. Perception of word-likeness: Effects of segment
probability and length on the processing of nonwords. Journal of Memory and Language 42. 481–496.
Frisch, Stephan A. & Bushra A. Zawaydeh. 2001. The psychological reality of OCP-Place in Arabic. Language 77.
91–106.
Goldrick, Matthew. 2004. Phonological features and phonotactic constraints in speech production. Journal of Memory
and Language 51. 586–603.
Greenberg, Joseph. 1965. Some generalizations concerning initial and final consonant sequences. Linguistics 18. 5–34.
Hallé, Pierre, Juan Segui, Uli Frauenfelder & Christine Meunier. 1998. The processing of illegal consonant clusters:
a case of perceptual assimilation? Journal of Experimental Psychology: Human Perception and Performance 24.
592–608.
Hancin-Bhatt, Barbara. 2000. Optimality in second language phonology: Codas in Thai ESL. Second Language
Research 16. 201–232.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
217
Hancin-Bhatt, Barbara. & Rakesh M. Bhatt. 1997. Optimal L2 syllables: Interactions of transfer and developmental
effects. Studies in Second Language Acquisition 19. 331–378.
Harnsberger, James D. 2001. On the relationship between identification and discrimination of non-native nasal consonants. Journal of the Acoustical Society of America, 110, 489–503.
Harris, James. 1983. Syllable structure and stress in Spanish: A nonlinear analysis. Cambridge, MA: MIT Press.
Haunz, Christine. 2002. Speech perception in loanword adaptation. Talk presented at the Postgraduate Conference of
the Edinburgh University, Department of Theoretical and Applied Linguistics, May 27–28, 2002.
Hay, Jennifer., Janet Pierrehumbert & Mary Beckman. 2004. Speech perception, wellformedness and the statistics of
the lexicon. In J. Local., R. Ogden & R. Temple (eds.), Phonetic interpretation: Papers in laboratory phonology VI,
58–74. Cambridge: Cambridge University Press.
Hayes, Bruce. 1984. The phonetics and phonology of Russian voicing assimilation. In M. Aronoff & R. T. Oehrle
(eds.), Language Sound Structure, 318–328. Cambridge, MA: MIT Press.
Hayes, Bruce. 2004. Phonological acquisition in Optimality Theory: The early stages. In R. Kager, J. Pater &
W. Zonneveld (eds.), Constraints on phonological acquisition. Cambridge: Cambridge University Press.
Hayes, Bruce & Colin Wilson. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic
Inquiry 39. 379–440.
Hualde, Jose I. 1991. On Spanish syllabification. In H. Campos & F. Martínez-Gil (eds.), Current studies in Spanish
linguistics, 475–493. Washington, DC: Georgetown University Press.
Jakobson, Roman. 1941/1968. Child language, aphasia and phonological universals. The Hague: Mouton.
Jusczyk, Peter W., Angela D. Friederici, Jeanine M. Wessels, Vigdis Y. Svenkerud & Ann Marie Jusczyk. 1993.
Infants’ sensitivity to the sound patterns of native language words. Journal of Memory and Language 32. 402–420.
Jusczyk, Peter W., Paul. A. Luce & Jan Charles-Luce. 1994. Infants’sensitivity to phonotactic patterns in the native
language. Journal of Memory and Language 33. 630–645.
Kabak, Baris & William J. Idsardi. 2007. Perceptual distortions in the adaptation of English consonant clusters: Syllable
structure or consonantal contact contraints? Language and Speech 50. 23–52.
Kager, René & Wim Zonneveld. 1986. Schwa, syllables, and extrametricality in Dutch. The Linguistic Review 5.
197–221.
Kager, René and Keren Shatzman. 2007. Phonological constraints in speech processing. In B. Los & M. van Koppen
(eds.), Linguistics in the Netherlands 2007, 99–111.
Kempgen, Sebastian. 1995. Phonemcluster und Phonemdistanzen (im Russischen). In D. Weiss (ed.), Slavische Linguistik 1994, 197–221. München.
Kucera, Henry & George K. Monroe. 1968. A comparative quantitative phonology of Russian, Czech, and German.
New York: American Elsevier Publishing Company.
Levy, Erika S. & Winifred Strange. 2008. Perception of French vowels by American English adults with and without
French language experience. Journal of Phonetics 36. 141–157.
Lönngren, Lennart. 1993. Chastotnyj Slovar0 Sovremennogo Russkogo Jazyka (A frequency dictionary of modern
Russian with a summary in English). Acta Universitatis Upsaliensis, Studia Slavica Upsaliensia 32, Uppsala.
Marslen-Wilson, William D. & Alan Welsh. 1978. Processing interactions and lexical access during word recognition
in continuous speech. Cognitive Psychology 10. 29–63.
Marslen-Wilson, William D. & Pienie Zwitserlood. 1989. Accessing spoken words: The importance of word onsets.
Journal of Experimental Psychology: Human Perception and Performance 15. 576–585.
Martínez-Celdrán, Eugenio, Ana M. Fernández-Planas & Josefina Carrera-Sabaté. 2003. Castilian Spanish. Journal of
the International Phonetic Association 33. 255–259.
Massaro, Dominic W. & Michael M. Cohen. 1983. Phonological constraints in speech perception. Perception and
Psychophysics 34. 338–348.
Mattys, Sven L. & Peter W. Jusczyk. 2001. Phonotactic cues for segmentation of fluent speech by infants. Cognition
78. 91–121.
Mattys, Sven L., Peter W. Jusczyk, Paul A. Luce & James L. Morgan. 1999. Phonotactic and prosodic effects on word
segmentation in infants. Cognitive Psychology 38. 465–494.
McQueen, James. 1998. Segmentation of continuous speech using phonotactics. Journal of Memory and Language 39.
21–46.
McQueen, James, Takashi Otake & A. Cutler. 2001. Rhythmic cues and possible-word constraints in Japanese speech
segmentation. Journal of Memory and Language 45. 103–132.
Moreton, Elliott. 2002. Structural constraints in the perception of English stop-sonorant clusters. Cognition 84. 55–71.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
218
TRAPMAN AND KAGER
Moreton, Elliott & Shigeaki Amano. 1999. Phonotactics in the perception of Japanese vowel length: Evidence for longdistance dependencies. In Proceedings of the 6th European Conference on Speech Communication and Technology,
Budapest, Hungary.
Onishi, Kristine H., Kyle E. Chambers & Cynthia Fisher. 2002. Learning phonotactic constraints from brief auditory
exposure. Cognition 83. 13–23.
Ostapenko, Olesya. 2005. The optimal L2 Russian syllable onset. In Linguistics Students Organization Working Papers
in Linguistics 5: Proceedings of the Workshop in General Linguistics 2005, 140–151. Madison, WI: Department of
Linguistics, University of Wisconsin-Madison.
Pensado, Carmen. 1985. On the interpretation of the non-existent: Nonoccurring syllable types in Spanish phonology.
Folia Linguistica 19. 313–320.
Peperkamp, Sharon & Dupoux, Emmanuel. 2003. Reinterpreting loanword adaptations: The role of perception. Proceedings of the 15th International Congress of Phonetic Sciences, 367–370.
Pertz, Doris L. & Thomas G. Bever. 1975. Sensitivity to phonological universals in children and adolescents. Language
51. 149–162.
Pitt, Mark A. 1998. Phonological processes and the perception of phonotactically illegal consonant clusters. Perception
and Psychophysics, 60, 941–951.
Praamstra, Peter, Antje S. Meyer & Willem J. M. Levelt. 1994. Neurophysiological manifestations of phonological
processing: Latency variations of a negative ERP component time-locked to phonological mismatch. Journal of
Cognitive Neuroscience 6. 204–219.
Prince, Alan & Paul Smolensky. 1993. Optimality theory: Constraint interaction in generative grammar. Technical
Report, Rutgers Center for Cognitive Science, Rutgers University, New Brunswick, NJ, and Computer Science
Department, University of Colorado, Boulder.
Prince, Alan & Bruce Tesar. 2004. Learning phonotactic distributions. In R. Kager, J. Pater & W. Zonneveld (eds.),
Constraints on Phonological Acquisition. Cambridge, MA: Cambridge University Press.
Quilis, Antonio & Joseph A. Fernández. 1992. Curso de Fonética y Fonología Españolas. Madrid, Consejo Superior
de Investigaciones Cientificas.
Rochet, Bernard L. & Anne Putnam Rochet. 1999. Effects of L1 phonotactic constraints on L2 speech perception. In
J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville & A. Baily (eds.), Proceedings of the 14th International Congress
of Phonetic Sciences, 1443–1446. Berkeley, CA: University of California.
Saffran, Jenny R. & Erik D. Thiessen. 2003. Pattern induction by infant language learners. Developmental Psychology
39. 484–494.
Scheer, Tobias. 2000. De la localité, de la morphologie et de la phonologie en phonologie. Thèse d’Habilitation,
Université de Nice.
Scholes, Robert J. 1966. Phonotactic grammaticality. The Hague: Mouton and Co.
Ševeleva, M. S. 1974. Obratnyj Slovar0 Russkogo Jazyka (Reverse dictionary of the Russian language). Moscow:
Sovetskaja Enciklopedija.
Silverman, Daniel. 1992. Multiple scansions in loanword phonology: Evidence from Cantonese. Phonology 9. 289–
328.
Smolensky, Paul. 1996. The initial state and ‘richness of the base’ in optimality theory. Technical Report, Department
of Cognitive Science, Johns Hopkins University.
Stone, Gregory O. & Guy C. van Orden. 1993. Strategic control of processing in word recognition. Journal of
Experimental Psychology: Human Perception and Performance 19. 744–774.
Suomi, Kari, James M. McQueen& Anne Cutler. 1997. Vowel harmony and speech segmentation in Finnish. Journal
of Memory and Language 36. 422–444.
Taylor, Conrad F. & George Houghton. 2005. Learning artificial phonotactic constraints: Time course, durability, and
relationships to natural constraints. Journal of Experimental Psychology: Learning, Memory, and Cognition 31.
1398–1416.
Taylor, W. L. 1953. Cloze procedure: A new tool for measuring read-ability. Journalism Quarterly 30. 414–438.
Tesar, Bruce & Paul Smolensky. 2000. Learnability in Optimality Theory. Cambridge, MA: MIT Press.
Trommelen, Mieke. 1983. The syllable in Dutch; with special reference to diminutive formation. Dordrecht: Foris.
Vitevitch, Michael S. & Paul A. Luce. 1998. When words compete: Levels of processing in perception of spoken
words. Psychological Science 9. 325–329.
Vitevitch, Michael S. & Paul A. Luce. 1999. Probabilistic phonotactics and neighborhood activation in spoken word
recognition. Journal of Memory and Language 40. 374–408.
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
219
Vroomen, Jean, Jyrki Tuomainen & Beatrice de Gelder. 1998. The roles of word stress and vowel harmony in speech
segmentation. Journal of Memory and Language 38. 133–149.
Weber, Andrea & Anne Cutler. 2006. First-language phonotactics in second-language listening. Journal of the Acoustical Society of America 119. 597–607.
Weinberger, Steven. 1988. Theoretical foundations of second language phonology. PhD dissertation, University of
Washington.
Wheeler, Max. 1979. The Phonology of Catalan. Oxford: Blackwell.
Yip, Michel C. W. 1993. Cantonese loanword phonology and optimality theory. Journal of East Asian Linguistics 2.
261–291.
Zonneveld, Wim. 1983. Lexical and phonological properties of Dutch voicing assimilation. In M. Van den Broeke,
V. Van Heuven & W. Zonneveld (eds.), Sound structures, studies for Antonie Cohen, 297–312. Dordrecht: Foris.
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
Submitted 10 June 2008
Final version accepted 28 April 2009
220
TRAPMAN AND KAGER
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
APPENDIX A
Target Nonwords in the Word-Likeness Judgment Task
Target Part
Cluster Type
Cluster
Monosyllable
Monosyllable
iamb
trochee
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
3
3
3
1
1
1
1
1
2
2
2
2
2
2
3
3
kt
rt
tk
xm
zb
zd
zl
zn
fpr
fsp
fst
skl
zdr
fspl
fstr
sl
sm
sn
st
spl
str
fl
pr
tr
zm
nsk
rsk
stf
str
kt
nt
rk
rm
rs
rt
ls
ns
ktɑm
rtɑn
tke
xmɑt
zbal
zdεk
zlεn
znεr
fprɑn
fspɑm
fstɑm
sklɑn
zdrɑn
fsplar
fstrɑn
slɑt
smn
snɑt
stim
splir
strɔl
flɔs
prɔn
trεl
pɑzm
lɑnsk
mɔrsk
lɑstf
kεstr
kεkt
dɔnt
fεrk
lεrm
dɑrs
kεrt
mɔls
mɔns
ktɔl
rtεk
tkɔl
xmεn
zbɔt
zdur
zlɔm
znus
fprel
fspe
fstur
sklir
zdre
fsplur
fstruk
slεn
smɔt
snɔk
stuf
splur
strun
fle
prεn
trun
tεzm
mɔnsk
pεrsk
mɔstf
lɔstr
mɑkt
rnt
trk
tɔrm
lɔrs
pεrt
rεls
tɔns
ktope
rtomun
tkoman
xmotun
zbomεl
zdaman
zlaton
znilɔn
fpriton
fspatan
fstiman
sklomεl
zdromun
fsploda
fstreni
slomɔn
smatan
snamɔn
stame
spliton
striman
flatun
proman
trilan
larɔzm
tolεnsk
pimεrsk
kapɔstf
tonεstr
rotɔkt
litεnt
palɔrk
tilɔrm
tanεrs
monɔrt
sirɔls
ramεns
ktela
rtono
tkari
xmado
zbeli
zdolu
zlara
znuri
fprani
fspudo
fstano
sklida
zdromo
fsplonir
fstrinεl
sladi
smoni
snoda
stano
spledi
strano
flira
prano
traka
lirɑzm
dolɑnsk
teprsk
pakɔstf
molɑstr
rotɑkt
melɔɔnt
latrk
kedɔrm
makɔrs
rokart
tarils
dalons
SUBSET AND SUPERSET PHONOTACTIC KNOWLEDGE IN L2
Downloaded By: [Universiteit Utrecht] At: 15:37 6 July 2009
APPENDIX B
Target Nonwords in the Lexical Decision Task
Target Part
Cluster Type
Cluster
Monosyllable
iamb
trochee
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Onset
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
Coda
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
3
3
3
1
1
1
1
1
2
2
2
2
2
2
3
3
kt
rt
tk
xm
zb
zd
zl
zn
fpr
fsp
fst
skl
zdr
fspl
fstr
sl
sm
sn
st
spl
str
fl
pr
tr
zm
nsk
rsk
stf
str
kt
nt
rk
rm
rs
rt
ls
ns
kte
rtɑl
tkεn
xmɑn
zbi
zde
zlɑp
znɑp
fprɔs
fspεn
fstɔs
sklεn
zdren
fsplɔt
fstrak
slam
smεr
snεp
stun
splɔt
strɔs
flar
prɔt
tros
dɑzm
rεnsk
trsk
lɔstf
dεstr
rɑkt
dɑnt
dɔrk
pεrm
pɑrs
lεrt
kεls
kɔns
ktilon
rtamεl
tkamun
xmitan
zbatan
zdalun
zlinal
znimon
fprolan
fspiran
fstamεl
sklaton
zdronal
fsplimεl
fstromir
sliman
smilan
snitan
stolir
spliret
strale
flaman
pramɔl
trone
tilεzm
satɔnsk
molεrsk
tonεstf
rimɔstr
lamɔkt
kolɔnt
timɑrk
minɔrm
tilɔrs
pamεrt
panɔls
palɔns
ktepo
rtado
tkali
xmopa
zbemo
zdeno
zlita
znoki
fpreda
fspako
fstumi
skloda
zdrena
fsplamo
fstrola
sloko
smida
snado
stoda
splino
stropa
flemi
praki
tralu
pelzm
ratεnsk
kodεrsk
tatɔstf
dolɑstr
lotɑkt
pokɑnt
kemrk
dedrm
takɔrs
lodεrt
merls
ladɔns
221