January 23:  Categoricity and Variation in Phonology

Schedule

Printable Schedule

Click on speaker name for Abstract

8:30-9:30am:   Breakfast

9:30-9:45am:   Paul SmolenskyIntroduction to Today's Symposium

9:45-10:35am:  Bruce Hayes: Stochastic Phonological Knowledge: The Case of Hungarian Vowel Harmony

10:35-11:00:  Coffee Break

11:00-11:50am: Rachel Hayes-Harb: Learning L2 Phonological Categories

11:50am-12:15pm:  Coffee Break

12:15-1:05pm:  Gary Dell: Phonological speech errors: A psycholinguistic perspective on learning and phonological processing

1:05-2:30pm:  Lunch!

2:30-3:20pm:  Adam Albright: Modeling variation and uncertainty in irregular morphophonology: the case of Spanish diphthongization

3:20-3:45pm:  Coffee Break

3:45-4:35pm:  Diamandis Gafos: Nonlinear links between continuity and discreteness: transparency in vowel harmony

4:35-5:00pm:  Coffee Break

5:00-6:00pm:  Discussion

7:00pm:  Party chez Legendre-Smolensky

Abstracts:

Bruce Hayes

Stochastic Phonological Knowledge: The Case of Hungarian Vowel Harmony

What happens when the language learner encounters a chaotic, unpredictable pattern in the inflectional paradigms of her language? Following Zuraw (2000) and others, I assume that she adopts the following strategy:

I. Memorize the individual inflected forms falling within the chaotic region.
II. Construct a stochastic grammar that predicts the behavior of novel stems within the chaotic region as accurately as possible. Such a grammar can be said to project lexical variation into free variation.

This is arguably the most practical strategy, especially for the young child. (I) maximizes accuracy for known forms, while (II) facilitates word recognition and maximizes accuracy when the child must guess the inflected form for a given base.

For similar reasons, I assume further that the language learner must also:

III. Constrain the chaotic region as much as possible, by locating the complementary regions where inflection is fully predictable.

This talk will support this overall view of paradigm acquisition with data and analysis of a chaotic region of the lexicon of Hungarian. I describe research in progress done in collaboration with Zsuzsa Londe of the UCLA Department of Applied Linguistics.

The chaotic region for Hungarian paradigms involves vowel harmony in stems whose vowel sequence ends with a back vowel followed by one or more front unrounded vowels. For such stems, harmony is not generally predictable, so Hungarians must memorize for each such stem whether it takes front- or back-voweled suffixes. However, good statistical generalizations exist that characterize the chaotic region (Siptár and Törkenczy 2000): (a) The lower the front unrounded vowel a stem contains, the more likely it will take front suffixes; (b) Stems ending with two front unrounded vowels take front suffixes more often than those with just one. We verify and quantify these claims with a corpus of 1100 Hungarian stems inflected in the dative case (front -nek or back -nak) by three native speakers, as well as an automated Google survey of 10,000 dative-case forms as they were used by Hungarians posting on the Internet.

To show that these lexical generalizations are projected to novel stems, we conducted an experiment in which 171 native speakers formed datives by attaching -nek or -nak to made-up stems. Our subjects behaved variably, in a way that statistically reflects both generalizations (a) and (b). Thus, Hungarian speakers do indeed project lexical variation into free variation.

Lastly, we adopt and extend a formal model of this process originally proposed by Zuraw (2000). The model makes use of the stochastic (Boersma 1997) variant of Optimality Theory (Prince and Smolensky 1993). Language learners confronted with inflectional variation memorize unpredictable forms, but from them construct a stochastic constraint ranking which governs their behavior in novel situations. However, some of the constraint rankings are not stochastic, but rigid. This accounts for why not just anything can be memorized; e.g. for Hungarian, no stem with all back vowels ever takes front suffixes.

References:

Boersma, Paul (1997) "How we learn variation, optionality, and probability," Proceedings of the Institute of Phonetic Sciences of the University of Amsterdam 21:43-58.

Prince, Alan and Paul Smolensky (1993). Optimality Theory: Constraint interaction in generative grammar.

Siptár, Peter and Miklos Törkency (2000) The Phonology of Hungarian. Oxford: Oxford University Press.

Zuraw, Kie (2000). Patterned exceptions in phonology. PhD dissertation, UCLA.


Rachel Hayes-Harb   "Learning L2 Phonological Categories"

Listeners are sensitive to phonetic differences that correspond to phonemic contrasts in their native language, and they often exhibit difficulty discriminating sounds that are not contrastive in their native language. Although a large literature demonstrates that learners can improve their perception of novel contrasts with exposure to a second language, there is still little understanding of how learners accomplish this. In this talk I will consider two possible sources of information that learners might use to acquire sensitivity to novel phoneme contrasts. It has traditionally been assumed that second language learners use their knowledge of minimal pairs - or the contrastive properties of phonemes - to infer phoneme contrasts. However, it has been demonstrated that adult second language learners can infer novel phoneme contrasts from the distributional properties of acoustic-phonetic information in their linguistic input alone (Maye 2000). Although both are possible sources of the evidence that learners use to acquire novel phoneme contrasts, the relative influence of these two types of information is not yet known. I will report a series of experiments that aim to shed light on this issue, and will conclude with a discussion of models that best account for the findings.


Gary Dell

"Phonological speech errors: A psycholinguistic perspective on learning and phonological processing"

Phonological speech errors, things like saying "calcium lust, and rime remover", instead of "calcium, rust, and lime remover", have long been the psycholinguist's principal source of evidence for characterizing phonological encoding in production. Two important facts about these errors are (1) the syllable position effect--onsets move to other onset positions and codas move to other coda positions and (2) the phonotactic regularity effect--slips do not result in illegal segment sequences. The first of these is a statistical effect; it is often violated. The second has been characterized as a "law"; it is rarely violated. I will argue that the often violated syllable-position effect and the rarely violated phonotactic regularity effect are two ends of a continuum reflecting the breadth of contraint that is operating. When we say the consonant [h], it must appear in onset position provided that we are speaking English. This is a broad constraint. When we say the onset [k], it must appear in onset position, provided that we are saying the words "king," "cow," and so on. This is a narrow constraint. I will further argue that these constraints are, to some extent, products of learning and are sensitive to recent experience. I'll present data from experiments that create phonological speech errors to illustrate this sensitivity.


Adam Albright

How do speakers learn to inflect words, in the midst of variable, contradictory, and sometimes sparse evidence? Such irregularity poses two challenges: first, the learner must learn to inflect existing words accurately, so her productions match those of the surrounding speech community. Second, she must be able to generalize to novel words, in order to produce and comprehend forms that she has never encountered before. This talk follows a growing line of research suggesting that generalization over irregular patterns proceeds bottom-up from the lexicon: first, the learner memorizes inflected forms, and then attempts to find significant generalizations about where irregularity is most likely to occur (Skousen 1989, Bybee 1995, Baayen and Schreuder 1999, Zuraw 2002, Albright and Hayes 2003). Generalization is thus a 'fitting' problem: the learner must find generalizations about competing patterns which cover the input data as accurately as possible.

When no "clean" generalizations can be found, variation may ensue. In this talk, I discuss two different sources of variation, using examples from irregularities in Spanish present tense paradigms. In some Spanish verbs, mid vowels ([e, o]) alternate in certain persons and numbers with diphthongs ([je, we]) or with high vowels. In all cases, the alternations are irregular, in the sense that there are also verbs that fail to alternate (e.g., observar 'to observe' vs. observo (*obsiervo) 'I observe', abandonar 'to abandon' vs. abandono (*abandueno, *abandongo) 'I abandon'. Nonetheless, there are numerous generalizations that can be drawn about segmental and morphological environments that encourage diphthongization, raising, or velar insertion, though none of these generalizations is perfect (Brame and Bordelois 1973).

The first type of variation arises when there are overlapping generalizations that compete for a particular word. For example, verbs in the first conjugation tend not to exhibit diphthongization (only about 10% do), but verbs with root-final [st] frequently do diphthongize (roughly 50%). I present results from joint work with Argelia Andrade and Bruce Hayes of UCLA, showing that when speakers are presented with a novel (nonce) verb like fostrar, these generalizations must compete, with the result that some speakers diphthongize (fuestro), while others do not (fostro). The amount of variation in the reponses depends on the degree of competition between overlapping generalizations, and is well modeled by a stochastic grammar of competing morphological rules. Furthermore, acceptability ratings show that no matter which form is produced, both are generally judged to be well-formed; variation results from a "kid in a candy store" choice between two equally appealing possibilities.

I then turn to a second, less-studied type of variation, which occurs when the data are simply too sparse to support any generalization. The Spanish third conjugation, unlike the first conjugation, is a small class, and exhibits a high degree of irregularity. In this class, all mid-vowel verbs alternate, either by diphthongizing or by raising. There are, however, so few mid-vowel verbs in this class that it is difficult to form any generalization at all about what conditions the alternations. Here, too, speakers are simply unsure whether or or not to diphthongize, and experimental participants offer both possibilities. In this case, however, both alternatives are judged to be ill-formed. Experimental results and automated Google search results show that even existing words in this class are produced variably, and are generally underattested (that is, speakers avoid them). This second type of variation results from a "rock and a hard place" choice. I argue that this type of variation is distinct from the more commonly discussed "candy store" type, and I suggest some possible approaches to teasing the two apart empirically.


Diamandis Gafos

"Nonlinear links between continuity and discreteness: transparency in vowel harmony"

A central question in the background of much current work in laboratory phonology is how to relate the continuous dimensions of phonetics to categorical phonological alternations (Pierrehumbert et al. 00). We address an instance of this question from transparency in vowel harmony. We relate our experimental results to the vowel mergers leading to transparency, assuming that diachronic change is shaped by perceptual and articulatory factors alike. We then propose a model of the relation between the observed properties of transparent vowels and their phonology, using basic tools of nonlinear dynamics.

In Hungarian harmony, certain front vowels may intervene between the trigger and target vowels of the harmony even if they bear the opposite value for the harmonizing feature: in kávé-nak 'coffee' the first ([+back]) vowel dictates the backness for the suffix across the so-called transparent [é], which is [-back] (accent denotes length). Using EMMA (Perkell at al. 92), we compared the location of receiver attached to the rear of the tongue dorsum during transparent vowels (TVs) in pairs like zefír-nek 'zephyr', zafír-nak 'sapphire', with matched consonants (22 pairs, 16 repetitions). For all three subjects, TVs in stems selecting back suffixes are articulated further back than TVs in stems selecting front suffixes, e.g. for one subject the mean difference for the tongue dorsum receiver in the two contexts is .4mm (p<0.001). A similar result obtains for the so-called abstract stems. Monosyllabic stems with TVs usually select front suffixes as in víz 'water(.NOM)', víz-nek. A limited number (about fifty) of these stems, however, select back suffixes as in híd 'bridge(.NOM)', híd-nak. The tongue dorsum for phonemically identical transparent vowels in such stems produced in isolation (no overt suffix) is more retracted in the stems selecting back suffixes than in the stems selecting front suffixes (mean difference = .5mm, p<0.01). The overarching generalization then is that non-distinctive retraction in TVs, either lexically-specified as in the monosyllables or contextually determined as in the cases of zefír-/zafír-, correlates with the phonological alternation in suffix form; the advanced/retracted version of a TV selects the front/back suffix. The diachronic evidence available reasonably suggests that the change leading to transparency was perceptually based, originating from loss of contrast or merger between front and unround back vowels in prosodically weak contexts (Kálmán 72). Our data, however, shows that there is a discernible distinction in articulation. We propose that the incompleteness of the merger is due to the articulatory pressures of agreement imposed by harmony, e.g. [í] in zafír- is retracted due to an agreement relation with its preceding vowel.

How then are fine differences in articulation to be linked to the phonological alternations of vowel harmony? Formally, the relation between retraction in the TV and suffix form is nonlinear: small changes in degree of retraction are associated with large changes in suffix form. In the proposed dynamical model, the two discrete forms of a suffix, say FRONT /-nek/ vs. BACK /-nak/ dative, are mapped to attractors of a dynamical system. Building on Gafos (in press), this is modeled by a function with two minima at the values of constriction location corresponding to FRONT /-e/, BACK /-a/. We require that the choice of the attractor, the surface form of the suffix, be determined by variation in R, representing the degree of tongue dorsum retraction for the stem-final vowel. Mathematically, these ideas can be stated succinctly in the form of the equation dx/dt = N(x,R) + F(t), whose solutions correspond to the attractors of constriction location for the suffix (N is a nonlinear function over x, R, and F(t) represents random noise). We show that for a range of small R values, the system is mono-stable with one attractor in the front region of the state space. As R is increased beyond a certain critical value, there is a change from the FRONT to the BACK attractor. In this model, then, a discrete change in suffix form is brought about by a scalar increase in the continuous retraction degree R of the transparent vowel. Effectively, non-contrastive differences in R can result in categorical alternations. This is due to the fundamental property of nonlinearity which allows us to link changes along continuous dimensions to categorical alternations.

References

Gafos, A. In press. Dynamics in Grammar: Comment on Ladd and Ernestus and Baayen. In Goldstein, L.M., Whalen, D.H., Best, C.T. and Anderson, S.R. (eds.), Varieties of Phonological Competence, Berlin/New York: Mouton de Gruyter.

Kálmán, B. 1972. Hungarian historical phonology. In Benko, L. and Imre, S. (eds.), The Hungarian Language, The Hague: Mouton, 49-83.

Perkell, J.S., Cohen, M., Svirsky, M., Matthies, M., Garabieta, I. and Jackson, M. 1992. Electro-magnetic midsagittal articulometer (EMMA) systems for transducing speech articulatory movements. Journal of the Acoustical Society of America 92, 3078-3096.

Pierrehumbert, J., Beckman, M. E. and Ladd, D. R. 2001. Conceptual Foundations of Phonology as a Laboratory Science. In Burton-Roberts, N., Carr, P. and Docherty, G. (eds.), Phonological Knowledge, Oxford, UK: Oxford University Press, 273-304.