| Abstract | [Introduction]
Speaking may be our most impressive motor skill. We speak rapidly, and production of each word involves intricate sequencing and temporal interleaving of gestures for the component, ordered consonants and vowels of the word. The problem of understanding speech production at this level is that of understanding how speakers accomplish the feat of fluent consonant and vowel production. Solving that problem involves solving another one, however. It is to understand what speaking is essentially. that is it is to understand how a series of complicated actions of a vocal tract can serve to convey a message composed of rulefully-patterned symbols to members of a language community. In fact, the kind of solution an investigator seeks to the problem of understanding how vocal-tract actions are executed depends on how the investigator looks at the relation between vocal-tract action and the linguistic message itself.
Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones is the human mind itself housed in the human brain. The second holds that their primary home is the human vocal tract.
Consider the first position and the conceptualization of speech production to which it leads. For a least two reasons, the vocal tract is rejected as a natural home for phonological segments of the language. A philosophical reason is that phonemes are not the kinds of things that can occur or exist outside the mind. They are ideas or concepts without real-world actualization. Articulatory gestures or their acoustic consequences can serve as cues to phonological segments.
“[Segments]” are abstractions. They are the end result of complex perceptual and cognitive processes in the listener’s brain” (Repp, 1981, 1462).
“Phonological representation is concerned with speakers’ implicit knowledge, that is with information in the mind... [Phonetic] representation...is not cognitive because it concerns events in the world rather than events in the mind” (Pierrehumbert 1990).
A practical reason why phonological segments cannot occur in the vocal tract is that linguistic symbols have other properties, aside from being covert kinds of things, that preclude the vocal tract from representing them veridically or even analogically. In particular, a central and important fact about language is that its messages are composed of discrete symbols. Phonological segments are discrete in the sense that they do not overlap and blend. Moreover, until recently, they have been represented in linguistic theories as if they were composed of lists of coextensive (and by implication, cotemporal ) features (cf. Chomsky/Halle 1968). The features themselves described static postures of the vocal tract or their acoustic consequences; accordingly, the feature lists of a word described a succession of vocal-tract actions that somehow convey a message to a listener have none of those properties. Actions associable with a given consonant or vowel do overlap and do appear to blend with actions of neighbors. Actions identifiable with the component features of a consonant or vowel are not cotemporal. Finally, fundamental units of articulation appear to be actions, not postures; accordingly, time is intrinsic to speech, rather than extrinsic as it is to the linguistic message. One interpretation of these mismatches is that they reflect the mismatch between the ideal of linguistic competence and the degraded physical reality of linguistic vocal performance; the latter necessarily is a considerable distortion of the former due to the limitations of mechanico-inertial systems. This way of looking at speech production promotes development of a kind of theory of the “how’ of speech production that have been termed ‘translation theories’ (Fowler/Rubin/Remez/Turvey 1980). The mismatch between the character of the planned message, presumably a sequence of linguistic symbols, and of its physical, phonetic, realization requires a translation over stages of processing out of the ideal, mental, domain of the plan into the real, physical-nonmental, domain of a vocal tract.
The other extreme perspective on the nature of speaking is that consonants and vowels are actions of the vocal tract that have linguistic, including phonological, significance in a language community. They are, certainly, psychological actions that require knowledge about them to be performed. However, the knowledge is not a superior ‘ideal’ that the actions cannot implement; rather, the knowledge is about the actions, derived from perceptual and articulatory experience with them. From this perspective, the mismatch between linguistic segments and articulation described above is apparent rather than real. It is the product of three kinds of error: 1. a mistaken ascription of primacy to linguistic activity (performance); 2. an incorrect characterization of phonological segments in linguistic theory; 3. an incorrect characterization of the vocal tract actions of speech production. As to the first ‘error’, the argument is that we treat language differently from other human creations when we decide that its components exist only in the mind. Other human creations include, for example, automobiles, baseball games and musical pieces. Automobiles definitely exist in the world and so do baseball games and musical pieces when they are played. what is in the mind of those who know about automobiles, baseball and a musical piece, is only what they know about those things; it is not the things themselves. If linguistic concepts are like those other concepts, they knowledge about real-world objects or events; the events have a psychological nature - in this case, they are actions of the vocal tract, identified as phonological segments. If the phonology in the mind of a language user is what the user knows about the actions that implement a linguistic message, then there need be no mismatch between knowledge and action. If a phonological theory ascribes properties to phonological segments as known that are impossible to realize in vocal-tract action distorts components of linguistic competence. If descriptions of vocal-tract actions include properties, such as coarticulatory blending, that would distort the phonological message, then the first hypothesis should be that the descriptions are wrong. From this perspective, an important aim is to work on development of a phonology that does not ascribe properties to phonological segments that are unproduceable as vocal-tract action (cf. Browman/Goldstein 1986; Browman/Goldstein 1989). A second aim is to find a perspective on vocal-tract action from which macroscopic order is evident that conforms to the phonological structure of spoken utterances (e.g., Fowler/Rubin/Remez/Turvey 1980; Fowler 1990; Saltzman 1986; Saltzman/Munall 1989).
This theoretical perspective promotes a theory of speech production different from a speech production different from a translation theory as outlined earlier. Speech production does not involve a translation out of an ideal, mental domain into a physical, nonmental, domain. Rather, the plan for a sequence of phonological segments, physically instantiated in the brain, replicates itself in a new physical medium, the moving vocal tract. A speech plan, in some way, brings about vocal-tact actions having linguistic significance.
In the remainder of this chapter, I pursue the different outlooks on a central aspect of speech production, coarticulation, that these different theoretical perspectives promote. I then consider
the implications of our understanding another central aspect of speech production: the coordinated actions of the vocal tract that constitute token phonological segments. |