| Abstract | Studies of speech production and studies of speech perception are seldom theoretically coordinated. As a result, we have no common vocabulary (other than that of abstract linguistic description) for referring to what speakers produce in terms of what they perceive, and vice versa. The lack is obvious when we ask how a child learns to speak. What are the physical units into which a child analyzes the words it hears, and from which it builds the words it speaks? The paper briefly considers four possible units: syllable, phoneme, feature, and gesture.) Drawing on examples from the variable forms of words spoken by a two-year-old child, the paper concludes that only the gesture can be given both an articulatory and acoustic definition as an irreducible unit of perception and action that can account coherently for the continuous transition from prelinguistic vocalizations through babble to speech. |