haskins logo
(This section is still under construction!!)

The sounds of speech are shaped by the continuously moving articulators: the tongue, the lips, the jaw. As this articulatory dance unfolds, a wide variety of sounds are produced, and if the performance is adequate (as it usually is), the sounds are the sounds of speech.

Although modeling of the speech production system as a whole has been underway for quite some time, the detailed modeling of the individual articulators is relatively new.

For a variety of reasons (including their visibility, their relationship to critical information in the spoken signal, their usefulness in speechreading, the ease of their graphical representation, etc.) the lips have been the most commonly modeled articulatory system. Recent approaches to modeling have incorporated the third dimension in the graphical representation of this characteristically human (or, more correctly, primate) articulatory system. Two interviews provide some background on issues related to this work, and on issues related to audio-visual speech in general.
   An interview with Christian Benoît.
   An interview with Lionel Reveret.
   An interview with Sumit Basu.

Modeling the tongue presents a number of unique problems. In general, this articulator is hidden from view as it does its main work. In addition, the visco-elastic nature of this uniquely deformable hunk of flesh can present a modeling nightmare. A few have risen to this considerable challenge. One example can be found in the work of Reiner Wilhelms-Tricarico.  

Additional information related to the modeling of speech articulators is available at a number of sites, including:

    ICP Visual Speech Synthesis

    House Ear Institute KDI Demos

    Multimodal Speech Synthesis (KTH)

    The Visible Human Project

    LipsInk by Ganymedia