RESEARCH
PEOPLE
PUBLICATIONS
GIVING


UNDERSTANDING SPEECH
READING
SPEECH TECHNOLOGY
Steps in the production of a synthetic utterance begin with the drawing of the first tract configuration on the graphics screen and the superimposition of a grid structure. The intersection of the grid lines with the tract walls leads to a derivation of the sagittal dimensions, the center line and the length of the tract. Then, using formulae based on a variety of vocal tract measurements (Heinz & Stevens, 1964; Ladefoged, Anthony & Riley, 1971; Mermelstein, Maeda & Fujimura, 1971), the sagittal cross-sections are converted to a smoothed area function approximated by a sequence of uniform tubes each 0.875 cm in length. This simplification of the vocal tract shape permits a rapid calculation of the vocal tract transfer function. Speech output is then generated, at a sampling rate of 20 kHz, by feeding the glottal waveform through the digital filter representation of the transfer function which, for voiced sounds, accounts for both oral and nasal branches of the tract.
An interactive demonstration of the original ASY model is available.

