CONTENTS
BEING BIOLOGICAL
SIMULACRA
SPEECH SYNTHESIS
VOCAL TRACTS
ARTICULATORS
SPEECH PRODUCTION
MCGURK
SPEECHREADING
FACIAL ANIMATION
AVATARS
BACKGROUND
DIRECTORY
BIBLIOGRAPHY

 
Talking Heads:
Speechreading

 
Pointers to research and references on speechreading can be found at the pioneering UCSC PSL Speechreading (Lipreading) webpage.
Additional information about speechreading is available at a number of sites, including:

NATO Advanced Study Institute: Speechreading by Man and Machine
"This [1995 NATO Advanced Study Institute was] the first forum on the interdisciplinary study of speechreading (lipreading) -- production, perception and learning by both humans and machines. The central aim [was] to explore and promote the incorporation of visual information into acoustic speech recognizers for improved recognition accuracy (especially in noisy environments), while drawing on and further elucidating knowledge of the psychology of speechreading by humans."

UCSC PSL Speech Perception by Ear and Eye
Includes information about the NSF Challenge Grant, which includes researchers from the Center for Spoken Language Understanding (Oregon Graduate Institute), Perceptual Science Laboratory (U. C., Santa Cruz), Interactive Systems Labs (Carnegie Mellon University), and Tucker Maxon Oral School. There are also links to Dominic Massaro's new book , "Perceiving Talking Faces: From Speech Perception to a Behavioral Principle", and other links to the research of this group and others.

Hearing by Eye II. Advances in the Psychology of Speechreading and Auditory-visual Speech.
Edited by Ruth Campbell, Barbara Dodd, and Denis Burnham.
"This book comprises fifteen invited chapters by leading international researchers in psychology, psycholinguistics, experimental and clinical speech science and computer engineering. It gives intriguing answers to theoretical questions (what are the mechanisms by which heard and seen speech combine?) and practical ones (what makes a good speechreader? Can machines be programmed to recognize seen, and seen-and-heard speech?) The book is written in a non-technical way and starts to articulate a behaviorally-based but cross-disciplinary program of research in understanding how natural language can be delivered by different modalities. "

Speechreading Demo Page
(Juergen Luettin , Machine Vision Group at IDIAP )
"The performance of most state-of-the-art speech recognition systems drops considerably in the presence of noise. This limits their use in real world applications, which are basically all subject to some interference from noise. Attempts to reduce the effect of noise in the speech signal have only shown limited success, particularly when the noise was due to crosstalk (cocktail party effect). Humans on the other hand use lip-reading (speechreading) as supplementary information for speech perception, especially in noisy conditions. The main benefit of visual information stems from its complementarity to the acoustic signal, i.e. phonemes that are difficult to distinguish acoustically are easier to distinguish visually. We have developed a speechreading system which locates and tracks the lips of a speaker over an image sequence to extract visualspeech information."

Visual Acoustic Speech Recognition (Computer Lipreading)
(Chris Bregler)
"In the frame of this project we investigate the utility of using visual information (lip-movements) to improve speech recogntion. This research was initiated by the Neural Net Speech Group (UKA and CMU) where we showed significant recognition improvement using a visual acoustic MS-TDNN architecture. At ICSI we specifically focus on conditions where state-of-the-art recognition systems perform poorly, for example in car environments with background noise, or office environments with cross-talk. Using a visual acoustic MLP/HMM architecture developed at the ICSI Speech Recognition Group we showed significant improvement over pure acoustic performance. Currently we are extending the interactive spontaneous speech system "BeRP" to make usage of the additional visual speech modality."

Interactive Systems Labs NLips
"What are we working on? We're using neural networks for lipreading. As [our] task we use speaker dependent continuous spelling of the German alphabet.
Why are we doing lipreading? We want to improve the recogniton rate of acoustical speech recognizers, especially in [non-]optimal conditions (cross-talking ...).
The goal is to get an on-line lipreader that is robust against all on-line conditions like illumination, translation and size without using some additional things like ... lip-markers ... etc. "

SYS Research Audio-Visual Speech Recognition
"Speechreading (rather than just lipreading) is increasingly being investigated by researchers to enhance current speech recognisers by adding visual information from the face of a talker. It is well known that acoustic-only speech recognisers fail in noisy conditions while a human 'listener' is still able to understand the speech. We are able to do this by exploiting all of the available information - for example our knowledge of language and visual cues such as body and facial gestures including lipreading. Engineers cannot replicate human understanding of speech but we can improve acoustic speech recognition by including visual speech information."

Speechreading Bibliography

 

PREVIOUS CONTENTS NEXT