CONTENTS
BEING BIOLOGICAL
SIMULACRA
SPEECH SYNTHESIS
VOCAL TRACTS
ARTICULATORS
SPEECH PRODUCTION
MCGURK
SPEECHREADING
FACIAL ANIMATION
AVATARS
BACKGROUND
DIRECTORY
BIBLIOGRAPHY

Lionel Reveret
 
Talking Heads:
Articulators
 

An interview with
Lionel Reveret


Q: Please provide a brief overview of your lip modeling work and other audiovisual research.

A: The 3D lip model that I'm currently developing is a parametric surface defined by a polynomial interpolation of a set of control points. The model is purely geometrical and the control points refer to features common to every human lip shape : lip corners, cupidon arc, inner and outer contour...

A nice feature of the approach is that the control points construction allows the model to be applied flexibly -- e.g., for different speakers and head orientation (camera view). Statistical studies for two speakers have shown that the model can track the lips sufficiently for synthesis using only two orthogonal parameters driving the model.

Although the original purpose of this work was to devise a 3D model for lip tracking regardless of head orientation, recently it has also been used for the lip animation componenent of a talking head model at ATR.
 


Q: What drives your work, theoretically?

A: The overall problem of lip tracking is to recover a complex shape from noisy images as the color of the lips is difficult to separate from the rest of the skin.

As a consequence, the more detailed you define a lip model, the better you regularize the problem of lip tracking. This geometrical model is a base line to add higher level considerations about the lip physiology, kinematics, muscle control... The simplicity of the geometrical definition allows the model to be used as a 'multi purpose tool' for lip shapes measurement.

I try to focus on this idea that analysis and synthesis of speaking lips are two closely related tasks.
 


Q: What is the most difficult issue, or issues, that you face when doing your research?

A: The biggest issue in the lip tracking work is to find the right trade-off between strong constraints that overly reduce the generality of the model and weak constraints that then do not prevent the model from tracking errors.

For example, I've focused on a very low sub-space parameterization of lip model deformation with only 2 parameters. This 2 parameters control brings robustness to tracking and seems enough to perform realistic animation. Nevertheless, it's far from being able to represent the detailed motion of the lips.

One other huge problem is the control of the recording conditions where differences of camera view and of lighting can have a huge impact on the results.
 


Q: What are your visions for the future of this sort of research?

A: Compared to vocal tract studies, only a few studies have been reported for lips in speech analysis. Most of the work done on lip tracking comes from the computer vision community. Despite the high quality of this work, they remain focused on image processing techniques with no particularly detailed lip modeling.

The understanding of the motor control of the lips could bring an important improvement both in lip tracking and realistic animation.
 


Q: Do you have any comments on related work by others that you consider to be exciting?

A: A 3D lip model is currently being developed by S. Basu at the MIT MediaLab. His model is based on a finite element model (FEM) used to model the elasticity of the deformation of the lip surface. Although no intellegibility tests have been provided so far, it seems to be an interesting approach for analysis/synthesis modelling.
 

QuickTime movie

Lip tracking movie (QuickTime, 2.1 Mbytes).


 
Lionel Reveret works at the Institut de la Communication Parlee, in Grenoble, France.
He can be reached via email at: reveret@icp.inpg.fr or visit his website.
 

PREVIOUS CONTENTS NEXT