|
Ananova
The world's first virtual newsreader on the internet has been launched in London.
Computer-generated Ananova is programmed to deliver news 24 hours a day.
Here are some newspaper reports.
digital CRL Facial Animation Project (Keith Waters)
"CRL's talking synthetic face is essentially the visual complement of the speech synthesizer DECtalk.
Where DECtalk provides synthesised speech, CRL's Face provides a synthetic face. By combining
the audio functionality of a speech synthesiser with the graphical functionality of a
computer-generated face, a variety of new applications can be developed. For example, a synthetic
character can give a multimedia presentation, or a synthetic character can monitor a system and report
anomalies as a feedback agent. One of the more intriguing possibilities is the construction of an
interactive face agent capable of assisting and conversing with the user."
Computer Facial Animation (book by Parke and Waters)
"This book is about computer facial models, computer generated facial images, and facial animation. In particular
it concerns the principles of creating face models and the manipulation or control of computer generated facial
attributes. In addition, various sections in the book describe and explain the development of specific computer
facial animation techniques over the past twenty years, as well as those expected in the near future. "
Demetri Terzopoulos homepage (with papers & animations)
UCSC Perceptual Science Laboratory (D. Massaro & M. Cohen)
"The Perceptual Science Laboratory is engaged in a variety of experimental and theoretical
inquiries in perception and cognition. A major research area concerns speech perception by
ear and eye, and facial animation. We also have tested a general fuzzy logical model of
perception in a variety of domains, including perception and understanding of language,
memory, object, shape and depth perception, learning, and decision making. Research is also
being carried out in reading."
Speech: A Sight to Behold
An online article by Barbra Rodriguez for science notes: summer 1996.
Focuses on Baldy: the UCSC PSL 3D computerized talking head model.
"When someone talks, you pick up clues about what they're saying from
their facial maneuvers. Scientists are using a computerized talking image
of a human head to learn about their visual language clues. Such talking
heads will also allow new ways of communicating in the future."
ICP Visual Speech Synthesis
"Three-dimensional modelisation of the different organs involved in speech production: lips, jaw, tongue for the
vocal tract and skin.
The animation of the different parts of the face model can be done, either by video analysis of a speaker's face,
or from text by means of a rule-based system."
FaceView (Alex Pentland, MediaLab, MIT)
"The FaceView project is concerned with observing, understanding, and synthesizing actions of the face and
head. The current work on this project is focused on the areas of head-tracking, facial expression recognition,
and non-rigid deformation of head models for animation."
MikeTalk
(Tony Ezzat and
Tomaso Poggio,
MIT Center for Biological and Computational Learning)
"The goal of this project is to create a videorealistic text-to-audiovisual speech synthesizer. The system should take as input any typed
sentence, and produce as output an audio-visual movie of a face enunciating that sentence. By videorealistic we mean that the final
audiovisual output should look like it was a videocamera recording of a talking human subject. "
Multimodal Speech Synthesis (KTH)
"Our approach to audio-visual speech synthesis
is based on parametric descriptions of both the acoustic and visual speech
modalities, in a text-to-speech framework. The visual speech synthesis
uses 3D polygon models, that are parametrically articulated and deformed.
Currently, we are working with two different parametric models for visual
synthesis : "Holger", which is an extended version of a face
model developed by F. Parke (1982), and "Olga", which was developed in
the Olga-project. The auditory synthesis is based on a source-filter formant-based
generation model. Parameter trajectories for both modalities are calculated
by a text-to-speech rule system. In the near future, we are hoping to improve
naturalness and intelligibility of the visual synthesis with the help of
data obtained by optical analysis of a real speaker's articulation."
MIAMI report (Schomaker et al.)
A taxonomy of multimodal interaction in the human information processing system.
A report of the ESPRIT PROJECT 8579.
Video Rewrite: Driving Visual Speech with Audio
(Bregler, Covell & Slaney, Interval Research Corp.)
"Video Rewrite uses existing footage to create
automatically new video of a person mouthing
words that she did not speak in the original
footage. This technique is useful in movie
dubbing, for example, where the movie
sequence can be modified to sync the actors'
lip motions to the new soundtrack."
ATR: Speech Synchronized Human Facial Image Synthesis
(Takaaki Kuratate and Eric Vatikiotis-Bateson)
RED TED Headcase Technology
(commercial facial animation software for Windows)
"Headcase Technology provides a real time 3D graphics system which displays a
realistic animated human face on your desktop. The face can gesture and
automatically mouths the current sound being played by your computer."
(Takaaki Kuratate and Eric Vatikiotis-Bateson)
RED TED Headcase Technology
(commercial facial animation software for Windows)
"Headcase Technology provides a real time 3D graphics system which displays a
realistic animated human face on your desktop. The face can gesture and
automatically mouths the current sound being played by your computer."
(Takaaki Kuratate and Eric Vatikiotis-Bateson)
RED TED Headcase Technology
(commercial facial animation software for Windows)
"Headcase Technology provides a real time 3D graphics system which displays a
realistic animated human face on your desktop. The face can gesture and
automatically mouths the current sound being played by your computer."
(Takaaki Kuratate and Eric Vatikiotis-Bateson)
RED |