SPEECH PERCEPTION BY EAR AND EYE
/ FACIAL ANIMATION
@ PSL
News and Publications
-
PSL On ABC PrimeTIME, Thurs
March 15, 2001
-
NSF Challenge Grant
-
Ron Coleand
Mike Macon at
CSLU, in conjunction with
Dominic Massaro
at the University of Santa Cruz and Alex Waibel at Carnegie Mellon
University, have been awarded a challenge grant for
a project entitled
"Creating Conversational Agents for Language Training: Technologies
for the Next Generation of Interactive Systems." The project goal is
to create a realistic conversational agent for language training,
in particular for the hearing-impaired. The conversational agent
includes the following components:
-
A 3-D model of a talking face with accurate articulator movements
and facial expressions.
-
Natural text-to-speech generation based on concatenative synthesis.
-
Visual speech recognition using a desktop camera in order to provide
information about the student's articulators.
-
Auditory speech recognition to understand what is being said and to
correct mispronunciations.
The conversational agent is initially being used to teach
hearing-impaired
students at the Tucker-Maxon Oral School
in Portland, Oregon.
-
PSL Tools, developed
at UCSC-PSL has been
released as part of the
CSLU Toolkit.
The PSL Tools package provides a set of tools for designing, conducting and analyzing the results of
perceptual experiments. It allows users to manipulate auditory and visual stimuli for perceptual
experiments; design
interactive protocols for multi-media data presentation and multi- modal data capture, transcribe and
analyze subjects'
responses; perform statistical analyses, and summarize and display results using the toolkit's
visualization tools.
- UCSC-PSL is hosting the AVSP '99
Audio-Visual Speech Processing Conference
here in Santa Cruz, August 7-9, 1999.
-
Speech - A Sight to Behold. UCSC Science Notes Summer 1996
-
UCSC psychologist teams up with Oregon school to help deaf
children. UCSC Currents January 1998
-
New Book! A new book by Dominic Massaro entitled,
"Perceiving
Talking Faces: From Speech Perception to a Behavioral Principle" (Bradford
Books: Cognitive Psychology Series - The
MIT Press) is a comprehensive resource for those interested in speechreading,
pattern recognition, face-to-face communication, and the Fuzzy Logical
Model of Perception (FLMP). The book presents converging evidence across
several domains of perception that supports the FLMP as a universal law
of perception. Included is a CD-ROM with many examples, stimuli from experiments,
and Baldy (the computer generated talking head) in action.
-
Speech Recognition and Sensory Integration
by Dominic W. Massaro and David G. Stork.
In May-June 1998 American Scientist.
Consider what goes through your mind when you attempt to identify an
apples variety - Granny Smith, Golden Delicious, McIntosh or Fuji...
-
Gudrun Schwarzer -
from
Fachbereich Psychologie der Universität Tübingen
is a Visiting Researcher with us during 1997-1998.
-
Jonas Beskow -
Doctoral student from the
KTH Centre for Speech Technology
is a Visiting Researcher with us during 1998-1999.
-
We recently had a visit from Gunnar Fant of KTH,
who talked
about early and recent work in
speech perception.
-
Nigel Ward -
from Tokyo University
is a Visiting Researcher with us during 1998-1999, working on Back-Channeling.
-
Mutsuhiro Terauchi -
from Hiroshima City University
is a Visiting Researcher with us during 1999, working on
bidirectional translation between Sign and written languages.
Demos
MPEG Movies
QuickTime Movies
Papers
-
Massaro, D. W. and Cohen, M. M. (1983)
Integration of visual and auditory information in speech perception.
Journal of Experimental Psychology: Human Perception and Performance,
9, 753-771.
-
Massaro, D. W. (1987).
Speech perception by ear and eye:
A paradigm for psychological inquiry.
Hillsdale, NJ: Lawrence Erlbaum Associates.
-
Massaro, D. W. (1989)
Multiple Book Review of Speech Perception by Ear and Eye:
A Paradigm for Psychological Inquiry.
Behavioral and Brain Sciences, 12, 741-794.
-
Cohen, M. M. & Massaro, D.
W. (1990)
Synthesis of visible speech.
Behavioral Research Methods and Instrumentation, 22, 260-263.
-
Massaro, D. W. and
Cohen, M. M. (1990)
Perception of synthesized audible and visible speech.
Psychological Science, 1, 55-63.
-
Gesi, A. T., Massaro, D. W.
and Cohen, M. M. (1992)
Discovery and expository methods in teaching
visual consonant and word identification.
Journal of Speech and Hearing Research, 35,
1180-1188.
-
Cohen, M. M., &
Massaro, D.
W. (1993)
Modeling coarticulation in synthetic visual speech.
In N. M. Thalmann & D. Thalmann (Eds.) Models and Techniques in
Computer Animation.
Tokyo: Springer-Verlag, 139-156.
-
Massaro, D. W., Tsuzaki, M.,
Cohen, M. M., Gesi, A., & Heredia, R. (1993).
Bimodal speech perception: An examination across languages.
Journal of Phonetics, 445-478.
-
Cohen, M. M., & Massaro, D.
W. (1994)
Development and Experimentation with
Synthetic Visible Speech
Behavioral Research Methods and Instrumentation, 26, 260-265.
-
Angola, O.,
Le Goff, B.,
Guiard-Marigny, T.,
Adjoudani, A.,
Benoît, C., &
Cohen, M.M. (1994),
Analyse-Synthèse de visages parlants,
Actes des 20èmes Journées d'Étude sur la Parole,
Lannion, France, May 1994
-
Le Goff, B.,
Guiard-Marigny, T.,
Cohen, M.M., &
Benoît, C., &
Real-time analysis-synthesis and intelligibility of talking faces'
Proceedings of the Second ESCA/IEEE Workshop on Speech Synthesis,
New Paltz, New York, U.S.A., Sept. 1994
-
Massaro, D. W. & Cohen, M. M.
(1994)
Auditory/visual speech in multimodal human interfaces.
Paper for special session on "Science and technology for
multimodal human interfaces" at International Conference of
Spoken Language Processing (September 18-22, 1994, Yokohama,
Japan).
-
Cohen, M. M., &
Massaro, D. W. (1994)
What can visual speech synthesis tell visual speech recognition?
28th Annual Asilomar Conference on Signals,
Systems, and Computers, (Oct 31-Nov 2, 1994,
Pacific Grove, CA).
-
Cohen, M. M., &
Massaro, D. W. (1995)
Discover Magazine Awards Finalists: Computer Software
-
Cohen, M. M., &
Massaro, D. W. (1995)
Auditory/visual speech synthesis for lifelike computer characters.
Lifelike Computer Characters '95. Snowbird Utah, Sept 26-29, 1995.
-
Massaro, D. W., &
Cohen, M. M. (1995).
Perceiving talking faces.
Current Directions in Psychology, 4,
104-109.
-
Cohen, M. M., Walker, R. L., & Massaro, D. W. (1995)
Perception of Synthetic Visual Speech,
Speechreading by Man and Machine: Models,
Systems and Applications, NATO Advanced Study Institute 940584,
(Aug 28-Sep 8, 1995, Chateau de Bonas, France).
-
Massaro, D. W. (1998).
Perceiving
Talking Faces: From Speech Perception to a Behavioral Principle.
Cambridge, MA: MIT Press
-
Cohen, M. M.& Massaro, D.
W. (1997)
Speech for Virtual Humans.
Virtual Humans 2 Conference, Hollywood, California, June 17-19.
-
Cohen, M. M.(1997)
Modeling and Evaluation of Synthetic Visual Speech.
Panel on Facial Animation, Past, Present, and Future.
SIGGRAPH '97, Los Angeles, August 3-8.
-
Campbell, R., Zihl, J., Massaro, D.W., Munhall, K., and Cohen, M.M., (1997)
Speechreading in the akinetopsic patient (LM),
Brain, 120, 1793-1803.
-
Smeele, P.M.T., Massaro, D.W., Cohen, M.M., and Sittig, A.C. (1998).
Laterality in visual speech perception. Journal of
Experimental Psychology: Human Perception and Performance 24, 1232-1242.
-
Massaro, D.W. and Cohen, M.M. (1998)
Visible speech and its potential for speech training
for hearing impaired perceivers.
STiLL - ESCA Workshop on Speech Technology in Language Learning
Stockholm, Sweden, May 25-27, 1998
-
Cole, R.A. Carmell, T., Conners, P.,
Macon, M., Wouters, J., de Villiers, J.,
Tarachow, A., Massaro, D.W., Cohen, M.M.,
Beskow, J., Yang, J., Meier, U., Waibel, A.,
Stone, P., Fortier, G., Davis, A.,
Soland, C. (1998)
Intelligent Animated Agents for Interactive Language Training
STiLL - ESCA Workshop on Speech Technology in Language Learning
Stockholm, Sweden, May 25-27, 1998
-
Cohen, M.M., Beskow, J., and Massaro, D.W. (1998).
Recent developments in facial animation: An inside view.
AVSP '98 (Dec 4-6, 1998, Sydney, Australia).
Postscript
-
Massaro, D.W., Cohen, M.M., Daniel, S. and Cole, R.A. (in press).
Developing and evaluating conversational agents
In P.A. Hancock (Ed.) Human Performance and Ergonomics.
Handbook of Perception and Cognition, 17
San Diego: Academic Press.
This material is based upon work supported by the National Science Foundation
under Grants Nos.
CDA-9726363,
BCS-9905176, and
IIS-0086107,
the Public Health Service under Grant No.
PHS R01 DC00236,
cooperative grants from the
Intel Corporation
and the University of
California Digital Media Program, and grants from the
University of
California-Santa Cruz.
This page has been accessed 1286 times since Tue Jan 6 16:12:21 PST 1998.