Article

A computer-animated tutor for spoken and written language learning

Author:

Dominic W. MassaroAuthors Info & Claims

ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces

Pages 172 - 175

https://doi.org/10.1145/958432.958466

Published: 05 November 2003 Publication History

Get Access

Abstract

Baldi, a computer-animated talking head is introduced. The quality of his visible speech has been repeatedly modified and evaluated to accurately simulate naturally talking humans. Baldi's visible speech can be appropriately aligned with either synthesized or natural auditory speech. Baldi has had great success in teaching vocabulary and grammar to children with language challenges and training speech distinctions to children with hearing loss and to adults learning a new language. We demonstrate these learning programs and also demonstrate several other potential application areas for Baldi®.

References

[1]

Barker, L. J. (in press). Computer-assisted vocabulary acquisition: The CSLU vocabulary tutor in oral-deaf education. Journal of Deaf Studies and Deaf Education, in press.

Google Scholar

[2]

Bosseler, A. and Massaro, D. W. (in press). Development and Evaluation of a Computer-Animated Tutor for Vocabulary and Language Learning for Children with Autism. Journal of Autism and Developmental Disorders, in press. http://mambo.ucsc.edu/pdf/autism.pdf

Google Scholar

[3]

Cohen, M. M., Beskow, J., & Massaro, D. W. (1998). Recent developments in facial animation: An inside view. In D. Burnham, J. Robert-Ribes, & E. Vatikiotis-Bateson (Eds.) Proceedings of Auditory Visual Speech Perception '98. (pp. 201--206). Terrigal-Sydney Australia, December, 1998. AVSP '98 (December 4-6, 1998, Sydney, Australia).

Google Scholar

[4]

Massaro, D. W. (1998). Perceiving Talking Faces: From Speech Perception to a Behavioral Principle. MIT Press: Cambridge, MA.

Google Scholar

[5]

Massaro, D. W., & Light, J. (in press). Improving the vocabulary of children with hearing loss, Volta Review, in press.

Google Scholar

[6]

Massaro, D. W., & Light, J. (2003). Read My Tongue Movements: Bimodal Learning To Perceive And Produce Non-Native Speech /r/ and /l/. Eurospeech 2003-Switzerland (Interspeech). 8th European Conference on Speech Communication and Technology, Geneva, Switzerland.

Google Scholar

[7]

Massaro, D. W., & Light, J. (in press). Using Visible Speech for Training Perception and Production of Speech for Hard of Hearing Individuals. Volta Review, in press.

Google Scholar

[8]

Vygotsky, L. (1962). Thought and language. Cambridge, MA: MIT Press.

Google Scholar

Cited By

View all

Zhao THu ASu RLyu CWang LYan N(2021)Phonetic versus spatial processes during motor‐oriented imitations of visuo‐labial and visuo‐lingual speech: A functional near‐infrared spectroscopy studyEuropean Journal of Neuroscience10.1111/ejn.1555055:1(154-174)Online publication date: 23-Dec-2021
https://doi.org/10.1111/ejn.15550
Katz WMehta S(2015)Visual Feedback of Tongue Movement for Novel Speech Sound LearningFrontiers in Human Neuroscience10.3389/fnhum.2015.006129Online publication date: 19-Nov-2015
https://doi.org/10.3389/fnhum.2015.00612
Hamdan MAli AHassan A(2015)The effects of realism level of talking-head animated character on students' pronunciation learning2015 International Conference on Science in Information Technology (ICSITech)10.1109/ICSITech.2015.7407777(58-62)Online publication date: Oct-2015
https://doi.org/10.1109/ICSITech.2015.7407777
Show More Cited By

Index Terms

A computer-animated tutor for spoken and written language learning
1. Applied computing
  1. Law, social and behavioral sciences
    1. Psychology

Recommendations

Simulating vocal learning of spoken language: Beyond imitation
Abstract
Computational approaches have an important role to play in understanding the complex process of speech acquisition, in general, and have recently been popular in studies of vocal learning in particular. In this article we suggest that two ...
Highlights
- Computational study of vocal learning produces intelligible consonant–vowel syllables.
- Goal-oriented articulatory exploration based on language-oriented speech perception.
- Novel methodology for quantitative evaluation of vocal ...
Pronunciation similarity estimation for spoken language learning
ICCPOL'06: Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead

This paper presents an approach for estimating pronunciation similarity between two speakers using the cepstral distance. General speech recognition systems have been used to find the matched words of a speaker, using the acoustical score of a speech ...
Animated Lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions

In this paper we study the production and perception of speech in diverse conditions for the purposes of accurate, flexible and highly intelligible talking face animation. We recorded audio, video and facial motion capture data of a talker uttering a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces

November 2003

318 pages

ISBN:1581136218

DOI:10.1145/958432

Conference Chair:
Sharon Oviatt
Oregon Health & Science University
,
Program Chairs:
Trevor Darrell
Massachusetts Institute of Technology
,
Mark Maybury
MITRE
,
Wolfgang Wahlster
DFKI, Germany

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 November 2003

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ICMI-PUI03

Sponsor:

ICMI-PUI03: International Conference on Multimodal User Interfaces

November 5 - 7, 2003

British Columbia, Vancouver, Canada

Acceptance Rates

ICMI '03 Paper Acceptance Rate 45 of 130 submissions, 35%;

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
749
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zhao THu ASu RLyu CWang LYan N(2021)Phonetic versus spatial processes during motor‐oriented imitations of visuo‐labial and visuo‐lingual speech: A functional near‐infrared spectroscopy studyEuropean Journal of Neuroscience10.1111/ejn.1555055:1(154-174)Online publication date: 23-Dec-2021
https://doi.org/10.1111/ejn.15550
Katz WMehta S(2015)Visual Feedback of Tongue Movement for Novel Speech Sound LearningFrontiers in Human Neuroscience10.3389/fnhum.2015.006129Online publication date: 19-Nov-2015
https://doi.org/10.3389/fnhum.2015.00612
Hamdan MAli AHassan A(2015)The effects of realism level of talking-head animated character on students' pronunciation learning2015 International Conference on Science in Information Technology (ICSITech)10.1109/ICSITech.2015.7407777(58-62)Online publication date: Oct-2015
https://doi.org/10.1109/ICSITech.2015.7407777
Mattheyses WVerhelst W(2015)Audiovisual speech synthesisSpeech Communication10.1016/j.specom.2014.11.00166:C(182-217)Online publication date: 1-Feb-2015
https://dl.acm.org/doi/10.1016/j.specom.2014.11.001
Li RYu JWang Z(2015)Collision Handling in 3D Articulatory Animation for Chinese Speech ArticulationIFAC-PapersOnLine10.1016/j.ifacol.2015.12.26948:28(1047-1052)Online publication date: 2015
https://doi.org/10.1016/j.ifacol.2015.12.269
Hamdan MAli A(2014)The implication of realistic levels of the computer based animation character: A conceptual framework2014 9th International Conference on Computer Science & Education10.1109/ICCSE.2014.6926490(385-389)Online publication date: Aug-2014
https://doi.org/10.1109/ICCSE.2014.6926490
Jiang DZhao YSahli HZhang Y(2014)Speech driven photo realistic facial animation based on an articulatory DBN model and AAM featuresMultimedia Tools and Applications10.1007/s11042-013-1610-x73:1(397-415)Online publication date: 1-Nov-2014
https://dl.acm.org/doi/10.1007/s11042-013-1610-x
Ouni S(2013)Tongue control and its implication in pronunciation trainingComputer Assisted Language Learning10.1080/09588221.2012.76163727:5(439-453)Online publication date: 29-Jan-2013
https://doi.org/10.1080/09588221.2012.761637
Milne MLuerssen MLewis TLeibbrandt RPowers D(2011)Designing and Evaluating Interactive Agents as Social Skills Tutors for Children with Autism Spectrum DisorderConversational Agents and Natural Language Interaction10.4018/978-1-60960-617-6.ch002(23-48)Online publication date: 2011
https://doi.org/10.4018/978-1-60960-617-6.ch002
Chuensaichol TKanongchaiyos PWutiwiwatchai CLiu ZJorge JPan ZZhang XAu ODong W(2011)Lip synchronization from Thai speechProceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry10.1145/2087756.2087814(355-358)Online publication date: 11-Dec-2011
https://dl.acm.org/doi/10.1145/2087756.2087814
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Simulating vocal learning of spoken language: Beyond imitation

Pronunciation similarity estimation for spoken language learning

Animated Lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions