[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3450618.3469176acmconferencesArticle/Chapter ViewAbstractPublication PagessiggraphConference Proceedingsconference-collections
poster

Silent Speech and Emotion Recognition from Vocal Tract Shape Dynamics in Real-Time MRI

Published: 06 August 2021 Publication History

Abstract

We propose a novel deep neural network-based learning framework that understands acoustic information in the variable-length sequence of vocal tract shaping during speech production, captured by real-time magnetic resonance imaging (rtMRI), and translate it into text. In an experiment, it achieved a 40.6% PER at sentence-level, much better compared to the existing models. We also performed an analysis of variations in the geometry of articulation in each sub-regions of the vocal tract with respect to different emotions and genders. Results suggest that each sub-regions distortion is affected by both emotion and gender.

Supplementary Material

VTT File (3450618.3469176.vtt)
a27-pandey-supplement (a27-pandey-poster.pdf)
Poster
MP4 File (3450618.3469176.mp4)
Presentation.

References

[1]
Jangwon Kim and et al.2014. USC-EMO-MRI corpus: An Emotional Speech Production Database Recorded by Real-time Magnetic Resonance Imaging.
[2]
Shrikanth Narayanan and et al.2014. Real-time Magnetic Resonance Imaging and Electromagnetic Articulography Database for Speech Production Research (TC). 136 (2014), 1307.
[3]
Pramit Saha and et al.2018. Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI. 1249–1253.
[4]
Kicky van Leeuwen and et al.2019. CNN-Based Phoneme Classifier from Vocal Tract MRI Learns Embedding Consistent with Articulatory Topology. 909–913.

Cited By

View all
  • (2024)Research Agenda for Speaker AuthenticationHuman Aspects of Information Security and Assurance10.1007/978-3-031-72559-3_19(278-291)Online publication date: 28-Nov-2024
  • (2023)An open-source toolbox for measuring vocal tract shape from real-time magnetic resonance imagesBehavior Research Methods10.3758/s13428-023-02171-956:3(2623-2635)Online publication date: 28-Jul-2023
  • (2023)EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic SensingProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580801(1-18)Online publication date: 19-Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGGRAPH '21: ACM SIGGRAPH 2021 Posters
August 2021
90 pages
ISBN:9781450383714
DOI:10.1145/3450618
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 August 2021

Check for updates

Author Tags

  1. MRI
  2. accessibility
  3. neural network
  4. silent speech
  5. speech
  6. vocal tract

Qualifiers

  • Poster
  • Research
  • Refereed limited

Conference

SIGGRAPH '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,822 of 8,601 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)3
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Research Agenda for Speaker AuthenticationHuman Aspects of Information Security and Assurance10.1007/978-3-031-72559-3_19(278-291)Online publication date: 28-Nov-2024
  • (2023)An open-source toolbox for measuring vocal tract shape from real-time magnetic resonance imagesBehavior Research Methods10.3758/s13428-023-02171-956:3(2623-2635)Online publication date: 28-Jul-2023
  • (2023)EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic SensingProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580801(1-18)Online publication date: 19-Apr-2023
  • (2022)Design and Evaluation of a Silent Speech-Based Selection Method for Eye-Gaze PointingProceedings of the ACM on Human-Computer Interaction10.1145/35677236:ISS(328-353)Online publication date: 14-Nov-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media