[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2663204.2663265acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
poster

Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues

Published: 12 November 2014 Publication History

Abstract

Traditional assessments of public speaking skills rely on human scoring. We report an initial study on the development of an automated scoring model for public speaking performances using multimodal technologies. Task design, rubric development, and human rating were conducted according to standards in educational assessment. An initial corpus of 17 speakers with 4 speaking tasks was collected using audio, video, and 3D motion capturing devices. A scoring model based on basic features in the speech content, speech delivery, and hand, body, and head movements significantly predicts human rating, suggesting the feasibility of using multimodal technologies in the assessment of public speaking skills.

References

[1]
L. Batrinca, G. Stratou, A. Shapiro, L.-P. Morency, and S. Scherer. Cicero towards a multimodal virtual audience platform for public speaking training. In Intelligent Virtual Agents, pages 116--128, 2013.
[2]
P. Boersma and D. Weeninck. Praat, a system for doing phonetics by computer. Technical Report 132, University of Amsterdam, Inst. of Phonetic Sc., 1996.
[3]
L. Chen, K. Zechner, and X. Xi. Improved pronunciation features for construct-driven assessment of non-native spontaneous speech. In NAACL-HLT, 2009.
[4]
T. Giraud, M. Soury, J. Hua, et al. Multimodal expressions of stress during a public speaking task: Collection, annotation and global analyses. In Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on, pages 417--422. IEEE, 2013.
[5]
A. Kleinsmith and N. Bianchi-Berthouze. Affective body expression perception and recognition: A survey. Affective Computing, IEEE Transactions on, 4(1):15--33, 2013.
[6]
M. Kuhn. Building predictive models in r using the caret package. Journal of Statistical Software, 28(5):1--26, 2008.
[7]
K. Kurihara, M. Goto, J. Ogata, Y. Matsusaka, and T. Igarashi. Presentation sensei: a presentation training system using speech and image processing. In Proceedings of the 9th international conference on Multimodal interfaces, pages 358--365. ACM, 2007.
[8]
P. C. Kyllonen. Measurement of 21st century skills within the common core state standards. In Invitational Research Symposium on Technology Enhanced Assessments. May, pages 7--8, 2012.
[9]
X. Lu. Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4):474--496, 2010.
[10]
L.-P. Morency, J. Whitehill, and J. Movellan. Monocular head pose estimation using generalized adaptive view-based appearance model. Image and Vision Computing, 28(5):754--761, 2010.
[11]
A.-T. Nguyen, W. Chen, and M. Rauterberg. Online feedback system for public speakers. In IEEE Symp. e-Learning, e-Management and e-Services. Citeseer, 2012.
[12]
C. B. Pull. Current status of knowledge on public-speaking anxiety. Current opinion in psychiatry, 25(1):32--38, 2012.
[13]
S. Scherer, G. Layher, J. Kane, H. Neumann, and N. Campbell. An audiovisual political speech analysis incorporating eye-tracking and perception data. In LREC, pages 1114--1120, 2012.
[14]
L. M. Schreiber, G. D. Paul, and L. R. Shibley. The development and test of the public speaking competence rubric. Communication Education, 61(3):205--233, 2012.
[15]
S. M. Witt. Use of Speech Recognition in Computer-assisted Language Learning. PhD thesis, University of Cambridge, 1999.
[16]
J. Yuan and M. Liberman. Speaker identi?cation on the scotus corpus. In Proc. of Acoustics, 2008.

Cited By

View all
  • (2024)Decoding the growth of multimodal learning: A bibliometric exploration of its impact and influenceIntelligent Decision Technologies10.3233/IDT-23072718:1(151-167)Online publication date: 20-Feb-2024
  • (2024)Charting the Future of AssessmentsETS Research Report Series10.1002/ets2.123882024:1(1-62)Online publication date: 21-Nov-2024
  • (2023)A digital “flat affect”? Popular speech compression codecs and their effects on emotional prosodyFrontiers in Communication10.3389/fcomm.2023.9721828Online publication date: 23-Mar-2023
  • Show More Cited By

Index Terms

  1. Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction
    November 2014
    558 pages
    ISBN:9781450328852
    DOI:10.1145/2663204
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 November 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. body tracking
    2. educational applications
    3. multimodal corpus
    4. multimodal presentation assessment
    5. public speaking

    Qualifiers

    • Poster

    Conference

    ICMI '14
    Sponsor:

    Acceptance Rates

    ICMI '14 Paper Acceptance Rate 51 of 127 submissions, 40%;
    Overall Acceptance Rate 453 of 1,080 submissions, 42%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Decoding the growth of multimodal learning: A bibliometric exploration of its impact and influenceIntelligent Decision Technologies10.3233/IDT-23072718:1(151-167)Online publication date: 20-Feb-2024
    • (2024)Charting the Future of AssessmentsETS Research Report Series10.1002/ets2.123882024:1(1-62)Online publication date: 21-Nov-2024
    • (2023)A digital “flat affect”? Popular speech compression codecs and their effects on emotional prosodyFrontiers in Communication10.3389/fcomm.2023.9721828Online publication date: 23-Mar-2023
    • (2023)Machine learning approaches to analyzing public speaking and vocal deliveryLondon Journal of Social Sciences10.31039/ljss.2023.6.106(69-74)Online publication date: 17-Sep-2023
    • (2023)Numerical Ratings and Content Labeling of Speeches in an Educational Public Speaking ProgramEuropean Journal of Educational Research10.12973/eu-jer.12.2.825volume-12-2023:volume-12-issue-2-april-2023(825-835)Online publication date: 15-Apr-2023
    • (2023)"I Am a Mirror Dweller": Probing the Unique Strategies Users Take to Communicate in the Context of Mirrors in Social Virtual RealityProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581464(1-19)Online publication date: 19-Apr-2023
    • (2023)Manifest: Public Speaking Training Using Virtual Reality2023 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)10.1109/ISMAR-Adjunct60411.2023.00102(468-473)Online publication date: 16-Oct-2023
    • (2023)Detection Of Public Speaking Anxiety: A New Dataset And Algorithm2023 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME55011.2023.00448(2633-2638)Online publication date: Jul-2023
    • (2023)Expression and Perception of Stress Through the Lens of Multimodal Signals: A Case Study in Interpersonal Communication Settings2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)10.1109/ACIIW59127.2023.10388186(1-5)Online publication date: 10-Sep-2023
    • (2023)Multimodal Transfer Learning for Oral Presentation AssessmentIEEE Access10.1109/ACCESS.2023.329583211(84013-84026)Online publication date: 2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media