More Web Proxy on the site http://driver.im/

poster

Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues

Authors:

Chee Wee Leong,

Christopher Kitchen,

Chong Min LeeAuthors Info & Claims

ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

Pages 200 - 203

https://doi.org/10.1145/2663204.2663265

Published: 12 November 2014 Publication History

Abstract

Traditional assessments of public speaking skills rely on human scoring. We report an initial study on the development of an automated scoring model for public speaking performances using multimodal technologies. Task design, rubric development, and human rating were conducted according to standards in educational assessment. An initial corpus of 17 speakers with 4 speaking tasks was collected using audio, video, and 3D motion capturing devices. A scoring model based on basic features in the speech content, speech delivery, and hand, body, and head movements significantly predicts human rating, suggesting the feasibility of using multimodal technologies in the assessment of public speaking skills.

References

[1]

L. Batrinca, G. Stratou, A. Shapiro, L.-P. Morency, and S. Scherer. Cicero towards a multimodal virtual audience platform for public speaking training. In Intelligent Virtual Agents, pages 116--128, 2013.

[2]

P. Boersma and D. Weeninck. Praat, a system for doing phonetics by computer. Technical Report 132, University of Amsterdam, Inst. of Phonetic Sc., 1996.

[3]

L. Chen, K. Zechner, and X. Xi. Improved pronunciation features for construct-driven assessment of non-native spontaneous speech. In NAACL-HLT, 2009.

Digital Library

[4]

T. Giraud, M. Soury, J. Hua, et al. Multimodal expressions of stress during a public speaking task: Collection, annotation and global analyses. In Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on, pages 417--422. IEEE, 2013.

Digital Library

[5]

A. Kleinsmith and N. Bianchi-Berthouze. Affective body expression perception and recognition: A survey. Affective Computing, IEEE Transactions on, 4(1):15--33, 2013.

Digital Library

[6]

M. Kuhn. Building predictive models in r using the caret package. Journal of Statistical Software, 28(5):1--26, 2008.

[7]

K. Kurihara, M. Goto, J. Ogata, Y. Matsusaka, and T. Igarashi. Presentation sensei: a presentation training system using speech and image processing. In Proceedings of the 9th international conference on Multimodal interfaces, pages 358--365. ACM, 2007.

Digital Library

[8]

P. C. Kyllonen. Measurement of 21st century skills within the common core state standards. In Invitational Research Symposium on Technology Enhanced Assessments. May, pages 7--8, 2012.

[9]

X. Lu. Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4):474--496, 2010.

[10]

L.-P. Morency, J. Whitehill, and J. Movellan. Monocular head pose estimation using generalized adaptive view-based appearance model. Image and Vision Computing, 28(5):754--761, 2010.

Digital Library

[11]

A.-T. Nguyen, W. Chen, and M. Rauterberg. Online feedback system for public speakers. In IEEE Symp. e-Learning, e-Management and e-Services. Citeseer, 2012.

[12]

C. B. Pull. Current status of knowledge on public-speaking anxiety. Current opinion in psychiatry, 25(1):32--38, 2012.

[13]

S. Scherer, G. Layher, J. Kane, H. Neumann, and N. Campbell. An audiovisual political speech analysis incorporating eye-tracking and perception data. In LREC, pages 1114--1120, 2012.

[14]

L. M. Schreiber, G. D. Paul, and L. R. Shibley. The development and test of the public speaking competence rubric. Communication Education, 61(3):205--233, 2012.

[15]

S. M. Witt. Use of Speech Recognition in Computer-assisted Language Learning. PhD thesis, University of Cambridge, 1999.

[16]

J. Yuan and M. Liberman. Speaker identi?cation on the scotus corpus. In Proc. of Acoustics, 2008.

Cited By

Joseph JThomas BJose JPathak N(2024)Decoding the growth of multimodal learning: A bibliometric exploration of its impact and influenceIntelligent Decision Technologies10.3233/IDT-23072718:1(151-167)Online publication date: 20-Feb-2024
https://doi.org/10.3233/IDT-230727
Kyllonen PSevak AOber TChoi ISparks JFishtein D(2024)Charting the Future of AssessmentsETS Research Report Series10.1002/ets2.123882024:1(1-62)Online publication date: 21-Nov-2024
https://doi.org/10.1002/ets2.12388
Niebuhr OSiegert I(2023)A digital “flat affect”? Popular speech compression codecs and their effects on emotional prosodyFrontiers in Communication10.3389/fcomm.2023.9721828Online publication date: 23-Mar-2023
https://doi.org/10.3389/fcomm.2023.972182
Show More Cited By

Index Terms

Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues
1. Information systems
  1. Information systems applications
    1. Multimedia information systems

Recommendations

Using Multimodal Cues to Analyze MLA'14 Oral Presentation Quality Corpus: Presentation Delivery and Slides Quality
MLA '14: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge

The ability of making presentation slides and delivering them effectively to convey information to the audience is a task of increasing importance, particularly in the pursuit of both academic and professional career success. We envision that multimodal ...
Multimodal Public Speaking Performance Assessment
ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

The ability to speak proficiently in public is essential for many professions and in everyday life. Public speaking skills are difficult to master and require extensive training. Recent developments in technology enable new approaches for public ...
Presentation Trainer, your Public Speaking Multimodal Coach
ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

The Presentation Trainer is a multimodal tool designed to support the practice of public speaking skills, by giving the user real-time feedback about different aspects of her nonverbal communication. It tracks the user's voice and body to interpret her ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

November 2014

558 pages

ISBN:9781450328852

DOI:10.1145/2663204

General Chairs:
Albert Ali Salah
Boğaziçi University, Turkey
,
Jeffrey Cohn
University of Pittsburgh, USA
,
Björn Schuller
University of Passau, Germany and Imperial College London, UK
,
Program Chairs:
Oya Aran
Idiap Research Institute, Switzerland
,
Louis-Philippe Morency
University of Southern California, USA
,
Philip R. Cohen
Adapx, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

ICMI '14

Sponsor:

SIGCHI

ICMI '14: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 12 - 16, 2014

Istanbul, Turkey

Acceptance Rates

ICMI '14 Paper Acceptance Rate 51 of 127 submissions, 40%;

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

76
Total Citations
View Citations
950
Total Downloads

Downloads (Last 12 months)38
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Joseph JThomas BJose JPathak N(2024)Decoding the growth of multimodal learning: A bibliometric exploration of its impact and influenceIntelligent Decision Technologies10.3233/IDT-23072718:1(151-167)Online publication date: 20-Feb-2024
https://doi.org/10.3233/IDT-230727
Kyllonen PSevak AOber TChoi ISparks JFishtein D(2024)Charting the Future of AssessmentsETS Research Report Series10.1002/ets2.123882024:1(1-62)Online publication date: 21-Nov-2024
https://doi.org/10.1002/ets2.12388
Niebuhr OSiegert I(2023)A digital “flat affect”? Popular speech compression codecs and their effects on emotional prosodyFrontiers in Communication10.3389/fcomm.2023.9721828Online publication date: 23-Mar-2023
https://doi.org/10.3389/fcomm.2023.972182
Mohammed AMir MGill R(2023)Machine learning approaches to analyzing public speaking and vocal deliveryLondon Journal of Social Sciences10.31039/ljss.2023.6.106(69-74)Online publication date: 17-Sep-2023
https://doi.org/10.31039/ljss.2023.6.106
Regnell S(2023)Numerical Ratings and Content Labeling of Speeches in an Educational Public Speaking ProgramEuropean Journal of Educational Research10.12973/eu-jer.12.2.825volume-12-2023:volume-12-issue-2-april-2023(825-835)Online publication date: 15-Apr-2023
https://doi.org/10.12973/eu-jer.12.2.825
Fu KChen YCao JTong XLC R(2023)"I Am a Mirror Dweller": Probing the Unique Strategies Users Take to Communicate in the Context of Mirrors in Social Virtual RealityProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581464(1-19)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581464
Siddiqui HIrfan HLakhani AAhmed BShaikh SMovania MFarhan M(2023)Manifest: Public Speaking Training Using Virtual Reality2023 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)10.1109/ISMAR-Adjunct60411.2023.00102(468-473)Online publication date: 16-Oct-2023
https://doi.org/10.1109/ISMAR-Adjunct60411.2023.00102
Song WWu BZheng CZhang H(2023)Detection Of Public Speaking Anxiety: A New Dataset And Algorithm2023 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME55011.2023.00448(2633-2638)Online publication date: Jul-2023
https://doi.org/10.1109/ICME55011.2023.00448
Nirjhar E(2023)Expression and Perception of Stress Through the Lens of Multimodal Signals: A Case Study in Interpersonal Communication Settings2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)10.1109/ACIIW59127.2023.10388186(1-5)Online publication date: 10-Sep-2023
https://doi.org/10.1109/ACIIW59127.2023.10388186
Tun SOkada SHuang HLeong C(2023)Multimodal Transfer Learning for Oral Presentation AssessmentIEEE Access10.1109/ACCESS.2023.329583211(84013-84026)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3295832
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents