[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3011263.3011272acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
short-paper

On data driven parametric backchannel synthesis for expressing attentiveness in conversational agents

Published: 12 November 2016 Publication History

Abstract

In this study, we are using a multi-party recording as a template for building a parametric speech synthesiser which is able to express different levels of attentiveness in backchannel tokens. This allowed us to investigate i) whether it is possible to express the same perceived level of attentiveness in synthesised than in natural backchannels; ii) whether it is possible to increase and decrease the perceived level of attentiveness of backchannels beyond the range observed in the original corpus.

References

[1]
A. Black and P. Muthukumar. Random forests for statistical speech synthesis. In Interspeech 2015, Dresden, Germany, 2015.
[2]
A. W. Black. Clustergen: a statistical parametric synthesizer using trajectory modeling. In Interspeech 2006, 2006.
[3]
N. Campbell. Towards conversational speech synthesis; lessons learned from the expressive speech processing project. In SSW 2207, pages 22--27, Bonn, Germany, 2007.
[4]
K. Ehlich. Interjektionen. Max Niemeyer Verlag, 1986.
[5]
A. Hunt and A. Black. Unit selection in a concatenative speech synthesis system using a large speech database. In ICASSP-96, volume 1, pages 373--376, Atlanta, Georgia, 1996.
[6]
B. Inden, Z. Malisz, P. Wagner, and I. Wachsmuth. Timing and entrainment of multimodal backchanneling behavior for an embodied conversational agent. Proceedings of the 15th ACM on International conference on multimodal interaction, pages 181--188, 2013.
[7]
L.-P. Morency, I. de Kok, and J. Gratch. Predicting listener backchannels: A probabilistic multimodal approach. In Intelligent Virtual Agents, pages 176--190. Springer, 2008.
[8]
C. Oertel, K. A. Funes Mora, J. Gustafson, and J.-M. Odobez. Deciphering the silent participant: On the use of audio-visual cues for the classification of listener categories in group discussions. In International Conference on Multimodal Interaction. ACM, 2015.
[9]
C. Oertel, K. A. Funes Mora, S. Sheikhi, J.-M. Odobez, and J. Gustafson. Who will get the grant?: A multimodal corpus for the analysis of conversational behaviours in group interviews. In Proceedings of the 2014 Workshop on Understanding and Modeling Multiparty, Multimodal Interactions, UM3I '14, pages 27--32, 2014.
[10]
C. Oertel, J. Gustafson, and A. W. Black. Towards building an attentive artificial listener: On the perception of attentiveness in feedback utterances. In Proc. of Interspeech, pages 2915--2919, 2016.
[11]
S. C. Pammi, M. Schröder, M. Charfuelan, O. Türk, and I. Steiner. Synthesis of listener vocalisations with imposed intonation contours. In Seventh ISCA Tutorial and Research Workshop on Speech Synthesis. ISCA, ISCA, 2010.
[12]
T. Stocksmeier, S. Kopp, and D. Gibbon. Synthesis of prosodic attitudinal variants in german backchannel ja. In Interspeech 2007, pages 1290--1293, Antwerp, Belgium, 2007.
[13]
A. Syrdal, A. Conkie, Y. Kim, and M. Beutnagel. Speech acts and dialog tts. In SSW 7, Keihanna, Japan, 2010.
[14]
N. G. Ward. Possible lexical cues for backchannel responses. In Feedback Behaviors in Dialog, 2012.
[15]
N. G. Ward and R. Escalante-Ruiz. Using responsive prosodic variation to acknowledge the user's current state. In Interspeech 2009, Brighton, UK, 2009.
[16]
H. Zen, K. Tokuda, and A. Black. Statistical parametric speech synthesis. Speech Communication, 51(11):1059--1064, 2009.

Cited By

View all
  • (2022)Phonetic Convergence in a Prototype Dialogue SystemCurrent Issues in Descriptive Linguistics and Digital Humanities10.1007/978-981-19-2932-8_48(705-719)Online publication date: 1-Dec-2022
  • (2020)The Harmonia Corpus – A Dialogue Corpus for Automatic Analysis of Phonetic ConvergenceHuman Language Technology. Challenges for Computer Science and Linguistics10.1007/978-3-030-66527-2_11(149-163)Online publication date: 31-Dec-2020
  • (2017)Using crowd-sourcing for the design of listening agents: challenges and opportunitiesProceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents10.1145/3139491.3139499(37-38)Online publication date: 13-Nov-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
MA3HMI '16: Proceedings of the Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction
November 2016
64 pages
ISBN:9781450345620
DOI:10.1145/3011263
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attentive agents
  2. backchannels
  3. synthesis

Qualifiers

  • Short-paper

Conference

ICMI '16

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Phonetic Convergence in a Prototype Dialogue SystemCurrent Issues in Descriptive Linguistics and Digital Humanities10.1007/978-981-19-2932-8_48(705-719)Online publication date: 1-Dec-2022
  • (2020)The Harmonia Corpus – A Dialogue Corpus for Automatic Analysis of Phonetic ConvergenceHuman Language Technology. Challenges for Computer Science and Linguistics10.1007/978-3-030-66527-2_11(149-163)Online publication date: 31-Dec-2020
  • (2017)Using crowd-sourcing for the design of listening agents: challenges and opportunitiesProceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents10.1145/3139491.3139499(37-38)Online publication date: 13-Nov-2017
  • (2017)A corpus for experimental study of affect bursts in human-robot interactionProceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents10.1145/3139491.3139496(20-21)Online publication date: 13-Nov-2017
  • (2017)Nonverbal conversation expressions processing for human-agent interactions2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII.2017.8273663(601-605)Online publication date: Oct-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media