More Web Proxy on the site http://driver.im/

extended-abstract

A Holistic Evaluation Methodology for Multi-Party Spoken Conversational Agents

Authors:

Angus Addlessee,

Daniel Hernández García,

Christian Dondrup,

Oliver LemonAuthors Info & Claims

IVA '24: Proceedings of the 24th ACM International Conference on Intelligent Virtual Agents

Article No.: 24, Pages 1 - 4

https://doi.org/10.1145/3652988.3673966

Published: 26 December 2024 Publication History

Abstract

While research in multi-party spoken conversation with intelligent embodied agents has made significant progress in sub-tasks like speaker identification and non-verbal cues, there’s a gap in fully autonomous applications users can directly interact with. This lack translates to the absence of a standard methodology for evaluating multi-party conversational speech agents that considers both task-based system performance and user experience.

Our research has addressed the former by developing a multi-modal robot receptionist for a hospital waiting room whose multi-party conversational ability, nonverbal behaviour, and dialogue management is implemented using Large Language Models (LLM). In this paper, we go on to address the issue of evaluation, describing an experimental methodology and design of task-based user experiments that captures both objective measures of multi-party dialogue performance (such as accurate tracking of user goals) and the users’ subjective experience of multi-party embodied conversations. This paper therefore presents a holistic methodology for the future evaluation of multi-party spoken conversational agents.

References

[1]

Angus Addlesee, Neeraj Cherakara, Nivan Nelson, Daniel Hernandez Garcia, Nancie Gunson, Weronika Sieińska, Christian Dondrup, and Oliver Lemon. 2024. Multi-party Multimodal Conversations Between Patients, Their Companions, and a Social Robot in a Hospital Memory Clinic. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, Nikolaos Aletras and Orphee De Clercq (Eds.). Association for Computational Linguistics, St. Julians, Malta, 62–70. https://aclanthology.org/2024.eacl-demo.8

[2]

Angus Addlesee, Neeraj Cherakara, Nivan Nelson, Daniel Hernández García, Nancie Gunson, Weronika Sieińska, Marta Romeo, Christian Dondrup, and Oliver Lemon. 2024. A Multi-party Conversational Social Robot Using LLMs. In Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (, Boulder, CO, USA,) (HRI ’24). Association for Computing Machinery, New York, NY, USA, 1273–1275. https://doi.org/10.1145/3610978.3641112

Digital Library

[3]

Angus Addlesee, Weronika Sieińska, Nancie Gunson, Daniel Hernández Garcia, Christian Dondrup, and Oliver Lemon. 2023. Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering. In Sigdial 2023. arxiv:2308.15231http://arxiv.org/abs/2308.15231

[4]

Dan Bohus and Eric Horvitz. 2011. Multiparty turn taking in situated dialog: Study, lessons, and directions. In Proceedings of the SIGDIAL 2011 Conference. 98–109.

[5]

Neeraj Cherakara, Finny Varghese, Sheena Shabana, Nivan Nelson, Abhiram Karukayil, Rohith Kulothungan, Mohammed Afil Farhan, Birthe Nesset, Meriam Moujahid, Tanvi Dinkar, Verena Rieser, and Oliver Lemon. 2023. FurChat: An Embodied Conversational Agent using LLMs, Combining Open and Closed-Domain Dialogue with Facial Expressions. In Sigdial 2023. 588–592. arxiv:2308.15214http://arxiv.org/abs/2308.15214

[6]

Sara Cooper, Alessandro Di Fava, Carlos Vivas, Luca Marchionni, and Francesco Ferro. 2020. ARI: The Social Assistive Robot and Companion. In 29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020. 745–751.

[7]

Arash Eshghi and Patrick GT Healey. 2016. Collective contexts in conversation: Grounding by proxy. Cognitive science 40, 2 (2016), 299–324.

[8]

Mary Ellen Foster, Bart Craenen, Amol Deshmukh, Oliver Lemon, Emanuele Bastianelli, Christian Dondrup, Ioannis Papaioannou, Andrea Vanzo, Jean-Marc Odobez, Olivier Canévet, 2019. MuMMER: Socially intelligent human-robot interaction in public spaces. arXiv preprint arXiv:1909.06749 (2019).

[9]

Press Furhat Robotics. 2018. FRAnny, Frankfurt Airport’s new multilingual robot concierge can help you in over 35 languages. Furhat Robotics Press Release (May 2018). https://furhatrobotics.com/press-releases/franny-frankfurt-airports-new-multilingual-robot-concierge-can-help-you-in-over-35-languages/

[10]

Sarah Gillet, Ronald Cumbal, André Pereira, José Lopes, Olov Engwall, and Iolanda Leite. 2021. Robot gaze can mediate participation imbalance in groups with diferent skill levels. In ACM/IEEE International Conference on Human-Robot Interaction. 303–311. https://doi.org/10.1145/3434073.3444670

Digital Library

[11]

Jia-Chen Gu, Chongyang Tao, and Zhen-Hua Ling. 2022. WHO Says WHAT to WHOM: A Survey of Multi-Party Conversations. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22).

[12]

Nancie Gunson, Daniel Hernández García, Weronika Sieińska, Christian Dondrup, and Oliver Lemon. 2022. Developing a Social Conversational Robot for the Hospital waiting room. In 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). IEEE, 1352–1357.

Digital Library

[13]

Eui Jun Hwang, Byeong Kyu Ahn, Bruce A. Macdonald, and Ho Seok Ahn. 2020. Demonstration of Hospital Receptionist Robot with Extended Hybrid Code Network to Select Responses and Gestures. In 2020 IEEE International Conference on Robotics and Automation (ICRA). 8013–8018. https://doi.org/10.1109/ICRA40945.2020.9197160

[14]

Adriana Lorena Iniguez-Carrillo, Laura Sanely Gaytan-Lugo, Miguel Angel Garcia-Ruiz, and Rocio Maciel-Arellano. 2021. Usability questionnaires to evaluate voice user interfaces. IEEE Latin America Transactions 19, 9 (2021), 1468–1477. https://doi.org/10.1109/TLA.2021.9468439

[15]

Koji Inoue, Hiromi Sakamoto, Kenta Yamamoto, Divesh Lala, and Tatsuya Kawahara. 2021. A multi-party attentive listening robot which stimulates involvement from side participants. In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue. 261–264. https://aclanthology.org/2021.sigdial-1.28

[16]

James R. Lewis and Mary L. Hardzinski. 2015. Investigating the psychometric properties of the Speech User Interface Service Quality questionnaire. International Journal of Speech Technology 18, 3 (2015), 479–487. https://doi.org/10.1007/s10772-015-9289-1

Digital Library

[17]

Meriam Moujahid, Helen Hastie, and Oliver Lemon. 2022. Multi-party Interaction with a Robot Receptionist. In ACM/IEEE International Conference on Human-Robot Interaction, Vol. 2022-March. IEEE, 927–931. https://doi.org/10.1109/HRI53351.2022.9889641

[18]

Prasanth Murali, Ian Steenstra, Timothy Bickmore, Hye Sun Yun, Ameneh Shamekhi, and Timothy Bickmore. 2023. Improving Multiparty Interactions with a Robot Using Large Language Models. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems(CHI EA ’23). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3544549.3585602

Digital Library

[19]

Tatsuya Nomura, Takayuki Kanda, and Tomohiro Suzuki. 2006. Experimental investigation into influence of negative attitudes toward robots on human-robot interaction. AI and Society 20, 2 (2006), 138–150. https://doi.org/10.1007/s00146-005-0012-7

Digital Library

[20]

Jekaterina Novikova, Oliver Lemon, and Verena Rieser. 2016. Crowd-sourcing NLG Data: Pictures Elicit Better Data. CoRR abs/1608.00339 (2016). arXiv:1608.00339http://arxiv.org/abs/1608.00339

[21]

Viktor Richter, Birte Carlmeyer, Florian Lier, Sebastian Meyer Zu Borgsen, David Schlangen, Franz Kummert, Sven Wachsmuth, and Britta Wrede. 2016. Are you talking to me? Improving the robustness of dialogue systems in a multi party HRI scenario by incorporating gaze direction and lip movement of attendees. In HAI 2016 - Proceedings of the 4th International Conference on Human Agent Interaction. 43–50. https://doi.org/10.1145/2974804.2974823

Digital Library

[22]

Ameneh Shamekhi and Timothy W. Bickmore. 2019. A multimodal robot-driven meeting facilitation system for group decision-making sessions. In ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction. 279–290. https://doi.org/10.1145/3340555.3353756

Digital Library

[23]

Gabriel Skantze, Martin Johansson, and Jonas Beskow. 2015. Exploring turn-taking cues in multi-party human-robot discussions about objects. ICMI 2015 - Proceedings of the 2015 ACM International Conference on Multimodal Interaction (2015), 67–74. https://doi.org/10.1145/2818346.2820749

Digital Library

[24]

Nicolas Spatola, Barbara Kühnlenz, and Gordon Cheng. 2021. Perception and Evaluation in Human–Robot Interaction: The Human–Robot Interaction Evaluation Scale (HRIES)—A Multicomponent Approach of Anthropomorphism. International Journal of Social Robotics 13, 7 (2021), 1517–1539. https://doi.org/10.1007/s12369-020-00667-4

[25]

Dina Utami and Timothy Bickmore. 2019. Collaborative User Responses in Multiparty Interaction with a Couples Counselor Robot. ACM/IEEE International Conference on Human-Robot Interaction 2019-March (2019), 294–303. https://doi.org/10.1109/HRI.2019.8673177

[26]

Evgenios Vlachos, Anne Faber Hansen, and Jakob Povl Holck. 2020. A robot in the library. In International conference on human-computer interaction. Springer.

Digital Library

[27]

Marilyn A Walker, Diane J Litman, Candace A Kamm, and Alicia Abella. 1997. PARADISE: A Framework for Evaluating Spoken Dialogue Agents. In Proceedings of the 8th Conference on European Chapter of the Association of Computational Linguistics, ACL ’97. 271–280.

[28]

Mateusz Żarkowski. 2019. Multi-party Turn-Taking in Repeated Human–Robot Interactions: An Interdisciplinary Evaluation. International Journal of Social Robotics 11, 5 (2019), 693–707. https://doi.org/10.1007/s12369-019-00603-1

Index Terms

A Holistic Evaluation Methodology for Multi-Party Spoken Conversational Agents
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Robotics
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
      1. User studies
    2. Interaction paradigms
      1. Natural language interfaces

Recommendations

Demonstration of a Robot Receptionist with Multi-party Situated Interaction
HRI '22: Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction

We present a demonstration of a Robot Receptionist: a situated interactive robot that can coordinate turn-taking and handle multi-party engagement and dialogue in dynamic environments, where users might enter or leave the scene at any time. We use a ...
Multi-party interaction with a virtual character and a human-like robot
VRST '13: Proceedings of the 19th ACM Symposium on Virtual Reality Software and Technology

Research on interactive virtual characters and social robots focuses mainly on one-to-one interactions and multi-party interactions concept are rather less explored. As we are developing these characters to be helpful to us in our daily lives as guides, ...
Conversational Agents: Acting on the Wave of Research and Development
CHI EA '19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems

In the last five years, work on software that interacts with people via typed or spoken natural language, called chatbots, intelligent assistants, social bots, virtual companions, non-human players, and so on, increased dramatically. Chatbots burst into ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

IVA '24: Proceedings of the 24th ACM International Conference on Intelligent Virtual Agents

September 2024

337 pages

ISBN:9798400706257

DOI:10.1145/3652988

Copyright © 2024 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 December 2024

Check for updates

Author Tags

Qualifiers

Extended-abstract
Research
Refereed limited

Funding Sources

Horizon 2020 Framework Programme

Conference

IVA '24

Sponsor:

SIGAI

IVA '24: ACM International Conference on Intelligent Virtual Agents

September 16 - 19, 2024

GLASGOW, United Kingdom

Acceptance Rates

Overall Acceptance Rate 53 of 196 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
60
Total Downloads

Downloads (Last 12 months)60
Downloads (Last 6 weeks)28

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten