abstract

Extending Discrete Verbal Commands with Continuous Speech for Flexible Robot Control

Authors:

Naoya Yoshimura,

Hironori Yoshida,

Fabrice Matulic,

Takeo IgarashiAuthors Info & Claims

CHI EA '19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems

Paper No.: LBW0165, Pages 1 - 6

https://doi.org/10.1145/3290607.3312791

Published: 02 May 2019 Publication History

Get Access

Abstract

Speech is a direct and intuitive method to control a robot. While natural speech can capture a rich variety of commands, verbal input is poorly suited to finer grained and real-time control of continuous actions such as short and precise motion commands. For these types of operations, continuous non-verbal speech is more suitable, but it lacks the naturalness and vocabulary breadth of verbal speech. In this work, we propose to combine the two types of vocal input by extending the last vowel of a verbal command to support real-time and smooth control of robot actions. We demonstrate the effectiveness of this novel hybrid speech input method in a beverage-pouring task, where users instruct a robot arm to pour specific quantities of liquid into a cup. A user evaluation reveals that hybrid speech improves on simple verbal-only commands.

Supplementary Material

MP4 File (lbw0165p.mp4)

Preview video

Download
2.07 MB

References

[1]

Jeff A Bilmes, Xiao Li, Jonathan Malkin, Kelley Kilanski, Richard Wright, Katrin Kirchhoff, Amarnag Subramanya, Susumu Harada, James A Landay, Patricia Dowden, et al. 2005. The Vocal Joystick: A voice-based human-computer interface for individuals with motor impairments. In Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 995--1002.

Digital Library

Google Scholar

[2]

Masataka Goto, Katunobu Itou, and Satoru Hayamizu. 2002. Speech completion: On-demand completion assistance using filled pauses for speech input interfaces. In Seventh International Conference on Spoken Language Processing.

Google Scholar

[3]

Brandi House, Jonathan Malkin, and Jeff Bilmes. 2009. The VoiceBot: a voice controlled robot arm. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 183--192.

Digital Library

Google Scholar

[4]

Takeo Igarashi and John F Hughes. 2001. Voice as sound: using non-verbal voice input for interactive control. In Proceedings of the 14th annual ACM symposium on User interface software and technology. ACM, 155--156.

Digital Library

Google Scholar

[5]

Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, et al. 2011. The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society.

Google Scholar

[6]

Shizuka Takahashi and Ikuo Mizuuchi. 2017. Operating a robot by nonverbal voice based on ranges of formants. In 2017 3rd International Conference on Control, Automation and Robotics. IEEE, 202--205.

Crossref

Google Scholar

Cited By

View all

Xian Yuen DYong Chen Pang AYang ZChong CKuan Lim MLo D(2023)ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems2023 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST57152.2023.00050(461-463)Online publication date: Apr-2023
https://doi.org/10.1109/ICST57152.2023.00050

Index Terms

Extending Discrete Verbal Commands with Continuous Speech for Flexible Robot Control
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods

Recommendations

Talk ROILA to your Robot
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction

In our research we present a speech recognition friendly artificial language that is specially designed and implemented for humans to talk to robots. We call this language Robot Interaction Language (ROILA). In this paper, we describe our current work ...
Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition

In this paper, we investigate the combination of complementary acoustic feature streams in large-vocabulary continuous speech recognition (LVCSR). We have explored the use of acoustic features obtained using a pitch-synchronous analysis, Straight, in ...
Using tone information in Cantonese continuous speech recognition

In Chinese languages, tones carry important information at various linguistic levels. This research is based on the belief that tone information, if acquired accurately and utilized effectively, contributes to the automatic speech recognition of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

CHI EA '19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems

May 2019

3673 pages

ISBN:9781450359719

DOI:10.1145/3290607

General Chairs:
Stephen Brewster
University of Glasgow, Scotland, UK
,
Geraldine Fitzpatrick
TU Wien, Austria
,
Program Chairs:
Anna Cox
University College London, UK
,
Vassilis Kostakos
University of Melbourne, Australia

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 May 2019

Check for updates

Author Tags

Qualifiers

Abstract

Conference

CHI '19

Sponsor:

SIGCHI

CHI '19: CHI Conference on Human Factors in Computing Systems

May 4 - 9, 2019

Glasgow, Scotland Uk

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
176
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)5

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Xian Yuen DYong Chen Pang AYang ZChong CKuan Lim MLo D(2023)ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems2023 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST57152.2023.00050(461-463)Online publication date: Apr-2023
https://doi.org/10.1109/ICST57152.2023.00050

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Talk ROILA to your Robot

Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition

Using tone information in Cantonese continuous speech recognition