[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/985921.986054acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
Article

Human-robot speech interface understanding inexplicit utterances using vision

Published: 24 April 2004 Publication History

Abstract

Speech interfaces should have a capability of dealing with inexplicit utterances including such as ellipsis and deixis since they are common phenomena in our daily conversation. Their resolution using context and a priori knowledge has been investigated in the fields of natural language and speech understanding. However, there are utterances that cannot be understood by such symbol processing alone. In this paper, we consider inexplicit utterances caused from the fact that humans have vision. If we are certain that the listeners share some visual information, we often omit or mention ambiguously things about it in our utterances. We propose a method of understanding speech with such ambiguities using computer vision. It tracks the human's gaze direction, detecting objects in the direction. It also recognizes the human's actions. Based on these bits of visual information, it understands the human's inexplicit utterances. Experimental results show that the method helps to realize human-friendly speech interfaces.

References

[1]
Coombs, D. and Brown, C., Real-time binocular smooth pursuit. IJCV 11, 2 (1993), 147--164.
[2]
Gibson, J.J., The Ecological Approach to Visual Perception. Houghton Mifflin, 1979.
[3]
Graf, B. and Hagele, M., Dependable interaction with an intelligent home care robot. Proc. ICRA 2001, 21--26.
[4]
Grice, H.P., Logic and Conversation Syntax. Harvard University Press, 120--150, 1975.
[5]
Kaur, M., Tremaine, M., Huang, N., Wilder, J., and Zoran., "Where is "it"? Event synchronization in gaze-speech input systems. Proc. ICMI 2003, 151--158.
[6]
MAlib development team. http://www.malib.net/.
[7]
Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Matsuda, H., Takaoka, K., and Asahara, M., Japanese Morphological Analysis System ChaSen Version 2.2.4 Manual. Nara Institute of Science and Technology, 2001 (in Japanese).
[8]
Schiehlen, M., Ellipsis resolution with underspecified scope. Proc. ACL 2000, 72--79.
[9]
Seabra Lopes, L. and Teixeira, A., Human-robot interaction through spoken language dialog. Proc. IROS 2000, 528--534.
[10]
Watanabe, M., Masui, F., Kawai, A., and Shino, T., Conversational ellipsis and its complement. Trans. IEICE 2000 SP2000-99, 31-36 (in Japanese).
[11]
Yoshizaki, M., Nakamura, A., and Kuno, Y., Vision-speech system adapting to the user and environment for service robots. Proc. IROS 2003, 1290--1295.

Cited By

View all
  • (2023)A hybrid model for gesture recognition and speech synchronization2023 IEEE International Conference on Advanced Robotics and Its Social Impacts (ARSO)10.1109/ARSO56563.2023.10187512(27-32)Online publication date: 5-Jun-2023
  • (2022)Toward the search for the perfect blade runner: a large-scale, international assessment of a test that screens for “humanness sensitivity”AI & SOCIETY10.1007/s00146-022-01398-y38:4(1543-1563)Online publication date: 1-Mar-2022
  • (2014)You just do not understand me! Speech Recognition in Human Robot InteractionThe 23rd IEEE International Symposium on Robot and Human Interactive Communication10.1109/ROMAN.2014.6926324(637-642)Online publication date: Aug-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI EA '04: CHI '04 Extended Abstracts on Human Factors in Computing Systems
April 2004
975 pages
ISBN:1581137036
DOI:10.1145/985921
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 April 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computer vision
  2. deixis
  3. ellipsis
  4. gaze
  5. multimodal interface
  6. natural language processing
  7. robot
  8. speech understanding

Qualifiers

  • Article

Conference

CHI04
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A hybrid model for gesture recognition and speech synchronization2023 IEEE International Conference on Advanced Robotics and Its Social Impacts (ARSO)10.1109/ARSO56563.2023.10187512(27-32)Online publication date: 5-Jun-2023
  • (2022)Toward the search for the perfect blade runner: a large-scale, international assessment of a test that screens for “humanness sensitivity”AI & SOCIETY10.1007/s00146-022-01398-y38:4(1543-1563)Online publication date: 1-Mar-2022
  • (2014)You just do not understand me! Speech Recognition in Human Robot InteractionThe 23rd IEEE International Symposium on Robot and Human Interactive Communication10.1109/ROMAN.2014.6926324(637-642)Online publication date: Aug-2014
  • (2013)A Framework Design for Human-Robot InteractionAdvanced Technologies, Embedded and Multimedia for Human-centric Computing10.1007/978-94-007-7262-5_118(1043-1048)Online publication date: 13-Nov-2013
  • (2012)Improving Speech Recognition with the Robot Interaction LanguageDisruptive Science and Technology10.1089/dst.2012.00101:2(79-88)Online publication date: Jun-2012
  • (2012)Two-handed gesture recognition and fusion with speech to command a robotAutonomous Robots10.1007/s10514-011-9263-y32:2(129-147)Online publication date: 1-Feb-2012
  • (2010)Using child-robot interaction to investigate the user acceptance of constrained and artificial languages19th International Symposium in Robot and Human Interactive Communication10.1109/ROMAN.2010.5598731(588-593)Online publication date: Sep-2010
  • (2009)Object recognition in service robots: Conducting verbal interaction on color and spatial relationship2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops10.1109/ICCVW.2009.5457530(2025-2031)Online publication date: Sep-2009
  • (2007)Recognition of household objects by service robots through interactive and autonomous methodsProceedings of the 3rd international conference on Advances in visual computing - Volume Part II10.5555/1779090.1779107(140-151)Online publication date: 26-Nov-2007
  • (2007)Recognition of Household Objects by Service Robots Through Interactive and Autonomous MethodsAdvances in Visual Computing10.1007/978-3-540-76856-2_14(140-151)Online publication date: 2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media