More Web Proxy on the site http://driver.im/

Article

A probabilistic approach to reference resolution in multimodal user interfaces

Authors:

Michelle X. ZhouAuthors Info & Claims

IUI '04: Proceedings of the 9th international conference on Intelligent user interfaces

Pages 70 - 77

https://doi.org/10.1145/964442.964457

Published: 13 January 2004 Publication History

Abstract

Multimodal user interfaces allow users to interact with computers through multiple modalities, such as speech, gesture, and gaze. To be effective, multimodal user interfaces must correctly identify all objects which users refer to in their inputs. To systematically resolve different types of references, we have developed a probabilistic approach that uses a graph-matching algorithm. Our approach identifies the most probable referents by optimizing the satisfaction of semantic, temporal, and contextual constraints simultaneously. Our preliminary user study results indicate that our approach can successfully resolve a wide variety of referring expressions, ranging from simple to complex and from precise to ambiguous ones.

References

[1]

Bolt, R.A. Put that there: Voice and Gesture at the Graphics Interface. Computer Graphics, 1980, 14(3): 262--270.

Digital Library

[2]

Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L., Chang, K., Vilhjalmsson, H. and Yan, H. Embodiment in Conversational Interfaces: Rea. In Proceedings of the CHI'99 Conference, 1999, pp. 520--527. Pittsburgh, PA.

Digital Library

[3]

Chai, J., Pan, S., Zhou, M., and Houck, K. Context-based Multimodal Interpretation in Conversational Systems. Fourth International Conference on Multimodal Interfaces, 2002.

Digital Library

[4]

Cohen, P., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., and Clow, J. Quickset: Multimodal Interaction for Distributed Applications. Proceedings of ACM Multimedia, 1996. pp. 31--40.

Digital Library

[5]

Gold, S. and Rangarajan, A. A graduated assignment algorithm for graph-matching. IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 4, 1996.

Digital Library

[6]

Grosz, B. J. and Sidner, C. Attention, intention, and the structure of discourse. Computational Linguistics, 12(3):175--204. 1986.

Digital Library

[7]

Gustafson, J., Bell, L., Beskow, J., Boye J., Carlson, R., Edlund, J., Granstrom, B., House D., and Wiren, M. AdApt -- a Multimodal Conversational Dialogue System in an Apartment Domain. Proceedings of 6th International Conference on Spoken Language Processing (ICSLP), 2000.

[8]

Huls, C., Bos, E., and Classen, W. 1995. Automatic Referent Resolution of Deictic and Anaphoric Expressions. Computational Linguistics, 21(1):59--79.

Digital Library

[9]

Johnston, M, Cohen, P., McGee, D., Oviatt, S., Pittman, J. and Smith, I. Unification-based Multimodal Integration, Proceedings of ACL'97, 1997.

Digital Library

[10]

Johnston, M. Unification-based Multimodal parsing, Proceedings of COLING-ACL'98, 1998.

Digital Library

[11]

Johnston, M. and Bangalore, S. Finite-state multimodal parsing and understanding. Proc. COLING'00. 2000.

Digital Library

[12]

Kehler, A. Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction, Proceedings of AAAI'01, 2000, pp. 685--689.

Digital Library

[13]

Koons, D. B., Sparrell, C. J. and Thorisson, K. R. Integrating Simultaneous Input from Speech, Gaze, and Hand Gestures. In Intelligent Multimedia Interfaces, M. Maybury, Ed. MIT Press: Menlo Park, CA, 1993.

Digital Library

[14]

Neal, J. G., and Shapiro, S. C. Intelligent Multimedia Interface Technology. In Intelligent User Interfaces, J. Sullivan & S. Tyler, Eds. ACM: New York, 1991.

Digital Library

[15]

Neal, J. G., Thielman, C. Y., Dobes, Z. Haller, S. M., and Shapiro, S. C. Natural Language with Integrated Deictic and Graphic Gestures. Intelligent User Interfaces, M. Maybury and W. Wahlster (eds.), 38--51, 1998.

Digital Library

[16]

Oviatt, S. L. Multimodal interfaces for dynamic interactive maps. In Proceedings of Conference on Human Factors in Computing Systems: CHI '96, 1996, pp. 95--102.

Digital Library

[17]

Oviatt, S., DeAngeli, A., and Kuhn, K., Integration and Synchronization of Input Modes during Multimodal Human-Computer Interaction, In Proceedings of Conference on Human Factors in Computing Systems: CHI '97, 1997.

Digital Library

[18]

Oviatt, S. L. Mutual Disambiguation of Recognition Errors in a Multimodal Architecture. In Proceedings of Conference on Human Factors in Computing Systems: CHI '99.

Digital Library

[19]

Oviatt, S.L., Multimodal System Processing in Mobile Environments. In Proceedings of the Thirteenth Annual ACM Symposium on User Interface Software Technology (UIST'2000), 21-30. New York: ACM Press.

Digital Library

[20]

Stent, A., J. Dowding, J. M. Gawron, E. O. Bratt, and R. Moore, The commandtalk spoken dialog system. Proc. ACL'99, 1999, pages 183--190.

Digital Library

[21]

Stock, Oliviero, ALFRESCO: Enjoying the combination of natural language processing and hypermedia for information exploration. Intelligent Multimedia Interfaces, M. Maybury (ed.), 1993, pp. 197--224.

Digital Library

[22]

Tsai, W.H. and Fu, K.S. Error-correcting isomorphism of attributed relational graphs for pattern analysis. IEEE Trans. Sys., Man and Cyb., vol. 9, 1979, pp. 757--768.

[23]

Wahlster, W., User and Discourse Models for Multimodal Communication. Intelligent User Interfaces, M. Maybury and W. Wahlster (eds.), 1998, pp 359--370.

Digital Library

[24]

Zancanaro, M., Stock, O., and Strapparava, C. 1997. Multimodal Interaction for Information Access: Exploiting Cohesion. Computational Intelligence 13(7):439--464.

[25]

Zhou, M. X. and Pan, S. Automated authoring of coherent multimedia discourse for conversation systems. Proc. ACM MM'01, pages 555--559, 2001.

Digital Library

Cited By

Juston MDekhterman SNorris WNottage DSoylemezoglu A(2024)Hierarchical Rule-Base Reduction-Based ANFIS With Online Optimization Through DDPGIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.344914732:11(6350-6362)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/TFUZZ.2024.3449147
Domingo C(2023)Recording multimodal pair-programming dialogue for reference resolution by conversational agentsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614231(731-735)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3577190.3614231
Spevak KHan ZWilliams TDantam N(2022)Givenness Hierarchy Informed Optimal Document Planning for Situated Human-Robot Interaction2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9981811(6109-6115)Online publication date: 23-Oct-2022
https://doi.org/10.1109/IROS47612.2022.9981811
Show More Cited By

Index Terms

A probabilistic approach to reference resolution in multimodal user interfaces
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI theory, concepts and models
    2. Interaction paradigms
      1. Natural language interfaces
  2. Interaction design
    1. Interaction design theory, concepts and paradigms

Recommendations

Linguistic theories in efficient multimodal reference resolution: an empirical investigation
IUI '05: Proceedings of the 10th international conference on Intelligent user interfaces

Multimodal conversational interfaces provide a natural means for users to communicate with computer systems through multiple modalities such as speech, gesture, and gaze. To build effective multimodal interfaces, understanding user multimodal inputs is ...
Attention driven reference resolution in multimodal contexts

In recent years a number of psycholinguistic experiments have pointed to the interaction between language and vision. In particular, the interaction between visual attention and linguistic reference. In parallel with this, several theories of discourse ...
Usability of user interfaces: from monomodal to multimodal
BCS-HCI '07: Proceedings of the 21st British HCI Group Annual Conference on People and Computers: HCI...but not as we know it - Volume 2

This workshop is aimed at reviewing and comparing existing Usability Evaluation Methods (UEMs) which are applicable to monomodal and multimodal applications, whether they are web-oriented or not. It addresses the problem on how to assess the usability ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '04: Proceedings of the 9th international conference on Intelligent user interfaces

January 2004

396 pages

ISBN:1581138156

DOI:10.1145/964442

Conference Chair:
Jean Vanderdonckt
University Louvain (BE)
,
Program Chairs:
Nuno Jardim Nunes
University of Madeira (PT)
,
Charles Rich
Mitsubishi Electric Research Laboratories (US)

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

IUI-CADUI04

Sponsor:

IUI-CADUI04: Intelligent User Interface

January 13 - 16, 2004

Funchal, Madeira, Portugal

Acceptance Rates

IUI '04 Paper Acceptance Rate 72 of 140 submissions, 51%;

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

41
Total Citations
View Citations
895
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Juston MDekhterman SNorris WNottage DSoylemezoglu A(2024)Hierarchical Rule-Base Reduction-Based ANFIS With Online Optimization Through DDPGIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.344914732:11(6350-6362)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/TFUZZ.2024.3449147
Domingo C(2023)Recording multimodal pair-programming dialogue for reference resolution by conversational agentsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614231(731-735)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3577190.3614231
Spevak KHan ZWilliams TDantam N(2022)Givenness Hierarchy Informed Optimal Document Planning for Situated Human-Robot Interaction2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9981811(6109-6115)Online publication date: 23-Oct-2022
https://doi.org/10.1109/IROS47612.2022.9981811
Tellex SGopalan NKress-Gazit HMatuszek C(2020)Robots That Use LanguageAnnual Review of Control, Robotics, and Autonomous Systems10.1146/annurev-control-101119-0716283:1(25-55)Online publication date: 3-May-2020
https://doi.org/10.1146/annurev-control-101119-071628
Johnston M(2019)Multimodal integration for interactive conversational systemsThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233798(21-76)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.1145/3233795.3233798
Karray FAlemzadeh MSaleh JArab M(2017)Human-Computer Interaction: Overview on State of the ArtInternational Journal on Smart Sensing and Intelligent Systems10.21307/ijssis-2017-2831:1(137-159)Online publication date: 13-Dec-2017
https://doi.org/10.21307/ijssis-2017-283
Rego PRocha RFaria BReis LMoreira P(2017)A Serious Games Platform for Cognitive Rehabilitation with Preliminary EvaluationJournal of Medical Systems10.1007/s10916-016-0656-541:1(1-15)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1007/s10916-016-0656-5
Nautiyal LMalik PRam M(2017)Computer Interfaces in Diagnostic Process of Industrial EngineeringDiagnostic Techniques in Industrial Engineering10.1007/978-3-319-65497-3_5(157-170)Online publication date: 21-Oct-2017
https://doi.org/10.1007/978-3-319-65497-3_5
Chai JFang RLiu CShe L(2016)Collaborative Language Grounding Toward Situated Human‐Robot DialogueAI Magazine10.1609/aimag.v37i4.268437:4(32-45)Online publication date: 1-Dec-2016
https://dl.acm.org/doi/10.1609/aimag.v37i4.2684
Caschera MD’Ulizia AFerri FGrifoni P(2016)MCBF: Multimodal Corpora Building FrameworkHuman Language Technology. Challenges for Computer Science and Linguistics10.1007/978-3-319-43808-5_14(177-190)Online publication date: 30-Jul-2016
https://doi.org/10.1007/978-3-319-43808-5_14
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten