[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.3115/976909.979653dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Unification-based multimodal integration

Published: 07 July 1997 Publication History

Abstract

Recent empirical research has shown conclusive advantages of multimodal interaction over speech-only interaction for map-based tasks. This paper describes a multimodal language processing architecture which supports interfaces allowing simultaneous input from speech and gesture recognition. Integration of spoken and gestural input is driven by unification of typed feature structures representing the semantic contributions of the different modes. This integration method allows the component modalities to mutually compensate for each others' errors. It is implemented in Quick-Set, a multimodal (pen/voice) system that enables users to set up and control distributed interactive simulations.

References

[1]
Bolt, R. A., 1980. "Put-That-There": Voice and gesture at the graphics interface. Computer Graphics, 14.3:262--270.
[2]
Brison, E., and N. Vigouroux. (unpublished ms.). Multimodal references: A generic fusion process. URIT-URA CNRS. Universit Paul Sabatier, Toulouse, France.
[3]
Calder, J. 1987. Typed unification for natural language processing. In E. Klein and J. van Benthem, editors, Categories, Polymorphisms, and Unification, pages 65--72. Centre for Cognitive Science, University of Edinburgh, Edinburgh.
[4]
Carpenter, R. 1990. Typed feature structures: Inheritance, (In)equality, and Extensionality. In W. Daelemans and G. Gazdar, editors, Proceedings of the ITK Workshop: Inheritance in Natural Language Processing, pages 9--18, Tilburg. Institute for Language Technology and Artificial Intelligence, Tilburg University, Tilburg.
[5]
Carpenter, R. 1992. The logic of typed feature structures. Cambridge University Press, Cambridge, England.
[6]
Cheyer, A., and L. Julia. 1995. Multimodal maps: An agent-based approach. In International Conference on Cooperative Multimodal Communication (CMC/95), pages 24--26, May 1995. Eind-hoven, The Netherlands.
[7]
Clarkson, J. D., and J. Yi. 1996. LeatherNet: A synthetic forces tactical training system for the USMC commander. In Proceedings of the Sixth Conference on Computer Generated Forces and Behavioral Representation, pages 275--281. Institute for simulation and training. Technical Report IST-TR-96-18.
[8]
Cohen, P. R. 1991. Integrated interfaces for decision support with simulation. In B. Nelson, W. D. Kelton, and G. M. Clark, editors, Proceedings of the Winter Simulation Conference, pages 1066--1072. ACM, New York.
[9]
Cohen, P. R. 1992. The role of natural language in a multimodal interface. In Proceedings of UIST'92, pages 143--149. ACM Press, New York.
[10]
Cohen, P. R., A. Cheyer, M. Wang, and S. C. Baeg. 1994. An open agent architecture. In Working Notes of the AAAI Spring Symposium on Software Agents (March 21--22, Stanford University, Stanford, California), pages 1--8.
[11]
Courtemanche, A. J., and A. Ceranowicz. 1995. ModSAF development status. In Proceedings of the Fifth Conference on Computer Generated Forces and Behavioral Representation, pages 3--13, May 9--11, Orlando, Florida. University of Central Florida, Florida.
[12]
King, P. 1989. A logical formalism for head-driven phrase structure grammar. Ph.D. Thesis, University of Manchester, Manchester, England.
[13]
Koons, D. B., C. J. Sparrell, and K. R. Thorisson. 1993. Integrating simultaneous input from speech, gaze, and hand gestures. In M. T. Maybury, editor, Intelligent Multimedia Interfaces, pages 257--276. AAAI Press/ MIT Press, Cambridge, Massachusetts.
[14]
Moore, R. C., J. Dowding, H. Bratt, J. M. Gawron, Y. Gorfu, and A. Cheyer 1997. Command Talk: A Spoken-Language Interface for Battlefield Simulations. In Proceedings of Fifth Conference on Applied Natural Language Processing, pages 1--7, Washington, D.C. Association for Computational Linguistics, Morristown, New Jersey.
[15]
Moshier, D. 1988. Extensions to unification grammar for the description of programming languages. Ph.D. Thesis, University of Michigan, Ann Arbor, Michigan.
[16]
Neal, J. G., and S. C. Shapiro. 1991. Intelligent multi-media interface technology. In J. W. Sullivan and S. W. Tyler, editors, Intelligent User Interfaces, pages 45--68. ACM Press, Frontier Series, Addison Wesley Publishing Co., New York, New York.
[17]
Oviatt, S. L. 1996. Multimodal interfaces for dynamic interactive maps. In Proceedings of Conference on Human Factors in Computing Systems: CHI'96, pages 95--102, Vancouver, Canada. ACM Press, New York.
[18]
Oviatt, S. L., A. DeAngeli, and K. Kuhn. 1997. Integration and synchronization of input modes during multimodal human-computer interaction. In Proceedings of the Conference on Human Factors in Computing Sytems: CHI'97, pages 415--422, Atlanta, Georgia. ACM Press, New York.
[19]
Oviatt, S. L., and R. van Gent. 1996. Error resolution during multimodal human-computer interaction. In Proceedings of International Conference on Spoken Language Processing, vol 1, pages 204--207, Philadelphia, Pennsylvania.
[20]
Pollard, C. J., and I. A. Sag. 1987. Information-based syntax and semantics: Volume I, Fundamentals., Volume 13 of CSLI Lecture Notes. Center for the Study of Language and Information, Stanford University, Stanford, California.
[21]
Vo, M. T., and C. Wood. 1996. Building an application framework for speech and pen input integration in multimodal learning interfaces. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Atlanta, GA.
[22]
Wahlster, W. 1991. User and discourse models for multimodal communication. In J. Sullivan and S. Tyler, editors, Intelligent User Interfaces, ACM Press, Addison Wesley Publishing Co., New York, New York.
[23]
Wauchope, K. 1994. Eucalyptus: Integrating natural language input with a graphical user interface. Naval Research Laboratory, Report NRL/FR/5510-94-9711.

Cited By

View all
  • (2024)An Artists' Perspectives on Natural Interactions for Virtual Reality 3D SketchingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642758(1-20)Online publication date: 11-May-2024
  • (2020)Understanding Gesture and Speech Multimodal Interactions for Manipulation Tasks in Augmented Reality Using Unconstrained ElicitationProceedings of the ACM on Human-Computer Interaction10.1145/34273304:ISS(1-21)Online publication date: 4-Nov-2020
  • (2019)”Paint that object yellow”: Multimodal Interaction to Enhance Creativity During Design Tasks in VR2019 International Conference on Multimodal Interaction10.1145/3340555.3353724(195-204)Online publication date: 14-Oct-2019
  • Show More Cited By
  1. Unification-based multimodal integration

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image DL Hosted proceedings
    ACL '98/EACL '98: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
    July 1997
    543 pages

    Sponsors

    • Directorate General XIII (European Commission)
    • Universidad Complutense de Madrid
    • Universidad Autónoma de Madrid
    • Universidad Nacional de Educación a Distancia
    • Universidad Politécnica de Madrid

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    Published: 07 July 1997

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 85 of 443 submissions, 19%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)76
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)An Artists' Perspectives on Natural Interactions for Virtual Reality 3D SketchingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642758(1-20)Online publication date: 11-May-2024
    • (2020)Understanding Gesture and Speech Multimodal Interactions for Manipulation Tasks in Augmented Reality Using Unconstrained ElicitationProceedings of the ACM on Human-Computer Interaction10.1145/34273304:ISS(1-21)Online publication date: 4-Nov-2020
    • (2019)”Paint that object yellow”: Multimodal Interaction to Enhance Creativity During Design Tasks in VR2019 International Conference on Multimodal Interaction10.1145/3340555.3353724(195-204)Online publication date: 14-Oct-2019
    • (2019)Multimodal integration for interactive conversational systemsThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233798(21-76)Online publication date: 1-Jul-2019
    • (2019)IntroductionThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233797(1-20)Online publication date: 1-Jul-2019
    • (2017)Temporal alignment using the incremental unit frameworkProceedings of the 19th ACM International Conference on Multimodal Interaction10.1145/3136755.3136769(297-301)Online publication date: 3-Nov-2017
    • (2017)Multimodal speech and pen interfacesThe Handbook of Multimodal-Multisensor Interfaces10.1145/3015783.3015795(403-447)Online publication date: 24-Apr-2017
    • (2015)Sketch-Thru-PlanCommunications of the ACM10.1145/273558958:4(56-65)Online publication date: 23-Mar-2015
    • (2014)Latent Semantic Analysis for Multimodal User Input With Speech and GesturesIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2013.229458622:2(417-429)Online publication date: 1-Feb-2014
    • (2014)Engineering Variance16th International Conference on Human-Computer Interaction. Theories, Methods, and Tools - Volume 851010.1007/978-3-319-07233-3_29(308-319)Online publication date: 22-Jun-2014
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media