[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/958432.958460acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

Pointing gesture recognition based on 3D-tracking of face, hands and head orientation

Published: 05 November 2003 Publication History

Abstract

In this paper, we present a system capable of visually detecting pointing gestures and estimating the 3D pointing direction in real-time. In order to acquire input features for gesture recognition, we track the positions of a person's face and hands on image sequences provided by a stereo-camera. Hidden Markov Models (HMMs), trained on different phases of sample pointing gestures, are used to classify the 3D-trajectories in order to detect the occurrence of a gesture. When analyzing sample pointing gestures, we noticed that humans tend to look at the pointing target while performing the gesture. In order to utilize this behavior, we additionally measured head orientation by means of a magnetic sensor in a similar scenario. By using head orientation as an additional feature, we observed significant gains in both recall and precision of pointing gestures. Moreover, the percentage of correctly identified pointing targets improved significantly from 65% to 83%. For estimating the pointing direction, we comparatively used three approaches: 1) The line of sight between head and hand, 2) the forearm orientation, and 3) the head orientation.

References

[1]
P. P. Maglio, T. Matlock, C. S. Campbel, S. Zhai, and B. A. Smith. Gaze and speech in attentive user interfaces. Proceedings of the International Conference on Multimodal Interfaces, 2000.]]
[2]
B. Brumitt, J. Krumm, B. Meyers, and S. Shafer. Let There Be Light: Comparing Interfaces for Homes of the Future. IEEE Personal Communications, August 2000.]]
[3]
C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland. Pfinder: Real-Time Tracking of the Human Body. IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, 1997.]]
[4]
A. Azarbayejani and A. Pentland. Real-time self-calibrating stereo person tracking using 3-D shape estimation from blob features. Proceedings of 13th ICPR, 1996.]]
[5]
T. Darrell, G. Gordon, M. Harville, and J. Woodfill. Integrated person tracking using stereo, color, and pattern detection. IEEE Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, 1998.]]
[6]
T. Starner and A. Pentland. Visual Recognition of American Sign Language Using Hidden Markov Models. M.I.T. Media Laboratory, Perceptual Computing Section, Cambridge MA, USA, 1994.]]
[7]
D. A. Becker. Sensei: A Real-Time Recognition, Feedback and Training System for T'ai Chi Gestures. M.I.T. Media Lab Perceptual Computing Group Technical Report No. 426, 1997.]]
[8]
A. D. Wilson and A. F. Bobick. Recognition and Interpretation of Parametric Gesture. Intl. Conference on Computer Vision ICCV, 329--336, 1998.]]
[9]
I. Poddar, Y. Sethi, E. Ozyildiz, and R. Sharma. Toward Natural Gesture/Speech HCI: A Case Study of Weather Narration. Proc. Workshop on Perceptual User Interfaces (PUI98), San Francisco, USA. 1998.]]
[10]
R. Kahn, M. Swain, P. Prokopowicz, and R. Firby. Gesture recognition using the Perseus architecture. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 734--741, 1996.]]
[11]
N. Jojic, B. Brumitt, B. Meyers, S. Harris, and T. Huang. Detection and Estimation of Pointing Gestures in Dense Disparity Maps. IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France, 2000.]]
[12]
K. Konolige. Small Vision Systems: Hardware and Implementation. Eighth International Symposium on Robotics Research, Hayama, Japan, 1997.]]
[13]
J. Yang, W. Lu, and A. Waibel. Skin-color modeling and adaption. Technical Report of School of Computer Science, CMU, CMU-CS-97-146, 1997.]]
[14]
L. R. Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE, 77 (2), 257--286, 1989.]]
[15]
L. W. Campbell, D. A. Becker, A. Azarbayejani, A. F. Bobick, and A. Pentland. Invariant features for 3-D gesture recognition. Second International Workshop on Face and Gesture Recognition, Killington VT, 1996.]]

Cited By

View all
  • (2024)MouseRing: Always-available Touchpad Interaction with IMU RingsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642225(1-19)Online publication date: 11-May-2024
  • (2024)Neuro-Symbolic Reasoning for Multimodal Referring Expression Comprehension in HMI SystemsNew Generation Computing10.1007/s00354-024-00243-842:4(579-598)Online publication date: 1-Nov-2024
  • (2024)Testing Usability of Different Implementations for VR Interaction MethodsAdvances in Computational Collective Intelligence10.1007/978-3-031-70259-4_22(287-300)Online publication date: 9-Sep-2024
  • Show More Cited By

Index Terms

  1. Pointing gesture recognition based on 3D-tracking of face, hands and head orientation

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces
      November 2003
      318 pages
      ISBN:1581136218
      DOI:10.1145/958432
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 November 2003

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. computer vision
      2. gesture recognition
      3. person tracking
      4. pointing gestures

      Qualifiers

      • Article

      Conference

      ICMI-PUI03
      Sponsor:
      ICMI-PUI03: International Conference on Multimodal User Interfaces
      November 5 - 7, 2003
      British Columbia, Vancouver, Canada

      Acceptance Rates

      ICMI '03 Paper Acceptance Rate 45 of 130 submissions, 35%;
      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)55
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 16 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MouseRing: Always-available Touchpad Interaction with IMU RingsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642225(1-19)Online publication date: 11-May-2024
      • (2024)Neuro-Symbolic Reasoning for Multimodal Referring Expression Comprehension in HMI SystemsNew Generation Computing10.1007/s00354-024-00243-842:4(579-598)Online publication date: 1-Nov-2024
      • (2024)Testing Usability of Different Implementations for VR Interaction MethodsAdvances in Computational Collective Intelligence10.1007/978-3-031-70259-4_22(287-300)Online publication date: 9-Sep-2024
      • (2023)A Novel Heteromorphic Ensemble Algorithm for Hand Pose RecognitionSymmetry10.3390/sym1503076915:3(769)Online publication date: 21-Mar-2023
      • (2023)Enhancing Human–Robot Collaboration through a Multi-Module Interaction Framework with Sensor Fusion: Object Recognition, Verbal Communication, User of Interest Detection, Gesture and Gaze RecognitionSensors10.3390/s2313579823:13(5798)Online publication date: 21-Jun-2023
      • (2023)Selecting Real-World Objects via User-Perspective Phone OcclusionProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580696(1-13)Online publication date: 19-Apr-2023
      • (2023)Multimodal Error Correction with Natural Language and Pointing Gestures2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW60793.2023.00212(1968-1978)Online publication date: 2-Oct-2023
      • (2023)DeePoint: Visual Pointing Recognition and Direction Estimation2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01881(20520-20530)Online publication date: 1-Oct-2023
      • (2023)Effects of Spatial Characteristics on the Human–Robot Communication Using Deictic Gesture in ConstructionJournal of Construction Engineering and Management10.1061/JCEMD4.COENG-12997149:7Online publication date: Jul-2023
      • (2023)Interactive Multimodal Robot Dialog Using Pointing Gesture RecognitionComputer Vision – ECCV 2022 Workshops10.1007/978-3-031-25075-0_43(640-657)Online publication date: 19-Feb-2023
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media