[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3411764.3445501acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Hummer: Text Entry by Gaze and Hum

Published: 07 May 2021 Publication History

Abstract

Text entry by gaze is a useful means of hands-free interaction that is applicable in settings where dictation suffers from poor voice recognition or where spoken words and sentences jeopardize privacy or confidentiality. However, text entry by gaze still shows inferior performance and it quickly exhausts its users. We introduce text entry by gaze and hum as a novel hands-free text entry. We review related literature to converge to word-level text entry by analysis of gaze paths that are temporally constrained by humming. We develop and evaluate two design choices: “HumHum” and “Hummer.” The first method requires short hums to indicate the start and end of a word. The second method interprets one continuous humming as an indication of the start and end of a word. In an experiment with 12 participants, Hummer achieved a commendable text entry rate of 20.45 words per minute, and outperformed HumHum and the gaze-only method EyeSwipe in both quantitative and qualitative measures.

Supplementary Material

VTT File (3411764.3445501_videofigurecaptions.vtt)
MP4 File (3411764.3445501_videofigure.mp4)
Supplemental video

References

[1]
Jeff Bilmes, Xiao Li, Jonathan Malkin, Kelley Kilanski, Richard Wright, Katrin Kirchhoff, Amarnag Subramanya, Susumu Harada, James Landay, Patricia Dowden, 2005. The Vocal Joystick: A voice-based human-computer interface for individuals with motor impairments. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 995–1002.
[2]
Peter M. Corcoran, Florin Nanu, Stefan Petrescu, and Petronel Bigioi. 2012. Real-time eye gaze tracking for gaming design and consumer electronics systems. IEEE Transactions on Consumer Electronics 58, 2 (2012), 347–355.
[3]
Antonio Diaz-Tula and Carlos H. Morimoto. 2016. AugKey: Increasing foveal throughput in eye typing with augmented keys. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). ACM, New York, 3533–3544. https://doi.org/10.1145/2858036.2858517
[4]
Thomas Eiter and Heikki Mannila. 1994. Computing discrete Fréchet distance. Technical Report. Citeseer.
[5]
Torsten Felzer, I. Scott MacKenzie, Philipp Beckerle, and Stephan Rinderknecht. 2010. Qanti: a software tool for quick ambiguous non-standard text input. In International Conference on Computers for Handicapped Persons. Springer, 128–135.
[6]
Maurice René Fréchet. 1906. Sur quelques points du calcul fonctionnel. Rendiconti del Circolo Matematico di Palermo (1884-1940) 22, 1(1906), 1–72.
[7]
Masaaki Fukumoto. 2018. SilentVoice: Unnoticeable Voice Input by Ingressive Speech. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (Berlin, Germany) (UIST ’18). Association for Computing Machinery, New York, NY, USA, 237–246. https://doi.org/10.1145/3242587.3242603
[8]
Markus Funk, Vanessa Tobisch, and Adam Emfield. 2020. Non-Verbal Auditory Input for Controlling Binary, Discrete, and Continuous Input in Automotive User Interfaces. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.
[9]
John Paulin Hansen, Anders Sewerin Johansen, Dan Witzner Hansen, Kenji Itoh, and Satoru Mashino. 2003. Command without a click: Dwell time typing by mouse and gaze selections. In Proceedings of Human-Computer Interaction–INTERACT. Springer, Berlin, 121–128.
[10]
Anke Huckauf and Mario Urbina. 2007. Gazing with pEYE: New concepts in eye typing. In Proceedings of the 4th Symposium on Applied Perception in Graphics and Visualization (Tübingen, Germany) (APGV ’07). ACM, New York, 141–141. https://doi.org/10.1145/1272582.1272618
[11]
Takeo Igarashi and John F. Hughes. 2001. Voice as sound: using non-verbal voice input for interactive control. In Proceedings of the 14th annual ACM symposium on User interface software and technology. 155–156.
[12]
Robert J. K. Jacob. 1990. What you look at is what you get: Eye movement-based interaction techniques. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (Seattle, WA) (CHI ’90). ACM, New York, 11–18. https://doi.org/10.1145/97243.97246
[13]
Robert J. K. Jacob. 1990. What you look at is what you get: Eye movement-based interaction techniques. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (Seattle, Washington, USA) (CHI ’90). ACM, New York, 11–18. https://doi.org/10.1145/97243.97246
[14]
Jyh-Shing Jang, Chao-Ling Hsu, and Hong-Ru Lee. 2005. Continuous HMM and Its Enhancement for Singing/Humming Query Retrieval.Proc. of ISMIR, 546–551.
[15]
Joseph Jordania. 2009. Times to fight and times to relax: Singing and humming at the beginnings of human evolutionary history. Kadmos 1, 2009 (2009), 272–277.
[16]
Josh Kaufman. 2015. Google 10000 english.
[17]
Suk-Jun Kim. 2018. Humming. Bloomsbury Publishing USA.
[18]
Alexios Kotsifakos, Panagiotis Papapetrou, Jaakko Hollmén, Dimitrios Gunopulos, and Vassilis Athitsos. 2012. A Survey of Query-by-Humming Similarity Methods. In Proceedings of the 5th International Conference on Pervasive Technologies Related to Assistive Environments (Heraklion, Crete, Greece) (PETRA ’12). Association for Computing Machinery, New York, NY, USA, Article 5, 4 pages. https://doi.org/10.1145/2413097.2413104
[19]
Per-Ola Kristensson and Keith Vertanen. 2012. The potential of dwell-free eye-typing for fast assistive gaze communication. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications (ETRA ’12). ACM, New York, 241–244.
[20]
Per-Ola Kristensson and Shumin Zhai. 2004. SHARK 2: a large vocabulary shorthand writing system for pen-based computers. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST ’04). ACM, New York, 43–52.
[21]
Chandan Kumar, Ramin Hedeshy, I. Scott MacKenzie, and Steffen Staab. 2020. TAGSwipe: Touch Assisted Gaze Swipe for Text Entry. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376317
[22]
Andrew Kurauchi, Wenxin Feng, Ajjen Joshi, Carlos Morimoto, and Margrit Betke. 2016. EyeSwipe: Dwell-free text entry using gaze paths. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). ACM, New York, 1952–1956. https://doi.org/10.1145/2858036.2858335
[23]
Andrew Kurauchi, Wenxin Feng, Ajjen Joshi, Carlos Morimoto, and Margrit Betke. 2016. EyeSwipe: Dwell-free text entry using gaze paths. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (San Jose, CA) (CHI ’16). ACM, New York, 1952–1956. https://doi.org/10.1145/2858036.2858335
[24]
Andrew T. N. Kurauchi. 2018. EyeSwipe: text entry using gaze paths.
[25]
Yi Liu, Chi Zhang, Chonho Lee, Bu-Sung Lee, and Alex Qiang Chen. 2015. Gazetry: Swipe text typing using gaze. In Proceedings of the annual meeting of the australian special interest group for computer human interaction. ACM, 192–196.
[26]
I. Scott MacKenzie and R. William Soukoreff. 2003. Phrase sets for evaluating text entry techniques. In Extended Abstracts of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI ’03). ACM, New York, 754–755.
[27]
Päivi Majaranta. 2012. Communication and text entry by gaze. In Gaze interaction and applications of eye tracking: Advances in assistive technologies. IGI Global, 63–77.
[28]
Päivi Majaranta, Ulla-Kaija Ahola, and Oleg Špakov. 2009. Fast gaze typing with an adjustable dwell time. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). ACM, New York, 357–360. https://doi.org/10.1145/1518701.1518758
[29]
Päivi Majaranta and Andreas Bulling. 2014. Eye tracking and eye-based human–computer interaction. In Advances in physiological computing. Springer, 39–65.
[30]
Raphael Menges, Chandan Kumar, and Steffen Staab. 2019. Improving user experience of eye tracking-based interaction: Introspecting and adapting interfaces. ACM Transactions on Computer-Human Interaction 26, 6, Article 37 (Nov. 2019), 46 pages. https://doi.org/10.1145/3338844
[31]
Carlos H. Morimoto and Arnon Amir. 2010. Context Switching for Fast Key Selection in Text Entry Applications. In Proceedings of the 2010 Symposium on Eye-Tracking Research and Applications (Austin, Texas) (ETRA ’10). Association for Computing Machinery, New York, NY, USA, 271–274. https://doi.org/10.1145/1743666.1743730
[32]
Carlos H. Morimoto, Jose A. T. Leyva, and Antonio Diaz-Tula. 2018. Context switching eye typing using dynamic expanding targets. In Proceedings of the Workshop on Communication by Gaze Interaction (Warsaw, Poland) (COGAIN ’18). ACM, New York, Article 6, 9 pages. https://doi.org/10.1145/3206343.3206347
[33]
Martez E. Mott, Shane Williams, Jacob O. Wobbrock, and Meredith Ringel Morris. 2017. Improving dwell-based gaze typing with dynamic, cascading dwell times. In Proceedings of the ACM CHI Conference on Human Factors in Computing Systems (CHI ’17). ACM, New York, 2558–2570.
[34]
Steffen Pauws. 2002. CubyHum: A Fully Operational Query by Humming System. In ISMIR 2002 Conference Proceedings. 187–196.
[35]
Diogo Pedrosa, Maria da Graça Pimentel, and Khai N. Truong. 2015. Filteryedping: A dwell-free eye typing technique. In Extended Abstracts of the ACM SIGCHI Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). ACM, New York, 303–306. https://doi.org/10.1145/2702613.2725458
[36]
Nathalia Peixoto, Hossein Ghaffari Nik, and Hamid Charkhkar. 2013. Voice Controlled Wheelchairs. Comput. Methods Prog. Biomed. 112, 1 (Oct. 2013), 156–165. https://doi.org/10.1016/j.cmpb.2013.06.009
[37]
Ken Pfeuffer and Hans Gellersen. 2016. Gaze and touch interaction on tablets. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). ACM, New York, 301–311. https://doi.org/10.1145/2984511.2984514
[38]
Ondřej Poláček, Zdeněk Míkovec, Adam J. Sporka, and Pavel Slavík. 2010. New way of vocal interface design: Formal description of non-verbal vocal gestures. Proceedings of the CWUAAT(2010), 137–144.
[39]
Ondrej Polacek, Zdenek Mikovec, Adam J. Sporka, and Pavel Slavik. 2011. Humsher: a predictive keyboard operated by humming. In The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility. 75–82.
[40]
Ondrej Polacek, Adam J. Sporka, and Pavel Slavik. 2017. Text input for motor-impaired people. Universal Access in the Information Society 16, 1 (2017), 51–72.
[41]
Daniel Rough, Keith Vertanen, and Per-Ola Kristensson. 2014. An evaluation of Dasher with a high-performance language model as a gaze communication method. In Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces. ACM, New York, 169–176.
[42]
Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26, 1 (February 1978), 43–49. https://doi.org/10.1109/TASSP.1978.1163055
[43]
Sayan Sarcar, Prateek Panwar, and Tuhin Chakraborty. 2013. EyeK: An efficient dwell-free eye gaze-based text entry system. In Proceedings of the 11th Asia Pacific Conference on Computer-Human Interaction. ACM, New York, 215–220.
[44]
Korok Sengupta, Raphael Menges, Chandan Kumar, and Steffen Staab. 2019. Impact of variable positioning of text prediction in gaze-based text entry. In Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications (Denver, Colorado) (ETRA ’19). ACM, New York, Article 74, 9 pages. https://doi.org/10.1145/3317956.3318152
[45]
R. William Soukoreff and I. Scott MacKenzie. 2003. Metrics for text entry research: An evaluation of MSD and KSPC, and a new unified error Mmtric. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Ft. Lauderdale, Florida, USA) (CHI ’03). ACM, New York, 113–120. https://doi.org/10.1145/642611.642632
[46]
Adam J. Sporka, Torsten Felzer, Sri H. Kurniawan, Ondřej Poláček, Paul Haiduk, and I. Scott MacKenzie. 2011. CHANTI: Predictive Text Entry Using Non-Verbal Vocal Input. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for Computing Machinery, New York, NY, USA, 2463–2472. https://doi.org/10.1145/1978942.1979302
[47]
Adam J. Sporka, Sri H. Kurniawan, Murni Mahmud, and Pavel Slavík. 2006. Non-speech input and speech recognition for real-time control of computer games. In Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility. 213–220.
[48]
Melisa Stevanovic. 2013. Managing participation in interaction: The case of humming. Text and Talk 33 (01 2013), 113–137. https://doi.org/10.1515/text-2013-0006
[49]
Kevin Toohey and Matt Duckham. 2015. Trajectory similarity measures. Sigspatial Special 7, 1 (2015), 43–50.
[50]
Outi Tuisku, Päivi Majaranta, Poika Isokoski, and Kari-Jouko Räihä. 2008. Now Dasher! Dash away!: longitudinal study of fast text entry by eye gaze. In Proceedings of the 2008 Symposium on Eye Tracking Research & Applications (ETRA ’08). ACM, New York, 19–26.
[51]
Mario H. Urbina and Anke Huckauf. 2010. Alternatives to single character entry and dwell time selection on eye typing. In Proceedings of the 2010 Symposium on Eye-Tracking Research and Applications (Austin, Texas) (ETRA ’10). Association for Computing Machinery, New York, 315–322. https://doi.org/10.1145/1743666.1743738
[52]
Alex Waibel and Kai-Fu Lee (Eds.). 1990. Readings in Speech Recognition. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
[53]
Colin Ware and Harutune H. Mikaelian. 1987. An evaluation of an eye tracker as a device for computer input. In Proceedings of the ACM SIGCHI/GI Conference on Human Factors in Computing Systems and Graphics Interface (Toronto, Ontario, Canada) (CHI+GI ’87). ACM, New York, 183–188. https://doi.org/10.1145/29933.275627
[54]
Richard Watts and Peter Robinson. 1999. Controlling computers by whistling. In Proceedings of Eurographics UK. Cambridge University Press.
[55]
Jacob O. Wobbrock, James Rubinstein, Michael W. Sawyer, and Andrew T. Duchowski. 2008. Longitudinal evaluation of discrete consecutive gaze gestures for text entry. In Proceedings of the 2008 ACM Symposium on Eye Tracking Research & Applications (ETRA ’08). ACM, New York, 11–18.
[56]
Nicole Yankelovich, Gina-Anne Levow, and Matt Marx. 1995. Designing SpeechActs: Issues in Speech User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’95). ACM Press/Addison-Wesley Publishing Co., USA, 369–376. https://doi.org/10.1145/223904.223952
[57]
Shumin Zhai and Per-Ola Kristensson. 2012. The word-gesture keyboard: Reimagining keyboard interaction. Commun. ACM 55, 9 (2012), 91–101.
[58]
Xiaoyu (Amy) Zhao, Elias D. Guestrin, Dimitry Sayenko, Tyler Simpson, Michel Gauthier, and Milos R. Popovic. 2012. Typing with Eye-Gaze and Tooth-Clicks. In Proceedings of the Symposium on Eye Tracking Research and Applications (Santa Barbara, California) (ETRA ’12). Association for Computing Machinery, New York, NY, USA, 341–344. https://doi.org/10.1145/2168556.2168632
[59]
Daniel Zielasko, Neha Neha, Benjamin Weyers, and Torsten W. Kuhlen. 2017. A reliable non-verbal vocal input metaphor for clicking. In 2017 IEEE Symposium on 3D User Interfaces (3DUI). IEEE, 40–49.

Cited By

View all
  • (2024)TouchEditorProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314547:4(1-29)Online publication date: 12-Jan-2024
  • (2024)FocusFlow: 3D Gaze-Depth Interaction in Virtual Reality Leveraging Active Visual Depth ManipulationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642589(1-18)Online publication date: 11-May-2024
  • (2024)SkiMR: Dwell-free Eye Typing in Mixed Reality2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR)10.1109/VR58804.2024.00065(439-449)Online publication date: 16-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
May 2021
10862 pages
ISBN:9781450380966
DOI:10.1145/3411764
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. eye tracking
  2. eye typing
  3. hands-free interaction
  4. humming
  5. swipe

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CHI '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)158
  • Downloads (Last 6 weeks)63
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)TouchEditorProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314547:4(1-29)Online publication date: 12-Jan-2024
  • (2024)FocusFlow: 3D Gaze-Depth Interaction in Virtual Reality Leveraging Active Visual Depth ManipulationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642589(1-18)Online publication date: 11-May-2024
  • (2024)SkiMR: Dwell-free Eye Typing in Mixed Reality2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR)10.1109/VR58804.2024.00065(439-449)Online publication date: 16-Mar-2024
  • (2024)The Guided Evaluation MethodInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103317190:COnline publication date: 1-Oct-2024
  • (2023)Balancing Accuracy and Speed in Gaze-Touch Grid Menu Selection in AR via Mapping Sub-Menus to a Hand-Held DeviceSensors10.3390/s2323958723:23(9587)Online publication date: 3-Dec-2023
  • (2023)FocusFlow: Leveraging Focal Depth for Gaze Interaction in Virtual RealityAdjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586182.3615818(1-4)Online publication date: 29-Oct-2023
  • (2023)Handwriting VelcroProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35694616:4(1-31)Online publication date: 11-Jan-2023
  • (2022)Improving Finger Stroke Recognition Rate for Eyes-Free Mid-Air Typing in VRProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3502100(1-9)Online publication date: 29-Apr-2022
  • (2022)Evaluation of Text Selection Techniques in Virtual Reality Head-Mounted Displays2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)10.1109/ISMAR55827.2022.00027(131-140)Online publication date: Oct-2022
  • (2022)An Exploration of Hands-free Text Selection for Virtual Reality Head-Mounted Displays2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)10.1109/ISMAR55827.2022.00021(74-81)Online publication date: Oct-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media