[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3357236.3395553acmconferencesArticle/Chapter ViewAbstractPublication PagesdisConference Proceedingsconference-collections
research-article

Voiceye: A Multimodal Inclusive Development Environment

Published: 03 July 2020 Publication History

Abstract

People with physical impairments who are unable to use traditional input devices (i.e. mouse and keyboard) are often excluded from technical professions (e.g. web development). Alternative input methods such as eye gaze tracking and speech recognition have become more readily available in recent years with both being explored independently to support people with physical impairments in coding activities. This paper describes a novel multimodal application ("Voiceye") that combines voice input, gaze interaction, and mechanical switches as an alternative approach for writing code. The system was evaluated with non-disabled participants who have coding experience (N=29) to assess the feasibility of the application in writing HTML and CSS code. Results found that Voiceye was perceived positively and enabled successful completion of coding tasks. A follow-up study with disabled participants (N=5) demonstrated that this method of multimodal interaction can support people with physical impairments in writing and editing code.

References

[1]
Advanced Voice-Control, speech to code, program by voice, stop RSI. Retrieved April 11, 2019 from http://voicecode.io
[2]
Atom. Retrieved January 29, 2020 from https://atom.io
[3]
Mubbashir Ayub and Muhammad Asjad Saleem. 2012. A speech recognition based approach for development in C++. IJCSNS 12, 10: 110--114
[4]
Imran Bajwa, Waqar Aslam Sarwar, and Irfan Hyder Syed. 2006. Speech Language Engineering System for Automatic Generation of Web based User Forms. In Proceedings of the International Conference on ManMachine Systems (ICOMMS 2006)
[5]
Aaron Bangor, Philip Kortum, and James Miller. 2009. Determining what individual SUS scores mean: Adding an adjective rating scale. Journal of Usability Studies 4, 3: 114--123.
[6]
Richard Bates and Istance Howell. 2002. Zooming interfaces!: enhancing the performance of eye controlled pointing devices. In Proceedings of the fifth international ACM conference on Assistive technologies, 119--126.
[7]
Tanya René Beelders. 2011. Enhancing the user experience for a word processor application through vision and voice. Ph.D Dissertation. University of the Free State.
[8]
Tanya René Beelders and Pieter J. Blignaut. 2010. Using vision and voice to create a multimodal interface for Microsoft Word 2007. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, 173--176.
[9]
Andrew Begel and Susan L.Graham. 2006. An assessment of a speech-based programming environment. In Proceedings IEEE Symposium on Visual Languages and Human-Centric Computing, 116--120.
[10]
Andrew Begel and Susan L.Graham. 2005. Spoken programs. In Proceedings IEEE Symposium on Visual Languages and Human-Centric Computing, 99--106.
[11]
Andrew Begel. 2005. Programming by voice: A domain-specific application of speech recognition. In AVIOS speech technology symposium--SpeechTek West.
[12]
Brackets - A modern, open source code editor that understands web design. Retrieved January 29, 2020 from http://brackets.io
[13]
Teresa Busjahn, Bednarik Roman, Andrew Begel, Martha Crosby, James H. Paterson, Carsten Schulte, Bonita Sharif, and Sascha Tamm. 2015. Eye Movements in Code Reading: Relaxing the Linear Order. In 2015 IEEE 23rd International Conference on Program Comprehension, 255--265.
[14]
Emiliano Castellina, Fulvio Corno, and Paolo Pellegrino. 2008. Integrated speech and gaze control for realistic desktop environments. In Proceedings of the 2008 symposium on Eye tracking research & applications, 79--82.
[15]
Central London RSI Support Group - Home| Facebook. Retrieved on January 30, 2020 from https://www.facebook.com/CentralLondonRsiSupportG roup
[16]
Hashmeet Chadha, Satyam Mhatre, Unnati Ganatra, and Sujata Pathak. 2018. HTML Voice. In 2018 Fourth International Conference on Computing Communication Control and Automation, 1--4.
[17]
Tuhin Chakraborty, Sayan Sarcar, and Debasis Samanta. 2014. Design and evaluation of a dwell-free eye typing technique. In Proceedings of the extended abstracts of the 32nd annual ACM conference on Human factors in computing systems, 1573--1578.
[18]
CodeMirror. Retrieved January 30, 2020 from https://codemirror.net
[19]
Chris Creed. 2018. Assistive technology for disabled visual artists: exploring the impact of digital technologies on artistic practice. Disability and Society 33, 7: 1103--1119.
[20]
Chris Creed, Maite Frutos-Pascual, and Ian Williams. 2020. Multimodal Gaze Interaction for Creative Design. In Proceedings of the ACM conference on Human factors in computing systems (CHI).
[21]
Chris Creed. 2016. Eye Gaze Interaction for Supporting Creative Work with Disabled Artists. In Proceedings of the 30th International BCS Human Computer Interaction Conference, 1--3.
[22]
Alain Desilets. 2001. VoiceGrip: A tool for programming-by-voice. In International Journal of Speech Technology, 103--116.
[23]
Alain Désilets, David C. Fox, and Stuart Norton. 2006. Voice Code: An Innovative Speech Interface for Programming-by-voice. In CHI '06 extended abstracts on Human factors in computing systems, 239--242.
[24]
Electron - Build cross platform desktop apps with JavaScript, HTML, and CSS. Retrieved January 29, 2020 from https://www.electronjs.org
[25]
Monika Elepfandt and Martin Grund. 2012. Move it there, or not?: The design of voice commands for gaze with speech. In Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction,13.
[26]
Emmet - the essential toolkit for web-developers. Retrieved September 13, 2019 from https://docs.emmet.io/
[27]
Hartmut Glücker, Felix Raab, Florian Echtler, and Christian Wolff. 2014. EyeDE: Gaze-enhanced software development environments. In CHI'14 Extended Abstracts on Human Factors in Computing Systems, 1555--1560.
[28]
Benjamin M. Gordon and George F. Luger. 2012. English for spoken programming. In 6th International Conference on Soft Computing and Intelligent Systems, and 13th International Symposium on Advanced Intelligence Systems, 16--20.
[29]
Benjamin M. Gordon and George F. Luger. 2012. Progress in Spoken Programming. In 8th Student Conference, 1--5.
[30]
Dilek Hakkani-Tür, Malcolm Slaney, Asli Celikyilmaz, and Larry Heck. 2014. Eye gaze for spoken language understanding in multi-modal conversational interactions. In Proceedings of the 2014 International Conference on Multimodal Interaction, 263--266.
[31]
Henna Heikkilä. 2013. EyeSketch: A drawing application for gaze control. In Proceedings of the 2013 Conference on Eye Tracking South Africa, 71--74.
[32]
Anthony J. Hornof and Anna Cavender. 2005. EyeDraw. In Proceedings of the SIGCHI conference on Human factors in computing systems, 161--170.
[33]
Thomas J. Hubbell, David D. Langan, and Thomas F. Hain. 2006. A voice-activated syntax-directed editor for manually disabled programmers. In Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility, 205--212.
[34]
Robert Jacob and Keith Karn. 2003. Eye tracking in human-computer interaction and usability research: Ready to deliver the promises. In The Mind's Eye, 573--605.
[35]
Yvonne Kammerer, Katharina Scheiter, and Wolfgang Beinhauer. 2008. Looking my way through the menu. In Proceedings of the 2008 Symposium on Eye Tracking Research & Applications, 213--220.
[36]
Jan van der Kamp and Veronica Sundstedt. 2011. Gaze and voice controlled drawing. In Proceedings of the 1st Conference on Novel Gaze-Controlled Applications, 1--8.
[37]
Per Ola Kristensson and Keith Vertanen. 2012. The potential of dwell-free eye-typing for fast assistive gaze communication. In Proceedings of the Symposium on Eye Tracking Research and Applications, 241--244.
[38]
Chandan Kumar, Raphael Menges, Daniel Müller, and Steffen Staab. 2017. Chromium based framework to include gaze interaction in web browser. In Proceedings of the 26th International World Wide Web Conference, 219--223.
[39]
Thomas D. LaToza, Gina Venolia, and Robert DeLine. 2006. Maintaining Mental Models: A Study of Developer Work Habits. In Proceedings of the 28th international conference on Software engineering, 492--501.
[40]
Päivi Majaranta, Ulla-Kaija Ahola, and Oleg pakov. 2009. Fast gaze typing with an adjustable dwell time. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 357--360.
[41]
Rinor S. Maloku and Besart Xh Pllana. 2016. HyperCode: Voice aided programming. IFAC Papers On Line, 263--268.
[42]
Darius Miniotas, Oleg Spakov, Ivan Tugoy, and I. Scott MacKenzie. 2006. Speech-augmented eye gaze interaction with small closely spaced targets. In Proceedings of the 2006 Symposium on Eye Tracking Research & Applications, 67--72.
[43]
Sahil Modak, Sagar Vikmani, Suril Shah, and Lakshmi Kurup. 2016. Voice driven dynamic generation of webpages. In 2016 International Conference on Computing Communication Control and automation, 1--4.
[44]
Martez E. Mott, Shane Williams, Jacob O. Wobbrock, and Meredith Ringel Morris. 2017. Improving dwell-based gaze typing with dynamic, cascading dwell times. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2558--2570.
[45]
Thomas H. Park, Ankur Saxena, Swathi Jagannath, Susan Wiedenbeck, and Andrea Forte. 2013. Towards a Taxonomy of Errors in HTML and CSS. In Proceedings of the ninth annual international ACM conference on International computing education research, 75--82.
[46]
Rakesh Patel and Mili Patel. 2014. Hands free JAVA (Through Speech Recognition).
[47]
Marco Porta and Alessia Ravelli. 2009. WeyeB, an eye-controlled web browser for hands-free navigation. In 2009 2nd Conference on Human System Interactions, 210--215.
[48]
David Price, Ellen Rilofff, Joseph Zachary, and Brandon Harvey. 2004. NaturalJava. In Proceedings of the 5th International Conference on Intelligent User Interfaces, 207--211.
[49]
Tavis Rudd. 2013. Using Python to Code by Voice. Retrieved September 17, 2019 from https://pyvideo.org/pycon-us-2013/using-python-tocode-by-voice.html
[50]
Stevche Radevski, Hideaki Hata, and Kenichi Matsumoto. 2016. EyeNav. In Proceedings of the 9th Nordic Conference on Human-Computer Interaction, 1--4.
[51]
Jean K. Rodriguez-Cartagena, Andrea C. ClaudioPalacios, Natalia Pacheco-Tallaj, Valerie Santiago González, and Patricia Ordonez-Franco. 2016. The Implementation of a Vocabulary and Grammar for an Open-Source Speech-Recognition Programming Platform. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility, 447--448.
[52]
Lucas Rosenblatt, Patrick Carrington, Kotaro Hara, and Jeffrey P. Bigham. 2018. Vocal Programming for People with Upper-Body Motor Impairments. In Proceedings of the Internet of Accessible Things, 110.
[53]
David Rozado, Alexander McNeill, and Daniel Mazur. 2016. VoxVisio--Combining Gaze and Speech for Accessible HCI. Resna 2016.
[54]
Susana Rubio, Eva Díaz, Jesús Martín, and José M. Puente. 2004. Evaluation of Subjective Mental Workload: A Comparison of SWAT, NASA-TLX, and Workload Profile Methods. Applied Psychology 53, 1: 61--86.
[55]
Korok Sengupta, Min Ke, Raphael Menges, Chandan Kumar, and Steffen Staab. 2018. Handsfree web browsing. In Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, 1--3.
[56]
Asma Shakil, Christof Lutteroth, and Gerald Weber. 2019. Code Gazer: Making Code Navigation Easy and Natural with Gaze Input. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1--12.
[57]
Henrik Skovsgaard, Julio C. Mateo, John M. Flach, and John Paulin Hansen. 2010. Small-target selection with gaze alone. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, 145--148.
[58]
Lindsey Snell and Mr Jim Cunningham. 2000.An investigation into programming by voice and development of a toolkit for writing voice controlled applications. M. Eng. Report, Imperial College of Science, Technology and Medicine, London.
[59]
Speech to Text API | Microsoft Azure. Retrieved January 30, 2020 from https://azure.microsoft.com/engb/services/cognitive-services/speech-to-text
[60]
Talon 0.0.7.7 documentation. retrieved April 19, 2019 from https://talonvoice.com/docs/index.html#documentindex
[61]
The Eye Tribe. Retrieved January 29, 2020 from https://theeyetribe.com/theeyetribe.com/about/index .html
[62]
Visual Studio Code - Code Editing. Redefined. Retrieved January 29, 2020 from https://code.visualstudio.com.
[63]
Tobii Gaming | Eye Tracker 4C for PC Gaming. Buy Now at ?169. Retrieved January 29, 2020 from https://gaming.tobii.com/tobii-eye-tracker-4c
[64]
Amber Wagner and Jeff Gray. 2015. An Empirical Evaluation of a Vocal User Interface for Programming by Voice. In International Journal of Information Technologies and Systems Approach, 47--63.
[65]
Qiaohui Zhang, Atsumi Imamiya, X. Mao, and K. Go. 2004. A gaze and speech multimodal interface. In 24th International Conference on Distributed Computing Systems Workshops, 2004. Proceedings, 208--213.

Cited By

View all
  • (2024)Jasay: Towards Voice Commands in Projectional EditorsProceedings of the 1st ACM/IEEE Workshop on Integrated Development Environments10.1145/3643796.3648449(30-34)Online publication date: 20-Apr-2024
  • (2023)An End-to-End Review of Gaze Estimation and its Interactive Applications on Handheld Mobile DevicesACM Computing Surveys10.1145/360694756:2(1-38)Online publication date: 15-Sep-2023
  • (2022)Low-Cost Human–Machine Interface for Computer Control with Facial Landmark Detection and Voice CommandsSensors10.3390/s2223927922:23(9279)Online publication date: 29-Nov-2022
  • Show More Cited By

Index Terms

  1. Voiceye: A Multimodal Inclusive Development Environment

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DIS '20: Proceedings of the 2020 ACM Designing Interactive Systems Conference
    July 2020
    2264 pages
    ISBN:9781450369749
    DOI:10.1145/3357236
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 July 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. assistive technology
    2. eye gaze tracking
    3. programming tools
    4. speech recognition

    Qualifiers

    • Research-article

    Conference

    DIS '20
    Sponsor:
    DIS '20: Designing Interactive Systems Conference 2020
    July 6 - 10, 2020
    Eindhoven, Netherlands

    Acceptance Rates

    Overall Acceptance Rate 1,158 of 4,684 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)45
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Jasay: Towards Voice Commands in Projectional EditorsProceedings of the 1st ACM/IEEE Workshop on Integrated Development Environments10.1145/3643796.3648449(30-34)Online publication date: 20-Apr-2024
    • (2023)An End-to-End Review of Gaze Estimation and its Interactive Applications on Handheld Mobile DevicesACM Computing Surveys10.1145/360694756:2(1-38)Online publication date: 15-Sep-2023
    • (2022)Low-Cost Human–Machine Interface for Computer Control with Facial Landmark Detection and Voice CommandsSensors10.3390/s2223927922:23(9279)Online publication date: 29-Nov-2022
    • (2022)Inclusive Multimodal Voice Interaction for Code NavigationProceedings of the 2022 International Conference on Multimodal Interaction10.1145/3536221.3556600(509-519)Online publication date: 7-Nov-2022
    • (2022)GazeBreath: Input Method Using Gaze Pointing and Breath SelectionProceedings of the Augmented Humans International Conference 202210.1145/3519391.3519405(1-9)Online publication date: 13-Mar-2022
    • (2021)Natural Interaction with Traffic Control Cameras Through Multimodal InterfacesArtificial Intelligence in HCI10.1007/978-3-030-77772-2_33(501-515)Online publication date: 3-Jul-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media