[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3450341.3458491acmconferencesArticle/Chapter ViewAbstractPublication PagesetraConference Proceedingsconference-collections
short-paper

Characterizing the Performance of Deep Neural Networks for Eye-Tracking

Published: 25 May 2021 Publication History

Abstract

Deep neural networks (DNNs) provide powerful tools to identify and track features of interest, and have recently come into use for eye-tracking. Here, we test the ability of a DNN to predict keypoints localizing the eyelid and pupil under the types of challenging image variability that occur in mobile eye-tracking. We simulate varying degrees of perturbation for five common sources of image variation in mobile eye-tracking: rotations, blur, exposure, reflection, and compression artifacts. To compare the relative performance decrease across domains in a common space of image variation, we used features derived from a DNN (ResNet50) to compute the distance of each perturbed video from the videos used to train our DNN. We found that increasing cosine distance from the training distribution was associated with monotonic decreases in model performance in all domains. These results suggest ways to optimize the selection of diverse images for model training.

References

[1]
Leonard Carmichael and Walter F Dearborn. 1947. Reading and visual fatigue.Houghton Mifflin.
[2]
A. K. Chaudhary, R. Kothari, M. Acharya, S. Dangi, N. Nair, R. Bailey, C. Kanan, G. Diaz, and J. B. Pelz. 2019. RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). 3698–3702. https://doi.org/10.1109/ICCVW.2019.00568
[3]
Tom N Cornsweet. 1958. New technique for the measurement of small eye movements. JOSA 48, 11 (1958), 808–811.
[4]
J. Deng, W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248–255. https://doi.org/10.1109/CVPR.2009.5206848
[5]
Shaharam Eivazi, Thiago Santini, Alireza Keshavarzi, Thomas Kübler, and Andrea Mazzei. 2019. Improving Real-Time CNN-Based Pupil Detection through Domain-Specific Data Augmentation. In Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications (Denver, Colorado) (ETRA ’19). Association for Computing Machinery, New York, NY, USA, Article 40, 6 pages. https://doi.org/10.1145/3314111.3319914
[6]
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778. https://doi.org/10.1109/CVPR.2016.90
[7]
Alexander B. Jung, Kentaro Wada, Jon Crall, Satoshi Tanaka, Jake Graving, Christoph Reinders, Sarthak Yadav, Joy Banerjee, Gábor Vecsei, Adam Kraft, Zheng Rui, Jirka Borovec, Christian Vallentin, Semen Zhydenko, Kilian Pfeiffer, Ben Cook, Ismael Fernández, François-Michel De Rainville, Chi-Hung Weng, Abner Ayala-Acevedo, Raphael Meudec, Matias Laporte, 2020. imgaug. https://github.com/aleju/imgaug. Online; accessed 01-Feb-2020.
[8]
Moritz Kassner, William Patera, and Andreas Bulling. 2014. Pupil: An Open Source Platform for Pervasive Eye Tracking and Mobile Gaze-Based Interaction. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication(Seattle, Washington) (UbiComp ’14 Adjunct). Association for Computing Machinery, New York, NY, USA, 1151–1160. https://doi.org/10.1145/2638728.2641695
[9]
Ryan Kiros, Ruslan Salakhutdinov, and Richard S. Zemel. 2014. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. CoRR abs/1411.2539(2014). arxiv:1411.2539http://arxiv.org/abs/1411.2539
[10]
Rakshit Kothari, Zhizhuo Yang, Christopher Kanan, Reynold Bailey, Jeff B Pelz, and Gabriel J Diaz. 2020b. Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities. Scientific reports 10, 1 (2020), 1–18.
[11]
Rakshit S Kothari, Aayush K Chaudhary, Reynold J Bailey, Jeff B Pelz, and Gabriel J Diaz. 2020a. EllSeg: An Ellipse Segmentation Framework for Robust Gaze Tracking. arXiv preprint arXiv:2007.09600(2020).
[12]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012), 1097–1105.
[13]
Alexander Mathis, Pranav Mamidanna, Kevin M Cury, Taiga Abe, Venkatesh N Murthy, Mackenzie Weygandt Mathis, and Matthias Bethge. 2018. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nature neuroscience 21, 9 (2018), 1281–1289.
[14]
Tanmay Nath*, Alexander Mathis*, An Chi Chen, Amir Patel, Matthias Bethge, and Mackenzie W Mathis. 2019. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nature Protocols 14, 7 (2019), 2152–2176. https://doi.org/10.1038/s41596-019-0176-0
[15]
Carlos R Ponce, Will Xiao, Peter F Schade, Till S Hartmann, Gabriel Kreiman, and Margaret S Livingstone. 2019. Evolving images for visual neurons using a deep generative network reveals coding principles and neuronal preferences. Cell 177, 4 (2019), 999–1009.
[16]
H. Rebecq, R. Ranftl, V. Koltun, and D. Scaramuzza. 5555. High Speed and High Dynamic Range Video with an Event Camera. IEEE Transactions on Pattern Analysis & Machine Intelligence01 (dec 5555), 1–1. https://doi.org/10.1109/TPAMI.2019.2963386
[17]
Ruslan Salakhutdinov, Antonio Torralba, and Josh Tenenbaum. 2011. Learning to share visual appearance for multiclass object detection. In CVPR 2011. IEEE, 1481–1488. https://doi.org/10.1109/CVPR.2011.5995720
[18]
Grant Van Horn and Pietro Perona. 2017. The devil is in the tails: Fine-grained classification in the wild. arXiv preprint arXiv:1709.01450(2017).
[19]
Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, and Kate Saenko. 2015. Translating Videos to Natural Language Using Deep Recurrent Neural Networks. (May–June 2015), 1494–1504. https://doi.org/10.3115/v1/N15-1173

Cited By

View all
  • (2023)A framework for generalizable neural networks for robust estimation of eyelids and pupilsBehavior Research Methods10.3758/s13428-023-02266-356:4(3959-3981)Online publication date: 28-Nov-2023
  1. Characterizing the Performance of Deep Neural Networks for Eye-Tracking

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ETRA '21 Adjunct: ACM Symposium on Eye Tracking Research and Applications
    May 2021
    78 pages
    ISBN:9781450383578
    DOI:10.1145/3450341
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 May 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep eye tracking
    2. eyelid tracking
    3. pupil tracking

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Funding Sources

    • NSF EPSCOR

    Conference

    ETRA '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 69 of 137 submissions, 50%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 12 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A framework for generalizable neural networks for robust estimation of eyelids and pupilsBehavior Research Methods10.3758/s13428-023-02266-356:4(3959-3981)Online publication date: 28-Nov-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media