More Web Proxy on the site http://driver.im/

research-article

Open access

Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods

Authors:

Yang LiAuthors Info & Claims

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

Pages 1 - 12

https://doi.org/10.1145/3313831.3376870

Published: 23 April 2020 Publication History

All formats PDF

Abstract

Modeling visual search not only offers an opportunity to predict the usability of an interface before actually testing it on real users but also advances scientific understanding about human behavior. In this work, we first conduct a set of analyses on a large-scale dataset of visual search tasks on realistic webpages. We then present a deep neural network that learns to predict the scannability of webpage content, i.e., how easy it is for a user to find a specific target. Our model leverages both heuristic-based features such as target size and unstructured features such as raw image pixels. This approach allows us to model complex interactions that might be involved in a realistic visual search task, which can not be achieved by traditional analytical models. We analyze the model behavior to offer our insights into how the salience map learned by the model aligns with human intuition.

Supplementary Material

MP4 File (a741-yuan-presentation.mp4)

Download
40.16 MB

References

[1]

Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 265--283.

Digital Library

[2]

Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-up and top-down attention for image captioning and visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6077--6086.

[3]

Gilles Bailly, Antti Oulasvirta, Duncan P. Brumby, and Andrew Howes. 2014. Model of visual search and selection time in linear menus. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3865--3874.

Digital Library

[4]

Ali Borji. 2019. Saliency Prediction in the Deep Learning Era: Successes and Limitations. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019).

[5]

Kan Chen, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu, and Ram Nevatia. 2015b. ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering. arXiv preprint arXiv:1511.05960 (2015).

[6]

X Chen, G Bailly, DP Brumby, A Oulasvirta, and A Howes. 2015a. The Emergence of Interactive Behaviour: A Model of Rational Menu Search. In CHI'15 Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Vol. 33. Association for Computing Machinery (ACM), 4217--4226.

[7]

Andy Cockburn, Carl Gutwin, and Saul Greenberg. 2007. A Predictive model of Menu Performance. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 627--636.

Digital Library

[8]

Maurizio Corbetta and Gordon L Shulman. 2002. Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience 3, 3 (2002), 201.

[9]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[10]

Wai-Tat Fu and Peter Pirolli. 2007. SNIF-ACT: A cognitive model of user navigation on the World Wide Web. Human-Computer Interaction 22, 4 (2007), 355--412.

Digital Library

[11]

James E Hoffman. 1979. A two-stage model of visual search. Perception & Psychophysics 25, 4 (1979), 319--327.

[12]

Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).

Digital Library

[13]

Laurent Itti and Christof Koch. 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research 40, 10--12 (2000), 1489--1506.

[14]

Melvin Johnson, Mike Schuster, Quoc V Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, and others. 2017. Google's multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5 (2017), 339--351.

[15]

Jussi PP Jokinen, Sayan Sarcar, Antti Oulasvirta, Chaklam Silpasuwanchai, Zhenxin Wang, and Xiangshi Ren. 2017. Modelling learning of new keyboard layouts. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 4203--4215.

Digital Library

[16]

Jussi PP Jokinen, Zhenxin Wang, Sayan Sarcar, Antti Oulasvirta, and Xiangshi Ren. 2020. Adaptive feature guidance: Modelling visual search with graphical layouts. International Journal of Human-Computer Studies 136 (2020), 102376.

Digital Library

[17]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[18]

Christof Koch and Shimon Ullman. 1987. Shifts in selective visual attention: towards the underlying neural circuitry. In Matters of Intelligence. Springer, 115--141.

[19]

Eileen Kowler. 2011. Eye movements: The past 25 years. Vision Research 51, 13 (2011), 1457--1483.

[20]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.

[21]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep Learning. Nature 521, 7553 (2015), 436.

[22]

Yang Li, Samy Bengio, and Gilles Bailly. 2018. Predicting Human Performance in Vertical Menu Selection Using Deep Learning. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, Article 29, 7 pages.

Digital Library

[23]

Taosheng Liu, Jonas Larsson, and Marisa Carrasco. 2007. Feature-based attention modulates orientation-selective responses in human visual cortex. Neuron 55, 2 (2007), 313--323.

[24]

Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh. 2016. Hierarchical question-image co-attention for visual question answering. In Advances In Neural Information Processing Systems. 289--297.

[25]

Julio C Martinez-Trujillo and Stefan Treue. 2004. Feature-based attention increases the selectivity of population responses in primate visual cortex. Current Biology 14, 9 (2004), 744--751.

[26]

Brian McElree and Marisa Carrasco. 1999. The temporal dynamics of visual search: evidence for parallel processing in feature and conjunction searches. Journal of Experimental Psychology: Human Perception and Performance 25, 6 (1999), 1517.

[27]

Ubric Neisser. 1967. Cognitive Psychology (New York: Appleton). Century, Crofts (1967).

[28]

Enkhbold Nyamsuren and Niels A Taatgen. 2013. Pre-attentive and attentive vision module. Cognitive Systems Research 24 (2013), 62--71.

Digital Library

[29]

Ken Pfeuffer and Yang Li. 2018. Analysis and Modeling of Grid Performance on Touchscreen Mobile Devices. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, Article 288, 12 pages.

Digital Library

[30]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91--99.

[31]

Jiye Shen, Eyal M Reingold, and Marc Pomplun. 2003. Guidance of eye movements during conjunctive visual search: the distractor-ratio effect. Canadian Journal of Experimental Psychology 57, 2 (2003), 76.

[32]

Kevin J Shih, Saurabh Singh, and Derek Hoiem. 2016. Where to look: Focus regions for visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4613--4621.

[33]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958.

Digital Library

[34]

Benjamin W Tatler, Roland J Baddeley, and Iain D Gilchrist. 2005. Visual correlates of fixation selection: Effects of scale and time. Vision Research 45, 5 (2005), 643--659.

[35]

Farnaz Tehranchi and Frank E Ritter. 2018. Modeling visual search in interactive graphic interfaces: Adding visual pattern matching algorithms to ACT-R. In Proceedings of 16th International Conference on Cognitive Modeling. University of Wisconsin Madison, WI, 162--167.

[36]

Leong-Hwee Teo, Bonnie John, and Marilyn Blackmon. 2012. CogTool-Explorer: a model of goal-directed user exploration that considers information layout. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2479--2488.

Digital Library

[37]

Kashyap Todi, Jussi Jokinen, Kris Luyten, and Antti Oulasvirta. 2019. Individualising Graphical Layouts with Predictive Visual Search Models. ACM Transactions on Interactive Intelligent Systems (TiiS) 10, 1 (2019), 1--24.

Digital Library

[38]

Anne M Treisman and Garry Gelade. 1980. A feature-integration theory of attention. Cognitive Psychology 12, 1 (1980), 97--136.

[39]

Stefan Treue and Julio C Martinez Trujillo. 1999. Feature-based attention influences motion processing gain in macaque visual cortex. Nature 399, 6736 (1999), 575.

[40]

Hidde van der Meulen, Petra Varsanyi, Lauren Westendorf, Andrew L Kun, and Orit Shaer. 2016. Towards understanding collaboration around interactive surfaces: Exploring joint visual attention. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, 219--220.

Digital Library

[41]

Robert Walter, Andreas Bulling, David Lindlbauer, Martin Schuessler, and Jörg Müller. 2015. Analyzing Visual Attention During Whole Body Interaction with Public Displays. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, New York, NY, USA, 1263--1267.

Digital Library

[42]

Jeremy M Wolfe. 1994. Guided search 2.0 a revised model of visual search. Psychonomic Bulletin & Review 1, 2 (1994), 202--238.

[43]

Jeremy M Wolfe and Todd S Horowitz. 2017. Five factors that guide attention in visual search. Nature Human Behaviour 1, 3 (2017), 0058.

[44]

Xiaoli Wu, Tom Gedeon, and Linlin Wang. 2018. The analysis method of visual information searching in the human-computer interactive process of intelligent control system. In Congress of the International Ergonomics Association. Springer, 73--84.

[45]

Huijuan Xu and Kate Saenko. 2016. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. In European Conference on Computer Vision. Springer, 451--466.

[46]

Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. 2016. Stacked attention networks for image question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 21--29.

[47]

Li Zhaoping and Uta Frith. 2011. A clash of bottom-up and top-down processes in visual search: The reversed letter effect revisited. Journal of Experimental Psychology: Human Perception and Performance 37, 4 (2011), 997.

[48]

Quanlong Zheng, Jianbo Jiao, Ying Cao, and Rynson WH Lau. 2018. Task-driven webpage saliency. In Proceedings of the European Conference on Computer Vision (ECCV). 287--302.

Digital Library

Cited By

Meißner SDegbelo A(2024)User Performance Modelling for Spatial Entities Comparison with Geodashboards: Using View Quality and Distractor as ConceptsCompanion Proceedings of the 16th ACM SIGCHI Symposium on Engineering Interactive Computing Systems10.1145/3660515.3661325(7-14)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3660515.3661325
Lee LYau YHui P(2024)Perceived User Reachability in Mobile UIs Using Data Analytics and Machine LearningInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2327199(1-24)Online publication date: 25-Mar-2024
https://doi.org/10.1080/10447318.2024.2327199
Wu JKrosnick RSchoop ESwearngin ABigham JNichols J(2023)Never-ending Learning of User InterfacesProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606824(1-13)Online publication date: 29-Oct-2023
https://dl.acm.org/doi/10.1145/3586183.3606824
Show More Cited By

Index Terms

Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods
1. Computing methodologies
  1. Artificial intelligence

Recommendations

Predicting Human Performance in Vertical Menu Selection Using Deep Learning
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

Predicting human performance in interaction tasks allows designers or developers to understand the expected performance of a target interface without actually testing it with real users. In this work, we present a deep neural net to model and predict ...
Modeling the masking effect of the human visual system with visual attention model
ICICS'09: Proceedings of the 7th international conference on Information, communications and signal processing

It is well known that the human visual system (HVS) cannot sense all changes in an image/video due to its underlying physiological and psychological mechanisms. We propose a complete masking estimation model for image/video in this paper. In our model, ...
Modeling motion visual perception for video quality assessment
MM '11: Proceedings of the 19th ACM international conference on Multimedia

Contrast sensitivity of Human Visual System (HVS) plays an important role in perceiving visual stimuli, and consequently, it has a significant impact on the perceived video quality. This paper proposes a visual perception model based on foveated vision ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

April 2020

10688 pages

ISBN:9781450367080

DOI:10.1145/3313831

General Chairs:
Regina Bernhaupt
Eindhoven University of Technology, Netherlands
,
Florian 'Floyd' Mueller
Monash University, Australia
,
David Verweij
Newcastle University, UK
,
Josh Andres
RMIT, Australia
,
Program Chairs:
Joanna McGrenere
University of British Columbia, Canada
,
Andy Cockburn
University of Canterbury, New Zealand
,
Ignacio Avellino
University of Maryland Baltimore County, USA
,
Alix Goguey
Grenoble Alpes University, France
,
Pernille Bjørn
University of Copenhagen, Denmark
,
Shengdong (Shen) Zhao
National University of Singapore, Singapore
,
Briane Paul Samson
Future University Hakodate, Japan & De La Salle University, Philippines
,
Rafal Kocielnik
University of Washington, USA

Copyright © 2020 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2020

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CHI '20

Sponsor:

SIGCHI

CHI '20: CHI Conference on Human Factors in Computing Systems

April 25 - 30, 2020

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
1,403
Total Downloads

Downloads (Last 12 months)278
Downloads (Last 6 weeks)62

Reflects downloads up to 10 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Meißner SDegbelo A(2024)User Performance Modelling for Spatial Entities Comparison with Geodashboards: Using View Quality and Distractor as ConceptsCompanion Proceedings of the 16th ACM SIGCHI Symposium on Engineering Interactive Computing Systems10.1145/3660515.3661325(7-14)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3660515.3661325
Lee LYau YHui P(2024)Perceived User Reachability in Mobile UIs Using Data Analytics and Machine LearningInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2327199(1-24)Online publication date: 25-Mar-2024
https://doi.org/10.1080/10447318.2024.2327199
Wu JKrosnick RSchoop ESwearngin ABigham JNichols J(2023)Never-ending Learning of User InterfacesProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606824(1-13)Online publication date: 29-Oct-2023
https://dl.acm.org/doi/10.1145/3586183.3606824
Jokinen JOulasvirta AHowes A(2023)Cognitive Modelling: From GOMS to Deep Reinforcement LearningExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3574173(1-3)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544549.3574173
Wallace SCai TLe BLeiva L(2022)Debiased Label Aggregation for Subjective Crowdsourcing TasksExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3519614(1-8)Online publication date: 27-Apr-2022
https://dl.acm.org/doi/10.1145/3491101.3519614
Jokinen JOulasvirta AHowes A(2022)Cognitive Modelling: From GOMS to Deep Reinforcement LearningExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3503771(1-3)Online publication date: 27-Apr-2022
https://dl.acm.org/doi/10.1145/3491101.3503771
Bajammal MStocco AMazinanian DMesbah A(2022)A Survey on the Use of Computer Vision to Improve Software Engineering TasksIEEE Transactions on Software Engineering10.1109/TSE.2020.303298648:5(1722-1742)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1109/TSE.2020.3032986
Pourmemar MJoshi YPoullis C(2022)Predicting Human Performance in Vertical Hierarchical Menu Selection in Immersive AR Using Hand-gesture and Head-gaze2022 15th International Conference on Human System Interaction (HSI)10.1109/HSI55341.2022.9869495(1-8)Online publication date: 28-Jul-2022
https://doi.org/10.1109/HSI55341.2022.9869495
Kaluarachchi TReis ANanayakkara S(2021)A Review of Recent Deep Learning Approaches in Human-Centered Machine LearningSensors10.3390/s2107251421:7(2514)Online publication date: 3-Apr-2021
https://doi.org/10.3390/s21072514
Todi KBailly GLeiva LOulasvirta AKitamura YQuigley AIsbister KIgarashi TBjørn PDrucker S(2021)Adapting User Interfaces with Model-based Reinforcement LearningProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445497(1-13)Online publication date: 6-May-2021
https://dl.acm.org/doi/10.1145/3411764.3445497
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents