Abstract
Users often browse the web in an exploratory way, inspecting what they find interesting without a specific goal. However, the temporal dynamics of visual attention during such sessions, emerging when users gaze from one item to another, are not well understood. In this paper, we examine how people distribute visual attention among content items when browsing news. Distribution of visual attention is studied in a controlled experiment, wherein eye-tracking data and web logs are collected for 18 participants exploring newsfeeds in a single- and multi-column layout. Behavior is modeled using Weibull analysis of item (article) visit times, which describes these visits via quantities like durations and frequencies of switching focused item. Bayesian inference is used to quantify uncertainty. The results suggest that visual attention in browsing is fragmented, and affected by the number, properties and composition of the items visible on the viewport. We connect these findings to previous work explaining information-seeking behavior through cost-benefit judgments.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
A large proportion of people’s time engaged with computers gets devoted to web browsing [14]. In considerable proportions, this activity includes exploration [4, 46, 47], characterized by lack of an explicit informational goal. Unlike focused search tasks, exploration permits users to review the available content freely and engage with anything they find interesting. Given the information-rich nature of many modern browsing environments (news feeds, social media, catalogues, etc.), understanding how users choose what to attend to and for how long during exploration is a key challenge for behavioral and psychological research.
This paper presents new empirical data and modeling results on visual attention when browsing feed-format news. We focus on “open-ended” browsing tasks, wherein information is gathered without a specific goal. The literature has described this type of browsing with several terms [6], such as “undirected” [13, 19], “unstructured” [13], “casual” [28], “serendipitous” [9, 13], “capricious” [43], and “hedonic” [27], denoting a contrast against directed or semi-directed browsing, which assumes a specified or somewhat specified goal.
At present, it is not well known how users spread their attention across content items when browsing. Previous work has used static spatial representations such as heatmaps for eye movement patterns during browsing. The “F-pattern” [30, 33] and the Golden Triangle [21, 22] are well-known examples. In addition, commercial tools have been developed to predict visual attention to visual stimuli (e.g., Attention InsightFootnote 1 and 3M Visual Attention SoftwareFootnote 2). Some research has examined how users distribute attention to a page’s various HTML elements [25], while other studies focused on how people look at a single element type, such as specific features of images [12, 20, 23]. Still, the temporal dynamics of this behavior remain relatively unknown. Exceptions to this are work by Liu et al. [26] and Luo et al. [27]. Modeling page-level dwell times via Weibull analysis, Liu et al. concluded that general browsing exhibits a screening pattern: only some web pages pass an initial screening. They used data on page-visit times without considering visual attention. Similarly, Luo et al. modeled page-level dwell times, using an inverse Gaussian distribution.
This paper presents new findings on the temporal dynamics of visual attention when browsing news. We model gaze behavior in a task where users were given a long newsfeed to explore, and were asked to read what they find interesting. As depicted in Fig. 1, we examined the temporal dynamics of visual attention over a page in terms of two concepts: an item offers a clickable preview of content, here consisting of textual (title and description) and visual (picture) elements, and a visit consists of a continuous sequence of fixations on an item.
We report on the distributions of visit times and examine them with Weibull analysis, a technique employed for analyzing time-to-event data [39], with special attention to the parameters of the Weibull distribution. We use Bayesian inference to quantify the uncertainty in the parameter estimates. Presenting how visit durations are affected by two independent variables – layout type (single- vs. multi-column) and item content (a picture and/or a description with a title), we look in particular at survival (how long the visit lasts, or “survives”) and hazard (the rate at which visits end). Our main finding is that visual attention in news browsing is fragmented. That is, visits are very brief on average, and items may be visited more than once.
In summary, this paper offers two contributions:
-
We show that the distribution of visual attention in news browsing is fragmented and depends on the properties of the items and the environment.
-
We quantify this fragmentation by means of Weibull analysis, extending the model presented by Liu et al. [26] to item-level dynamics.
We conclude with synthesis, discussing these results and future work. In particular, we analyze the findings in relation to existing theories of information-seeking that take cost–benefit judgments as a basis for behavior.
2 Related Work
Below, we provide a brief overview of literature on visual attention and of empirical results related to browsing. We then discuss theories specifically addressing how people seek information on the web.
2.1 Visual Attention and Web Browsing
Eye-tracking work in information-search studies commonly associates eye movements with attention [7, 15, 25]. Several eye-tracking studies have analyzed the distribution of visual attention over a website during a browsing session, many of these focusing on search-engine result pages (SERPs) [15, 21, 22, 25]. Among the well-known outputs pertaining to web browsing are the depiction of how people “scan” some web pages in a pattern resembling the letter F [30] and Google’s Golden Triangle, wherein the upper-left corner of a SERP attracts the majority of eye-tracking activity [21, 22] (similar patterns have been found for other types of pages too [7]). Most often, static representations of eye movements describe the findings, typically represented via heatmaps. In addition, previous research has examined visual attention in relation to specific content displayed on a website – for instance, tendencies in ignoring advertisement banners [36, 37].
2.2 Empirical Understanding of Browsing Without a Specific Goal
In comparison, browsing without a specific goal has gained little empirical attention. Work using similar methodology to ours consists primarily of the aforementioned studies of general and hedonic browsing, by Liu et al. [26] and Luo et al. [27], both using statistical models. While Liu et al.’s work identified a screening pattern in web browsing, the inverse Gaussian dwell-time distribution found by Luo and colleagues in hedonic content systems reinforces the law-of-surfing results proposed by Huberman et al. [24] in more structured tasks. The “law” suggests that the number of pages visited on a given website follows strong regularities and this can be captured with distinct probability distributions.
2.3 Theories of Browsing Behavior
While there are multiple theoretical models of browsing, views on its fundamental mechanisms differ. Cove and Walsh describe browsing as “an art” wherein individuals know what they want only as they come across it [13]. In fact, many theoretical accounts address this observation, whether focusing on directed, semi-directed, or undirected browsing. White and Roth [47] distinguish exploratory browsing as a part of exploratory search (as opposed to focused search) undertaken to 1) specify information needs and 2) encourage information discovery [40, 47]. Using the “berrypicking” model, Bates [3, 4] draws on the analogy of picking berries in a forest: browsing is an activity comprising a series of glimpses that may lead to closer inspection and acquisition of an object, connecting it to curiosity and exploratory behavior in humans. In this analogy, browsing is undirected behavior [40]. In contrast, information-foraging theory (IFT) [34] is based on biology and anthropology’s theory of optimal foraging; this posits that individuals weigh the costs of performing an action against the potential information gain. The IFT notion of browsing is of a dynamic activity wherein the individual is guided between information items by “information scent,” a subjective measure of item value [10]. Work in SERP and other settings has suggested that cost–benefit judgments may drive users’ browsing behavior [2]. Under the models developed for these settings, the effort and time required to complete the task are the costs while the relevance of the information discovered constitutes the benefits [2]. An alternative is to interpret the benefits as utility, though such notions are seldom used in IR research, due to difficulties in measuring it [2, 11, 41].
3 Method
We report on a controlled experiment where participants were given a newsfeed to explore in two different layouts. They were allowed to inspect, click, and scroll as they found natural, without being asked to perform particular actions during the browsing session. We present aggregated results obtained from 18 participants, with multiple data sources: eye tracking, web logging, and a questionnaire.
3.1 Experimental Design
Participants were advised to read news items they found interesting for an unlimited amount of time. They were told that the experiment is not designed to assess them. All participants were shown a single-column (mobile-like) and multi-column (desktop-like) layout, where news items in various categories were presented with different levels of detail visible (see Fig. 2). The conditions’ presentation order was counterbalanced across the participants: every other person saw the single-column condition first followed by the multi-column condition, and the tasks had the opposite order for the other half of the participants. The logs captured interaction with an item if at least 60% of it was visible on the screen and it remained visible for at least 300 ms. In addition to eye-tracking and log data for each participant, we recorded participants’ interests, via the questionnaire form. Our analysis examined the effects of two independent variables on item (article) visit times: layout (single- or multi-column) and level of detail (accompanying the title, always presented, with an image and/or description).
3.2 Participants
In total, 24 participants (9 male and 15 female university students, mean age 26.13 years with SD=4.27) were enrolled in the study, between December 12 and 20, 2018, from a student mailing list at Aalto University. Five of them wore glasses, and one used contact lenses. On a five-point Likert scale, from participants’ self-reporting, the mean level of their knowledge of the English language was 3.71 (SD=0.62) and of their interest in news related to North America was 3.67 (SD=0.64). All participants received a movie ticket as compensation. The study was conducted in accordance with the principles stated in the Declaration of Helsinki and a local procedure for ethics approval. Each participant signed an informed-consent form before taking part.
3.3 Apparatus and Setup
A custom news-aggregator web application called WebNews was created for the experiment. This application’s purpose was twofold: 1) to present stored news items to participants in a single-column and multi-column layout (see Fig. 2) and 2) to log the participants’ browsing-behavior data. On average, the single-column condition presented three items per viewport, and the multi-column condition had eight. The application was implemented with standard web technologies (front end) and with Python and MongoDB (back end). The experiment was carried out via the Chrome 71 web browser, running on Windows 8.1, with an Intel Core i7-5930K CPU @3.50 GHz and 64 GB of RAM. Other hardware used in the experiment included a 24-inch LCD monitor and a Logitech M100 optical mouse with scroll wheel. The participants’ eye movements were tracked by means of a Tobii EyeX eye-tracker attached to the bottom edge of the monitor. The tracker was calibrated with Tobii Eye Tracking Core Software v2.13.4 for each participant individually, once at the beginning of the experiment and then a second time, between conditions. We collected fixation data by using a custom C# program with Tobii Interaction Library SDK 0.7.3. A video of the participants’ browsing behavior with overlaid eye positions was recorded via Tobii Ghost v1.4 and OBS Studio v22.0.2, while eye positions and fixation data was captured through a custom C# program.
3.4 Materials
Headlines of the top live news articles from the US were obtained each morning of the experiment via News APIFootnote 3 and stored locally in an empty database. Each news item belonged to a single, specified topic category (“business,” “entertainment,” “health,” “science,” “sports,” or “technology”) and contained data such as title, teaser, publication date, and URL to both an actual news article and a related image. Four categories were presented in the single-column layout (“business,” “entertainment,” “health,” and “technology”), while the multi-column layout covered all six. Out of the stored news articles (approximately 400 pieces), 64 + 64 (no duplicates) news items were randomly sampled for each participant to be shown in the experiment. The previews of the articles varied in their level of detail: all items were presented with a title, but some featured a picture and/or description in addition. In the single-column condition, levels of detail within the given layout were determined randomly. The template applied for the multi-column layout displayed the levels of detail in the same order for all participants, but different articles were assigned to each item position. This design choice was intended to generate realistic-looking layouts in the multi-column condition: had the items’ detail level been allowed to vary randomly, the page may have looked unrealistic, with items of differing size shown side by side. Participants could freely choose to browse the WebNews app or visit the external sites where the articles were hosted. Upon visiting an external site, participants were instructed to return to the WebNews app and not follow any further links. Additionally, they rated how interesting they found the news in each category generally.
3.5 Data Pre-processing
We considered a visit to consist of a continuous dwell on an item. When a participant fixated on an area outside the item and then returned to it, we deemed the subsequent dwelling as a revisit and regarded the item as having been visited twice so far. Visits were calculated from fixation data, obtained using the Tobii software’s fixation filter. For six users, the beginnings and endings of fixations that it calculated were ambiguous, likely on account of a logging error (that is, either a “Begin” or an “End” tag being missing for the gaze points’ associated event type). For consistency in fixation calculations (i.e., comparability across all fixations included in the modeling), we omitted these users’ data from consideration. Roughly following earlier work’s approach [18], we filtered out fixation outliers, which we defined as fixations of below 50 ms or longer than 1500 ms, or 22% of all fixations, across conditions. Fixation duration may depend on the type of activity (e.g., reading [45] or visual search, with varying difficulty [35]), so we used sensitivity analysis with outliers included, to be sure the definition chosen for outliers did not affect the qualitative modeling results.
We considered only those fixations taking place within the WebNews app, to focus on browsing internal to the newsfeed (rather than on external sites). Likewise, we excluded fixations on areas in the margins or on items that were not included in the logs (i.e., those visible for below 300 ms or with less than 60% of their area visible). Our final dataset consisted of 18 participants’ data, for 7,446 fixations (with a mean of 0.33 s, SD=0.29) in the single-column condition and 7,122 (mean: 0.33 s, SD=0.28) in the multi-column layout. Since the participants were allowed to sit 45–100 cm from the monitor and move their head back and forth (our calculations used a mean of 72.5 cm), we assumed the foveal area to correspond to a diameter of 2.53 cm, or 96 px. If any part of an item fell within the foveal area, the calculation of visits took it into account. We carried out sensitivity analysis to test the effect of different foveal areas (diameters of 56 px, 96 px, and 132 px) and concluded that the qualitative modeling results are not affected by these choices. Our dataset covers 2,200 visits, to 794 articles, in the single-column condition and 3,178 visits, to 898, in the multi-column one.
4 Modeling Browsing Behavior
We use Weibull analysis to examine browsing behavior. The Weibull distribution has been used in different contexts as it can fit data from a number of different applications (e.g., biology, engineering and economics) [39]. We draw an analogy between system failure and web browsing, in a manner similar to Liu et al.’s [26] but with item-visit times rather than page-dwell times as the time-to-event-data. Inherent to this approach is that visiting is considered a random process.
4.1 Visiting as a Random Process
In web browsing, a user can visit an item for one or more fixations, then shift the focus of attention somewhere else. Consider a user who is examining a screen displaying three items as in Fig. 1. In this example, there are five visits (labeled from top to bottom in the figure: item 1 \(\rightarrow \) item 3 \(\rightarrow \) item 1 \(\rightarrow \) item 2 \(\rightarrow \) item 3). We model visiting as a random process. Formalizing phenomena as a random process has proven suitable for application in such fields as general browsing [26], gaming [5, 44], and medical research [8], with survival analysis of time-to-event data. It is plausible that item visits in browsing are affected by multiple latent variables, introducing randomness. For instance, a door suddenly closing during browsing may draw the user’s attention away from the screen, interrupting a visit. We assume that, alongside the random component, browsing behavior is affected by the properties of the items and the browsing environment.
Similarly to Liu et al., we assume that visit durations follow a Weibull distribution, as its parameters can be interpreted with respect to user behavior. We consider a two-parameter Weibull distribution with a shape k and a scale \(\lambda \).
-
The distribution’s shape parameter (k) determines whether a process follows negative aging (i.e., the immediate probability of the process ending decreases over time) or positive aging (i.e., the immediate probability of it ending rises over time). Positive aging is associated with \(k>1\), no aging with \(k=1\), and negative aging with \(k<1\).
-
The scale parameter denotes where 63.2% of the processes have ended [31].
One can analyze these parameters by applying two concepts from Weibull analysis: the survival function and the hazard function, for which we use the following formulations. The former, S(t), describes the proportion of processes (visits) that exceed a given duration t. It is the inverse of the cumulative distribution function, F(t), which can be written as follows for the Weibull distribution:
The hazard function h(t) at time t of a process (instantaneous failure probability) is calculated thus [26]:
Here, h(t) gives the instantaneous rate at which a visit to an item ends.
4.2 Weibull Model Specification
To obtain more robust estimates of the Weibull model parameters (shape and scale), we use Bayesian inference to obtain their posterior distributions. This allows quantifying the uncertainty in these estimates. Two models are considered: 1) a separate model for single- and multi-column environments and 2) an extension of this that takes into account properties of an item as covariates.
Separate Model. The separate model accounts for single- and multi-column environments having distinct, or “separate,” shape and scale parameters. We assume that k and \(\lambda \) both have a weakly informative prior distribution in the positive domain. Hence, the data y can be modeled via the Weibull distribution:
Separate Model with Covariates. We also can consider adding item properties to the separate model as covariates. The properties we examine are whether the item contains a picture (p), a description (d), or both. All items contain a title in our setting. We assume that all parameters have weakly informative prior distributions. Again, dataset y is assumed to follow a Weibull distribution:
where y denotes the data, \(\beta _0\) is an intercept, \(\beta _p\) and \(\beta _d\) are coefficients, and x is a Boolean indicating whether an item preview included a picture (p) or a description (d). Hence, the covariates x are included as a linear combination for the scale \(\lambda \) parameter. This way of adding covariates to the Weibull distribution is referred to as the accelerated life model or the proportional hazard model [39], and the implementation chosen is based on one from prior work [29, 32].
4.3 Model Fitting
To obtain posterior samples for the parameters, a sampler implemented in PyStan3 (Hamiltonian Monte Carlo with No-U-Turn) [38] was run with four chains for 1,000 iterations, with 500 iterations being discarded as warm-up in line with recommendations [16, p. 282]. We achieved good convergence (measured as rank-normalized \(\hat{R}<\) 1.01 [42]). Model fit was evaluated via comparison of the observations to data produced under the posterior. We used a posterior predictive p-value, which is the probability of data drawn from the posterior being more extreme than the observations, as measured by a test quantity [16] (note that this metric is not the commonly used frequentist p-value). We performed a prior sensitivity analysis too, concluding that the qualitative results hold also for both an uninformative and an informative prior.
5 Results
We report both statistics describing the visit durations and the results of model fitting. Since uncertainty is quantified in the Bayesian model (see Subsect. 5.2), we do not provide related measurements for the descriptive results.
5.1 Descriptive Results
The participants were allowed to browse for an unlimited amount of time. On average, the participants spent more time browsing in the single-column than the multi-column condition: 18:01.80 (SD=599 s) vs. 13:15.94 (SD=257 s). Visits to items were, in general, short, and the distribution of visit durations was right-skewed: most visits were very brief, with some extended visits creating a long right tail for the distribution. Mean visit durations were longer in the single-column condition (1.54 s, with SD=1.90, vs. 0.84 s, with SD=1.04). Total dwell times on items (the sums of all visit durations) were higher in the single-column condition, with means of 3.38 s (SD=3.09) and 2.50 s (SD=2.17), respectively, and exceed the mean visit durations, thus reflecting that items frequently received several visits. That is, users seemed to engage in the following pattern: observe an item, look elsewhere on the screen, then return to an item they had already examined. In the single-column condition, approximately 58% of the items were visited more than once, while 76% of the items in the multi-column condition received several visits. The mean number of visits was higher in the multi-column condition, at 1.25 (SD=1.69) as opposed to 2.01 (SD=2.02).
5.2 Modeling Results
To analyze the strategies users may adopt during web browsing, we look at survival and hazard functions for the two fitted models.
Model Fitting Results. We begin by describing the fitting results for the Weibull models’ shape and scale parameters. In the single-column condition, the k parameter suggests that visits follow negative aging, since k is below 1 for both models (90% credibility intervals of \(k \in [0.89,0.93]\) and \(k\in [0.90,0.95]\) for, respectively, the separate model and the separate model with covariates). That is, the immediate probability of a user glancing away from an item decreases over time. For the multi-column condition, k is higher (with corresponding 90% credibility intervals of \(k\in [0.96,1.00]\) and \(k\in [0.97,1.01]\)). This translates to behavior wherein the immediate probability that a user switches between items is more stable in the multi-column condition. The 90% credibility intervals for the two conditions do not overlap for the k parameter. The scale parameter’s value is lower in the multi-column condition for both models (means: \(\lambda \approx 0.8\) vs. \(\lambda \approx 1.5\)) and for items with less information. This result can be interpreted as follows: 63.2% of visits end before reaching a duration of approximately 1.5 s (single-column) or 0.8 s (multi-column). Posterior predictive p-values calculated with the mean as the test statistic indicate a good model fit [16, 17] (\(p\approx 0.43\) for the separate model and \(p\approx 0.47\) for the separate model with covariates), though the fitted model underestimates standard deviation (\(SD\approx 1.3\) vs. \(SD\approx 1.5\)).
Survival. Next, we turn to the survival functions evaluated for the fitted shape and scale parameters. Both models estimate that visits frequently last less than a second, suggesting that visits to items are brief. The models also estimate that visits are longer in the single-column condition. For instance, the separate model estimates that 47–53% of them last over a second in the single-column condition while the equivalent figure for the other condition is only 28–33% (see Fig. 3, pane A). These estimates seem consistent with the empirical observations of the proportions of visits above a given duration (see the dotted lines in Fig. 3, A). The fitted model also shows that visits to items that have less information (e.g., only a title) tend to be shorter (see Fig. 3, B and C).
Hazard. The model fitting’s results suggest that the hazard functions for the two conditions differ in shape. Users seem to move between items frequently, with the switching rate being higher in the multi-column condition. The instantaneous rate of switching one’s focus of attention per second (the hazard rate) decreases over time in the single-column condition (the slope of the hazard function is steeper for that condition in Fig. 4’s pane A). On the other hand, in the multi-column condition, this probability stays more stable. Users in the multi-column setting were approximately 50% more likely to switch their focus of attention upon landing on an item than users in the single-column condition (with roughly 0.8 vs. 1.2 switches per second in the first fixation). Users move their attention away quicker from items with fewer details (as Fig. 4’s panes B and C attest). This pattern is more distinct for the single- than the multi-column layout, where the average hazard rate decreases as the amount of detail increases (e.g., compare the hazard functions for items with title only vs. with image, title, and description in Fig. 4, B). However, for the multi-column condition, the hazard rate is similar between items with a title only and ones with an additional image (t and pt in Fig. 4, C). In addition, items with descriptions in the multi-column condition show similar hazard rates (dt and ptd in Fig. 4, C). These observations arise from the different estimates for the shape k and the scale \(\lambda \) parameters for the different models. The hazard rate in the single-column condition exhibits negative aging (a hazard rate that falls over time), which roughly corresponds to a screening pattern wherein most visits are brief, with some items passing this initial test [26]. A less prominent effect is visible in the multi-column condition.
6 Discussion and Conclusions
The main findings of this paper are the following:
-
People distribute their attention to items on a screen in a fragmented manner. Instead of making a single, focused visit to an item, users gather information in a sequence of visits.
-
We found the “fragmentation” to be more prominent in desktop (multi-column) than mobile-like (single-column) environments in our setting.
We measured fragmentation of attention as the frequency of gaze shifts between items and formalized it by modeling visit durations via Weibull analysis in line with prior work [26]. These results could inform design of content feeds, commonly used in social media and news applications.
Weibull analysis presents the advantage of having parameters (scale and shape) that can be interpreted with respect to user behavior. Our results suggest that mobile-like environments with a single-column layout are more effective at maximizing the attention a user directs toward any single item. If the goal is instead a maximal number of items attended to, desktop-like environments with multi-column layouts are better. For example, the fitted model suggests that when user gaze shifts to a target item, the rate of switching one’s focus of attention (the hazard rate) is higher in a desktop-like environment. The properties of the item matter also: items that contain a title, a description, and an image are given attention longer than those with just a title. This observation is sensible, since items with only a title offer less information – hence its processing is quicker. In addition, we found that a screening pattern wherein items are quickly scanned is more prevalent in the single-column condition, suggested by the lower value of the Weibull distribution’s shape parameter. This observation parallels that of Liu et al., who find a similar pattern when analyzing page-level data.
One way to interpret the model proposed here is that a user samples item-visit times from Weibull distributions. Our results point to these distributions diverging between the two conditions (mobile- and desktop-like) and with the level of detail visible in an item (picture, title, and/or description). Longer visits to items that are richer in detail may be a natural consequence of there being more information to explore. Similarly, the shorter visits in the multi-column condition may stem from the a more complex layout and the larger number of items presented. Additionally, with fewer items being visible on the viewport in the single-column layout than the multi-column one at any given time, longer visits may be explained by the effort it would take to switch viewport.
We hypothesize that a cost–benefit (or utility) lens [2, 11, 41] may aid in interpreting these results. In previous work, the notion of costs and benefits has been used in reference to browsing behavior in more structured search tasks (e.g., with SERPs [1, 2]). Some of this work, building on IFT, suggests that search behavior is determined by a judgment of whether the information sources are relevant for the information diet. We suggest that our results can be viewed through this lens (i.e., in relation to cost–benefit analysis) even though we concern ourselves with an unstructured task. Browsing the newsfeed brings a cost to the user in the form of time and effort. In addition, users may choose items to attend to by gauging some utility to be gained from the activity, even when a specific information need is not specified. This approach ties in with our finding that visit times were lower in the desktop- than the mobile-like environment and with items showing a title only. The switching cost of glancing at another item may be lower when the target displays only text and in multi-column layouts that position items near one another and make more items visible without a need for scrolling. When switching costs are lower, moving between items more frequently may offer strategic benefits. Related work sometimes characterizes undirected browsing tasks as oriented toward randomness [4, 9, 13]. Were browsing purely random selection and sampling, however, we would not expect the inter-condition differences observed in our study to emerge.
Future work could address certain limitations of the study. Running a similar experiment in a different geographical region, with participants who are not primarily students and using another commercial eye-tracker, could aid in assessing whether the results generalize to the population at large. In addition, future efforts should aim to explore visit durations in more complex conditions, such as the richer interaction scenarios emerging when users follow links.
Data Availability Statement
The data and code for the Weibull analysis is available through the project page: https://userinterfaces.aalto.fi/browsing/.
References
Azzopardi, L., Thomas, P., Craswell, N.: Measuring the utility of search engine result pages: An information foraging based measure. In: The 41st International ACM SIGIR Conference on Research & ; Development in Information Retrieval, pp. 605–614. SIGIR ’18, Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3209978.3210027
Azzopardi, L., Zuccon, G.: An analysis of the cost and benefit of search interactions. In: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval, pp. 59–68. ICTIR ’16, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2970398.2970412
Bates, M.J.: The design of browsing and berrypicking techniques for the online search interface. Online Rev. 13(5), 407–424 (1989). https://doi.org/10.1108/eb024320
Bates, M.J.: What is browsing - really? a model drawing from behavioural science research. Inform. Res. 12(4), (2007)
Bauckhage, C., Kersting, K., Sifa, R., Thurau, C., Drachen, A., Canossa, A.: How players lose interest in playing a game: an empirical study based on distributions of total playing times. In: 2012 IEEE Conference on Computational Intelligence and Games (CIG), pp. 139–146 (2012). https://doi.org/10.1109/CIG.2012.6374148
Bawden, D.: Encountering on the road to Serendip? Browsing in new information environments, pp. 1–22. Facet (2011). https://doi.org/10.29085/9781856049733.003
Buscher, G., Cutrell, E., Morris, M.R.: What do you see when you’re surfing? Using eye tracking to predict salient regions of Web pages, pp. 21–30. Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1518701.1518705
Carroll, K.J.: On the use and utility of the Weibull model in the analysis of survival data. Control. Clin. Trials 24(6), 682–701 (2003). https://doi.org/10.1016/S0197-2456(03)00072-2
Catledge, L.D., Pitkow, J.E.: Characterizing browsing strategies in the world-wide web. Computer Networks and ISDN Systems 27(6), 1065–1073 (1995). https://doi.org/10.1016/0169-7552(95)00043-7, proceedings of the Third International World-Wide Web Conference
Chi, E.H., Pirolli, P., Chen, K., Pitkow, J.: Using information scent to model user information needs and actions and the Web. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 490–497. CHI ’01, Association for Computing Machinery, New York, NY, USA (2001). https://doi.org/10.1145/365024.365325
Cooper, W.S.: On selecting a measure of retrieval effectiveness. J. Am. Soc. Inform. Sci. 24(2), 87–100 (1973). https://doi.org/10.1002/asi.4630240204
Couture Bue, A.C.: The looking glass selfie: instagram use frequency predicts visual attention to high-anxiety body regions in young women. Comput. Hum. Behav. 108, 106329 (2020). https://doi.org/10.1016/j.chb.2020.106329
Cove, J., Walsh, B.: Online text retrieval via browsing. Inform. Process. Manage. 24(1), 31–37 (1988). https://doi.org/10.1016/0306-4573(88)90075-1
Crichton, K., Christin, N., Cranor, L.F.: How do home computer users browse the web? ACM Trans. Web 16(1) (2021). https://doi.org/10.1145/3473343
Dumais, S.T., Buscher, G., Cutrell, E.: Individual differences in gaze patterns for web search. In: Proceedings of the Third Symposium on Information Interaction in Context, pp. 185–194. IIiX ’10, Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1840784.1840812
Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., Rubin, D.: Model checking. Chapman Hall/CRC (2013). https://doi.org/10.1201/b16018
Gelman, A.: Two simple examples for understanding posterior p-values whose distributions are far from uniform. Electroni. J. Stat. 7, 2595–2602 (2013). https://doi.org/10.1214/13-EJS854
Henderson, J.M., Choi, W., Luke, S.G., Schmidt, J.: Neural correlates of individual differences in fixation duration during natural reading. Quart. J. Exp. Psychol. 71(1), 314–323 (2018). https://doi.org/10.1080/17470218.2017.1329322
Herner, S.: Browsing. In: Kent, A., Lancour, H., Nasri, W. (eds.) Encyclopedia of Library and Information Science, vol. 3, pp. 408–415 (1970)
Ho, H.F.: The effects of controlling visual attention to handbags for women in online shops: Evidence from eye movements. Comput. Hum. Behav. 30, 146–152 (2014). https://doi.org/10.1016/j.chb.2013.08.006
Hotchkiss, G.: Google’s Golden Triangle - Nine Years Later (2014). https://outofmygord.com/2014/10/09/googles-golden-triangle-nine-years-later/
Hotchkiss, G., Alston, S., Edwards, G.: Eye Tracking Study: An In Depth Look at Interactions with Google Using Eye Tracking Methodology. Enquiro Search Solutions Incorporated (Jun 2005)
Huang, Y.T.: The female gaze: content composition and slot position in personalized banner ads, and how they influence visual attention in online shoppers. Comput. Hum. Behav. 82, 1–15 (2018). https://doi.org/10.1016/j.chb.2017.12.038
Huberman, B.A., Pirolli, P.L.T., Pitkow, J.E., Lukose, R.M.: Strong regularities in world wide web surfing. Science 280(5360), 95–97 (1998). https://doi.org/10.1126/science.280.5360.95
Lagun, D., Agichtein, E.: Inferring searcher attention by jointly modeling user interactions and content salience. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 483–492. SIGIR ’15, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2766462.2767745
Liu, C., White, R.W., Dumais, S.: Understanding web browsing behaviors through weibull analysis of dwell time. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 379–386. SIGIR ’10, Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1835449.1835513
Luo, P., Zhou, G., Tang, J., Chen, R., Yu, Z., He, Q.: Browsing regularities in hedonic content systems. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 3811–3817. IJCAI’16, AAAI Press (2016)
Marchionini, G.: Information Seeking in Electronic Environments. Cambridge Series on Human-Computer Interaction, Cambridge University Press (1995). https://doi.org/10.1017/CBO9780511626388
(to mi), T.P.: stan-survival-shrinkage (2015). https://github.com/to-mi/stan-survival-shrinkage
Nielsen, J.: F-Shaped Pattern for Reading Web Content (original study) (Apr 2006). https://www.nngroup.com/articles/f-shaped-pattern-reading-web-content-discovered/
Pasha, G., Khan, M.S., Pasha, A.: Empirical analysis of the weibull distribution for failure data. J. Stat. 13, 33–45 (2006)
Peltola, T., Havulinna, A.S., Salomaa, V., Vehtari, A.: Hierarchical Bayesian survival analysis and projective covariate selection in cardiovascular event risk prediction. In: Proceedings of the Eleventh UAI Conference on Bayesian Modeling Applications Workshop - Volume 1218, pp. 79–88. BMAW’14, CEUR-WS.org, Aachen, DEU (2014)
Pernice, K.: Text Scanning Patterns: Eyetracking Evidence (Aug 2019). https://www.nngroup.com/articles/text-scanning-patterns-eyetracking/
Pirolli, P., Card, S.: Information foraging in information access environments. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 51–58. CHI ’95, ACM Press / Addison-Wesley, USA (1995). https://doi.org/10.1145/223904.223911
Reingold, E.M., Glaholt, M.G.: Cognitive control of fixation duration in visual search: the role of extrafoveal processing. Vis. Cogn. 22(3–4), 610–634 (2014). https://doi.org/10.1080/13506285.2014.881443
Resnick, M., Albert, W.: The impact of advertising location and user task on the emergence of banner ad blindness: An eye-tracking study. Int. J. Human-Comput. Interact. 30(3), 206–219 (2014). https://doi.org/10.1080/10447318.2013.847762
Resnick, M.L., Albert, W.: The influences of design esthetic, site relevancy and task relevancy on attention to banner advertising. Interact. Comput. 28(5), 680–694 (2016). https://doi.org/10.1093/iwc/iwv042
Riddell, A., Hartikainen, A., Carter, M.: pystan (3.0.0). PyPI (Mar 2021)
Rinne, H.: Related distributions. In: The Weibull Distribution. CRC Press (2008). https://doi.org/10.1201/9781420087444.ch3
Savolainen, R.: Berrypicking and information foraging: comparison of two theoretical frameworks for studying exploratory search. J. Inf. Sci. 44(5), 580–593 (2018). https://doi.org/10.1177/0165551517713168
Varian, H.R.: Economics and search. ACM. SIGIR Forum 33(1), 1–5 (1999)
Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., Bürkner, P.C.: Rank-normalization, folding, and localization: an improved \(\hat{\text{R}}\) for assessing convergence of mcmc (with discussion). Bayesian Analysis 16(2) (Jun 2021). https://doi.org/10.1214/20-ba1221
Vickery, J.: A note in defense of browsing. BLL Rev. 5(3), 110 (1977)
Viljanen, M., Airola, A., Heikkonen, J., Pahikkala, T.: Playtime measurement with survival analysis. IEEE Trans. Games 10(2), 128–138 (2018). https://doi.org/10.1109/TCIAIG.2017.2727642
Vitu, F., McConkie, G.W., Kerr, P., O’Regan, J.: Fixation location effects on fixation durations during reading: An inverted optimal viewing position effect. Vision. Res. 41(25), 3513–3533 (2001). https://doi.org/10.1016/S0042-6989(01)00166-3
White, R.W., Drucker, S.M.: Investigating behavioral variability in web search. In: Proceedings of the 16th International Conference on World Wide Web. pp. 21–30. WWW ’07, Association for Computing Machinery, New York, NY, USA (2007). https://doi.org/10.1145/1242572.1242576
White, R.W., Roth, R.A.: Exploratory search: beyond the query-response paradigm. Synthesis Lect. Inform. Concepts, Retrieval Serv. 1(1), 1–98 (2009). https://doi.org/10.2200/S00174ED1V01Y200901ICR003
Acknowledgements
This work was supported by the Finnish Center for Artificial Intelligence (FCAI), Business Finland (MINERAL project), the Academy of Finland (projects Human Automata – ID: 328813, and BAD – ID: 318559), as well as the Technology Industries of Finland (project SOWP). We would also like to thank the reviewers for their feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Putkonen, A., Nioche, A., Laine, M., Kuuramo, C., Oulasvirta, A. (2023). Fragmented Visual Attention in Web Browsing: Weibull Analysis of Item Visit Times. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13981. Springer, Cham. https://doi.org/10.1007/978-3-031-28238-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-28238-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28237-9
Online ISBN: 978-3-031-28238-6
eBook Packages: Computer ScienceComputer Science (R0)