Keywords

1 Introduction

The use of tablets for teaching and learning in schools is now common. In many developed countries, each child often has their own tablet during class [7]. This gives educators a wider set of options for teaching and learning activities. Many tablet devices in schools, such as the Microsoft Surface, can run applications traditionally designed for the desktop. It is often impractical to augment classroom tablets with mice, keyboards or trackpads due to space constraints or risk of loss or damage. And so, even for traditional desktop activities, the preferred method of interaction is often through the touch screen, using the device in tablet-mode. As a result, applications that support educational activities with reduced text input are often preferred [20]. A common method of providing input is through drag-and-drop style interaction, especially in spatial/visual-based teaching and learning activities. In this context, children have two main options when interacting with tablets: finger input or stylus input. As the cost of a stylus is low, it is interesting to know the effect finger or stylus input has when performing drag-and-drop tasks, as schools may wish to provide children with styli to use with tablets if an observable advantage is possible.

2 Related Work

Inkpen [8] compared children’s performance between mouse-based drag-and-drop tasks and point-and-click tasks. Point-and-click interaction was faster and had fewer errors. In contexts where direct selection is possible through touch, it is anticipated results would be similar to those reported in the above study.

Finger and stylus input with children has been compared in a number of studies [2, 3, 22, 25, 26]. But, to our knowledge there is no research comparing the finger and stylus when performing drag-and-drop tasks with children.

Drag-and-drop using the finger or stylus has been investigated in older adults (65+) using mobile devices [18] with an increase in accuracy observed with stylus input. As younger children have less experience holding and using a pen, more research on this user group is needed.

McKnight and Cassidy [16] investigated finger and stylus input with children in a qualitative study. Their work did not use drag-and-drop and employed now-dated resistive touch screen devices primarily designed for use with a stylus, where finger input is problematic. Perhaps unsurprisingly, a preference was found for stylus input. Since this study, the technical landscape has changed significantly. Tablets are now common in schools and are primarily designed for finger input. Tablet manufacturers do, however, sell premium pen-based accessories. Low-cost replaceable fiber-tip styli are also commonly available.

Stylus use and comfort were studied by Ren and Zhou [21], who manipulated the length and width of pens for children of different ages. They concluded that a pen-length of 7–13 cm combined with a pen tip width of 1.0–1.5 mm and a pen barrel width of 4 mm was optimal. In a more recent study, Arif and Sylla found the Wacom bamboo pen suitable for children [2].

Concerning finger interaction, Anthony et al. [1] found differences in performance with finger input when comparing children to adults. Children had more difficulty having their gestures recognized using the finger, compared to adults, and had more problems accurately acquiring on-screen targets.

Arif and Sylla [2] investigated touch and pen gestures with both adults and children (the latter were 8–11 yrs). Pen input was faster and more accurate than touch for adults, but there was no significant difference with children. One possible reason is that gesture input requires accurate input for correct recognition, and this may only develop as children get older. In addition to accuracy, there was no significant user preference reported for finger or stylus input with children.

Compared to gestural input, drag-and-drop tasks require less accuracy, and so are appropriate to evaluate with children. Due to the prolific use of tablets in schools and the potential benefits for stylus use with children engaged in learning activities, it is appropriate to revisit stylus input for tasks such as drag-and-drop.

Woodward et al. [26] found children’s fingers slipped less and responded more accurately when performing pointing and gesture tasks. However, children missed more targets with finger input compared to stylus input. While limited in participant numbers (13), their work highlights the complexities in measuring pen or finger performance on tablets with children.

It would be useful for the HCI community to have robust but simplified measures of performance when considering stylus input as an alternative to touch. Once such measure is throughput, calculated in a testing regime known as Fitts’ law.

2.1 Evaluation Using Fitts’ Law and ISO 9241-9

Fitts’ law [6] provides a well-established protocol for evaluating target selection operations on computing systems [11]. This is particularly true since the mid-1990s with the inclusion of Fitts’ law testing in the ISO 9241-9 standard for evaluating non-keyboard input devices [9, 10, 23]. The most common ISO evaluation procedure uses a two-dimensional (2D) task with targets of width W arranged in a circle. Selections proceed in a sequence moving across and around the circle. See Fig. 1. Each movement covers an amplitude A, the diameter of the layout circle. The movement time (MT, in seconds) is recorded for each trial and averaged over the sequence of trials.

Fig. 1.
figure 1

Standard two-dimensional target selection task in ISO 9241-9 [9, 10].

The difficulty of each trial is quantified using an index of difficulty (ID, in bits) and is calculated from A and W as

$$\begin{aligned} ID = \log _2\big (\frac{A}{W}+1\big ). \end{aligned}$$
(1)

The main performance measure in ISO 9241-9 is throughput (TP, in bits/second or bps) which is calculated over a sequence of trials as the ID:MT ratio:

$$\begin{aligned} TP = \big (\frac{ID_e}{MT}\big ). \end{aligned}$$
(2)

The standard specifies using the effective index of difficulty (ID\(_e\)) which includes an adjustment for accuracy to reflect the spatial variability in responses:

$$\begin{aligned} ID_e = \log _2\big (\frac{A_e}{W_e}+1\big ). \end{aligned}$$
(3)

with

$$\begin{aligned} W_e = 4.133 \times SD_x. \end{aligned}$$
(4)

The term SD\(_x\) is the standard deviation in the selection coordinates computed over a sequence of trials. For the two-dimensional task, selections are projected onto the task axis, yielding a single normalized x-coordinate for each trial. For x = 0, the selection was on a line orthogonal to the task axis that intersects the center of the target. x is negative for selections on the near side of the target centre and positive for selections on the far side. The factor 4.133 adjusts the target width for a nominal error rate of 4% under the assumption that the selection coordinates are normally distributed. The effective amplitude (\(A_e\)) is the actual distance traveled along the task axis. The use of \(A_e\) instead of A is only necessary if there is an overall tendency for selections to overshoot or undershoot the target (see [13] for additional details).

Throughput is a potentially valuable measure of human performance because it embeds both the speed and accuracy of participant responses. Comparisons between studies are therefore possible, with the proviso that the studies use the same method in calculating throughput. Figure 2 is an expanded formula for throughput, illustrating the presence of speed and accuracy in the calculation.

Fig. 2.
figure 2

Expanded formula for throughput, featuring speed (\(\varvec{1 / MT}\)) and accuracy (\(\varvec{SD_{x}}\)).

Previous studies applying Fitts’ law with children are scarce. One recent paper by Gottwald et al. studied the movement accuracy of 14-month-old children [19]. A study by Tsai compared the performance of children (10–14 yrs), adults, and the elderly when using a set of gestures on smartphones and concluded that the Fitts’ law model held for all users [24]. They suggested that any size of device less than five inches was too small to be usable.

3 Method

Preference data are difficult to gather reliably with children, due to external factors. For example, children who are familiar with tablets, such as the Apple Pencil or Microsoft Surface Pen, may indicate a preference for the device simply because it is seen as novel and “cool”. As such, the focus in this paper is on children’s performance between finger and stylus input, with user preference data not formally gathered. Our focus is on drag-and-drop tasks.

Fitts’ law was used as the experimental procedure, with throughput, described above, as the main performance metric. Measures of speed and accuracy are also important as they often provide additional insight on the observed behaviors. Analyses are also provided on the dragging paths, the distribution of drop coordinates, and Fitts’ law regression models.

3.1 Participants

Twenty-eight children aged 8 and 9 (16 male, 12 female) participated on a voluntary basis during a school visit to the university lab. Appropriate parental consent was obtained through the school in the weeks prior to the visit. This age range was selected as the children are young users whose writing and pen-holding skills are still improving. They are inside Piaget’s concrete-operational stage of development, where more logical, organized communication and collaboration is observed and where more sophisticated collaborative learning exercises are possible [19].

3.2 Apparatus

Five Asus Google Nexus 7 tablets running Android v5.1.1 were used along with five low-cost micro-fiber-tip stylus pens. The pens measured 120 mm by 9 mm with an 8 mm tip. The tablets have a 7-inch multi-touch color display with a resolution of 1920 \(\times \) 1200 pixels. The device dimensions are 120 \(\times \) 10 \(\times \) 198 mm. Figure 3 illustrates the hardware used in the study.

Fig. 3.
figure 3

Asus Google Nexus 7 tablet and stylus.

The experiment software was an Android implementation of the ISO 2D Fitts’ law software developed by MacKenzie [12] known as FittsTaskTwo (now GoFitts).Footnote 1 The software was modified to support drag-and-drop interaction and to add a game-like appeal for children. To make the task “child friendly” and to keep the children engaged, the start target was an apple (i.e., food) and the end target was an animal (to be fed). The modified software is called FittsFarm.

The graphics changed from trial to trial, thus providing variation to maintain the children’s interest. There was also a voice-over telling the children how to proceed. Apart from these changes, the adapted software contains the same features as the original ISO-conforming software, with spherical targets presented on a flat 2D-plane. Figure 4 illustrates a typical user trial with FittsFarm.

Fig. 4.
figure 4

FittsFarm user trial. The participant feeds the lion by dragging and dropping the apple. The blue arrow does not appear during use (Color figure online).

3.3 Procedure

On entering the experiment room, the children were first given a short briefing about the study. Their task was to perform an “animal feeding” activity on a tablet computer. A short training phase was conducted where the children completed two sequences of 15 “feeding” trials of a random amplitude and target width (Fig. 4). The training phase was repeated at the start of each input method (finger, stylus).

For each sequence they were asked to complete all trials once started, and to proceed at a comfortable speed and to not make many mistakes. If they missed a drag-and-drop target an audio alert sounded. At the end of each sequence, performance data appeared on the device display.

The study was divided into two blocks, one for finger input and one for stylus input. Each block had six sequences of 15 trials. Each sequence used a unique combination of movement amplitude and target width. Targets appeared in a circle to provide a full range of drag-and-drop start and end points. See Fig. 1.

A trial was a single drag-and-drop interaction from a source target to a destination target. For each trial, the start point was the previous end point. This repeats clockwise around the circle until all 15 trials are complete. The start point is set at random.

Participants were organized in groups of four with each child seated separately with a facilitator assigned to oversee the study.

3.4 Design

The study was a \(2 \times 3 \times 2\) within-subjects design with the following independent variables and levels:

  • Input method (finger, stylus)

  • Amplitude (120, 240, 480)

  • Width (50, 100)

The primary independent variable was input method. Amplitude and width were included to ensure the tasks covered a range of difficulty. This results in six sequences for each input method with IDs ranging from \(\log _2(\frac{120}{100}+1)=1.34\) bits to \(\log _2(\frac{480}{50}+1)=3.41\) bits.

The target width and amplitude values are “nominal” values, not pixels. The FittsFarm software scales the values according to the resolution of the host device’s display such that the widest condition spans the available space with a 10-pixel margin.

The participants were divided into two groups. One group was tested with the finger first; the other group was tested with the stylus first.

Aside from training, the total number of trials was 5040 (= 28 participants \(\times \) 2 input methods \(\times \) 3 amplitudes \(\times \) 2 widths \(\times \) 15 trials).

4 Results and Discussion

The results are discussion below, organized by dependent variables. Statistical analysis were performed using the GoStats application.Footnote 2

4.1 Throughput

The grand mean for throughput was 2.45 bps. By input method, the means were 2.34 bps (finger) and 2.55 bps (stylus), representing a 9% performance advantage for stylus input. See Fig. 5. The effect of input method on throughput was statistically significant (\(F_{1,26} = 11.11, p < .005\)).

Fig. 5.
figure 5

Throughput (bps) by input method. Error bars show \(\pm 1\) SE.

The finger and stylus throughput values are low compared to desktop pointing and selecting with a computer mouse or stylus, where throughputs are typically in the range of 4–5 bps [23, Table 4]. Of course, here, we are testing children and the task is drag-and-drop, not point-select. There are no throughput values in the literature for drag-and-drop tasks computed using Eq. 2.

For dragging actions there likely is a motor-control overhead in ensuring that an acquired target remains selected while moving the target from its start point to its destination. This is supported by findings elsewhere [8, 14, 16].

This has implications for designers of educational applications and suggests that designers should consider other types of interaction, such as select-and-tap (where an object is first selected by touch and then moved to a destination on the next touch) over a drag-and-drop approach.

A group effect was observed and was statistically significant (\(F_{1,26} = 4.93, p = 0.035\)), with TP = 2.64 bps for the finger-first group and TP = 2.30 bps for the stylus-first group. There was a clear advantage to beginning with finger input compared to stylus input. The most likely explanation is that the participants who used the stylus first had to learn both the experiment task and a novel input method (i.e., stylus input). This was accounted for with a training phase, but as the children are young and not as well practiced in holding a pen or stylus, a stronger learning effect for the stylus was present. It is not anticipated this would occur in older children or adults.

4.2 Movement Time and Error Rate

Since throughput is a composite measure combining speed and accuracy, the individual results for movement time and error rate are less important. They are summarized briefly below.

The effect of input method on movement time was not statistically significant (\(F_{1,26} = 0.078\), ns), with finger and stylus input having means of 975 ms and 968 ms, respectively. This was as expected as modern capacitive touch screens have a fast response to both finger and stylus input.

The grand mean for error rate was 10.9% with means by input method of 12.6% (finger) and 9.3% (stylus). See Fig. 6. The effect of input method on error rate was statistically significant (\(F_{1,26} = 4.33, p < .05\)). This is in line with findings in other studies that did not involve complex interaction techniques such as gesture input [1, 4, 16, 26]. The implication for educators is that children carrying out low-level drag-and-drop type activities could have lower error rates when interacting with a tablet if they are provided with a low-cost stylus.

Fig. 6.
figure 6

Error rate (%) by input method. Error bars show \(\pm 1\) SE.

For the large target, error rates were about 8% for both finger input and stylus input. For the small target, error rates were more than double this, at 15.2% for stylus input and 20.6% for finger input.

4.3 Dragging Paths

Drag-and-drop, whether with the finger or stylus, is an example of “direct input”. This is in contrast to “indirect input” using, for example, a mouse or touchpad, where input involves maneuvering a tracking symbol, such as a cursor. It is no surprise, then, that the dragging paths were smooth overall, with reasonably direct movement of the dragged apple between targets. A typical example for the stylus is seen in Fig. 7a. In contrast, an atypical example for the finger is seen in Fig. 7b. In this case, there appears to be an element of play, as the participant’s drag path proceeds around the layout circle for some of the trials.

Fig. 7.
figure 7

Dragging paths for the stylus (top) and finger (bottom). See text for discussion.

4.4 Distribution of Drop Coordinates

An assumption when including the adjustment for accuracy, or W\(_e\), in the calculation of throughput is that the selection coordinates are normally distributed, as noted earlier. The FittsFarm software logs such coordinates for all trials in a sequence. The coordinate is the location where the acquired object – the apple – was dropped. This occurs by lifting either the finger or stylus, depending on the input method. For each sequence of trials we tested the normality assumption using the Lilliefors test available in GoStats. Of the 720 total sequences in the experiment, the null hypothesis of normality was retained for 648, or 90%, of the sequences. Thus, the assumption of normality is generally held. The results were split evenly between the finger and stylus conditions.

4.5 Fitts’ Law Models

To test for conformance to Fitts’ law, we built linear regression models for each input method. As expected, both input methods follow Fitts’ law with \(R^2\) values \(> .9\) for the ID models and \(> .8\) for the ID\(_e\) models. Figure 8 provides examples for the ID\(_e\) models.

Fig. 8.
figure 8

Regression models for finger input and stylus input.

5 Conclusions

The reduced error rate and improved throughput indicates the stylus is both more accurate and more efficient than the finger for drag-and-drop tasks on tablet computers. As researchers who visit schools often to facilitate STEM activities, the authors have observed a greater tendency for children to collaborate when working next to each other on tablets. This is supported in the literature [5, 15]. It is easier for children to perform drag-and-drop interaction using a finger or stylus on a friend’s tablet than it is to take control of peripheral devices such as mouse or trackpad. This facilitates collaboration between children when working together or helping friends with their work.

One issue is that direct finger input can mark the screen and potentially discourage children from using other people’s devices [17]. Stylus input is an intermediary in this context, potentially supporting more accurate, scuff-free, and efficient collaboration between children. Further work will examine the potential for stylus input in educational collaboration, but our findings indicate that stylus input is a superior choice for drag-and-drop tasks with children. The high availability and low cost means educators should consider providing styli with tablets for school-related activities.