[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Status and events: static and dynamic properties of interactive systems

Alan Dix

At time of publication: HCI Group and Dept. of Computer Science,
University of York

This paper appeared as:
A. J. Dix (1991). Status and events: static and dynamic properties of interactive systems. Proceedings of the Eurographics Seminar: Formal Methods in Computer Graphics, Ed. D. A. Duce. Marina di Carrara, Italy,
http://www.hcibook.com/alan/papers/euro91/euro91.html

See also Alan's pages on status-event analysis


top of page || abstract || references || contents

Abstract

This paper analyses user-oriented properties on the basis of a static dynamic distinction. Modern graphic interfaces often emphasise an evaluative status-oriented user model, whereas their underlying implementations are event oriented. Formal models and notations will be examined as to their ability to deal with status and event based properties.


top of page || abstract || references || contents

Contents

1. Overview
2. Static and dynamic invariants
2.1 A first model
2.2 Two classes of commands
2.3 Static invariants
2.4 Dynamic invariants
2.5 Summary
3. Status and events
3.1 A look back
3.2 An alarm clock
3.3 Properties of status and events
3.4 Messages and shared data
3.5 Mail-tool
4. Status inputs - mice
4.1 A simple model for status inputs
4.2 Trajectories
4.2 An example - Pull down vs. fall down menus
5. Dynamism
6. Implications
References


top of page || abstract || references || contents

1. Overview

In the following, we will examine several ways to model and discuss the dynamism in a user interface. It begins with a simple state based model (section 2). This is used to look at two types of constraint on the system static invariants and dynamic invariants. Their use in specification is demonstrated by an example of a simple graphical editor. Section 3 begins by highlighting shortcomings in the model and this leads us on to the discussion of status and events which is the dominant theme of the remainder of the paper. Status describes persistant phenomena which have a continually available value, whereas event describes ephemeral phenomena which occur at a specific time: events happen, status is. This distinction is used to discuss real world phenomena, human--human and human--computer interaction. In particular, it enables us to generate models which deal more adequately with mouse based input. The two concepts static and dynamic invariants and events and status are brough together in section 5, where we look at how well they capture the feeling of dynamism within the interface. Finally, we classify different specification and implementation notations as to their ability to express these properties.


top of page || abstract || references || contents

2. Static and dynamic invariants

In this section, we will look at a simple model of an interactive system. The model is focused on the output of the system and ignores input interpretation. The two are in fact intimately connected, and typically the Map component introduced below is central to the interpretation of input as well as to the generation of output (see chapter on dynamic pointers in Dix 1991).

Over the years the York HCI group have used a range of models to describe interactive systems, focussed on different aspects of interaction. The one below is fairly typical. Some of the models deliberately do not discuss the system state, this is because a description solely in terms of system inputs and outputs has the best claim to be a model of the interface. However, sometimes, when we are interested in discussing structure within the system a state representation as used below is more convenient.

2.1 A first model

This is a state based model, so we start of with a set of allowable system states. We shall call this set Sys to distinguish it from status which we will discuss later. The user has a set C of commands available at any time, and these cause the system state to change. The function describing the change we shall call doit.

    doit:   C * Sys   ->   Sys
In common with most of our models, we have a display function which gives the current graphic display associated with a particular state.

Often we can distinguish within the state components corresponding to the application itself and the additional state required to map this onto the display device. These two components we shall call Appl and Map:

    Sys   ==   Appl * Map
For example, if imagine we have a simple graphics editor. An abstract view of the application would be a set of shapes, with (perhaps) a currently selected state. (In fact, one would probably also have hidden shape identifiers, but that merely complicates the exposition.)
    Appl   ==  { shapes: power Shape,  selection:  Shape }
These shapes would include information as to what sort of shape they were, position and dimensions.
    Shape   =      LINE    start_pos: Coord   end_pos: Coord
                |  CIRCLE  centre: Coord   radius: Real
                   ...
The coordinate system used in the application system is unbounded, shapes can be of any size and at any position.

To portray these shapes on a bounded raster screen, we need to have a window into this unbounded coordinate system. This window's coordinates and scale would need to be in the Map component:

    Map   ==  { offset: Coord   scale: Real }
The display function can then use this to fit the desired window into the available space, clipping and scaling the image as appropriate. It might also put 'handles' at critical points of the selected shape. The display function also would need to deal with the complicated issues of pixel mappings for lines and circles. Happily, for most application programmers, these issues have already been dealt with by graphics libraries. Similarly when specifying such a graphics system, one is unlikely to worry about these lowest level details.

2.2 Two classes of commands

Having divided the state Sys into two components, we can see that commands can be divided as to their effect upon these components. We can think of the two classes of commands as presentation directed commands and application directed commands

The presentation directed commands affect only the Map component. That is, commands c such that:

    forall sys, sys' in Sys     sys' = doit(c,sys)    =>    appl' = appl
Note that I have taken the notational liberty of letting appl and appl' represent the application components of sys and sys' respectively. (A convention borrowed from high level specification notations like Z and Cobol)

An example of such a command for the graphics editor described above would be zoom_in defined as follows:

    let  sys'  =  doit(zoom_in,sys)
    then
        appl'    =   appl
        offset'  =   offset
        scale'   =   scale / 1.10
This moves the viewpoint 10% closer to the graphics plane. We could similarly define zoom_out and also commands for changing the current offset.

These commands are mainly navigational. They change ones view of the application objects, but do not do any 'real' work. That is not to say they are unimportant, quite the reverse, but simply that they are not working at the level of the abstract application. Depending on how we divide up an application, more or less would live in the Map component. For instance, we might have chosen to put the selected shape into the map component as it is not part of the final graphics object, merely a tool for manipulating it. Also we may view the system on more than two levels, the classical number for such separations being three. (Computer scientists not knowing about pi).

If we turn now to the application directed commands. These correspond to some action at the abstract application level, Each such command affects principally the Appl component. In reality the interpretation of commands is affected by the Map component, however, as we noted above the intention at this stage is to ignore these parsing issues, so we will imagine the commands that affect the application state to be a subset C' of the top-level commands. There will be a state transition function action that acts at this level:

    action:  C' * Appl   ->   Appl
The doit function associated with these commands will perform the required action on the underlying application: It would be nice to say that for such commands the Map component was unchanged. This would make a very crisp distinction between the commands at the surface presentation level and those at the abstract application level. Unfortunately this is rarely possible. This is because most systems have a set of design constraints, for instance, we may want the currently selected shape always to be visible on screen. The application action may have been to change the selected shape, and thus the map would need to be changed to one that included then newly selected shape.

The general form of such a constraint is:

    invariant:  Sys * Sys  ->  Bool
or 'exploded' so that we can see the system components
    invariant:  Appl * Map * Appl * Map  ->  Bool
The map component of an application directed command must therefore be changed so as to be consonant with these invariants:
    forall c in C', sys, sys' in Sys
        let     sys'  =  doit(c,sys)
        then
            appl'  =  action(c,appl)
            map' is chosen to satisfy invariant(appl,map,appl',map')

However, we can divide such constraints into two broad classes:

2.3 Static invariants

The general form of an invariant includes both components of the old and new state. However many of the constraints apply only to one state at a time. For instance, the example constraint above, that the selected object be visible on the display concerns only the current state. Such invariants can be put in the simple form:

    invariantstatic:  Appl * Map   ->   Bool
These constraints are usually concerned with consistency properties of the interface. They often ensure that the visual presentation is a true and faithful representation of the actual state of the system.

Note that this form of invariant must be satisfied by all states of the system, both those generated by application directed commands and those generated by presentation directed commands.

    forall c in C,  sys,  sys' in Sys
                invariantstatic(sys)  and  sys' = doit (c,sys)
                        =>  invariantstatic(sys')
Note two things from this formula. First that it ranges over all commands, not just application directed ones from C', as we demanded above. Also that we only require the doit function to preserve the static invariant: if it is given a bad state to start with we should not necessarily demand it produces a good one in reply.

Note also that application directed commands typically preserve the command by altering the mapping. Presentation directed commands would not normally affect the application state. They would normally maintain the constraint by simply ignoring (or treating as errors) commands which would otherwise violate the constraint. That is for each such command we would have a normal state transition function for the Map component normal(c,map), the full doit function would then be defined for a state sys as follows:

    let   map' = normal(c,map)
    and   sys' = (appl,map')
    then
        doit(c,sys) =  sys'    if   invariantstatic(appl,map')
                    =  sys     otherwise

2.4 Dynamic invariants

In contrast, dynamic invariants necessarily involve successive states. They specify some form of continuity between states. An example of a dynamic invariant is display inertia: that is some sort of condition minimising change between successive displays. The simplest such invariant for the graphics editor would be

    map  =  map'
However, as we saw above, such a constraint cannot always hold. This is a constant problem with dynamic invariants. We usually want a condition of the form:
    if possible   map  =  map'
The "if possible" means if the static invariants allow, as it is almost always the case that static invariants take precedence over dynamic invariants.

Assume that we have specified an amended application component appl' and are trying to find a suitable new map component. We can then formalise the "if possible" condition as:

    exists map' in Map   st   invariantstatic(appl',m)
                              and  invariantdynamic(appl,map,appl',map')

If there is no possible map component satisfying both the static and dynamic invariant, then we drop that invariant completely. However, in such cases we usually have another weaker component up our sleeve:

    if possible   invariant1
    otherwise     invariant2
and so on, with weaker and weaker invariants. For instance, in the graphics editor we may want preserve the zoom factor even if we have to move the position of the window:
    if possible   map    =  map'
    otherwise     scale  =  scale'

We do not expect presentation directed commands to preserve dynamic invariants, as they're purpose is precisely to alter the map component. However, the specification of these commands may well make use of some of the same formulae as the weaker dynamic invariants.

2.5 Summary

We have used the simple model to differentiate two components of the system state, the application and the presentation and two classes of constraint static and dynamic invariants.

In general we found that the application has precedence over the presentation, in that commands directed at the application may need to change the map component in order to maintain constraints, but not visa versa. Also we found that static invariants take precedence over dynamic invariants, in that the dynamic invariants may have to be relaxed in order to find possible consistent states. Perhaps the reader can think of situations where one or other of these precedences is broken.

The first of these general rules is a reflection of the fact that no matter how pretty the interface it is the application which is the purpose of the interaction. The second stresses the importance of maintaining consistency. To characterise the two, we could say that static invariants were about the safety or correctness of the interface, whereas dynamic invariants were much more connected to feel and aesthetics. However, although correctness is important and determines the usefulness of the system the feel and dynamism of the interface determines its usability.


top of page || abstract || references || contents

3. Status and events

3.1 A look back

The model above was chosen to reflect some of static and dynamic properties of the user interface. However, the reader may have noticed that the commands defined for the example 'graphic' interface did not include mouse based commands such as dragging the shapes about. This was for the simple reason that although defining such commands is possible in such a model, it is by no means easy, and certainly is not natural. Why is this?

Let's look at the model at the coarsest level. The user's inputs are a sequence of commands c1, c2 ... the system is then left in a series of states s0, s1 ... and the user sees a sequence of displays d0, d1 ... These are related by the model:

    si+1  =  doit(ci,si)
    di    =  display(si)

If we looked at the user we could imagine the user looking at the display planning what to do next, and then issuing the relevant command. One could even write down a series of similar equations for the user:

    braini+1  =  think(di,braini)
    ci        =  plan(braini)
It is precisely the goal of many cognitive models and in particular programmable user models to specify these user functions. Personally I would take any attempts to specify the functions with a pinch of salt, however, that is not really pertinent to this paper!

The purpose of writing the above is to notice an apparent symmetry between the behaviour of the system and the user. The one takes commands and produces displays, the other takes displays and produces commands. I say apparent because there is in fact a profound difference between the nature of the commands and the display. Imagine we were timing the interaction. The commands would have a precise time associated with them. A moment before and a moment after there is no record, except in so far as they have effected the system's state. The display however (and the state for that matter) persist over a period.

These two classes of phenomena, the ephemeral and the persistent I will call events and status. Events happen at a precise moment, status just is. I will classify a phenomena as a status if it is possible at any time to consult it and obtain a value. A status does not have to be static however, it may change from time to time (like the display in the above model) or continuously (e.g. the fuel indicator in a car). Events on the other hand may carry some sort of value (e.g. the particular command above), but that value is only present for the short duration of the event. However events not only happen themselves, they typically make other things happen. They are active. Stati on the other hand are relatively passive, determining action, but rarely provoking it.

Now looking back on the model above, we can classify it as an event-in/status-out model, or simply event-status. Is this intrinsic to interaction with computers, or merely an feature of the model? We shall answer this later (section 4), but first we need to look at status and events in some detail, examining their properties and interactions. In fact, we shall see that this distinction between status and events is far from clear cut, and that it is precisely the interconnections between the two that make them interesting concepts.

3.2 An alarm clock

I'm typing this paper one Saturday night, I look up at the clock, half past nine - that time already, I'll take a break when the News comes on the TV. A little later I notice that it's ten o'clock, time for the News. After watching the news, I do a bit more work, I look again at the clock, it's quarter past eleven, perhaps I'd better go to bed. I set the alarm clock for 7.15 in the morning an go to sleep. Brrrr..rrrr - the alarm wakes me, I blink my eyes for a few moments, and the alarm rings on. Eventually, I switch off the alarm and get up.

Let's look at the alarm clock. We shall ignore its inputs (winding and setting the time) and concentrate on its output, there are two distinct channels: the clock face and the alarm bell. At first sight they are clear examples of status and event outputs. The clock face continually shows the current time. At any moment I may glance at it to find what time it is. On the other hand the alarm bell is an event. At a particular time it rings and wakes me. Before and after that moment it is silent. However, if we look a little more closely things are more complicated.

When I first looked at the time I was using the clock in its status role. However, when I noticed that it was time for the News its role was subtly different. The status of the clock face had reached a critical value and I interpreted this as an event. In fact this is not very different from the alarm waking me in the morning. The major difference is that for the latter I delegate the task of noticing the critical time to the alarm clock. In order to notice when the News started I had to constantly monitor the clock (and therefore be awake).

The third case was different again. I notice that it was getting late and went to bed. Going to bed is clearly an event, was this triggered by the time? Presumably, there would have been a whole range of times which I would have considered "late": twenty past eleven, ten past eleven, midnight! The event of me going to bed was the combination of the event of me looking at the clock with the current status at the moment when I looked.

Let's turn our attention to the alarm bell. It rang at a specific moment, and hence is an event. However, it took me a while to respond. During that time the alarm rang. Hence, we can think of it as a two valued status. Not ringing or ringing. If we look at it with respect to the task of going to bed and getting up in the morning, the alarm bell is an event. However, if we look at the process of waking up and getting out of bed, the alarm is better seen as a status, the important events being when it begins ringing and (most important) when I finally turn it off. By similar analysis, although we normally interpret the clock face itself as a status, at a fine enough time scale we would see the hands move in little jerks.

Now imagine that I had set the alarm to the correct time, but forgot to turn it on. Now in the morning at 7:15, the hands of the clock sit at quarter past seven, somewhere a few levers click into place as the alarm time is reached, but because the bell is turned off, no ringing is heard and I go on sleeping. A similar situation would arise if I had simply slept through the ringing of the alarm. In both these scenarios there was an event in the alarm clock of the time 7:15 being reached, but no event for me.

Of course, there is a lot more we could say about status and events, in relation to the alarm clock: for instance - Oh! is that the time, I'd better be getting to bed ...

3.3 Properties of status and events

It's another day now. We can look back at the alarm clock example from the previous section and draw out some lessons that apply generally to status and events. In particular, we will see a strange intertwining of the two.

The primary job of a clock is to tell the time. That is, it is a status indicator. The time is always shown, but it is usually interpreted in connection with an event. For instance, when I look at the clock, that is an event. It is the time shown at the instant of that event which is important to me. The status at the instant of an event may determine the future course of events (and stati) as, for instance, when I noticed it was quarter past eleven and I went to bed.

The alarm clock has the additional job of ringing when a certain time is reached. That is the bell is an event which signals a certain critical change in status. Similarly, when I noticed the time was ten o'clock I interpreted this change in status as an event. In general, one finds that changes in status often give rise to events. Notice that these status change events do not conflict with the interpretation of the time as a status, but are derived from the nature of status.

The first interaction: the interpretation of status at the instant of an external event can be seen as an synchronous effect of the status on the future behaviour of the system. Although there are inevitable gaps between them, there is a direct causal link between me looking up at the clock, and then me going to bed. On the other hand, status change events allow an asynchronous effect of the status, in that it causes an event, in essence from nowhere. Obviously, if we looked at the movements of the cogs and springs (or the oscillation of the quartz) there were events that gave rise to the alarm ringing, however, at the granularity of interest, there was no causal event.

This takes us to the important area of granularity. Depending on our perspective we may view the same phenomena as status or event. A major determinant of this subjective distinction is the time-scale or temporal granularity at which we view the phenomena. This time-scale is itself dependent on the task which the observer wishes to achieve. For instance, when the task was 'getting me up in the morning' the associated time-scale was hours, and the alarm bell was interpreted as an event. However, in the morning, when the task was actually getting out of bed, the time-scale was seconds (or minutes) and the alarm bell took on the nature of a status. Similarly, for most tasks the clock face is interpreted as a status, however, if we were using the jerks of the second hand as a metronome we would view the clock as a series of 'tick' events.

Now differences in task and granularity may make a difference as to whether we regard a phenomena as status or event. However, even if we agree on an event interpretation, different observers may experience what is causally the same event at different times, or perhaps not experience an event at all. That is we must distinguish an objective event from the perceived event of a particular observer. Consider the alarm clock again. At 7:15 the mechanism detected that,it was time to ring the bell, that is an event occurred for the alarm clock. When the alarm rang I woke up and the objective event of the alarm ringing became a perceived event for me, possibly delayed somewhat as I was roused. If however, I had slept through the alarm, or forgotten to set it, the event for the alarm clock would still have occurred but there would have been no corresponding event for me. Similarly, imagine the clock has stopped in the middle of the night. In the morning 7:15 would have still have come, objectively the event of 7:15 - my waking time - would have arisen, however, this time there would have been no corresponding perceived event for the alarm clock! One could summarise the task of the alarm clock as ensuring that specified objective status change events in time become perceived events for the user. The same task occurs in computer systems, especially in the domain of computer mediated communication.

3.4 Messages and shared data

Now the distinction between status and events has been introduced in order to study human--computer interactions. It was interesting to find that a very similar distinction arose quite independently as part of the study of computer supported human--human communications. Although this is somewhat a side issue for this paper, it furnishes some further interesting examples of the interplay between status and events. It is of course not surprising that these different forms of multi-party interaction have somewhat similar features.

Some while ago, on first examining computer mediated communication with an eye to examining properties within a formal framework, several distinctions become obvious; of these perhaps the most elusive turned out to be the distinction between communication via messages and shared data. Most systems fall into one camp or other. On the messaging side we have the traditional email systems, now perhaps with added features for document exchange, auditory and visual messages, conversational protocols etc. On the shared data side, we have bulletin boards, shared hypertexts and databases, screen sharing, multi-person editors and electronic white-boards (Yoder et al. 1989, Donahue and Widom 1986, Stefik et al. 1987).

Now many organisations spend a lot of effort taking information supplied as messages and turning it into shared data; e.g. filing memos and letters, taking minutes of meetings. So as a medium of information transfer messages seem somewhat superfluous. However, messages are clearly important, so what precisely do messages achieve over and above information transfer. Various determining features were considered which are important, but which didn't really capture the essence of a message. It took me a considerable time to decide, what is obvious now in the context of this paper, that what chiefly distinguishes messages from shared data is that messages give rise to events.

There are of course differences in emphasis when studying human--human and human--computer interaction, but many of the same issues arise: e.g. task granularity and the importance of messages becoming salient to the recipient (that is message events becoming perceived events). Of particular interest is the way that users of electronic communication media use a system of one kind (shared data or messaging) when their task demands one of the other. For instance, people coauthoring a paper may email copies back and forth. They may use sophisticated control mechanisms to cope with the problems of multiple edits to the same areas of the document, akin to those in a shared database. In essence they use the messaging medium to implement shared data. On the other hand, users of a shared database have been known to allocate specific areas as effectively mailboxes where they leave messages for one another.

Of course, these complex social protocols are themselves a sign that something is wrong with the underlying mechanism. If they are insufficient to mask the mismatch between task and medium some sort of breakdown will occur. Almost certainly that was the initial experience of those who developed these protocols. There are similar single-user situations. Consider again clocks. My wristwatch is not supplied with an alarm feature (in fact I don't have a wrist watch, but imagine). I have a meeting which starts at two o'clock. I try to keep an eye on the time, but miss the appointed time and arrive late. There is a breakdown because my task demands an event but my watch only supplies status. After repeatedly missing meetings, I may buy myself a wristwatch with an alarm. Alternatively, I may use a coping mechanism and get into the habit of frequently consulting my wristwatch. If this becomes unconscious enough I may not even notice the disparity.

The fact that people are so good at coping with bad interfaces is of course no excuse for supplying one in the first place!

3.5 Mail-tool

Another interesting tale of the way status and events interact comes from the single-user interface to the mail-tool on my workstation. The mail-tool itself is portrayed as an icon in the shape of an (American) mailbox. When someone on the local network sends me mail the effect is as follows. There is an event - the person is sending me mail - this event makes the posting program append the message to a special file, my mailbox file. That is the event has triggered a change in the status of the file-system. The mail-tool periodically examines my mailbox file, but to avoid causing too much of a computational overhead this is only done every few minutes. However, at some point the mail-tool notices the mailbox file has changed. At this point the status change has become a perceived event for the mail-tool. The mail-tool responds to this event by changing its icon: a little flag is raised and a letter is drawn poking out of the icon's mailbox. Thus the perceived event for the mail-tool has become a change of status on my workstation's screen. I never notice this event as it happens as I have the mail-tool icon positioned at the top of the screen, outside my focus of attention. Periodically I look up at the icon, either deliberately, or as just as I move my eyes over the screen. At this point I notice that the mail is there, and the status change in the screen has become a perceived event for me.

This story of layers of events and status is repeated all over computer systems and indeed social systems. Even deep in the machine we see things like interrupts (events) being reponded to by setting flags (status) which are then interpreted by a running program when it appropriate (perceived event). This paper is focusing on events and status as they are perceived by the user, but these lower levels of computation are interesting as they show very similar behaviour to that we see at the human interface. Further, the abstraction used in the implementation and specification often 'bleed' through to the interface.


top of page || abstract || references || contents

4. Status inputs - mice

We now move back, from alarm clocks and communication, solidly to individual computer interfaces. We began our discussion of status and events by noticing the disparity between input and output in the formal model. It was an event in - status out model. This implies that the user is expected to be acting in a status-event mode, seeing status on the screen and then enacting events at the keyboard. All possible types of interaction: event-status, status-event, event-event and status-status are possible, and of course mixed interfaces.

The dominance of the computer screen on terminals and workstations make it natural to think of the output as being mainly status. However, auditory interfaces (telephone interfaces or interfaces for the visually impaired) are event-event. Even most terminals can bleep at you, flash their screens or put up preemptive message boxes, all of which form an event output.

On the input side, keyboards are (with the exception of the shift keys) clearly an event input, but mouse position is more of a status. The formal model presented earlier betrays clearly its roots in keystroke and screen interfaces. This distinction between devices can sometimes be seen at the program level. For instance, GKS distinguishes event from sampling devices. Inputs from event devices are presented to the programmer as events to deal with whereas sampling devices are read by non-blocking polling to obtain their current value. Polling always betrays a status model of the device: by polling one assumes that there is always a value to be read, and further that previous, unsampled values are unimportant. Many windowing systems today have a purely event based model of input, not only are keystrokes and mouse-buttons treated as events, but so also is the mouse position itself. Sometimes position is associated with other events, such as "click at (x,y)", which is consonant with a status interpretation. However, even mouse movement gets decomposed into events "move(delta_x, delta_y)". At a very low level, we will often find that the mouse is detected as a sequence of events from micro-switches or optical devices, but that the keyboard controller actually polls the keyboard.

When we specify a system however, the way the devices are viewed by the implementation is not the issue - we want to know how the device feels to the user. At this level, the mouse is, I would argue, perceived as a continuously varying parameter under direct control of the user. That is it is a user controlled status. At this point, we might want to distinguish between the mouse in the user's hand, and the mouse on the screen. However if I have to make this distinction then the user is already in a breakdown situation; so I will assume the user has made the appropriate mapping and feels in direct control of the screen mouse position.

Modern graphical interfaces are often dominated by mouse-based interaction. Of course, events occur: clicks on buttons, picking up icons, dropping them, menu selections, but a the principal mode of interaction is status-status. Command and keystroke interfaces were predominantly event-status based. The corresponding cognitive model was that of goal-plan-evaluate. Status-status interfaces still have this planning cycle at a low level, but the dominant model is now one of goal seeking.

If status input is so important in graphical interfaces, we should include this into our formal and informal models.

4.1 A simple model for status inputs

We will look at a simple model of the way status and event inputs affect the system's internal state. By this I mean the whole internal state of the system; application and interface. This model has some implicit restrictions which we will discuss in succeeding sections, but will serve to make some immediate distinctions. These restrictions in fact turn out to be as important as what the model does describe.

So as to distinguish this state of the system from the status inputs we will refer to the set of possible system states as Sys, and the set status inputs as Pos. This is because most of the time we will be thinking about positional devices such as mice. However, much of the discussion will be equally valid for discrete status devices such as shift keys. The set of event inputs we refer to as Ev, and includes both keyboard events and mouse buttons.

When there are no event inputs, the display at any moment is a function of current system state (from Sys) and the current status inputs (from Pos). For instance, when we are simply moving the mouse around, the display has a mouse cursor which depends on the current position of the mouse. Similarly when dragging an icon, the position of the icon depends on the current mouse position. This function which yields the current display (say from a set D) we shall call (not surprisingly) display.

    display:    Sys * Pos   ->   D

When an event occurs, the system state will change. In general, the new state depends not just on which event occurred, but also on the current status inputs. When we click a mouse button, for example, the item selected is the one under the current mouse cursor. This state transition function we shall call doit.

    doit:    Sys * Ev * Pos   ->   Sys

Now in general the transition is a function of both the event and the current status. Such was the case when I looked at the clock and decided to go to bed, or if we click a mouse button when the mouse is over a menu item. However, for many events the status is immaterial. We can call such events status independent, or in the common case where the status is a mouse position or other pointing device we can say position independent.

    forall p, p' in Pos  :    doit(sys,e,p)  =  doit(sys,e,p')

Typically, most keyboard commands are position independent. I might click a mouse button to select a text entry point (a position dependent event), but thereafter my typing is all aimed at that text cursor, which is a component of the system state and can largely ignore the mouse cursor. In fact, this is all to the well, for while I type my hand is off the mouse and I am no longer in control of it. If the mouse flex is stretched under tension it may move spontaneously when I let it go, and I would be worried if my typing appeared at odd places as the mouse sprung back across the screen. At a rather gross level however, some windowing systems use the position of the mouse cursor to determine which window keyboard commands are directed to. In such circumstances, the stretched flew may indeed make my typing suddenly jump from one window to another. The large size of windows reduce the effect of this problem, and so even on these systems, to a degree of approximation, the keyboard commands are position independent. (We can in fact make this approximation explicit into a weaker principle of region dependence, see (Dix 91).

It would be nice to capture this behaviour into a principle of design: when an event is position dependent the user should be actively in control of the mouse. Such a principle would be heavily dependent on the task and user, and would also be somewhat self-fulfilling. If a word-processor always entered text at the mouse cursor position, users would soon get into the habit of typing with one hand whilst firmly holding the mouse with the other!

We can be fairly certain that the user has a firm and conscious grip on the mouse when the event is generated by a mouse button. The example of the word processor is just such an example. That is we want a principle of physical proximity whereby if an event is position dependent, the transducer that causes the event must be physically close to the status transducer, and further be cognitively associated with it by the user.

Such a principle may be regarded as too strong and perhaps the physical proximity should be downgraded in deference to the cognitive proximity. Particularly awkward cases arise if we consider, for instance, whether mouse button actions should be affected by the keyboard shift key. As a personal preference, I find it very odd that single button mice are advocated for reasons of simplicity, but then functionality is added by using additional keyboard chords.

4.2 Trajectories

So, to the limitations of the above model. Notice that the system state transitions depend only on the value of the status input at the instant the events occur. The intermediate values of the status input affect the ephemeral display, but have no lasting effect. This behaviour I call trajectory independence and is the only behaviour possible under the model.

For a mouse or other pointer device, its trajectory is the path it has taken across the screen. We can think of the difference as if the mouse were running over a dusty floor. When it holds its tail up there is no trace of its path, but when it drags its tail in the dust we can see where it has been. The dust trail is its trajectory. In general, for any status device, whether continuous or discrete, its trajectory is the history of values it has had. Trajectory independent events are therefore those which do not depend on the history of the status device, but only the current value.

Most actions performed by mice and similar pointers are trajectory independent: selection, dragging, rubber banding, sweeping out new window regions. In each case there is constant feedback (the display function) but only the initial and final positions are important.

Trajectory independence is a form of undo principle. Consider the interaction as a dialogue of events, interspersed with sub-dialogues of pure mouse movement. Trajectory independence ensures that errors during this sub-dialogue can be corrected within it simply by repositioning the mouse. So with the dragging, if I accidentally move over the waste-basket, so long as no event occurs, I correct my mistake by moving off again. The ability to undo easily is regarded as an important feature of direct manipulation systems (Shneiderman 1982). Not only should users' actions be undoable, but as a general rule there should be some proportionality between the complexity of an action and the complexity required to undo that action. Trajectory independance assures this as mouse only errors can be corrected with further mouse only corrections.

In fact, the notion of errors and mistakes is a little insidious in this context. Most positioning devices are part of a tight hand-screen-eye control loop. The "errors" are a necessary and integral part of such a control loop. Retaining trajectory independence wherever possible is not just a formal statement about the dialogue but is fundamental to the use of positioning devices as system extensions of the user's motor system.

Not all systems are trajectory independent all the time. The most obvious counter example is freehand drawing. Notice also that it is not only trajectory dependent, but it does not obey the sub-dialogue undo property. However, the essential nature of such drawing is its trajectory dependence, and it would be ridiculous to expect it to be otherwise.

Another interesting thing about drawing is that typically a line is only drawn whilst a mouse button is depressed. This is very important. Firstly, because it mirrors the physical act of drawing; the line only appears when we press. Secondly, because the period during which the precise mouse position is important is marked by positive user pressure on the device. Thus the user is physically aware of the action and its connection to the mouse.

In his three-state model, Buxton characterises positional input devices by a combination of user pressure and tracking behaviour. For a typical mouse, he calls the state with no mouse button pressed state 1, typically this is characterised by the system just tracking the position of the mouse. When a button is pressed, the device goes into state 2 and returns to state 1 when the button is released. State 2 is associated with actions such as dragging. As well as these state 1-2 devices, he also discusses an additional state 0 associated with devices such as a light pen. Here the released state (state 0) has no tracking. However, for simplicity, we'll just think about state 1-2 devices.

In terms of this three-state model, line drawing is only trajectory dependent in state 2. We could form this into a putative principle for trajectories. Wherever reasonable we should maintain trajectory independence. When trajectory dependence is required by the nature of the task, it should only be during state 2. There may well be exceptions to this principle, but it can serve as a touchstone with which to start an analysis.

4.2 An example - Pull down vs. fall down menus

We have framed a putative principle for trajectory dependence. Let's see whether this is helpful by examining two rather similar systems: the menu bars from the MAC (Apple Computers) and GEM (Digital Research).

We will take an imaginary word-processor with four menu bar options (see fig 1). In many ways both the MAC and GEM menu bars are similar. The user starts with the mouse over the FILE button (position A). The menu appears and the mouse is moved down to position B. Rather than selecting the item the user instead moves to position C. Now if this was by moving via the FONT button (position D), and then down, the user would be over an item in the FONT menu (say about to select "times obscure"). If, on the other hand, the movement was horizontal straight to C, the FONT menu would not have appeared.


figure 1. imaginary word-processor


Notice first that this cannot be described by the simple status model. The display when the mouse is at position C depends on how it got there, even though there were no events since it moved from B. So the display is trajectory dependent. Further, if the FONT menu is displayed, the user selecting at position C will change the font to times obscure, but not otherwise. So the event of menu selection is also trajectory dependent.

So far the two systems appear virtually identical. However, even though the interfaces look similar they feel different. This difference was deliberately elided in the above description.

We will go back to position A. In the MAC system nothing happens unless the mouse button is depressed. Only then would the FILE menu appear. In the case of GEM the menu appears simply by moving the mouse to position A. With the MAC, the mouse button is then held down during the whole interaction until point C, whereas in GEM the mouse button is not depressed. The act of selecting a menu item in MAC is indicated by the release of the mouse over the appropriate item, whereas GEM requires a mouse click

In terms of the three-state model the entire interaction took place in state 2 with the MAC interface and in state 1 with GEM (except the click event at the end). If we recall our putative trajectory principle, it said that where trajectory dependence was deemed necessary, it should always be in a state of physical tension. That is for a mouse GEM breaks the principle. Further, you can do nothing else while the menus are there, and the only way to get rid of them is to click on a blank bit of screen. That is not only is it trajectory dependent, but actions in the movement sub-dialogue require events to undo them, violating a weaker principle.

Does it matter? It was after all a putative principle, based on rather home-spun psychology. I have observed both children and novices using the GEM interface (and let's admit it myself!). Sooner or later the mouse strays over the menu bar, and a menu drops down, the only way to get rid of the menu is to move to some vacant bit of screen an click there. With a bit of experience this becomes automatic, if a little irksome. For younger users and novices it is utterly confusing, and especially older novices would be loath to click the mouse for fear of damage.

The MAC menus which require physical pressure to get them to drop can be reasonably called "pull" down, and have a similar feel to a roller blind. The GEM menus are more like a fun fair, you touch somewhere and something jumps out at you. I would describe these as "fall" down menus. They are perhaps more reminiscent of video games, where the ability to easily undo actions is not expected.

Now one assumes that there are good reasons for not having the two interfaces too similar, but the situation could easily have been improved. Having noted that the dialogue is trajectory dependent, and (perhaps) wishing to avoid the "pull" down method, we can investigate alternatives. The mode change between normal mouse movement and top menu selection, could have been signaled by an explicit mouse click over the menu bar. This would of course have required an extra click, but the really expert users would likely opt for keyboard accelerators anyway. A reviewer of an earlier, partial version of this paper suggested that the problem could have been avoided without events by only dropping the menu if the mouse stayed in the top bar for some interval thus signaling intention. Alternatively, the menus could have disappeared when the mouse moved off them, which would have preserved the undo property for mouse movement, at the cost of making the users more careful as they moved down the menu options.

Direct manipulation

We have already noted how graphical interfaces have important status-status sub-dialogues. At a low-level the mouse-device/mouse-cursor distinction breaks down and users manipulate the mouse on the screen as if they owned it directly. The screen mouse position effectively becomes the user's input. Direct manipulation systems push this feeling of ownership deeper into the interface, and thus into larger units of interaction. The aim is that user do not so much feel that they are moving an icon by dragging it with the mouse, but that they move the mouse directly. Similarly it is not that they specify the start and end points of a line with a mouse, they simply draw it. There are two points I want to draw from this, both emphasise the importance of status inputs.

The first thing to notice that the status-status interactions which most easily become assimilated and treated as extensions of the user's direct sphere of influence. This is quite natural, the way we interact with the physical world is largely, and most satisfyingly via status-status interactions. Contrast: tickling a cricket with a blade of grass, prod and pop! off it jumps; stroking a cat, the fur ruffles, the muscles move. Much of modern life seems a movement from the later to the former. The move towards automation has often substituted status interactions with event ones. The pinnacle of this is the relentless clatter of keys. Perhaps the mouse on the desk gives, an albeit poor, throwback to the feel of real materials under ones fingers. Direct manipulation systems move away from the event oriented clerical model, towards a status interactions and a feeling of handcraft.

Notice the conflict here between the analysis here and that a short while ago. Earlier I argued that trajectory independence should be avoided because of the intense hand-eye coordination required. Whereas now, one senses that it is precisely this requirement which makes direct manipulation systems satisfying. The two positions are not so far apart. Most physical interactions do not require that the history of movement be precise, they are typically feedback situations where slight deviations at one stage can be compensated for latter. This is precisely the situation for position dependent but trajectory independent systems. Situations whereby the complete movement is closely defined are notoriously stressful, for instance, the game where one takes a loop of wire over another bent wire without the two touching. The requirement of physical pressure emphasises to the user that this control is required.

The second point I want to draw out is the increased abstraction of the direct manipulation interface. We have noted how many status-status interactions are internalised to the point whereby the user feels in direct control of the 'output' status. At this point we can regard this as a new status input at a greater level of abstraction. Users feel in direct control of the objects of the interface, this may involve status-status interactions, but also events as buttons are pressed and released when objects are moved or lines drawn. Periodically they are drawn out of this direct control to exercise more distant actions. At the very least this might involve printing the final version of a document or drawing, but also may include actions like reformatting a document, searching a database or recalculating a spreadsheet.

Notice that this is merely a larger scale version of the status input model. The user has control over two types of input, status and events. In the original model we were thinking of status as being perhaps a mouse position, now the status is perhaps the whole text of a document. Before, the events were keystrokes and button clicks, now they include printing or global replacement. Even the display is stretched from the immediate screen before the users to the whole abstract state of the objects over which they feel control. However, in both cases the users image is a constant function of their status input, and this status is interpreted when events occur.


top of page || abstract || references || contents

5. Dynamism

Now having seen two major models and thrusts of analysis, how well can we understand dynamism within the user interface?

Events obviously capture points of change in the dialogue and are thus major determinants of the feel and pace of interaction. The events mark points of discontinuity. The dynamic invariants which bracket the events maintain a level of continuity however. Dynamic invariants talk about change, but they talk about controlling change.

Note that in the original model it was only the action rules for each command which took any notice of what kind of event occurred. The dynamic invariants only looked at the state immediately before, and that just after an event. They talk about what an event does, but not how it happens.

Often it is precisely the manner of change which speaks most to us. Consider a word-processor on a bit-map workstation. The user moves the document scroll bar, and the text is redrawn at a new position on the screen. There are few visual cues to the direction of movement. Contrast this with an older text based screen The movement was often achieved by actually scrolling the screen in the desired direction, sometimes this was simply because it was most efficient, but occasionally also as a deliberate policy (Bornat and Thimbleby 1986).

Now here's a conundrum, the modern interface has substituted a continuously changing status update for a discontinuous redraw event. In so doing it has lost some of the dynamism of the scroll. Where does this leave the balance of dynamism? We obviously have to take care that the manner of change is not ignored when we determine event updates. The original model tried to separate the update of the application component from the presentation component. This is obviously desirable from an architectural point of view, but obviously some interaction is required. In my study of dynamic pointers (Dix 1989, 1991). I use a pull function which captures some features of the change which occurs to the object. This enables the Map component or its equivalent to take account of some features of the manner of change whilest still maintaining a degree of separation.

Status seems very clear cut, the word does suggests it is static. However, the clock face reminds us that this is not the case. Status phenomena are about continuity, but not stasis. In fact, it is in the status-status interactions that we see some of the most dynamic interactions of graphical systems. It is interesting to note that it is the equivalent of the static invariant, relating different stati, which determines this dynamism. Because static invariants tell us how one status must be at any moment in relation to another, it also tells us they will change together.


top of page || abstract || references || contents

6. Implications

Status and events both seem necessary to adequately describe the user interface. How do specification notations and implementation techniques express these modalities.

Most formalisms for interactions regard the user's input to the system as a sequence of atomic actions, or events. This includes both psychologically inspired formalisms such as GOMS, TAG, and CLG (Card et al. 1983, Payne 1984, Moran 1981) and our own models of interaction at York (Harrison et al.. 1989, Dix and Runciman 1985, Harrison and Dix 1990, Abowd 1990). Most of the psychological notations only concern themselves with the user's planning and actions, their users have brains and fingers, but no eyes. These can be seen as nothing-event models of the user, or alternatively as event-nothing models of the machine. Of the computer science notations, most are event-status models, as was the model that began this paper, also Sufrin and He's interactive processes (Sufrin and He 1989), and other specifications in algebraic formalisms (Ehrig and Mahr 1985) and in Z (Took 1986, Sufrin 1982). In addition there has been some use of derivatives of CSP, including Alexander's SPI and recent work on agent models by Abowd (Alexander 1987, Abowd 1990). These inherit the event-event nature of the process algebras on which they are based.

On the implementation side most window managers offer a largely event-event or event-status model for the programmer. On the other hand databases and file systems are almost purely status oriented, which explains the difficulty involved in transmitting the mail event in the mail-tool example earlier.

It is worrying that there is rarely a combination of expressive techniques. In both specification and implementation we usually do not need to have both status and events available. As we have seen it usually possible to express one in terms of the other. Formalists especially would see such possibilities as a deficit in the notation and opt for parsimony. However, both in specification and implementation the lack of appropriate expressive mechanisms can be disastrous. Although it may be possible to express all interface characteristics within an event only or status only framework, it is not always easy to do so. The notation has a grain which makes it likely that only certain types of system be produced. This is especially important for the formal specification notations where ease of expression is crucial to obtaining a true reflection of the requirements. Specification notations must have adequate mechanisms for easily talking about both status and events at both input and output. the only formal work of which I am aware which addresses these issue is by Abowd, who has investigated ways of extending his agent notation to include elements of status.

We saw earlier how important events were for message communication, it is to be hoped that databases for CSCW will include events within their data model. There are precedents for this: in access oriented programming events are raised when certain variables are assigned to or accessed, also spreadsheets have similar tenor in their change propagation mechanisms, especially when within integrated packages.

We could summarise: status is about being, events about doing. Both can involve change and dynamism, but there is a tendency in modern life to think that only doing effects change. In fact the change due to status can be just as dynamic. Any interface should include elements of both kinds of dynamism, and notations for interface design should likewise allow such expression.


top of page || abstract || references || contents

References


maintained by Alan Dix 16/12/97