[Top] [Prev] [Next] [Bottom]

"A major challenge of the post-WIMP interface is to find and characterize appropriate mappings from high degree-of-freedom input devices to high degree-of-freedom tasks."

Stu Card



Chapter 1

Introduction


This thesis originated with a user interface I designed for three-dimensional neurosurgical visualization (fig. 1.1). The interface design was well received both by neurosurgeons and by the community of human-computer interface (HCI) designers. By itself, the interface contributes a point design which demonstrates techniques that allow neurosurgeons to effectively view and cross-section volumetric data. A primary goal for this dissertation is to move beyond point design and to introduce some careful scientific measurement of behavioral principles which were suggested by the original system implementation. Based on the synergy of (1) an interface design which has undergone extensive informal usability testing and (2) formal experimental evaluation, I make some general points about interface design and human behavior, so that some of the lessons learned in the neurosurgery application can be applied to future user interface designs.

picture of user holding props

Figure 1.1 The props-based interface for neurosurgical visualization.

1.1 Problem motivation

Figure 1.2 illustrates plateaus in the quality and facility of user interaction over the history of computing systems. When computers were first introduced, very few people had the knowledge and ability to operate them. Computers were extremely expensive, so the most important concern was to optimize use of computer time. Users submitted "batch jobs" in advance which would keep the computer constantly busy; it did not matter if both the turn-around time, and the resulting user productivity, were very poor. The introduction of interactive teletype and command-line interfaces was an improvement, but most people still did not have the expertise to perform tasks using a computer.

Figure 1.2 Historical plateaus in quality and facility of user interaction.

By the late 1970's, with computer costs continuing to fall, the cost of time that users spent operating a computer became comparable to the cost of the computer itself. This led to a new philosophy for interface design, first embodied by the Xerox Star personal computer [9]: the Star's designers realized that it was more important to make effective use of the user's time than to optimize processor cycles. In the process of attempting to make users more productive, the Star became the first computer to include a user interface based on a bitmapped display and a mouse. The Star inspired the designers of the Apple Lisa [9], and later the Apple Macintosh computer, and led to many of the current conventions for graphical user interfaces. For the first time, these innovations made it possible for a large segment of the population, without extensive knowledge of how computers work, to use personal computers as tools.

The current paradigm for graphical user interfaces (GUI's) has been dubbed the "WIMP" (Windows, Icons, Menus, and Pointer) interface. Many WIMP graphical interaction techniques were originally designed by Xerox and Apple for low-powered processors with small black-and-white displays. Yet as computing technology becomes ubiquitous and the capabilities of processors, displays, and input devices continue to grow, the limitations of the WIMP interface paradigm become increasingly apparent. To get past this "WIMP plateau," devising new interface metaphors will not be enough. We need to broaden the input capabilities of computers and improve the sensitivity of our interface designs to the rich set of human capabilities and skills.

There are many devices, displays, and interaction techniques which are candidates for the post-WIMP interface. Candidates identified by Nielsen [123] include virtual realities, sound and speech, pen and gesture recognition, animation and multimedia, limited artificial intelligence (in the form of so-called "interface agents"), and highly portable computers. Weiser [181] has proposed the ubiquitous computing paradigm, which suggests that networked computers will increasingly become integrated with ordinary implements and that computers will be embedded everywhere in the user's environment.

Demonstrations of point designs will not be sufficient to take advantage of advanced interaction techniques. Departing with tradition is expensive and risky, and while a demonstration might be compelling, it cannot tell an interface designer how to generalize a proposed interface technique to a new situation. A demonstration also cannot answer questions about when a new interface technique should be used, or when it may (or may not) have measurable advantages for the user. One must perform careful scientific evaluation of interaction techniques to understand how to use a proposed technique as well as why a proposed technique may result in improved performance for the user. Without such knowledge, and interaction technique essentially must be re-invented every time a designer attempts to use it.

Perhaps the best-known archetype for how one should approach formal user interface evaluations is provided by Card's experimental comparison of the mouse and other input devices [39]. Card formed mathematical models of each device and tested these models against observed performance data. Card found that the mouse was effectively modelled by Fitts's Law, which predicts movement time based on the amplitude of a movement and width of the target area. Furthermore, Card showed that the same model, with roughly the same parameters, modelled both movement with the mouse and movement with the hand alone. This suggests that designing an input device which allows one to point at targets more quickly than one can point using the mouse would be difficult to achieve.

Summarizing the philosophy of his approach, Card has written:

[User technology]... must include a technical understanding of the user himself and of the nature of human-computer interaction. This latter part, the scientific base of user technology, is necessary in order to understand why interaction techniques are (or are not) successful, to help us invent new techniques, and to pave the way for machines that aid humans in performing significant intellectual tasks. [38]
My dissertation seeks to follow Card's general approach: by formulating hypotheses and subjecting hypotheses to experimental tests, I demonstrate some fundamental mechanisms of human behavior and general design principles which suggest new design possibilities for the post-WIMP interface.

1.2 Virtual manipulation

Three-dimensional (3D) interfaces and interaction techniques form one candidate for the post-WIMP interface, especially in application areas such as computer-aided design (CAD), architectural design, scientific visualization, and medicine. A central problem for three-dimensional interaction is that of virtual manipulation, which concerns the general problem of grasping and manipulating computer-generated 3D virtual objects. Virtual manipulation includes tasks such as specifying a viewpoint, planning a navigation path, cross-sectioning an object, or selecting an object for further operations.

Virtual manipulation poses a difficult dilemma: one wants virtual objects to violate reality so that one can do things that are not possible to do in the real world, yet one also wants virtual objects to adhere to reality so that the human operator can understand what to do and how to do it. The interface design challenge is find ways that real and virtual objects and behaviors can be mixed to produce something better than either alone can achieve [156]; and part of this challenge is to discover interaction techniques that do not necessarily behave like the real world, yet nonetheless seem natural. This leads to a key point: to design interaction techniques which meet these criteria, short of taking wild guesses in the dark, the interface designer needs to understand the human.

1.3 Passive haptic issues

This thesis mainly deals with what a psychologist would consider proprioception (the reception of stimuli produced within the body) and kinesthesis (the sense of bodily movements and tensions). I focus not on the feel of objects, but rather on how users move their hands to explore and manipulate virtual objects, and how users are sensitive to changes in the relative distances between the hands and the body. But, as psychologist J. J. Gibson has pointed out [63], even the terms proprioception and kinesthesis are quite imprecise. The terms include diverse sensations such as sense of movement of the body, joint angles, upright posture and body equilibrium, forces exerted by or on the body, orientation relative to gravity, as well as sense of linear and angular accelerations [63]. Nonetheless, the hypotheses raised by this thesis do not depend upon whether or not these diverse sensations comprise separate sensory systems or one super-ordinate sense.

Haptic comes from the Greek word haptesthai, which means to touch. In human-computer interaction, and for the purposes of this thesis, the term haptic encompasses a broad set of related sensations including proprioception and kinesthesis. By describing all of these sensations as "haptic" my intent is not to suggest that these are all facets of a unitary sense, but rather to suggest that together these sensations constitute the feel of an interface and present a set of issues which graphical user interfaces and prior efforts in virtual manipulation have often neglected [31][32]. Haptic issues for virtual manipulation involve issues of manual input and how they influence the feel of the human-computer dialog, including issues such as the muscle groups used, muscle tension, the shape and mechanical behavior of input devices, coordination of the hands and other limbs, and human sensitivity to the articulation of the body and the position of its members relative to one another.

Haptic issues include the related topic of active haptic feedback using force-returning armatures or exoskeletons. These devices allow one to feel the shape and texture of a computer-generated object. The Phantom [147], for example, is an armature with a stylus at its distal end. The user can feel a virtual object through the stylus. When the user holds and moves the stylus, the Phantom knows the position of the tip of the stylus and can generate forces which resist the user's motion when a virtual surface is encountered. Although there are some promising applications for active haptic feedback [21], the technology is still relatively primitive and expensive.

I primarily focus on the information which the user's own manual movements provide for feedback during manipulation-- what one might call passive haptic feedback because the computer can't directly control or alter these sensations.1 But the design of the user interface can and should take passive haptic issues into account to provide a human-computer dialogue which is natural and which takes advantage these innate human capabilities and sensations. In essence, my approach focuses not on a particular technology, but rather on the capabilities of the human participant.

1.4 Humans have two hands

The cooperative action of the two hands and the proprioceptive information which they provide to the user comprise aspects of passive haptic feedback which have been particularly neglected in virtual manipulation. Humans not only have two hands, but they also have highly developed manual skill with both hands. Then why doesn't contemporary computer interface design reflect this? When the nonpreferred hand is used at all, it is usually banished to the occasional keyboard button press. In the traditional "WIMP" interface, even with the preferred hand, all of the human's manual skills are boiled down to moving a single speck on the screen with a mouse. We can do better than this.

The above motivation takes an admittedly informal approach, but Guiard's research on the cooperative action of the two hands working together [67], which Guiard terms bimanual action, provides a scientific basis for understanding how humans use two hands. For the present discussion, I will primarily rely on intuitive examples to illustrate Guiard's work; chapter 2, "Related Work," treats Guiard's theoretical contributions in greater depth.

A fundamental observation provided by Guiard is that, in the set of human manipulative tasks, purely unimanual acts are by far a minority, while bimanual acts are commonplace. Manipulative motions such as dealing cards, playing a stringed musical instrument, threading a needle, sweeping, shovelling, striking a match, using scissors, unscrewing a jar, and swinging a golf club all involve both hands. Even writing on a piece of paper with a pen, which has sometimes been mistakenly classified as a unimanual behavior [67], is demonstrably two-handed: Guiard has shown that the handwriting speed of adults is reduced by about 20% when the nonpreferred hand cannot help to manipulate the page [67].

Some of these tasks can be performed unimanually if necessary due to injury or disability, but the normal human behavior is to use both hands. Threading a needle is an interesting example: logic tells us that it should be easier to thread a needle if it is held still in a clamp, and one just has to hold the thread in the preferred hand to guide it through the eye of the needle. Holding the needle in the "unsteady" and "weak" nonpreferred hand introduces a second moving thing and should make the task more difficult. But using both hands instead makes the task easier: when faced with a stationary needle, the first instinct is to grab it with the nonpreferred hand. The nonpreferred hand acts as a dynamic and mobile clamp which can skillfully coordinate its action with the requirements of the preferred hand.

Virtual manipulation is a particularly promising application area for two handed interaction. There is not yet an established "standard interface" such as the WIMP interface's mouse-and-keyboard paradigm which dominates the marketplace. Furthermore people naturally use both hands to indicate spatial relationships and to talk about manipulations in space [76], and as Guiard has argued, the vast majority of real-world manipulative tasks involve both hands [67]. Finally, virtual manipulation presents tasks with many degrees-of-freedom; using both hands can potentially allow users to control these many degrees-of-freedom in a way that seems natural and takes advantage of existing motor skills.

1.5 Thesis statement

Behavioral principles (as proposed by Guiard) suggest that humans combine the action of the hands through an asymmetric hierarchical division of labor. I assert that a system must respect the subtleties of the application of these principles; the mapping of hand motion to virtual object motion is not obvious.

1.6 Contributions and overview

1.6.1 Interdisciplinary approach

Human-computer interaction is inherently interdisciplinary; some would argue that computer science itself is inherently interdisciplinary, with (as Fred Brooks has put it) the computer scientist taking on the role of a toolsmith [20] who collaborates with others to develop useful tools. In my research, the collaboration of neurosurgeons, computer scientists, and psychologists has been essential. The input of neurosurgery domain experts validates the computer science research: rather than addressing a "toy problem," the work has meaning and application for something real. Working with neurosurgeons has also forced me to address problems I otherwise might have ignored, such as making the physical input devices look professional and "polished," or carefully optimizing the program code to allow real-time, interactive visualization of volumetric cross-sectional information. The collaboration with psychologists has forced me to think carefully about the underlying behavioral issues, and has allowed me to pursue careful scientific evaluations of these issues. And of course, without a computer science systems-building approach, there would be no tool for neurosurgeons to use, and no design issues to validate with experiments. In short, an interdisciplinary approach has enabled a decisive contribution to the virtual manipulation and two-handed interaction research fields. Without any one part of this collaboration, the work would not be as convincing. Taken as a whole, my thesis contributes a case study for interdisciplinary research methodology in human-computer interaction.

1.6.2 Revisiting haptic issues

The sense of touch is crucial to direct skilled manipulative action. With all the emphasis on graphical user interfaces, it is easy to forget that physical contact with the input device(s) is the basis of manipulative action. While much research has focused on the visual aspects of 3D computer graphics for virtual reality, many of the haptic issues of virtual manipulation have gone unrecognized. The phrase Look and Feel is frequently used to capture the style of both the visual appearance (the output) and the physical interaction (the input) afforded by an interface-- but as Buxton has argued, to reflect the relative investment of effort on the output side versus the input side, the phrase should perhaps instead be written [31]:

Look

and Feel.

At the broadest level, in the process of revisiting haptic issues, I have demonstrated that facile virtual manipulation requires studying the feel of the interface as well as the look of the interface.

1.6.3 Application to neurosurgical visualization

I have worked with the Department of Neurological Surgery to develop a 3D volume visualization interface for neurosurgical planning and visualization (fig. 1.1). The 3D user interface is based on the two-handed physical manipulation of hand-held tools, or props, in free space. These user interface props facilitate transfer of the user's skills for manipulating tools with two hands to the operation of a user interface for visualizing 3D medical images, without need for training.

From the user's perspective, the interface is analogous to holding a miniature head in one hand which can be "sliced open" or "pointed to" using a cross-sectioning plane or a stylus tool, respectively, held in the other hand. Cross-sectioning a 3D volume, for example, simply requires the surgeon to hold a plastic plate (held in the preferred hand) up to the miniature head (held in the nonpreferred hand) to demonstrate the desired cross-section.

1.6.4 Two-handed virtual manipulation techniques

The interface demonstrates interaction techniques which use the nonpreferred hand as a dynamic frame of reference. Unlike a clamp (whether physical or virtual), the nonpreferred hand provides mobile stabilization. An important design principle is for the interface to preserve the mobile, dynamic role of the nonpreferred hand as a reference. The nonpreferred hand adjusts to and cooperates with the action of the preferred hand, allowing users to restrict the necessary hand motion to a small working volume. Informal usability evaluations with hundreds of test users confirms that two-handed virtual manipulation can be effective and easy to learn when designed appropriately.

1.6.5 Basic knowledge about two hands

I contribute formal experimental data which confirms Guiard's suggestion that the hands have specialized, asymmetric roles. In particular, the thesis includes the first experimental data which suggests that the preferred hand operates relative to the frame-of-reference of the nonpreferred hand. I also show that the advantages of hand specialization are most significant for "high degree of difficulty" tasks.

1.6.6 Two hands are not just faster than one hand

Using two hands provides more than just a time savings over one-handed manipulation. Two hands together provide the user with information which one hand alone cannot. Furthermore, a simultaneous two-handed task is not the same thing as the serial combination of the sub-tasks controlled by each hand. The simultaneous task allows for hierarchical specialization of the hands. It also provides the potential to impact performance at the cognitive level: it can change how users think about a task. Since the user can potentially integrate subtasks controlled by each hand without an explicit cost to switch between subtasks, this encourages exploration of the task solution space.

1.7 Organization

This thesis is organized as follows:

Chapter two: Related Work describes related work in virtual manipulation, multimodal input, and theory and experiments for two-handed interaction.

Chapter three: System Description discusses interaction techniques, implementation issues, and notes on user acceptance for a three-dimensional interface which I designed for neurosurgical visualization.

Chapter four: Design Issues in Spatial Input presents a synthesis of design issues drawn from the literature on virtual manipulation, tempered by my interface design experience with the three-dimensional interface for neurosurgical visualization.

Chapter five: Research Methodology discusses the issues raised by evaluation of user interfaces and working with domain experts. This chapter also discusses the rationale and the process for the formal experimentation approach used in my thesis work.

Chapter six: Usability Analysis of 3D Rotation Techniques presents a formal analysis of techniques for three-dimensional rotation of virtual objects.

Chapter seven: Issues in Bimanual Coordination describes experiments which analyze manipulation of physical objects with both hands.

Chapter eight: The Bimanual Frame-of-Reference presents an experiment which demonstrates how two hands together provide the user with information about the environment which one hand alone cannot, and also demonstrates that physical articulation of a task can influence how users think about that task.

Chapter nine: Conclusions summarizes the main results of my thesis work and proposes some potential areas for future research.



1 An ecological psychologist (one who studies perception in the natural environment, emphasizing the moving, actively engaged observer) would think of these as "active" feedback sensations, because the user seeks out and explores the stimuli rather than passively receiving them [75]. From the standpoint of feedback which is under direct control of the computer, however, the information is static or passive, even if the user is active in exploring the passive stimulus.


[Top] [Prev] [Next] [Bottom]

Copyright © 1996, Ken Hinckley. All rights reserved.