[Top] [Prev] [Next] [Bottom]

"Experience keeps a dear school, but fools will learn in no other."

Benjamin Franklin, Poor Richard's Almanac, 1757



Chapter 2

Related Work


2.1 Introduction

This chapter provides an illustrated tour of systems and techniques for manipulation of virtual objects and navigation in virtual environments. Although the overall emphasis is on two-handed, three-dimensional approaches, I have attempted to provide a sampling of 2D and 3D approaches, as well as one-handed and two-handed techniques, to provide some overall context for my own work. Since experimental work is an emphasis of this dissertation, this review also includes experiments analyzing issues in two-handed interaction.

2.2 Two-dimensional approaches for 3D manipulation

A number of interaction techniques have been proposed which allow 2D cursor motion to be mapped to three-dimensional translation or rotation. These techniques usually consist of a combination of mappings of mouse motion to virtual object motion and a mechanism for selection of modes to alter which degrees-of-freedom the 2D device is currently controlling.

2.2.1 3D Widgets

3D Widgets are graphical elements directly associated with objects in the scene. For example, an object might be augmented with handles to allow rotation of the object (fig. 2.1, left), or with shadows to allow constrained translation of the object (fig. 2.1, right). The 3D Widgets disguise separate manipulation modes as spatial modes; that is, moving the mouse and dragging a new handle invokes a new constraint mode. Of course, the widgets must be added to the scene as extra components, which may clutter the scene or obscure the object(s) being manipulated.

Figure 2.1 Examples of "3D Widgets" (taken from [43] and [77])1.

Other, more abstract, varieties of 3D widgets include the Arcball [153], Chen's Virtual Sphere [40], and the Evans & Tanner [51] rotation controllers. The Arcball and Virtual Sphere both use the metaphor of rotating an imaginary sphere; the Evans & Tanner technique, while not explicitly using this metaphor, was a predecessor of these techniques which exhibits performance similar to the Virtual Sphere [40]. These techniques will be discussed in further detail in the context of Chapter 6, "Usability Analysis of 3D Rotation Techniques." As an additional example, Nielson and Olsen's triad mouse technique allows specification of 3D points by mapping mouse motion onto the projected principle axes of an object (fig. 2.2). The resulting points can be mapped to translation, rotation, and scaling in three dimensions. This technique works well at three-quarter perspective views (as seen in figure 2.2), but does not work as well when the projection of two axes fall close to one another.

Figure 2.2 Cursor movement zones for the triad mouse technique [125].

2.2.2 SKETCH

The SKETCH system [186] provides a gesture-based 2D interface for generating approximate 3D CAD models. SKETCH's gesture-based interface consists of a small vocabulary of axis-aligned strokes, mouse clicks, and modifier keys which can be composed following some simple rules. SKETCH also makes clever use of heuristics, such as making a stroke down from an existing object or sketching a shadow below an object (fig. 2.3), to infer the 3D location of objects from a 2D sketch. In this sense, SKETCH makes use of the dynamic process of sketching itself to infer information, rather than relying only on static 2D images, as is usually the case in vision research.

Figure 2.3 SKETCH uses heuristics to infer 3D placement [186].

2.3 Using the hand itself for input

2.3.1 Gloves and Gesture for virtual manipulation

Sturman [165] and Zimmerman [189] describe instrumented gloves which can detect the flex angles of the hand and finger joints (fig. 2.4). These gloves are typically used in conjunction with a six degree-of-freedom magnetic sensor attached to the wrist, providing a combined input stream of hand position and finger flexion. The development of techniques to reliably detect gestures, such as making a fist, an open palm, or pointing, from multiple streams of input data, is still an active area of research (for example, see Wexelblat [183]). Other research, such as Baudel's Charade system [10], has focused on techniques for designing gesture command languages which are easy for users to learn.

Figure 2.4 Glove used by Zimmerman [189].

Recently introduced gloves, called chord gloves or pinch gloves [52], detect contact between the fingertips, such as when pinching the thumb and forefinger together. Detecting contact is much more reliable than detecting gestures through finger flexion or hand movement. The gloves cost significantly less without the flexion sensors, making it more practical to use two gloves in one application, as demonstrated by the Polyshop system [1] (fig. 2.5), as well as Multigen's SmartScene application [121]. But most importantly, touching one's fingers together is much easier for users to learn than a gesture language, so fingertip contact is an effective means to provide frequently used commands.

Figure 2.5 Polyshop employs two-handed interaction with chord gloves [1].

2.3.1.1 The Virtual Wind Tunnel

Bryson has implemented a virtual wind tunnel which allows scientists to visualize properties of airflow and pressure in computational fluid dynamics volumetric data sets. These enormous data sets yield many computational challenges [25]. Interaction with the data is achieved with a glove for pointing and issuing gestural commands plus a boom display for viewing the environment (fig. 2.6). The boom display is used instead of a head-mounted display to avoid problems with encumbrance. The boom also makes it easy for a pair of scientists to share views of the data.

Figure 2.6 Using a boom display and glove with the Virtual Wind Tunnel [25].

2.3.2 Electric field sensing

Zimmerman [190] describes a technique for detecting hand or body position without any encumbering input device at all. The technique works by sensing the capacitance associated with the human body. The body acts as a ground for a low frequency energy field, displacing a current which is functionally related to position within the field. The technique can also be used in the opposite manner, using the body as a short-range electric field transmitter: this has applications in mobile computing, for example. Since the technique relies only on electrical components, it has the potential to become a ubiquitous, cheap technology available as a PC plug-in board.

2.3.3 VIDEODESK and VIDEOPLACE

Kreuger's VIDEODESK system [104] uses video cameras and image processing to track 2D hand position and to detect image features such as hand, finger, and thumb orientation. This approach reads to a rich vocabulary of simple, self-revealing gestures such as pointing, dragging, or pinching which do not require any explicit input device [103]. For example, the index finger and thumb of both hands can be used to simultaneously manipulate four control points along a spline curve (fig. 2.7, left) or the finger tips can be used to sweep out a rectangle (fig. 2.7, right).

Figure 2.7 Example VIDEODESK applications using two hands [104].

Kreuger has also explored two-dimensional camera-based tracking of the entire body in the VIDEOPLACE environment [104][105], and he has linked this environment with the VIDEODESK. For example, the VIDEODESK participant can pick up the image of the VIDEOPLACE participant's body, and the VIDEOPLACE participant can then perform gymnastics on the other person's fingers [104]. Although these examples are somewhat whimsical, they demonstrate that the medium has many design possibilities.

2.3.4 Hands and voice: multimodal input

2.3.4.1 Hauptmann's behavioral study

Hauptmann [76] reports a "Wizard-of-Oz" study of voice and gesture techniques for 3D manipulations such as translation, rotation, and scaling of a single object. In a Wizard-of-Oz study, instead of implementing an entire system, a "man behind the curtain" plays the role of the computer. Such studies are often used to classify the types of responses that users will naturally generate, so that designers will have a better sense of the issues which an actual system implementation must handle. Hauptmann uses the Wizard-of-Oz strategy to explore the possibilities for voice and gesture, together and in isolation. Hauptmann found that the vocabulary of speech and gestural commands was surprisingly compact. Hauptmann's test subjects strongly preferred to use simultaneous voice and gesture. Many subjects also spontaneously used both hands, particularly when indicating scaling operations.

2.3.4.2 Put-That-There

Bolt's "Put-That-There" system [16] provides a compelling demonstration of multimodal voice input plus gesture. Put-That-There allowed users to manipulate objects on a large-screen graphics display showing (for example) a map. The user could point at objects on the map, and speak commands to modify them.

The key insight of the Put-That-There system is that either modality alone is quite limited. Voice recognizers (even today) are nowhere near 100% reliable, especially when the syntax of possible commands is complex. So describing everything verbally is both tedious and error-prone. Gesture (pointing) alone allows selection and dragging, but not much else. But together, the modalities provide a rich medium. Instead of saying "Put the orange square to the right of the blue triangle," the user can speak commands such as "Put that... to the right of that." The recognizer needs to know only one object name: "that." The pronoun is disambiguated using the pointing gesture. This simplifies the syntax of voice commands, and also frees the user from having to remember the names of objects.

2.3.4.3 Two hands and voice

Bolt and Herranz [17] report an implementation of some of the techniques explored by Hauptmann's study [76], among others, using voice input, two-handed gesture, and eye gaze information. The eye gaze information is used to select objects. Voice commands help to disambiguate the eye gaze and gesture information. Two-handed techniques include dual-hand rotation about principle axes of an object as well as two-handed relative placement of objects, where one hand acts as a reference, and the other hand indicates the relationship of a second object to the first hand.

Weimer and Ganapathy [180] discuss voice and gesture input for a 3D free-form surface modeling system. Weimer and Ganapathy initially based the interface purely on gestural information, but added voice for three reasons: (1) people tend to use gestures to augment speech; (2) spoken vocabulary has a more standard interpretation than gestures; and (3) hand gesturing and speech complement one another. Voice is used for navigating through commands, while hand gestures provide shape information. Weimer and Ganathapy report that there was "a dramatic improvement in the interface after speech recognition was added" [180].

2.4 One-handed spatial interaction techniques

Hand-held input devices afford different interaction techniques than systems based on glove and gesture-based approaches. This section focuses on techniques in free space that utilize three-dimensional hand-held input devices.

2.4.1 Schmandt's stereoscopic workspace

Schmandt [142] describes a stereoscopic workspace which allows the user to see his or her hand in the same apparent volume as computer-generated 3D graphics, providing direct correspondence between the 3D input and the 3D output volumes (fig. 2.8). A hand-held wand is seen through a half-silvered mirror, upon which the computer graphics are projected. A white spot on the wand itself can be seen through the half-silvered mirror and acts as a real-world cursor. Despite the corresponding input and output volumes, the semi-transparent mirror used in the workspace cannot provide correct occlusion cues. Occlusion is widely regarded as the most important depth cue [19], so despite the corresponding input and output volumes, users sometimes found the workspace difficult to use.

Figure 2.8 Schmandt's stereoscopic workspace [142].

2.4.2 Ware's investigations of the "bat"

Ware has investigated interaction techniques for a six degree-of-freedom magnetic tracker, which he refers to as the "bat" [176][179]. Ware reports that it is difficult to achieve precise 3D positioning using a 1:1 control ratio when the arm or hand is unsupported. He finds that rotations of the bat produce inadvertent translations; interaction techniques which require the user to precisely control both position and orientation at the same time are difficult to master. But when the requirement for precision is relaxed, test users can make effective use of all six degrees of freedom: simultaneous positioning and orientation yields faster task completion times than separate positioning and orientation. Ware uses the bat as a relative device. A button on the bat acts as a clutch allowing or disallowing movement, enabling users to employ a "ratcheting" technique to perform large translations or rotations.

2.4.3 High Resolution Virtual Reality

Deering [46] describes a system which uses a hand-held 3D input device in conjunction with stereoscopic projection (using a standard desktop monitor and shuttered glasses). The effect is similar to that originally explored by Schmandt (fig. 2.8), as the stereoscopic 3D volume and the 3D input volume correspond. In one demonstration of the system, the user holds his or her hand up to a (stereoscopically projected) miniature virtual lathe. Deering particularly emphasizes achieving highly accurate registration between the 3D input device and the projected graphics, resulting in a fairly convincing illusion of "touching" the high-resolution 3D graphics with the input device.

2.4.4 JDCAD

Liang's JDCAD [111] is a solid modeling system which is operated using a magnetic tracker similar to Ware's "bat" [176]. By switching modes, the bat can be used to interactively rotate and move the model under construction, to select objects for subsequent operations or to orient and align individual pieces of the model.

Figure 2.9 JDCAD configuration and cone selection technique [111].

The system set-up is similar to Deering's system [46], with head tracking and a single hand-held input device used in front of a standard monitor (fig. 2.9, left). JDCAD, however, uses an object selection mechanism based on ray or cone casting. The bat is used to shoot a ray or cone into the scene, allowing the user to hold the input device in a comfortable position and rotate it to change the cone direction. The objects intersected by the cone then become candidates for selection (fig. 2.9, right).

Another innovation introduced by Liang is the ring menu. This is a menu selection technique that is designed specifically for a hand-held 3D input device, and thus eliminates the need to switch between the bat and the mouse for menu selection. The items available for selection appear in a semi-transparent belt; a gap in this belt always faces the user and indicates the selected item. Figure 2.10 shows a user selecting a cylinder geometric primitive using the ring menu. Rotation of the bat about the axis of the user's wrist causes a new item to rotate in to the gap and become selected. This technique is useful, but it does not scale well to a large number of items, or to selection of menu items which cannot be easily represented by a 3D graphics icon.

Figure 2.10 The ring menu technique for 3D menu selection [111].

Liang has evaluated JDCAD by comparing it in usability tests with commercially available 3D modelling packages [112]. Liang has not, however, performed formal experimental analysis of the factors which contributed to JDCAD's design.

2.4.5 Butterworth's 3DM (Three-Dimensional Modeler)

Butterworth [26] describes a 3D CAD (Computer Aided Design) system for use in head-mounted displays. One hand is used for all interactions, including selecting commands from a floating menu, selecting objects, scaling and rotating objects, or grabbing vertices to distort the surface of an object. Modes for gravity, plane snapping, and grid snapping aid precise placement. The interface has support for multiple navigation modes, including growing and shrinking the user to allow work at multiple levels of detail; walking a short distance within tracker range; and grabbing the world to drag and rotate it. Butterworth reports that "since the user can become disoriented by all of these methods of movement, there is a command that immediately returns the user to the initial viewpoint in the middle of the modeling space"[26]. This is evidence of an underlying usability problem; the command to return the user to the initial viewpoint helps, but the reported user disorientation indicates that the underlying problems of navigation and working at multiple scales still need additional research.

2.5 Two-handed spatial interaction techniques

2.5.1 3Draw

Sach's 3-Draw system is a computer-aided design tool which facilitates the sketching of 3D curves [141]. In 3-Draw, the user holds a stylus in one hand and a tablet in the other. These tools serve to draw and view a 3D virtual object which is seen on a desktop monitor. The palette is used to view the object, while motion of the stylus relative to the palette is used to draw and edit the curves making up the object (fig. 2.11). Sachs notes that "users require far less concentration to manipulate objects relative to each other than if one object were fixed absolutely in space while a single input sensor controlled the other".

Figure 2.11 The 3Draw computer-aided design tool [141].

2.5.2 Interactive Worlds-in-Miniature

Stoakley's Worlds-in-Miniature (WIM) interface metaphor [164] provides the virtual reality user with a hand-held miniature representation of an immersive life-size world (fig. 2.12, left). For example, the user can design a furniture layout for the room in which he or she is standing. Users interact with the WIM using both hands. The user's non-preferred hand holds the WIM on a clipboard while the preferred hand holds a "button ball," a ball-shaped 3D input device instrumented with some buttons (fig. 2.12, right). Moving a miniature object on the WIM with the ball moves the corresponding life-size representation of that object.

Figure 2.12 The Worlds-In-Miniature (WIM) metaphor [164].

The WIM effectively integrates metaphors for viewing at 1:1 scale, manipulating the point-of-view, manipulation of objects which are out of physical reach or occluded from view, and navigation [133]. I will discuss some further issues raised by two-handed interaction with the WIM in the context of Chapter 4, "Design Issues in Spatial Input," as well as Chapter 8, "The Bimanual Frame-of-Reference."

2.5.3 The Virtual Workbench

Poston and Serra [138] have implemented the Virtual Workbench (fig. 2.13), a mirrored display similar to the system implemented by Schmandt (fig. 2.8), but the mirror which Poston and Serra use is opaque. The means that all images are completely computer generated, allowing correct occlusion cues to be maintained. The Virtual Workbench employs a physical tool handle which can have different virtual effectors attached to it, depending on the current mode. In some modes it is possible to use both hands to assist manipulation (fig. 2.13). Poston and Serra [148] report that physical layout of the workspace is important: the user's head has to stay in front of mirror, so users with differing body sizes may need to make adjustments to their working environment to avoid uncomfortable postures.

The Virtual Workbench has been developed with medical applications in mind. It includes support for tasks of interest in neurosurgical visualization, including cross-sectioning a volume. Poston and Serra have not, however, performed any formal experimental evaluation of the design issues raised by the Virtual Workbench.

Figure 2.13 The Virtual Workbench [138].

2.5.4 The Reactive Workbench and ImmersaDesk

Another recent trend in interactive 3D graphics is to use stereoscopic projection on a tilted large-screen projection display. Examples of such displays include the Reactive Workbench [52] and the ImmersaDesk [49]. The user typically wears stereo glasses, causing stereoscopically projected objects to appear to occupy the space in front of or behind the display. The large display fills a wide visual angle and gives a better sense of presence than an ordinary desk top display. These displays are useful when multiple persons wish to collaborate on a single problem; much like drafting board invites others to look over one's shoulder at ongoing work, the large display surface conveys a subtle social message which tells others that the information is meant for sharing. Multiple participants, however, cannot simultaneously view accurate stereoscopic data, because the geometry of the stereoscopic projection can only be correct for one viewer.

Figure 2.14 The ImmersaDesk [49].

2.5.5 Other notable systems

Several other systems demonstrate two-handed techniques for virtual object manipulation. Shaw's THRED system for polygonal surface design [149] employs both hands. The nonpreferred hand indicates constraint axes, performs menu selections, and orients the scene, while the preferred hand performs all detailed manipulation. Shaw has performed users tests on THRED [150], but he has not reported any formal behavioral studies. Leblanc [107] describes a sculpting application which uses a Spaceball (a force-sensing six degree-of-freedom input device that rests on the desk surface) in the nonpreferred hand to orient a sculpture, while the preferred hand uses a mouse to select and deform the vertices making up the sculpture. Mine [119] has recently added two-handed manipulation to an immersive architectural design application. Mine uses the separation of the hands to indicate the magnitude of object displacements, and to select options from a "mode bar" extending between the hands.

2.6 Theory and experiments for two hands

2.6.1 Guiard's Kinematic Chain Model

Guiard's analysis of human skilled bimanual action [67] provides an insightful theoretical framework for classifying and understanding the roles of the hands. The vast majority of human manual acts involve two hands acting in complementary roles. Guiard classifies these as the bimanual asymmetric class of manual actions.

Guiard has proposed the Kinematic Chain as a general model of skilled asymmetric bimanual action, where a kinematic chain is a serial linkage of abstract motors. For example, the shoulder, elbow, wrist, and fingers form a kinematic chain representing the arm. For each link (e.g. the forearm), there is a proximal element (the elbow) and a distal element (the wrist). The distal wrist must organize its movement relative to the output of the proximal elbow, since the two are physically attached.

The Kinematic Chain model hypothesizes that the preferred and nonpreferred hands make up a functional kinematic chain: for right-handers, the distal right hand moves relative to the output of the proximal left hand. Based on this theory and observations of people performing manual tasks, Guiard proposes three high-order principles governing the asymmetry of human bimanual gestures, which can be summarized as follows:

(1) Motion of the preferred hand typically finds its spatial references in the results of motion of the nonpreferred hand. The preferred hand articulates its motion relative to the reference frame defined by the nonpreferred hand. For example, when writing, the nonpreferred hand controls the position and orientation of the page, while the preferred hand performs the actual writing by moving the pen relative to the nonpreferred hand (fig. 2.15). This means that the hands do not work independently and in parallel, as has often been assumed by the interface design community, but rather that the hands specify a hierarchy of reference frames, with the preferred hand moving relative to the nonpreferred hand.

(2) The preferred and nonpreferred hands are involved in asymmetric temporal-spatial scales of motion. The movements of the preferred hand are more frequent and more precise than those of the nonpreferred hand. During handwriting, for example, the movements of the nonpreferred hand adjusting the page are infrequent and coarse in comparison to the high-frequency, detailed work done by the preferred hand.

(3) The contribution of the nonpreferred hand starts earlier than that of the preferred. The nonpreferred hand precedes the preferred hand: the nonpreferred hand first positions the paper, then the preferred hand begins to write. This is obvious for handwriting, but also applies to tasks such as swinging a golf club [67].

Figure 2.15 Guiard's handwriting experiment [67].

A handwriting experiment by Guiard illustrates these principles in action (fig. 2.15). The left half of the image shows an entire sheet of paper as filled out by the subject on dictation. The right half of the image shows the impression left on a blotter which was on a desk underneath the sheet of paper. The experiment demonstrates that movement of the dominant hand occurred not with respect to the sheet of paper itself, but rather with respect to the postures defined by the non-dominant hand moving and holding the sheet of paper over time.

In a related experiment, Athenes [7], working with Guiard, had subjects repeatedly write a memorized one-line phrase at several heights on individual sheets of paper, with the nonpreferred hand excluded (no contact permitted with the sheet of paper) during half the trials. Athenes's study included 48 subjects, including a group of 16 right-handers and two groups of left-handers,2 each again with 16 subjects. Athenes's results show that when the nonpreferred hand was excluded, subjects wrote between 15% and 27% slower, depending on the height of the line on the page, with an overall deficit of approximately 20%. This result clearly shows that handwriting is a two-handed behavior.

Although for consistency I have used handwriting as an example throughout this section, note that Guiard's original analysis is rich with many examples of bimanual acts, such as dealing cards, playing a guitar, swinging a golf club, hammering a nail, or unscrewing a jar cap. Guiard carefully describes how the principles which he proposes apply to a wide range of such examples.

Guiard also provides an overall taxonomy of bimanual actions. The three principles outlined above apply to the bimanual asymmetric class of manipulative actions only. The above principles do not apply to bimanual symmetric motions, such as weight lifting, climbing a rope, rowing, jogging, or swimming. In particular, during locomotion, symmetry seems to be a virtue, since asymmetric motions would result in postural instability (loss of balance). Symmetric motion is also useful to specify scale or extent, as demonstrated by Kreuger's two-handed technique for sweeping out a rectangle (fig. 2.7, right) [104]. As a final note, Guiard's principles also do not apply to the communicative gestures which accompany speech (for example, see the work of Kimura [101][102]), or those that are used for sign language. Such gestures are most likely a completely separate phenomenon from the manipulative gestures studied by Guiard.

Looking beyond the hands, one might also apply the Kinematic Chain model to reason about multiple effector systems ranging from the hands and voice (playing a piano and singing [68]), the hands and feet (operating a car's clutch and stick shift), or the multiple fingers of the hand (grasping a pen). For example, for the task of playing a flute, Guiard argues that the preferred hand organizes its action relative to the nonpreferred hand as usual, but that the mouth (breathing and tongue motion) performs the highest frequency actions relative to the hands. Similarly, one's voice can also be thought of as a link in this hierarchy. Although multimodal two handed input plus voice input will not be a subject of this dissertation, Guiard's model could offer a new perspective for studying how and when the hands and voice can work well together, and how and when they cannot. In existing studies that I am aware of, the hands and voice have generally been thought of as independent communication channels rather than as mutually dependent or hierarchically organized effectors [17][76][180]. For example, a pianist can easily sing the right-handed part of a piece of music, while continuing to perform the left-handed part, but singing the left-handed part without error is difficult or impossible to achieve, even for simple compositions [68].

2.6.2 Formal experiments

There are few formal experiments which analyze two-handed interaction, and I am not aware of any previous formal experimental work which has studied two hands working in three dimensions. There is an extensive literature of formal analyses of bimanual tasks in the psychology and motor behavior fields; I will briefly review this literature in the context of Chapter 7, "Issues in Bimanual Coordination."

2.6.2.1 Buxton and Myers experiments

A classic study by Buxton and Myers [27] (fig. 2.16) demonstrated that two-handed input can yield significant performance gains for two compound tasks that were studied: a select / position task and a navigate / select task. Their results demonstrated that two-handed techniques were not only easily learned by novices, but also that the two-handed techniques improved the performance of both novice and expert users.

Figure 2.16 Configuration for Buxton and Myers experiments [27].

In the first experiment reported by Buxton and Myers, subjects positioned a graphical object with one hand and scaled it with the other. The task allowed subjects to adopt either a strictly serial strategy (i.e., position first, then scale) or a parallel strategy (position and scale at the same time). Buxton and Myers found that novices adopted the parallel task strategy without prompting and that task performance time was strongly correlated to the degree of parallelism employed.

In a second experiment, subjects scrolled through a document and selected a piece of text. Buxton and Myers found that both experts and novices exhibited improved performance using two hands to perform this task versus using one hand, and they also found that novices using two hands performed at the same level as experts using just one hand.

2.6.2.2 Kabbash experiments

Kabbash, MacKenzie, and Buxton [94] compared pointing and dragging tasks using the preferred hand versus the non-preferred hand. For small targets and small distances, the preferred hand exhibited superior performance, but for larger targets and larger distances, there was no significant difference in performance. Contrary to the traditional view that humans have one "good" hand and one "bad" hand, the authors concluded that "the hands are complementary, each having its own strength and weakness" [94].

A second experiment by Kabbash, Buxton, and Sellen [95] studied compound drawing and color selection in a "connect the dots" task (fig. 2.17). The experiment evaluated the two-handed ToolGlass technique [15]. ToolGlass consists of a semi-transparent menu which can be superimposed on a target using a trackball in the nonpreferred hand. The preferred hand can then move the mouse cursor to the target and click through the menu to apply an operation to the target. Note that this integrates the task of selecting a command (or mode) from the menu and the task of applying that command to objects being edited. In Kabbash's experiment, the ToolGlass was used to select one of four possible colors for each dot.

Figure 2.17 Second experimental task used by Kabbash [95].

Kabbash proposes that two-handed input techniques which mimic everyday tasks conforming to Guiard's bimanual asymmetric class of gestures can produce superior overall performance. Kabbash's results suggest that everyday two-handed skills can readily transfer to the operation of a computer, even in a short interval of time, and can result in superior performance to traditional one-handed techniques. This result holds true despite the benefit of years of practice subjects had with the one-handed techniques versus only a few minutes of practice with the two-handed techniques.

Kabbash also demonstrates that, if designed incorrectly, two-handed input techniques can yield worse performance than one-handed techniques [95]. In particular, the authors argue that techniques which require each hand to execute an independent sub-task can result in increased cognitive load, and hypothesize that consistency with Guiard's principles is a good initial measure of the "naturalness" of a proposed two-handed interaction.

2.6.2.3 Leganchuk's area selection experiment

Leganchuk [108] has adapted Kreuger's camera-based techniques (fig. 2.7) [104] to tablet-based input using a pair of wireless devices (fig. 2.18). Leganchuk's experiment studied an "area sweeping" task in which subjects selected an area encompassing a target. This is similar to sweeping out a rectangle to select a set of targets in a graphics editing application. Using both hands allowed subjects to complete the task significantly faster than using just one hand.

For the one handed technique, Leganchuk also timed how long it took users to switch between two separate control points used to sweep out the rectangle. Even when this time was removed from the one-handed technique, the two-handed technique was still faster, especially for sweeping out larger, more difficult areas. Thus, Leganchuk argues that the difference in times cannot be attributed to the increased time-motion efficiency alone, and interprets this as evidence that the bimanual technique "reduces cognitive load."

Figure 2.18 Experimental task and apparatus used by Leganchuk [108].

2.7 Summary

One thing that stands out in a review of the user interface literature is how few systems have supported two-handed interaction. There are several reasons for this, including cost, lack of support from windowing systems and user interface toolkits, and the inexperience of designers in developing two-handed interfaces. And I am only aware of a handful of formal experiments which have studied the nature of two-handed interaction with computers, most of which have been performed in the context of Buxton's Input Research Group at the University of Toronto. To my knowledge there has not yet been any experimental work which has studied two hands working in three dimensions with multiple degree-of-freedom input devices3.

My work is different and new along several dimensions. First, I have applied virtual manipulation to a develop a novel interface for volume visualization; the application to neurosurgical visualization is new. Second, much of the previous work has been driven by the technology rather than the real-world needs of an actual user. By working closely with domain experts, I have been able to focus on a real problem and I have been able to contribute not only a description of the design, but also the results of usability tests with real users. Many other efforts have been aimed at general applications without a specific focus or a specific user base.

Finally, most of the prior research has taken either a purely systems-building approach, which boils down to an account of "here is something we built," or it has taken a purely experimental approach, performing psychology experiments which may or may not be relevant to the problems faced by designers. By combining these approaches, my work offers an integrated discussion which is both timely and relevant to design.



1 Please note that every figure in this chapter has been reproduced from another source. Hereafter, a reference number in each figure caption will indicated the source from which the figure has been reproduced or adapted.

2 Athenes used two groups of left-handers because there are at least two distinct left-handed handwriting postures; these postures will be discussed further in section 7.2.1 of Chapter 7, "Issues in Bimanual Coordination."

3 Chris Shaw has performed usability studies on the THRED two-handed polygonal surface design system [150], but he has not performed any experimental analyses of the underlying behavioral issues.



[Top] [Prev] [Next] [Bottom]

Copyright © 1996, Ken Hinckley. All rights reserved.