Archives & Museum Informatics
158 Lee Avenue
Toronto, Ontario
M4E 2P3 Canada

published: April, 2002

Towards Tangible Virtualities: Tangialities

Slavko Milekic, M.D., PhD, The University of the Arts, Philadelphia, USA

Abstract

Rapid proliferation of different types of interaction devices that use more natural channels (voice, touch, gesture) for interfacing with the digital medium illustrates the trend (and need) towards the creation of more 'humane' interaction mechanisms. However, the current historical paradox is that modern technological advances are dramatically ahead of our understanding of their possible uses and meaning on a conceptual level.

In this paper I will present an overview of some of the currently available interaction technologies, the conceptual barriers that limit their use and the case for the creation of interaction mechanisms that make abstract (virtual) information more tangible.

Introduction

The meanings associated with the adjective "tangible" in an on-line version of the Merriam-Webster dictionary (see references for URL) include:

1 a : capable of being perceived especially by the sense of touch : PALPABLE b : substantially real : MATERIAL

2 : capable of being precisely identified or realized by the mind <her grief was tangible>

The listed synonym for "tangible" is PERCEPTIBLE, which in turn, has the following synonyms: SENSIBLE, PALPABLE, TANGIBLE, APPRECIABLE, PONDERABLE. The evolution of the term, starting with sense percepts related to the sense of touch and ending with precise mental identification and realization of abstract concepts (like 'grief' in the definition above), corresponds, more or less, to my view on this topic. In this paper, I would like to make a case that association of virtual and abstract information with multimodal sensory experiences creates a new layer of knowledge and action spaces that is more natural and efficient for humans. These in-between domains, where interactions with virtual data produce tangible sensations, I dubbed tangialities (see Figure 1.).

Figure 1. Representation of different action/knowledge domains

Please note that the way I define the term tangiality includes all sensory modalities and is not reduced just to those related to the sense of touch (haptic, cutaneous, tactile).

Our body may be considered as the first interface between ourselves and the real world. The interactions were guided by our goals (intentions), carried out through actions, and repeated or corrected based on perception of the consequences of actions (observations).

Body actions were soon enhanced through the use of tools (artifacts). Norman ( Norman 1991) introduced the concept of a cognitive artifact, as a tool that enhances cognitive operations. Although the enhancement of body actions is sometimes achieved by sheer magnification (using a lever, or inclined plane), the enhancement of cognitive operations is most often the consequence of changing the nature of the task. An example of a cognitive artifact that enhances our ability to memorize and recall events is a personal calendar. Instead of trying to rehearse and memorize all of the events for weeks to come, we have to remember only to write them down into the calendar, and to consult the calendar every day. In the context of tangialities, cognitive enhancements are also the consequence of changing the nature of the task. Most often a change is in shift from relying on formal, abstract operations as a means of gaining knowledge, to direct manipulation of data (properties) with instantaneously observable results. For example, in order to answer the problem illustrated in Figure 2., “are the dimensions of the smaller cubes exactly one-half of the larger one?” we may use conventional knowledge of algebra and solve the problem. Direct manipulation approach would be just to juxtapose two smaller cubes next to the larger one and the answer becomes self-evident. Note that the formal solution can be made harder by choosing different dimensions (for example, the height of the big cube could be 8.372914) but this does not influence the direct manipulation solution.

Figure 2. Getting the result using abstract operation or direct manipulation

I would like to add another word of clarification. For the purposes of this paper, I am not going to address a very fruitful area of research often referred to as "tangible interfaces" (for example, see Ullmer, B., Ishii, H. 2000, and Patten, J., Ishii, H., Hines, J., Pangaro, G. 2001). The cornerstone of this approach is in using real objects with desired physical (manipulable) properties as data representations/containers. These objects embody computation regardless of whether they are connected to a computer or not. Although the areas of development of tangible interfaces and tangialities overlap, and will probably merge in the future, in this paper I will focus on procedures that endow data representations with tangible properties and thus make manipulations carried out on data available to our senses.

Direct Manipulation

The first human-computer interfaces were abstract, efficient and accessible only to expert users. They involved learning the vocabulary and syntax of a command language which was then used to initiate some operations on the digitally stored data, and often one needed to issue a separate command to see the results of the previous one. There was no continuity of interaction - once the command was issued, there was no way of interfering with the process (short of aborting it). There was also no sensory feedback that would provide relevant information about the operation on an experiential level.

One of the first examples of a tangiality domain was the introduction of Graphical User Interface (GUI) and the concept of direct manipulation. "Direct manipulation", a somewhat misleading term, was introduced by Ben Shneiderman in 1983 to describe what we take today to be an integral part of human-computer interaction - the use of a mouse (cursor) for pointing at and manipulating graphically represented objects. The crucial characteristics of direct manipulation are: a) continuous visibility of the manipulated object; b) all the actions carried out on the objects are rapid, incremental and reversible; and c) the consequences of actions are immediately visible (Shneiderman, 1983, 1998).

What makes direct manipulation a tangiality domain is the fact that it provided continuous sensory input (visual and kinesthetic feedback from hand-on-mouse positions) while acting on abstract parameters (like location coordinates, adjacency, parallelism) of digitally represented data. In spite of the fact that the output in direct manipulation depended on a single sense (vision), it truly revolutionized human-computer interaction.

Figure 3. "Direct manipulation". Notice that in spite of the term the movements of the mouse are (indirectly) mapped to the screen coordinates.

Suddenly, anyone who could see and make hand movements could use the computer. However, it is the very success of the direct manipulation paradigm that is now one of the obstacles to creating even more efficient interfaces.

Problems with the Traditional Interface

As Malcolm McCullough aptly put it in his book "Abstracting Craft" (McCullough, 1996), one of the problems with the traditional "point-and-click" interface was the increased separation of the hand and the eye. In performing operations on digital data, the eye was given a mayor role of identifying, focusing, monitoring and interpreting, while the hand was reduced to performing simple repetitive gestures. The fact that the rising number of repetitive motion injuries is associated with individuals working with computers indicates that this analysis is not just a handy metaphor. Shifting the control to the eye did not bring any benefits either. Besides its physiological role in finding and interpreting visual clues, in the traditional GUI the eye is forced to play the role of the white cane of the blind for the hand. All of the cursor guidance and positioning, often demanding a single pixel precision (in graphic programs), is done under the guidance of the eye. This leads to overstraining of this sensory channel to the point that computer operators ‘forget’ to blink, ultimately developing the chronic dry conjunctivitis of the eyes.

In our other daily activities we rarely depend on single-sense feedback. Take, for example, the simple act of putting a pencil on the table. Although the eyes are involved, their role is more general – seeing whether the surface of the table is within our reach, and if it is clear of other objects. The actual act of putting the pen on the surface is guided more by cutaneous and proprioceptive clues, and augmented by discrete but definite auditory cues. This multimodal and complementary feedback is the reason our daily actions do not cause over-straining of any particular sense.

Introducing Multimodal Interaction

Although the term multimodal interaction encompasses both input and output, I will focus more on ways of making the output in human-computer interaction more tangible by using different sensory channels. This should not be taken as an indication that multimodal input (for example, using both speech and gestures) is of lesser importance. It just introduces another level of complexity that goes beyond the topic of this paper.

The value of multimodal output (feedback) was recognized in HCI design and is currently most widely-spread as coupling of actions to sounds. Sound production is a standard part of modern computer systems, and it is astonishing that the value of consistent sound feedback was not recognized earlier and integrated into interface design guidelines. By coupling actions to sounds I do not mean often exotic "sound schemes" featuring drum-rolls for window closings and alarm clock sounds for warning messages. An example of consistent auditory feedback is a discrete "click" associated with opening of a new window in Microsoft Internet Explorer. The sound is so discrete that many users don't consciously perceive it, yet immediately notice its absence when they switch to another browser.

Another indicator of this trend is the introduction of a mouse (Logitech, iFeel ™ mouse) that allows the user to "feel" different objects (for example, folders) and actions (dragging, scrolling) in the traditional GUI interactions. Haptic feedback is provided by a vibro-tactile unit in the mouse and can be finely tuned to fit individual preferences. Although the additional information initially seems trivial and meager (a series of vibration patterns), it becomes very quickly evident that it significantly increases the comfort of interactions (Figure 4.). With complementary information about cursor location or action, the user does not have to rely as much on an already over strained sense of vision. The difference in the experience can be compared to the difference between typing on a standard keyboard where every key-press is accompanied by a tactile, kinesthetic and auditory cue, and typing using keyboards where the keys are outlined on a touch-sensitive surface and provide no specific feedback. Numbers seem to confirm the benefits to the user of haptic technologies – in the first year they were introduced Logitech sold a quarter of a million of iFeel™ mice (quoted from Immersion TouchSense™ Web site).

Figure 4. Movement of the iFeel mouse (cursor) over the table cells is augmented by haptic output produced by vibro-tactile unit in the mouse. Although the stimulus is fairly discreet it significantly reduces the load of the visual sense during table input or selections using pull-down menus.

Manifest increase in the quality of interaction experience is making the devices that use haptic or force-feedback a standard in the area of computer games. A quote from Briggs and Srinivasan illustrates the use of haptic interfaces in PC game playing:

Active haptic interfaces can improve a user’s sense of presence: Haptic interfaces with 2 or fewer actuated degrees of freedom are now mass-produced for playing PC videogames, making them relatively cheap (about US$100 at the time of this writing), reliable, and easy to program. Although the complexity of the cues they can display is limited, they are surprisingly effective communicators. For example, if the joystick is vibrated when a player crosses a bridge (to simulate driving over planks) it can provide a landmark for navigation, and signal the vehicle’s speed (vibration frequency) and weight (vibration amplitude).
(Briggs and Srinivasan, 2001)

An important area where multimodal sensory feedback plays a crucial role is the area of affective computing. The pioneer of research in affective computing, Rosalind Picard (Picard, 1997) suggests a number of possible applications where the affective state of a human user becomes accessible to a computer, or another remote user. One of the applications is TouchPhone, developed by Jocelyn Sheirer (described in Picard, 2000), where the pressure that a participant in a phone conversation applies to the headset is transmitted to other party's computer screen as a color range - blue corresponding to slight pressure and red corresponding to the maximum pressure value. Inspired by TouchPhone I dared imagine a more sophisticated model where haptic information is transmitted back and forth by holding the other party's simulated "hand" while carrying on the conversation (Figure 5.).

Besides making abstract data manipulations tangible and accessible to humans and providing a channel for affective communication, using complementary sensory feedback to illustrate complex physical interactions is becoming a method of choice, especially for getting feedback while operating complex machinery or vehicles. The paradox here is that the human operator is most often not directly exposed to the relevant physical

Figure 5. By "holding hands" one can carry out both verbal and non-verbal parts of a conversation. It remains to be seen whether the actual emotional "value" to the users is high enough for this invention to be economically viable. (photo and hand simulation by S. Milekic, modeled by M. Lengauer)

changes, and these are made available by translating numerical data into a sensory experience readily interpretable by a human. The value of adding additional simulated sensory information to the real world task is beautifully illustrated by an example provided by cognitive scientist David Kirsh:

This odd situation which digital technology creates is nicely portrayed by the way modern airplanes rely on simulations of the feel of flying to improve the control of pilots. Apparently, jets fly faster if their center of mass is moved closer to the plane’s nose, thereby changing the relative position of the center of mass with respect to the center of lift. The trouble is that in moving the center of mass forward there is an increased chance that the plane will tip into a nose dive. To keep the plane flying on this knife edge the speed and sensitivity of adjustments is so great that pilots can no longer use mechanical means to control their planes. To assure fast enough response such jets now rely on digital networks to relay a simulated feel to the pilots. When a pilot pulls up on his steering wheel the computers inside the plane simulate the resistance of the ailerons delivering to the pilot the haptic information he or she needs to know what they are doing. Small computer adjustments augment and speed up these pilot reactions. To the pilot this force feedback is an integral part of the way he or she flies the plane. But, of course, there is no true resistance in the steering wheel. Pulling harder on the wheel is just a way of sending the number 7 to the wing actuators instead of the number 5. Computation is so irremediably built into planes that pilots could be in simulators. (Kirsh, 2001)

There is yet another area, academically not that well researched, that readily embraced (no pun intended) the prospect of multisensory interaction. This is the vast domain of on-line sex. As NBC reports, there exist already a number of (multisensory) products that belong to the new “cyberdildonics” area, as well as full “cyber sex suits” that allow a wide variety of tactile sensations to be experienced on strategic body parts (Brunker, MSN NBC online).

Barriers

While it may seem that all arguments are in favor of building tangialities, it is worth investigating the barriers and problems associated with this approach. As I see them, there are some closely related general problems that are listed here separately only for the sake of clarity. These are:

success of "drag-and-drop" and "point-and-click" interface;
failure to realize that changing to, or addition of, another interaction device calls for redesign of the GUI;
need to un-learn established (imposed) conventions.

Although it may seem to be a paradox that an early solution would be blocking the introduction of more advanced ones, historically this is a common occurrence. The QWERTY keyboard arrangement became standard for typewriters, and continued to be used even for computer keyboards where any conceivable keyboard layout is equally accessible. In the same way, the very success of mouse/cursor point-and-click interface made it a de facto standard closely associated with the very concept of what it means to interact with a computer. This standardized notion of what an interface should look like affected adversely the introduction of any new interaction device. For example, using a touchscreen as an interaction device while preserving the traditional GUI creates huge problems for the user because of the discrepancy in scale of objects necessary for comfortable interaction. An icon with dimension of 8 x 8 pixels; for example, a window closing box, is perfectly acceptable for a single pixel active tip of the cursor, but is definitely inappropriate for finger interaction. In environments adequately scaled for finger interactions, touchscreens have been shown to be superior pointing devices (Sears, Shneiderman, 1991). The same holds true for other interaction devices - introduction of continuous speech recognition calls for adequate feedback that verbal information has been successfully transmitted. The use of haptic mice and joysticks introduces the texture as a GUI design element, etc.

Another general problem that has to be taken into account when introducing new ways of interaction is the need for un-learning of adopted conventions. This can be a very slow process, and a transitional stage should be a part of the design of any new convention (and lack thereof is often the reason for their failure).

More specific problems with the introduction of multimodal feedback come from our lack of knowledge of the complexities of multimodal interaction. Just adding another channel to human-computer interaction is not by default beneficial. A logical and commonsensical analysis tells us that additional information presented through another channel (like any other information) may fall into the following categories:

conflicting
competing
redundant
complementary

In the above list, only the last category has a beneficial effect on interaction. In the following paragraphs I will provide examples of different types of multimodal information.

Conflicting

Conflicting information is often the consequence of hardware/bandwidth limitations as is the case with lack of audio/video synchronization in streamed Web- videos. Sometimes it is a product of poor design; for example, when an animated character's mouth movements are inappropriate for actual utterances. Conflicting information is sometimes purposefully designed into an application. An example is some Web site designs which try to keep the user "glued" to the site in order to artificially boost ratings.

Competing

An example of competing multimodal information is a voice overlay that is not synchronized to the printed text one is trying to read. Animated graphics (Gifs) on Web sites are another example of information competing for the same sensory modality (visual) and claiming a part of cognitive resources. An example from everyday life is the effect carrying on a phone conversation has on driving ability.

Redundant

A definition of redundant multimodal information conveyed through another sensory channel, which does not increase the total amount of information about the interaction but is also not adversely affecting the interaction.

Complementary

Complementary multimodal information is information conveyed through another sensory channel that does increase the total amount of information received and has a beneficial effect on interaction. This effect can be manifest as an increase in efficacy of interaction, or decrease in number of errors. This is the only instance where the bandwidth of human-computer information channel is increased by engagement of another channel.

Case for Building Tangialities

In conclusion, one can make the case for building tangialities for the following reasons:

widening bandwidth of human/computer communication channel;
adding affective dimension to interaction;
allowing grasping and manipulation of complex concepts without the need for explicit formalization;
reducing cognitive load by use of intuitive body (biological) knowledge;
reducing the strain on one sense (vision) - single sense fatigue;
possibility of adding another dimension to meta data – “how does it feel” (being able to feel the texture of paintings and other, otherwise “untouchable” objects)

Another significant use of tangible descriptions of data and results of data manipulations is in making the digital domain more accessible for populations with special needs. There are already some promising results in this area (Yu, Ramloll and Brewster, 2001; Gouzman, Karasin, Braunstein 2000).

Currently, we are lacking a satisfactory theory of multimodal interaction and most often arrive at usable results through a process of trial and error. It is evident that this theory has to come from interdisciplinary efforts bridging the disciplines as diverse as neurophysiology, tele-robotics and computer science.

References

Biggs, J., Srinivasan, M.A. (2001) Haptic Interfaces, in Stanney, K.M. (ed.) Handbook of Virtual Environment Technology, Lawrence Erlbaum Associates, Inc.

Brewster, S.A.(2001). The Impact of Haptic ‘Touching’ Technology on Cultural Applications In Proceedings of EVA 2001. (Glasgow, UK), Vasari UK, s28 pp1-14, also available on line: http://www.dcs.gla.ac.uk/~stephen/papers/EVA2001.pdf

Brunker, M., (2001) Sex toys blaze tactile trail on Net, MSNBC News online article:http://www.msnbc.com/news/318124.asp?cp1=1

Gouzman, R., Karasin, I., Braunstein, A. (2000) The Virtual Touch System by VirTouch Ltd: Opening New Computer Windows Graphically for the Blind, Proceedings of "Technology and Persons with Disabilities" Conference, Los Angeles March 20-25, 2000, on-line: http://www.csun.edu/cod/conf2000/proceedings/0177Gouzman.html

Immersion TouchSense™ technology, consulted on line 24/3/02: http://www.immersion.com/products/ce/generalmice.shtml

Kirsh, D., (2001) Changing the Rules: Architecture in the new Millennium, Convergence

Logitech, http://www.logitech.com

McCullough, M. (1996) Abstracting Craft: The Practiced Digital Hand, The MIT Press, Cambridge, MA

Merriam-Webster Dictionary consulted on line 2/28/02: http://www.m-w.com/home.htm

Norman, D. (1991) Cognitive Artifacts, in Carroll, M.J.(ed.) Designing Interaction: Psychology at the Human-Computer Interface, Cambridge University Press

Oakley, I., Brewster S.A. and Gray, P.D. (2000). Communicating with feeling. In Proceedings of the First Workshop on Haptic Human-Computer Interaction, 17-21

Patten, J., Ishii, H., Hines, J., Pangaro, G. (2001) Sensetable: a wireless object tracking platform for tangible user interfaces, in Proceedings of the SIGCHI conference on Human factors in computing systems 2001, Seattle, Washington 253-260, ACM Press, NY

Picard, R.W. (1997) Affective Computing, The MIT Press, Cambridge, MA

Picard, R.W. (2000) Toward computers that recognize and respond to user emotion, IBM Systems Journal, Vol. 39, 705-717

Roy, D. (2000) Learning from multimodal observations. Proc. IEEE Int. Conf. Multimedia and Expo (ICME), New York, NY, consulted on line 3/4/02: http://dkroy.www.media.mit.edu/people/dkroy/papers/pdf/ieee_multimedia2000.pdf

Sears, A., Shneiderman, B., (1991) High precision touchscreens: design strategies and comparisons with a mouse, International Journal of Man-Machine Studies 34, 593-613

Shneiderman, B. (1983), Direct Manipulation: A step beyond programming languages, IEEE Computer, 16, 8, 57-69

Shneiderman, B. (1998) Designing the User Interface: Strategies for Effective Human-Computer Interaction, Addison-Wesley

Ullmer, B., Ishii, H. (2000) Emerging Frameworks for Tangible User Interfaces, IBM Systems Journal, v393n3, 915-931

Yu, W., Ramloll, R., Brewster S.A. (2001) Haptic graphs for blind computer users, in Brewster, S.A. and Murray-Smith, R. (Eds.) Haptic Human-Computer Interaction, Springer LNCS, Vol 2058, 41-51.