The National Academies Press

Currently Skimming:

4 Design and Evaluation
Pages 121-153

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.

From page 121... ... 4 DESIGN AN D EVALUATION Designing any sort of computer-mediated device for ordinary people for effective and pleasant everyday use has proven to be surprisingly difficult. The evidence for this observation comes from the myriad problems cited above in this report and at the workshop organized by the steering committee, from systematic empirical studies cited in this chapter, and from anecdotes involving frequent complaints from ordinary people when they are required to use the currently most common publicoriented application telephone-based voice response menu systems as well as from more sophisticated users of World Wide Web concerning the complexities and frustrations that have led as many to abandon the online life as to join it. Read the entire page →
From page 122... ... Another example comes from the digital libraries context and relates to the Cypress on-line database of some 13,000 color images and associated metadata from the Film Library of the California Department of Water Resources (Van House, 1996~. Iterative usability testing led to improvements for two groups of users, a group from inside the film library and a more diverse and less expert group of outsiders. Read the entire page →
From page 123... ... The situation can become analogous to providing the cockpit control panel of an airliner for use by its passengers to turn on their reading lights. The consequences in computing range from the proliferation of features in software products to observations that most amateur spreadsheets contain serious errors, and that employee hand-holding costs as much as hardware for business personal computer users, to additional but seldom-used features on standard computer keyboards.~ The concept of multimodal interfaces that would accommodate alternative approaches to input and/or output, discussed in Chapters 2 and 3, will introduce considerable complexity into the technology development process, without adding any new functional features. Read the entire page →
From page 124... ... The irony is that in some cases (e.g., early cellular phones, personal computer software) , a significant amount of complexity appears to derive from software and sometimes hardware added with the intention of "enhancing" usability.2 Emphasis on Users with Unusual Abilities Computer-mediated tools emphasize individual differences in ability more than do traditional technologies. Read the entire page →
From page 125... ... One guess comes from studies of the efficiency gains expected for computer applications to common business tasks. From the sparse available data, Landauer (1995) Read the entire page →
From page 126... ... The challenge is to design so as to exploit the potential power for ease of learning and use as well as for increased functionality. Discontent with proliferating features contributed to mid-199Os experiments with so-called network computers, with fewer features than conventional personal computers, as well as to periodic articles in the business press about the persistently high costs of owning and using personal computers.4 Several members of the steering committee and reviewers of a draft of this report wondered whether the low-efficiency gains and large individual differences found in studies in the 1980s may have been overcome by technological advances in the l990s. Read the entire page →
From page 127... ... For example, this approach will require significant innovation in system instrumentation and user sampling techniques because, as outlined later, the untutored opinions of programmers and other power users are usually of little value for detecting the functionality and usability problems that are important for ordinary people (Nielsen, 1993~. Tracking such efforts in broad-based user involvement and assessing their effectiveness might provide a productive starting place for research on large-scale participatory design and evaluation methods. Read the entire page →
From page 128... ... Not only are software specialists typically more experienced with the technology, but they are also, in general, quite different from the average user in the characteristics and abilities currently needed to deal effectively with computers: youth, mechanical and mathematical interests, good spatial memory, verbal fluency, and logical ability. They also tend to be less socially and pragmatically oriented in personality (Tognazzini, 1992~. Read the entire page →
From page 129... ... In short, evidence discussed at the workshop indicates that an organized design and development process that ensures that the needs and abilities of potential average citizen users will be well taken account of has not yet become standard practice in software to nearly the extent that it has in the manufacture of most other mass-market products. Workshop discussions among technical experts and social scientists knowledgeable about specific population segments attested to the diversity of needs, reactions, and other qualities within the population as well as the uneven appreciation for that diversity. Read the entire page →
From page 130... ... AL60 a, .,. 40 201nn1 80- ~ 0~ MORE THAN SCREEN DEEP New Query Language / / 0 0 / To 0 SQL 1 I � ~ / ~ 0 out a / / / 0 0 20 40 60 80 ~0 Logical Reasoning Ability {Percentile) Read the entire page →
From page 131... ... Several usability specialists at the workshop reported routinely advising designers to provide as many functions, features, and options as will be useful, feasible, and in demand by experts, but to "hide" them from users who want only basic functions by, for example, retaining the simplicity of short menus that emphasize only the best general functions and offer the option of selecting an "advanced functions" button for access to special features. The second approach is, of course, to increase the sophistication of users through education, training, and access to good guides and manuals (e.g., "training wheels" and "minimal manual" techniques, and scaffolded and staged advancement) Read the entire page →
From page 132... ... Sometimes the evaluation was done by systematic experiments in laboratory settings, sometimes by careful examination of the interface and dialogue by two or more experienced usability experts, sometimes by detailed examination of usage logs, sometimes by analysis of videotapes of users working, sometimes by informal observation of users or by asking users to talk about what they were doing. In one telling example, as recounted at the workshop by John Thomas, researchers at NYNEX used a simulation model to estimate and measure the work efficiency of a new graphical user interface intended to improve the efficiency of a computer system for use by thousands of employees. Read the entire page →
From page 133... ... Comprehensive usefulness and usability evaluation of entire three-dimensional user interfaces, for example, remains to be undertaken (Herndon et al., 1994~. TOO LITTLE USE OF KNOWN METHODS While more empirical evaluation of usefulness and usability is being done, especially by large producers on major products, the quantity and intensity of such research is still very slight relative, for example, to the mechanical testing of auto engines, wind tunnel testing of airframes, cycle speed and reliability testing of computer chips, or code correctness and performance testing of computer programs. Read the entire page →
From page 134... ... Professional societies would have a role or at least a position on possible mechanisms. Another suggested approach was to get appropriate evaluation activities better specified in development process standards and their certifications. Read the entire page →
From page 135... ... Other workshop participants, including especially social psychologists and human factors specialists, were of the opinion that, while such assessments are more difficult and less sure than, for example, comparative evaluation of existing and widely used technologies, methods already exist that can give good early hints. Some methods are associated with scholarly research; some are associated with market research (much of which draws on social science techniques, such as conjoint analysisusing analysis of statistical variance to explore trade-offs and constant sum point assignment to assess priorities) Read the entire page →
From page 136... ... Applied in 1870, a task analysis approach might have discovered that people spent a good deal of time writing letters in order to pass small bits of news or to obtain short answers: reports of sickness, requests for prices, dinner invitations, social maintenance greetings. An analyst might have gone on to count the number of occasions in which circumstances arose that would be well served by different means of communication if they were available and, if clever, would have noted that people often took long periods out of demanding days to walk miles to merely chat with friends, that occasionally runners were sent in all possible haste to fetch midwives or relatives. Read the entire page →
From page 137... ... or asking them, where "them" is both ordinary potential users and usability experts, what they think of what the system does and has to offer (Lund and Tschirgi, 1991~. At the workshop, Bruce Tognazzini showed the workshop the "Starfire" vision video produced by Sun Microsystems. Read the entire page →
From page 138... ... This view, perhaps influenced by the "input-process-output" paradigm, has strongly influenced the design and evaluation of user interfaces. Not surprisingly, that orientation yielded a substantial body of information about the significance of individual differences in ability and prior experience for ease of use and judged and measured usefulness of alternative system designs as well as guides for improved research and evaluation (see above) Read the entire page →
From page 139... ... , and intelligent agents interact in a built environment (or both) , it is likely that new or significantly improved design and evaluation methods will be needed to make such interchange accessible to everyday citizens. Read the entire page →
From page 140... ... Instead, the inference to be drawn is that taking into account the social nature of most citizens' everyday activities and the range of actors and contexts involved makes advance use of task analysis, performance analysis, focus group discussion, and rapid prototyping both more important and more difficult. Read the entire page →
From page 141... ... They also illustrate how broader usability involves not only the user interface per se but also the social context and the overall service and what these imply for interfaces. Although the more recent growth in sales and apparent use of groupware products such as Lotus Notes suggests that progress has been made, the evolving NII also raises the prospect of far larger numbers of people interacting than has been experienced to date. Read the entire page →
From page 142... ... In the first place, whatever methods are chosen, they will need to involve trial users playing the interdependent roles relevant to the application in sufficient number and over sufficient time to exercise, assess, and redesign the varied functions that the application is supposed to support. This consideration by itself suggests that iterative test and redesign of groupware may take longer and cost more than the same methods applied to independent applications if sufficient ingenuity is not brought to bear on the evaluation methods, for example, by embedding usefulness and usability analysis in the instrumentation of experimental designs offered over the World Wide Web. Read the entire page →
From page 143... ... Several workshop participants noted that once one moves beyond a focus on personal computers as the access device and considers all manner of devices telephones, television remote controls, and so on, as well as embedded systems the problems and opportunities add up to a very large set. Inherent Unpredictability of Use For reasons both practical and theoretical, predicting the performance of social applications in real-world use on the basis of prior research is inherently difficult. Read the entire page →
From page 144... ... Specific tasks tend to be tightly delimited and jobs of the performers in typical studies depend on their use of the system; in the NII, in contrast, there is a huge variety of tasks, a huge variety of users, and the users have more choice in what they do and how. Walter Feurzeig, of BBN, argued that it is nevertheless difficult to consider user interfaces independent of specific activities. Read the entire page →
From page 145... ... Involve Representative Users in Substantive Design and Evaluation Activity Early and Often Participatory design is difficult to arrange, as noted above, and so more likely to be slighted. The goal is to understand how interfaces to connected communities may prove more than skin deep, how they may affect how we locate and remain aware of one another and find shared information, as well as how we understand, enact, and track our roles in group activities, recover from errors, merge our work with others, and so on. Read the entire page →
From page 146... ... One obvious approach is to conduct field trials with smaller than universal, but still representative, population samples; this procedure is as yet seldom followed. Often, as workshop participants noted, experts both system designers and such specialists as speech or occupational therapists may play the role of representative users; sometimes a think-aloud approach is used in which users comment on their experiences as they use a system. Read the entire page →
From page 147... ... This is the principle behind new efforts to conduct product beta-tests via the Web, as noted earlier. Given the desirability of involving greater numbers of representative users in application design and evaluation as well as field trials and implementation, and given the capability of networked systems to enable both the provision of usable prototypes and the collection of user feedback, it would be desirable to explore options for leaving applications intentionally underdesigned, to be adaptively developed as they are implemented in contexts of use (see Box 4.2~. Read the entire page →
From page 148... ... Although a few comparative studies have been made of some of the different methods in use user testing, heuristic evaluation, cognitive walkthroughs, scenario analysis, ordinary and video ethnographythese studies have not reached any unequivocal conclusions; indeed, there is active controversy about their relative advantages. This is an area in Read the entire page →
From page 149... ... It is often mystifying to usability professionals that testing is resisted as strongly as it is and that calls for doing principle-based design are so frequent in this arena, when practitioners and managers concerned with other complex dynamic systems (even electronic circuits and software) can easily see the need and strongly support empirical methods. Read the entire page →
From page 150... ... During its development, it was subjected to an exemplary application of formative evaluation and involving nearly daily user testing and redesign. Moreover, the graphical user interface (GUI) Read the entire page →
From page 151... ... As prior research has found, user test results showed a gain of approximately 50 percent in user task performance efficiency as a direct result of usability engineering activities. Several lessons can be taken from this and recent, similar reports. Read the entire page →
From page 152... ... Also of interest, about one-sixth of the papers at CHI96 were directed toward network interface applications, and another sixth were about research on general interface components that might be used in the future the kind of science research toward principled design many workshop participants thought should be better encouraged. As mentioned above, it could be hypothesized that greatly increased beta testing made possible by World Wide Web dissemination of software has reduced the need for explicit evaluation. Read the entire page →
From page 153... ... She reports estimates that 27 percent ($3,510) of the $13,000 annual cost of a networked personal computer goes for providing technical support to the user, and writes, "There's a Parkinson's Law in effect here: computer software grows to fill the expanded hardware. Read the entire page →

From page 121...

... 4 DESIGN AN D EVALUATION Designing any sort of computer-mediated device for ordinary people for effective and pleasant everyday use has proven to be surprisingly difficult. The evidence for this observation comes from the myriad problems cited above in this report and at the workshop organized by the steering committee, from systematic empirical studies cited in this chapter, and from anecdotes involving frequent complaints from ordinary people when they are required to use the currently most common publicoriented application telephone-based voice response menu systems as well as from more sophisticated users of World Wide Web concerning the complexities and frustrations that have led as many to abandon the online life as to join it.

4 Design and Evaluation Pages 121-153

4 Design and Evaluation
Pages 121-153