| ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
PART I.
Issues of Theory and Methodology
OCR for page 2
Human Performance Research: An Overview
Monica J. Harris and Robert Rosenthal
Harvard University
OCR for page 3
Table of Contents
Interpersonal Expectancy Effects eeeeeeeeee.~ee.~1
Definition. eeeeeee.~eeeeeeeeeeeeaeee.eeeeeeee.eeeee.eeeeeeee.~.e.1
Evidence for Expectancy Effects eee.eee.~eeeeee..eee.~.eeaeeeae.~.e2
Methodological Implications of Expectancy Effects ee.~.ee.~.e.~3
Mediation of Interpersonal Expectancy Effects e.~.eeeee.~.eeeee5
Ran: i ~ I .C$:tl-.C ~
The Four-Factor Theory e.~eeaee.~eeeee.~ee..eee.~.
Meta-Analysis of Expectancy Mediation eeee..eeeee
Human Performance Technologies and Expectancy Effects ee e.
Research on Accelerated Learning...
Neurolinguistic Programming... ee
Imagery and Mental Practiced...
Biofeedback.........
Parapsychology eeeeeeeeeseeeeae.~41
Situational Taxonomy of Human Performance Technologies 52
Sugges t ions for Future Research . e.~. 57
Expectancy Con tro 1 Des i gns . . . . . e.~.e. 57
Central ~ for Expectancy Effects .. ~ ~ ......
Expectancies ant the Enhancement of Human Performance....
Conclusion................................
Ref erence se~~eea~~ ~~e
^~.ea.~62
ooe.65
OCR for page 4
4
interpretation of results: Increasing ransom noise merely makes it more
difficult to obtain significant results, but increasing systematic bias can
result in completely erroneous cone fusions .
Experimenter expectancy effects are a potential source of problems for
any research area, but they may be especially influential in more recent
research areas lacking well-established findings. This is because the first
studies on a given treatment or technique are typically carries out by
creators or proponents of the technique who tend to hold very positive
expectations for the efficacy of the technique. It is not until later that the
technique may be investigated by more impartial or skeptical researchers, who
may be less prone to expectancy effects operating to favor the technique. Many
of the human performance technologies of interest in the present paper are
relatively recent innovations, and thus may be especially susceptible to
expectancy effects.
In principle, expectancy effects could be investigated by introducing
expectations as a manipulation in addition to the independent variable of
theoretical interest. This method, which will be described in detail later,
allows the direct comparison of the magnitudes of the effects due to the
phenomenon and effects due to expectancies. Another approach, perhaps even
richer theoretically, is to examine directly the processes underlying
expectancy effects as they occur in various areas. In same areas, such as the
area of teacher expectancy effects, a considerable amount of research has been
conducted in this manner, and there is now a good general understanding of
what variables are important in mediating teacher expectancies. However, in
other areas, such as the human technologies of interest here, this background
research is lacking. The best that can be done in such cases is: (a) to
OCR for page 5
Interpersonal Expectancy Effects and Human Performance Research
Monica J. Harris and Robert Rosenthal
Humans have long tried to surmount their traditional limitations and to
increase their performance. Long ago such efforts were aided by social
institutions of religion, proto-medicine, and magic. More recently, Such
efforts have been aides by social institutions of science ant its associated
technologies. Systematic programs have been developed with such aims as
improving communication, accelerating learning, ant increasing conscious
control over physiological processes. Because the promise of enhancing human
performance is so appealing, considerable resources, both in terms of time ant
money, are being invested in these programs. The time has come to step back
ant evaluate human performance technologies so that resources may be directed
more appropriately. The purpose of this paper is to aid in such an evaluation.
We will focus specifically on the possible influence of interpersonal
expectancy effects on several human performance technologies. The paper
advances in three steps: First, we describe the methodological, theoretical,
ant empirical issues relevant to the study of expectancy effects, including
how expectancy effects are mediated. Second, we describe each of several types
of human performance research and speculate on the extent to which expectancy
effects may be responsible for the experimental results. Finally, we discuss
more generally how the literature on expectancy effects can be applied to the
development and evaluation of human performance technologies.
Interpersonal Expectancy Effects
Def inition
An interpersonal expectancy effect occurs when a person (A), acting in
accordance with a set of expectations, treats another person (B.) in such a
OCR for page 6
2
manner as to elicit behavior that tends to confirm the original expectations
(Rosenthal, 1966, 1976~. For example, a teacher who believes that certain
pupils are especially bright may act more warmly toward them, teach them more
material, and spend more time with them. Over time, such a process could
result in greater gains in achievement for those students than would have
occurred otherwise.
The concept of an expectancy effect was first introduced by Merton (1948)
in his discussion of the self-fulfilling prophecy, which he defined as "a
false definition of the situation evoking a new behavior which makes the
originally false conception come true" (p. 195~. The first systematic
application of the concepts of expectancy effects in the field of psychology
came in the 1960s with a program of research on experimenter expectancy
effects (e.g., Rosenthal, 1963~. This research demonstrated that the
experimenter's hypothesis may act as an unintended determinant of experimental
results. In other words, experimenters may obtain the results they predicted
not because the relationship exists as predicted in the real world but because
the experimenters expected the sub jec ts to behave as they did .
Evidence for Interpersonal Expectancy Effects
Although originally fraught with controversy, the existence of
interpersonal expectancy effects is no longer in serious doubt. In 197B,
Rosenthal ant Rubin reported the results of a meta-analysis of 345 studies of
expectancy effects. A meta-analysis is the quantitative combination of the
results of a group of studies on a given topic. This meta-analysis showed that
the probability that there is no relationship between experimenters'
expectations and their subjects' subsequent behavior is less than .0000001.
The practical importance of expectancy effects was also substantial; the mean
OCR for page 7
effect size of expectancy effects across the 345 studies was equivalent to a
correlation coefficient of .33.
This meta-analysis also investigated the importance of expectancy effects
within a wide variety of research domains. There were eight categories of
expectancy studies: reaction time experiments, inkblot tests, animal learning,
laboratory interviews, psychophysical judgments, learning and ability, person
perception, and everyday situations or field studies. Although effect sizes
varied across categories, the importance of expectancy effects within each
category was firmly established. These results suggest that expectancy effects
may occur in many different areas of behavioral research and emphasize the
importance of taking into account the possibility of expectancy effects when
designing and conducting studies.
Although initially focused on the psychological experiment as the domain
of interest, research on expectancy effects turned quickly to other domains
where expectancy effects might be operating, domains such as teacher-student,
employer-employee, and therapist-client interactions. Over the years, research
interest has also turned from merely documenting the existence of expectancy
effects to delineating the processes underlying expectancy effects.
Methodological Implications of Expectancy Effects
Experimenter expectancy effects are a source of rival hypotheses in
accounting for experimental results. In other words, a given result could be
causes not by the independent variable under investigation but rather by the
experimenter' a expectation that such a result would be obtained. As rival
hypotheses, expectancy effects can be considered a threat to the internal
validity of a study; they are a source of systematic bias rather than random
error. Consequently, expectancy effects present a serious danger to the
OCR for page 8
4
interpretation of results: Increasing random noise merely makes it more
difficult to obtain significant results, but increasing systematic bias can
result in completely erroneous cone fusions .
Experimenter expectancy effects are a potential source of problems for
any research area, but they may be especially influential in more recent
research areas lacking well-established findings. This is because the first
studies on a given treatment or technique are typically carried out by
creators or proponents of the technique who tend to hold very positive .
expectations for the efficacy of the technique. It is not until later that the
technique may be investigated by more impartial or skeptical researchers, who
may be less prone to expectancy effects operating to favor the technique. Many
of the human performance technologies of interest in the present paper are
relatively recent innovations, and thus may be especially susceptible to
expectancy effects.
~ .
In principle, expectancy effects could be investigated by introducing
expectations as a manipulation in addition to the independent variable of
theoretical interest. This method, which will be described in detail later,
allows the direct comparison of the magnitudes of the effects due to the
phenomenon and effects due to expectancies. Another approach, perhaps even
richer theoretically, is to examine directly the processes underlying
expectancy effects as they occur in various areas. In Some areas, such as the
area of teacher expectancy effects, a considerable amount of research has been
conducted in this manner, ant there is now a good general understanding of
what variables are important in mediating teacher expectancies. However, in
other areas, such as the human technologies of interest here, this background
research is lacking. The best that can be done in such cases is: (a) to
OCR for page 9
Us
analyze the situations of interest, (b) to determine whether mediating
mechanisms shown to be important in traditional research areas are likely to
be present in the new areas, and (c) to estimate the extent to which
expectancy effects could be influential in the new area. The present paper
undertakes such an analysis.
Mediation of Interpersonal Expectancy Effects
Basic Issues
-
A primary question of interest with respect to expectancy effects is the
question of mediation: How are one person's expectations communicated to
another person so as to create a self-fulfilling prophecy? This question in
turn can be broken down into two components. The first component is the
differential behaviors that are displayed by the expecter as a result of
holding differential expectancies (the expecter-behavior link). For example,
in what ways do teachers treat their high expectancy students differently? The
second component is the differential behaviors that are associated with actual
change in expectee behavior and self-concept (the behavior-outcome link). For
example, what teacher behaviors result in better academic performance by the
students? Both these aspects are critical in understanding expectancy
mediation, for even if we could show an enormous effect of expectancy on
expecter behavior (e.g., teachers smile more at high expectancy students),
that behavior would not be important in expectancy mediation unless it
actually impacted on the expectee to create better outcomes (e.g., being
smiled at leads to better grates).
The Four-Factor "Theory"
~ ,
Rosenthal (1973a, 1973b) proposed a four-factor "theory" of the mediation
of teacher expectancy effects. In this view, four broad groupings of teacher
OCR for page 10
6
behaviors are hypothesized to be involved in teacher expectancy effects. The
first factor is climate, referring to the warmer 60cioemotional climate that
teachers may create for their high expectancy students. This factor includes
warmth communicated in both verbal and nonverbal channels. The second factor,
feedback, refers to teachers' tendency to give more differentiated feedback to
high expectancy students. The third factor, input, refers to the tendency to
teach more material and more difficult material to high expectancy students.
The fourth factor is output, or the tendency for teachers to spend more time
with high expectancy students and provide them with greater opportunities for
responding. Although the four factor theory was originally proposed to account
for the mediation of teacher expectancy effects, it seems reasonable to think
that these factors may also operate in other domains where expectancy effects
may be operating.
Meta-analysis _ Expectancy Mediation
The question of how expectancy effects are mediated is ultimately an
empirical one. Luckily, many studies address the mediation of expectancy
effects, and we have conducted a meta-analysis of this literature (Harris &
Rosenthal, 1985). Essentially, we read all the studies we could find that
examined expectancy mediation (resulting in an initial pool of 180 studies)
and classified them according to the mediating variables that were
investigated. This resulted in 31 mediating behaviors each of which was
examined in at least four studies. We then computed an overall significance
level and effect size for each of the 31 categories, separately for the
expectancy-behavior effects and the behavior-outcome effects.
The results of this meta-analysis pointed to the practical importance of
16 behaviors in mediation: negative climate, physical distance, input,
OCR for page 11
7
positive climate, off-task behavior, duration of interactions, frequency of
interactions, asking questions, encouragement, eye contact, smiles, praise,
accepting students' ideas, corrective feedback, nods, and wait-time for
responses. Table 1 summarizes the results of the meta-analysis for these 16
behaviors, presenting the effect sizes for the expectancy-behavior links and
the behavior-outcome links separately. An intuitive way of understanding these
effect sizes is given by the Binomial Effect Size Display (BESD; Rosenthal &
Rubin, 1982). The BE SD expresses correlations in terms of percent increase in
"success" rates due to a given "treatment," with the treatment group success
rate computed as .50~(r/2) and the control group success rate computed as
.50-(r/2). So, for example, the correlation of .21 for Positive Climate can be
interpreted using the BE SD as meaning that the percentage of teachers
exhibiting above average amounts of Positive Climate will increase from 39.5Z
[.50-(.21/2)] for low expectancy students to 60.5: [.50+(.21/2)] for high
expectancy students. The other effect sizes can be similarly interpreted.
Note that in Table 1 the effect sizes for behavior-outcome relations tend
to be larger than the effect sizes for expectancy-behavior relations. One
possible reason for this is that expectancies are manifested in myriad ways,
meaning that the relationship between expectations and any particular behavior
is not likely to be very strong. However, we can more accurately predict a
person's response to a particular behavior once we know that a particular
behavior has occurred. In other words, if we can condition on the behaviors
emitted, we are in a better position to make more accurate predictions.
We also presented a summary analysis evaluating the four factor theory.
The ten behavior categories with the most studies (and therefore providing the
most stable estimates) were reclassified into the four factors of climate,
feedback, input, and output. We then computed an overall significance level
OCR for page 142
38
belief in the new ~anscenderd physics remains.
Some recent examples of the "vividness" criteria in media reports are Me press coverage given
~ me~-bend~ng children (e.g. Defty, Washington Post' March 2, 1980) and the tremendous attention
given the Columbus, Ohio, poltergeist (Safran, Reader's Digest, December 1984; San Francisco Chron-
icle, March 7, 1984, from Associated Press). Both stones developed Croup extremely unreliable per-
sonal experience (Rand), 1983; Kurtz, 1984b) and demonstrate the way Mat personal reports fit me
requirements of the media better than caution or rigor. Expenmental analysis is rarely as dramatic or
newsworthy as personal reports, especially since rigorous analysis emphasizes a cautious conservative
approach. FoDow-up stones on the "debunking" of these phenomena rarely receive comparable atten-
non to the first excited reports.
The public television program Nova is regarded as one of the best popular freemen of
scientific affairs in any communicator medium. Yet itS program on ESP has been vilified by skeptics
of paranonnal phenomena (Lutz, 1984b). It tried to show both sides of the issue-- it included dramatic
"recreations" of the most famous ESP experiments and interviews with civics of ESP who proposed
altemadve explanations of these experiments. The mcreated stones were more exciting and vividly
memorable than the interviews. The enthusiasm and hopefulness of the believed was. more gripping
than the skeptics' "accenn~abon of Me negative". What were the producers of Nova to do about the fact
that what made a good story also was memorable and persuasive-- even though these elements were
irrelevant to what was ~e? In this case, Hey went for the good story.
OCR for page 143
39
Perceptual biases and mediated information
People with strong pm-exisiing beliefs are rarely affected by any presentation of evidence.
Instead, Hey manage to find some confim~adon in aU presentations. The "biased assimilator" of evi-
dence relevant to our beliefs is a phenomenon that seems obviously true of omen, but sometimes
difficult to believe In ourselves. Consider a classic social psychological study of students' perceptions
of the annual Pr~nceton-Darunouth football game. (Hastorf and Cant, 19541. Students from the
opposing schools watched a movie of the rough 1951 football game and were asked to carefully record
an infractions. The Ho groups ended up with different scorecards based on the same game. Of coupe,
this is not remarkable at an. We see this in sports enthusiasts and portico pow even day. But
what is worth noting is that the students used objective trial by trim recording techniques and they sod
saw different games if Hey were on different sides.
This is a clue to the reason that people cannot understand why others continue to disagree wad
them, even after they have been shown the "truth". We construct our perceived wodd on the basis of
expectations and theones, and then we fall to take this constructed nature of the world into account.
When we talk about the same "facts" we may not be arguing on the basis of the same construed evi-
dence. This is especially important when we are faced with ~nte~prei~ng mixed evidence. IN almost an
real-world cases, evidence does not come neatly packaged as "pro" or "con", and we have to interpret
how each piece of evidence supports each side.
IN a more recent extension of this idea, social psychologists at Stanford University presented
proponents and opponents of capital punishment with some studies that purported to show Hat deter-
rence worked, and some studies apparently showing that capital punishment had no deterrence effect
Cord, Ross & Lepper, 1979~. They reasoned that common sense must dictate that mixed evidence
should lead to a decrease in certainty In the beliefs of both partisan groups. But if partisans accept
OCR for page 144
40
supportive evidence at face value, critically scrutinize contradictory evidence, and construe ambiguous
evidence according to their theones, both sides might actuary strengthen their beliefs on the basis of
the mixed evidence.
'1he answer was clear in our subjects assessment of the pertinent deterrence studies.
Both groups believed that the methodology that had yielded evidence supportive of
Weir view had been clearly superior, both in its relevance and freedom from artifact, lo
the methodology that had yielded non-suppor~ve evidence. ~ fact however, the sum
jects were evaluating exactly the same designs and procedures, wad only me purported
results vaned....To put the matter more bluntly, the two opposing groups had each con-
strued the "box-score" vis a vis empirical evidence as tone good study supporting my
view, and one lousy study supporting the opposite view'-- a state of affairs that seem-
ingly justified the maintenance and even the strengthening of their paracular viewpoint"
(Ross, 1986, p. 14).
This result leads to a sense of pessimism for those of us who Mink that "truth" comes from the
objective scientific collection of data, and from a solid replicable base of research. Giving the same
mixed evidence to two opposing groups may drive the partisans farther apart. How is intellectual and
emotional rapprochement possible?
One possible source of optimism comes from related work by Ross and his colleagues (Ross,
Lepper & Hubbard, 1975) in which the experimenters gave subjects false information about their abil-
ity on some task. After subjects built up a theory to explain this ability, the experimenters discredited
the original information, but the subjects retained a weaker form of the theory they had built up. The
only form of debriefing that effectively abolished the (inappropriate) theory involved telling the sub-
jects about the perseverance phenomenon itself. This debriefing about the actual psychological process
involved finally aDowed the subjects to remove the effect of the false information. Biased assimilation
may be weakened in a similar way: when we understand that our most "objective" evaluations of evi-
dence involves such bias, we may be more able to understand that our opponents truly are reasonable
people.
OCR for page 145
41
Another reaction to processed evidence is the perception of hostile media bias. Why should
politicians from both ends of the spectrum believe that the media is particularly hostile to their side?
At first glance, this widespread phenomenon seems to contradict assimilative biases-- often, we don't
react to stories in the press by selectively choosing supportive evidence; instead we perceive that the
news story is deliberately slanted in favor of evidence against our side. Ross and colleagues speculated
that the same biasing construal processes are at work. A partisan has a rigid construction of the truth
that lines up with his or her beliefs, and when "evenhanded" evaluations am presented, they seem to
stress the questionable evidence for the opposition.
Support for these speculations came from studies on the news coverage of both the 1980 and
1984 presidential election and the 1982 "Beirut Massacre" (Vallone, 1986; Vallone, Ross ~ Lepper,
1985). These issues were chosen because there were actively involved partisans on both sides avail-
able. The opposing parties watched clips of television news coverage. Not only did they disagree about
the validity of the facts presented, and about the likely beliefs of the producers of the program, but
they acted as if they saw different news clips. "Viewem of the the same 30-minute videotapes reported
that the other side had enjoyed a greater proportion of favorable facts and references, and a smaller
proportion of negative ones, than their own side" (Ross, 1986, p. 18). However, objective viewers
tended to rate the broadcasts as relatively unbiased.
These "objective" viewers were defined by the experimenters as those without personal
involvement or strong opinions about the issues. But the partisans themselves-- if they are involved in
college football, the capital punishment debate, party politics or the Arab-Israeli conflict-- claim to be
evaluating the evidence on its own merits. And in a sense they are: They evaluate the quality of the
evidence as they have constructed it in their mind. It is the illusion of "direct perception" that is the
fatal barrier to understanding why others disagree with us. To the extent that we "fill in" ambiguities
OCR for page 146
42
in me infom~a~aon given we can find inte~pretatons mat make the evidence fit our model. Because
scientific practice demands public definition of concepts, measures and phenomena, personal constn~c-
dons are minimized and meanings debate can take place. But when we rely on casual observation,
personal experience and entenain~ng narratives as sources of evidence, we have too much room to
create our own persuasive consnual of the evidence.
Problems in Evaluanng Evidence [V: The Elect of Formal Research
FonTIal research structure and quantitative analysis may not be me only, or best, route to
"understanding" problems. Often, an in~epth qualitative familianty with a subject area is necessary ~
truly grasp the nature of a problem. But in all public policy programs, a private understanding must be
followed by a public demonstration of the efficacy of the program.
Only quantitative analysis leads to such a demonstration, and only quantitative evidence will
force partisans to take the other side seriously. The effect of the acceptance of this argument can be
seen in different ways in two domains: parapsycholog~cal research, and medicine. The effect of the
rejec~acn of this argument can be seen In the development of the human potential movement.
Modem parapsychology is almost entirely an experimental science, as any CUmOIy look
Hugh its influential journals vAI1 demonstrate. Articles published in He Journal of Parapsychology or
the Journal of the Sociery for Psychical Research explicitly discuss the statistical assumptions and con-
trolled research design used in their studies. Most active parapsychological researchers believe that the
path to scientific acceptance lies Hugh me adoption of rigorous experimental method.
Robert Jahn, formerly dean of engineering and applied sciences at Princeton University and an
active experimenter in this field, argues that "further careful study of this formidable field seems
justified, but only within Be context of very well conceived and technically impeccable experiments of
OCR for page 147
43
large data-base capability, with disciplined attention to the pertinent aesthetic factors, and with more
constructive involvement of the critical community" (Jahn, 1982, quoted in Hyman, 1985, p. 4). This
attitude has not caused the traditional scientific institutions to embrace parapsychology, so what have
parapsychologists gained from it?
Parapsycholog~sts have now amassed a large literature of experiments, and this compendium of
studies and results can now be assessed using the language of science. Discussions of the status of
parapsychological theories can be argued on the evidence: quantified, explicit evidence. As it stands,
the evidence for psychic phenomena is not convincing to most traditional scientists (Hymen, 1981).
But critical discussions of the evidence can take place on the basis of specifiable problems, and not
only on the basis of beliefs and attitudes (e.g. the exchange between Hyman and HonoIton on the qual-
ity of the design and analysis of the psi ganzfeld experiments, starting with Hym an, 1977; and Honor-
mn, 1979).
In direct contrast to this progression is the attitude of the human potential movement towards
evaluation and measurement. Kurt Back (1972) titled his personal history of the human potential
movement "Beyond Words" but it could have been just as accurately called "Beyond Measurement".
He begins his book and his history with an examination of the roots of the movement in the post-war
enthusiasm for applied psychology. Academic psychologists and sociologists were anxious to measure
the increase in efficiency that would result from group educational activities. They examined group
productivity, the solidarity and cohesion of the groups themselves, as wed as the weD-being of the
group members.
Few measurable changes were found, and this led the research-oriented scientists to either lose
interest in these group phenomena or to lose interest in quantitative measurement. Many of those
involved in the group experiments-- even some of the scientist who began with clearly experimental
OCR for page 148
outlooks-- were caught up in the phenomenology, the experience of the group processes.
44
Back describes many influendal workers in this movement who started out with keen beliefs
that controBed experiments wad groups processes would reveal significant observable effects. When
these were not forthcoming, Me believers made two claims: We effects of group processes were too
subtle, diffuse and holistic to be measure by reductionist science, and the only evidence that really
mattered was subjective experience-- the individual case was We only level of interest, and this level
could never be cape by extemal "objective" measurements.
"Believing the language of the movement, one might look for msearch, proof, and the
acceptability, of disproof. In fact, me followers of me movement are quote immune to
rational argument or persuasion. The experience they are seeking exists, and me believ-
ers are happy in their closed system which shows Hem mat Hey alone have mue
Lights and emotional beliefs....Seen in this light, He history of sensitivity mining is a
struggle to get beyond science" (Back, p. 204).
The dangem in trying to get beyond science In an important policy area am best described by
an example from surgical medicine. This example is often used in introductory statistics' classes
because it demonstrates that good research really makers In the world. It shows how opt bask
on personal experience or even unconsoled research can cause the adoption or condnuanon of
dangerous policies.
One treannent for severe bleeding caused by cinhosis of me liver is to send the blood Trough
a portacaval she. This operation is time-consuming and risky. Many studies (at least 50y, of varying
sophistication' have been undertaken to determine if the benefits outweigh the risks. These studies are
reviewed in Grace, Muench, and Chahners, 1966; He stadsucal meaning is discussed in Freedman'
Pisan~ & Pukes, 1978~.
The message of the studies is clear: the poorer studies exaggerate He benefits of the surgery.
Seventy-five percent of the studies without control groups (24 out of 32) were very enthusiastic about
OCR for page 149
45
tile benefits of the shunt. In the studies which had control groups which were not randomly assigned,
67% (IO out of IS) were very en~usiasuc about Me benefits. But none of the studies widi random
assignment to contm} and experimental groups had results that led to a high degree of enthusiasm.
He of these studies showed Be shunt to have no value whatsoever.
on the experiments without controls, me physicians wem accidentally biasing the outcome by
including only the most healthy patients in the study. In the e~cpenments with non~domued controls,
me physicians were accidentally biasing the outcome by assigning Be poorest padents to the control
group that did not receive the shunt. Ordy when the confound of patient heath was removed by mn-
domizanon was it clear that the risky operation was of lime or no vague.
Good research does matter. Even physicians, highly selected for intelligence and highly
trained in intuitive assessment, were misled by their daily expenence. Because He Amoral studies were
publicly available, and because the quality of the studies could be evaluated on the basis of their exper-
unental method, the overall conclusions were decisive. Until the human potential movement agrees on
the importance of quantitative evaluation, it win remain spUt into factions based on ideologies main-
tained by personal experience.
Focal research methods are not He orgy or necessarily best way to team about the true state
of nature. But good research is He ordy way to ensure Hat real phenomena win drive out illusions.
The story of the "discovery" of N-rays in France in 1903 reveals how even physics, the hardest of the
hard sciences, could be led astray by subjective evaluation (Broad ~ Wade, 1982, p. Ilk. This "new"
form of X-rays made sparks brighten when viewed by the naked eye. The best physical scientists in
France accepted this breakOuough because they wanted to believe in iL It took considerable logical
and experiment effort to convince the scientific establishment that He actual phenomenon was self-
decepi~on. Good research can disconfirm theones, subjective judgment rawly does.
OCR for page 150
46
In his clique of the use of poor research practices, Pitfalls of Hwnan Research, Barber (1976)
points out that many Saws of natural inference can creep into scientific research. "The validity and
generalizability of experiments can be significantly improved by malting more explicit the pitfalls Mat
are integral to their planning...and by keeping the pitfalls in full view of researchers who conduct
experimental studies" (pp. 90-91). While scientists and scientific methods are not immune to the flaws
of subjective judgment, good research is designed to minimize the impact of these problems.
We proper use of science in public policy involves replacing a "person-onent~" approach
win a "me~od-onented" approach (Hammond, 1978~. When cndcs or supporters focus on the person
who is setting policy cntena, the debate involves the bias and mobvanons of me people involved. But
attempts to precisely define the variables of interest and to gather data that relate to these variables
focus the adversanal debate on He quality of He nietho~s used. This "is sc~entificaBy defensible not
because it is flawless (it isn't), but because it is readily subject to scientific cnucism" (Hammond,
1978, p. 135~.
Intuitive Judgment and the evaluation of evidence: A summary
Personal experience seems a compeding source of evidence because it involves He most basic
processing of information: perception, attention, and memory storage and retrieval. Yet while we have
great confidence in the accuracy of our subjective impressions, we do not have conscious access to the
actual processes unsolved. Psychological expenment~ion has revealed that we have too much
confidence in our own accuracy and objectivity. Humans are designed for quick thinking rather than
accurate thinking. Quick, confident assessment of evidence is adaptive when hesitation, uncertainty and
self-doubt have high costs. But natural shortcut methods are subject to systematic errors and our in~s-
peci~ve feelings of accuracy are misleading.
OCR for page 151
47
These ergo of intuitive judgment lead people to search OUt confirming evidence, to interpret
mixed evidence in ways that confirm their expectations, and to see meaning in chance phenomena.
This same biased processing of information makes it very difficult to change our beliefs and tO under-
stand the point of view of those with opposing beliefs. These errors and biases are now well-
documented by psychologists and decision theorists, and the improvement of human judgment is of
central concern in current research. Me long-tenn response to this knowledge requites broad educa-
tional programs in basic statistical inference, and formal decision-m~king, such as those proposed and
examined by various authors in Kahneman et al (1982). Already, business schools include "de-
biasing" procedures in their programs of formal decision-making. But with the complex technological
nature of our society, most researchers believe that some instruction on how ~ be a better consumer of
infonnadon should start In public schools.
The immediate response should be a renewed commitment to formal structures In deciding
important policy, and a new realization that personal experience cannot be decisive in forming such
policy. As Gilbert, Light and Mosteller (1978) post Out ~ their review of me efficacy of social ~nno-
vations, only true experiment teals can yield knowledge that is reliable and cumulate. While for-
mal research is slow and expensive, and scientific knowledge increases by Any increments, Me final
result is impressively useful. Perhaps most important, explicit public evidence is our best hope for
moving toward a consensus on appropriate public policy.
OCR for page 152