Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 91
5
Recommendations for
Policy and Research
T
he preceding chapters have synthesized our key findings and con-
clusions from the basic research about the way that incentives oper-
ate and from the applied research about the results of implementing
test-based incentive policies in education. In this chapter, the committee
recommends ways to improve current test-based incentive policies and
highlights important directions for further research. We discuss the use
of test-based incentives, the design of test-based incentive programs, and
the research that is needed about those programs.
THE USE OF TEST-BASED INCENTIVES
As discussed in Chapter 4, there have been a number of careful efforts
to use test-based incentives to improve education. They have included
broadly implemented government policies—notably, state high school exit
exams and the school-level requirements of NCLB and its predecessors—as
well as experimental programs. A number of these programs have been
carefully studied, using research designs that allow some level of causal
conclusions about their effects. We conclude (see Chapter 4) that the avail-
able evidence does not give strong support for the use of test-based incen-
tives to improve education and provides only minimal guidance about
which incentive designs may be effective. However, basic research related
to the design of incentives and the practical experience from implement-
ing the first generation of incentive programs suggest more sophisticated
approaches to designing incentive programs that are promising and should
91
OCR for page 92
92 INCENTIVES AND TEST-BASED ACCOUNTABILITY IN EDUCATION
be investigated. As a result, we recommend that policy makers continue
to support the development of new approaches to test-based incentives
but with a realistic understanding of the limited knowledge about how to
design such programs so that they will be effective.
Recommendation 1: Despite using them for several decades,
policy makers and educators do not yet know how to use test-
based incentives to consistently generate positive effects on
achievement and to improve education. Policy makers should
support the development and evaluation of promising new
models that use test-based incentives in more sophisticated
ways as one aspect of a richer accountability and improvement
process. However, the modest success of incentive programs
to date means that all use of test-based incentives should be
carefully studied to help determine which forms of incen-
tives are successful in education and which are not. Continued
experimentation with test-based incentives should not displace
investment in the development of other aspects of the educa-
tion system that are important complements to the incentives
themselves and likely to be necessary for incentives to be effec-
tive in improving education.
It is only by continuing to conduct careful research about test-based
incentive programs that it will be possible to understand how they can
be more effectively designed. The small or nonexistent benefits that have
been demonstrated to date suggest that incentives need to be carefully
designed and combined with other elements of the educational system to
be effective. Much additional work will be required to learn whether and
how test-based incentives can be used to produce consistent improve-
ments in education. The available evidence does not justify a single-
minded focus on test-based incentives as a primary tool of education
policy without a complementary focus on other aspects of the system.
THE DESIGN OF NEW PROGRAMS
The general lack of guidance coming from existing studies of test-
based incentive programs in education suggests that future policy experi-
mentation with test-based incentives should be guided by the key con-
trasts that emerge from basic research about how incentives operate.
Recommendation 2: Policy makers and researchers should
design and evaluate new test-based incentive programs in ways
that provide information about alternative approaches to incen-
OCR for page 93
93
RECOMMENDATIONS FOR POLICY AND RESEARCH
tives and accountability. This should include exploration of the
effects of key features suggested by basic research, such as who
is targeted for incentives; what performance measures are used;
what consequences are attached to the performance measures
and how frequently they are used; what additional support
and options are provided to schools, teachers, and students in
their efforts to improve; and how incentives are framed and
communicated. Choices among the options for some or all of
these features are likely to be critical in determining which—if
any—incentive programs are successful.
In general, the design of test-based incentives should begin with a
clear description and delineation of the most valued educational goals
that the incentive program is meant to promote, as well as recognition of
the tradeoffs among these goals. Those goals should shape the features
of the incentive program, even though experience shows that the effects
of a program may not always occur in the ways intended.
The performance measures used in an incentive system are likely to
be critical. The tests and indicators used for performance measures should
be designed to reflect the most valued educational goals, and their rela -
tive weights in the incentive system should reflect the tradeoffs across
educational goals that designers of the system are prepared to accept.
Although any test will necessarily be incomplete, it should be designed to
emphasize the most important learning goals in the subject domain and
to measure students’ attainment of the goals through the use of various
test item formats.
A test that asks very similar questions from year to year and uses a
limited set of item formats will become predictable and encourage nar-
row teaching to the test. The test scores are likely to become distorted
as a result, even if they were initially an excellent measure. To reduce
the inclination for teachers to inappropriately teach to high-stakes tests,
the tests themselves should be designed to sample the subject domain
broadly and include continually changing content and item formats. And
test items should be reused only rarely and unpredictably.
Performance targets should be challenging while also being attain-
able. Data should be used to determine attainable targets. Psychological
research shows that unrealistically high goals undermine motivation. The
ideal goals provide optimal challenge—ones that encourage people to
stretch themselves and are attainable with effort.
The indicators used to summarize test results should match the goals
of the test-based incentives policy, both in terms of the level of student
achievement expected and the students or subgroups that are the focus of
attention. Because any system of tests and indicators is necessarily incom-
OCR for page 94
94 INCENTIVES AND TEST-BASED ACCOUNTABILITY IN EDUCATION
plete, the system should be designed to emphasize the most important
goals, and progress toward those goals should be measured in varied and
diverse ways. Policy makers should recognize that goals that are not mea-
sured are likely to be deemphasized during instruction. Test-based incen-
tive systems should be dynamic, responding to current goals as well as
to indications of whether incentives are aligned to these goals in practice.
Given that tests are necessarily incomplete measures of valued educa-
tional goals, designers of incentive systems should recognize the potential
problems inherent in having strong consequences based on test scores
alone and should experiment with the use of systems of multiple mea -
sures that reflect desired outcomes. One way of incorporating multiple
measures would be to use the results of large-scale tests as triggers for
more focused evaluation of struggling schools and teachers, rather than
as final evaluations on their own.
It is possible that the weak effects of the test-based incentive programs
we reviewed may be due in part to the use of performance measures based
primarily on tests that encourage narrow test preparation rather than
broader instruction that can produce more general learning gains that
are not tied to a particular test. We note, however, that the one program
we reviewed that used multiple measures—the Teacher Advancement
Program, which uses classroom observations in addition to test scores in
evaluating teachers—produced a near-zero average effect with a number
of negative effects in the upper grades. Again, this result underlines how
much is still unknown about using test-based incentives effectively.
The nature of the support provided in conjunction with a test-based
incentives system is also likely to prove important to success. If the capac-
ity to bring about change is limited, successful implementation will require
that the incentives system include provisions to promote the development
of that capacity. In any system of incentives—whether focused on schools,
teachers, or students—the people who are most in need of improvement
and therefore usually the focus of the incentives are often specifically
those who lack the capacity to bring about change on their own. The
research to date does not suggest what kinds of support could be paired
with test-based incentives to increase program effectiveness.
It is beyond the committee’s charge to suggest how to build capacity
in school systems, but there is a growing literature on resources that are
most useful in helping schools improve. Some of that work is brought
together in two reports from the National Research Council, Engag-
ing Schools: Fostering High School Students’ Motivation to Learn (National
Research Council and Institute of Medicine, 2004) and America’s Lab
Report: Investigations in High School Science (2006a). A recent report by the
Center on Reinventing Public Education (Hill et al., 2008) suggests new
approaches to finance, governance, and accountability that would foster
OCR for page 95
95
RECOMMENDATIONS FOR POLICY AND RESEARCH
the kinds of competitive experimentation that could produce empirically
grounded understandings of what works under what circumstances and
for different groups.
RESEARCH ON TEST-BASED INCENTIVES
Substantial research needs to be conducted in order to understand
the effects of test-based incentives well enough for policies to be designed
that will consistently result in meaningful educational improvement. The
committee recognizes that it is difficult and time-consuming to conduct
definitive—or even credible—studies of the effects of test-based incen -
tives in educational settings. However, there is a strong initial body of
work that can serve as a foundation. Chapter 4 provides examples of the
kind of research that will be needed to identify successful ways of design-
ing test-based incentive policies.
Recommendation 3: Research about the effects of incentive pro-
grams should fully document the structure of each program and
should evaluate a broad range of outcomes. To avoid having
their results determined by the score inflation that occurs in the
high-stakes tests attached to the incentives, researchers should
use low-stakes tests that do not mimic the high-stakes tests to
evaluate how test-based incentives affect achievement. Other
outcomes, such as later performance in education or work and
dispositions related to education, are also important to study. To
help explain why test-based incentives sometimes produce neg-
ative effects on achievement, researchers should collect data on
changes in educational practice by the people who are affected
by the incentives.
The committee offers priorities for rigorous research, presented
as questions, in four areas: behavioral responses to incentives, valid -
ity of test score gains, incentive system outcomes, and incentive system
improvements.
Behavioral Responses to Incentives
• What types of incentives do different types of performance mea-
sures and indicators create for educators and students?
• What is the range of effects—not just the average—of differ-
ent types of incentives on teachers’ and students’ behavior and
motivation?
OCR for page 96
96 INCENTIVES AND TEST-BASED ACCOUNTABILITY IN EDUCATION
• How does the complexity of an incentives system affect the ability
of educators, parents, and students to understand the intended
signals and respond to them?
Validity of Test Score Gains
• What is the relationship between the responses of teachers and
others in the school system to test-based incentives and the valid-
ity of the gains in test scores? What measures of responses to
accountability should be used to understand these relationships?
• What is the relationship between test-based incentives and exter-
nal criteria, such as employment and wages? Are there relative
wage and employment increases among the people for whom test
scores rose?
• What characteristics of students, schools, and test-based incen-
tives predict score inflation?
• What are some practical auditing methods, that is, cost-effective
ways to monitor test score gains overall and at the school level?
Incentives System Outcomes
• What are the effects of test-based incentives on school and class-
room practices? What changes occur in school policies, curricu-
lum, instruction, and nonacademic activities, and are they consis-
tent with community goals and priorities?
• What are the verifiable effects on student learning that can be
attributed to the expectation of being accountable or to the sub-
sequent use of data?
• How do test-based incentives affect the labor market for teachers,
including recruitment, hiring, retention, placement, and mobility?
• How do stakeholders—students, parents, educators, policy mak-
ers, elected officials—affect the design and effects of test-based
incentives?
Incentives System Improvements
• How can subjective measures of teaching practices be used to
improve test-based incentives?
• How can large-scale tests be used as triggers to identify schools
that need more focused, in-depth evaluation?
• What role should value-added analyses play in developing indi-
cators for test-based incentives? What are the points of leverage
in the education system for improvement? What are the policy
and administrative levers for effecting change?
OCR for page 97
97
RECOMMENDATIONS FOR POLICY AND RESEARCH
CLOSING REFLECTIONS
The charge to the committee pointed out the contradiction between
many economists’ optimism and most psychologists’ pessimism about
the potential for test-based incentives to alter academic performance. Our
review of the literature and our deliberations did not resolve the contra-
diction. Our review of the evidence uncovered reasons to expect positive
results from incentive programs and reasons to be skeptical of apparent
gains. Our recommendations, accordingly, call for policy makers to sup -
port experimentation with rigorous evaluation and to allow midcourse
correction of policies when evaluation suggests such correction is needed.
Our call for more research may seem like a hackneyed response, but
we believe it is essential with regard to incentives. In calling for more
evaluation, we draw attention to the fact that the frequent question, “Do
incentives work?” is too broad and vague to be answerable. Most reforms
using test-based incentives attempt to change student performance in
many grades and many subjects. When ambitions are so broad, it is
not surprising that the results are varied and unclear. Broad and major
reforms do not succeed or fail all at once and altogether. Outcomes usually
mix small successes and failures that add up to either modest improve -
ments or disappointments. Our call for more focused evaluations is a call
to examine the expected successes and failures. We call on researchers,
policy makers, and educators to examine the evidence in detail and not
to reduce it to a simple thumbs-up or thumbs-down verdict. The school
reform effort will move forward to the extent that everyone, from policy
makers to parents, learns from a thorough and balanced analysis of each
success and each failure.
OCR for page 98