instruction. Students who are retained tend to have lower academic achievement than those who are promoted, and drop out of school at higher rates (National Research Council, 1999a).

Placing high-stakes accountability on students also poses special problems. For one thing, tests that are used to make decisions about schools may be illsuited for decisions about individual students. In addition, states and districts face substantial legal hurdles in using tests to apply consequences to students. Specifically, they need to demonstrate that the tests neither discriminate against any group of students nor deny any student due process. To demonstrate the latter, states need to prove that students have received adequate notice of high-stakes testing requirements and that they have been taught the knowledge and skills the test measures (Debra P. v. Turlington, 1981).

For these and other reasons, the Committee on Appropriate Test Use of the National Research Council (National Research Council, 1999a:279) recommended that “high-stakes decisions [about individual students] such as tracking, promotion, and graduation should not automatically be made on the basis of a single test but should be buttressed by other relevant information about the student's knowledge and skills, such as grades, teacher recommendations, and extenuating circumstances.”

Accountability for What? Determining what students or schools should be held accountable for is no less challenging than determining whom to hold accountable. The Title I statute and the new accountability ideas it reflects hold that the answer is “student performance.” But in practice, this answer leads to a number of interpretations, and the way schools respond to those interpretations affects whether accountability realizes its goals of increasing learning for all students.

As noted above, one of the major purposes of accountability based on performance is to encourage schools to focus their efforts on improving performance above all else. Everyone held accountable has an incentive to ensure that performance increases—or at least to stave off declines.

In the past, though, efforts to raise stakes on tests have not always had the desired effect. In some cases, schools employed inappropriate practices to raise test scores, such as focusing instruction on the format or general content of tests, rather than the concepts and skills the tests were expected to measure. These practices may have boosted scores, at least temporarily, but they did not in fact raise achievement (Koretz et al., 1991). Occasionally, schools resorted to practices that were unethical or illegal, including cheating.

The phenomenon of raising test scores without raising achievement occurs only under certain circumstances, although these circumstances happen to be relatively common. The first is when schools use tests that are not particularly sensitive to instruction. Tests that measure general knowledge and skills, rather

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement