method appears to assume a linear rate—each school will grow at a 10 percent rate every two years. There is little evidence to suggest that this assumption is valid, or indeed what rate might be expected. Kentucky's own experience shows that, after initial gains, improvement appears to have reached something of a plateau. Without evidence about the rate of progress that schools are capable of demonstrating, particularly schools with high proportions of low-income students, a gap-closing model might set up unrealistic expectations and could provoke a backlash among schools that fail to meet such expectations.

Another design issue in the development of measures of progress is related to the frequency of assessment. Kentucky elected not to test students in every grade level and instead relies on cross-sectional measures. That is, in determining progress, the state compares this year's 4th graders with last year's. This may be misleading, particularly in small schools, since the population of students in a school may differ significantly from one year to the next. Kentucky attempted to deal with this problem by gauging schools over a two-year period; year-to-year fluctuations in student populations could be ironed out over two years.

An alternative is to use longitudinal measures, which show the performance of one group of students over time. This approach is expensive, since it requires annual testing of each student and tracking of students who move from school to school (Carlson, 1996). And it tends to rely on traditional forms of testing, because of cost and the scaling of results. Performance measures tend to be more expensive than traditional multiple-choice tests, and annual testing of each student with performance measures would add up. In addition, performance measures often rate student performance according to qualitative characteristics, which are difficult to place on a linear scale—yet a linear scale might be needed to show growth from year to year (Baker and Linn, 1997).

A final design issue relates to the use of multiple measures. The Kentucky model uses an index that combines scores from all subject-area assessments, plus other data (such as dropout rates and attendance rates) into a single number. This method has the advantage of incorporating information from a range of indicators, so that judgments about progress do not rest on a single test. Schools can compensate for weak performance in one area by showing strong progress in another. Yet this system is highly complex, and few people understand how the index is compiled (Elmore et al., 1996). It fails to include the more detailed information about the data that constitute the index, to provide clues to educators about what to do to improve the next time.

Moreover, the index approach may exclude other data that may be useful in determining school progress toward standards. As noted in Chapter 5, data about classroom practices and the conditions of instruction are critical pieces of information in an educational improvement system. For one thing, they provide a context for the performance data, by showing whether any performance gains are accompanied by improvements in practice and support for instruction. In addition, the information about the conditions of instruction also can serve

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement