or accountability systems. Finally, we considered whether evaluations included indications of attrition through the course of the study and explanations of how losses might influence the data reports.

After considering the selection of indicators of student learning, we explored the question of how those indicators present that information across diverse populations of students. Disaggregation of the data by subgroups—including gender, race, ethnicity, economic indicators, academic performance level, English-language learners, and students with special needs—can ensure that measures of effectiveness include considerations of equity and fairness. In considering issues of equity, we examined whether evaluations included comparisons between groups—by subgroup on gain scores—to determine the distribution of effects. We also determined whether evaluations reported on comparisons in gains or losses among the subpopulations of any particular treatment, to provide evidence on the magnitude of the achievement gap among student groups. Accordingly, we asked whether evaluations examined distributions of scores rather than simply attending to the percentage passing or “mean performance.” Did they consider the performance of students at all levels of achievement?

Conducting evaluations of curricula in schools or districts with high levels of student mobility presents another challenge. We explored whether a pre- or postevaluation design could be used to ensure actual measurement of student achievement in this kind of environment. If longitudinal studies were conducted, were the original treatment populations maintained over time, a particularly important concern in schools where there is high mobility, choice, or dropout problems?

Finally, in drawing conclusions about effectiveness, we considered whether the evaluations of curricula employed the use of multiple types of measures of student performance other than test scores (e.g., grades, course-taking patterns, attitudes and perceptions, performance on subsequent courses and postsecondary institutions, and involvement in extracurricular activities relevant to mathematics). An effective curriculum should make it feasible and attractive to pursue future study in a field and to recover from prior failure or enter or advance into new opportunities.

We also recognized the potential for “corruptibility of indicators” (Rossi et al., 1999). This refers to the tendency for those whose performance is monitored to improve the indicator in any way possible. When some student outcome measures are placed in an accountability system, especially one where the students’ retention or denial of a diploma is at stake, the pressure to teach directly to what is likely to be assessed on the tests is high. If teachers and administrators are subject to loss of employment, the pressures increase even more (NRC, 1999b; Orfield and Kornhaber, 2001).

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement