Provide adequate information to judge the comparability of samples.
In addition, a study must have included at least one of the following additional design elements:
A report of implementation fidelity or professional development activity;
Results disaggregated by content strands or by performance by student subgroups; or
Multiple outcome measures or precise theoretical analysis of a measured construct, such as number sense, proof, or proportional reasoning.
The application of these criteria led to the elimination of 32 comparative studies.
Case studies focus on documenting how program theories and components of a particular curriculum play out in a particular real-life situation. These studies usually describe in detail the large number of factors that influence implementation of that curriculum in classrooms or schools. For the 45 case studies, 13 were eliminated leaving 32 that met our standards of methodological rigor.
Synthesis studies summarize several evaluation studies across a particular curriculum, discuss the results, and draw conclusions based on the data and discussion. All of the 16 synthesis studies were retained for further examination by the committee.
The committee then had a total of 147 studies that met our minimal criteria for consideration of effectiveness, barely more than 20 percent of the total number of submissions with which we began our work. Seventy-five percent of these studies were related to the curricula supported by the National Science Foundation. The remaining studies concerned commercially supported curricular materials.
On the basis of the committee’s analysis of these 147 studies, we concluded that the corpus of evaluation studies as a whole across the 19 programs studied does not permit one to determine the effectiveness of individual programs with a high degree of certainty, due to the restricted number of studies for any particular curriculum, limitations in the array of methods used, and the uneven quality of the studies.
This inconclusive finding should not be interpreted to mean that these curricula are not effective, but rather that problems with the data and/or study designs prevent confident judgments about their effectiveness. Inconclusive findings such as these do not permit one to determine conclusively whether the programs overall are effective or ineffective.