The SEPUP assessment system provides one such example, but teachers can employ other forms of assessment that capture progress as well as achievement at a specific point in time. Keyed to standards and goals, such systems can be strong on meaning for teachers and students and still convey information to different levels of the system in a relatively straightforward and plausible manner that is readily understood. Teachers can use the standards or goals to help guide their own classroom assessments and observations and also to help them support work or learning in a particular area where sufficient achievement has not been met.

Devising a criterion-based scale to record progress and make summative judgments poses difficulties of its own. The levels of specificity involved in subdividing a domain to assure that the separate elements together represent the whole is a crucial and demanding task (Wiliam, 1996). This becomes an issue whether considering performance assessments or ongoing assessment data and needs to be articulated in advance of when students engage in activities (Quellmalz, 1991; Gipps, 1994).

Specific guidelines for the construction and selection of test items are not offered in this document. Test design and selection are certainly important aspects of a teacher's assessment responsibility and can be informed by the guidelines and discussions presented in this document (see also Chapter 3). Item-writing recommendations and other test specifications are topics of a substantial body of existing literature (for practitioner-relevant discussions, see Airasian, 1991; Cangelosi, 1990; Cunningham, 1997; Doran, Chan, and Tamir, 1998; Gallagher, 1998; Gronlund, 1998; Stiggins, 2001). Appropriate design, selection, interpretation and use of tests and assessment data were emphasized in the joint effort of the American Federation of Teachers (AFT), the National Council on Measurement in Education (NCME), and the National Education Association (NEA) to specify pedagogical skills necessary for effective assessment (AFT, NCME, & NEA, 1990).

VALIDITY AND RELIABILITY IN SUMMATIVE ASSESSMENTS

Regardless of what form a summative assessment takes or when it occurs, teachers need to keep in mind validity and reliability, two important technical elements of both classroomlevel assessments and external or large-scale assessments (AERA, APA, & NCME, 1999). These concepts also are discussed in Chapter 3.

Validity and reliability are judged using different criteria, although the two are related. Validity has different



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement