4
Factors That Affect the Accuracy of NAEP’s Estimates of Achievement

The purpose of the National Assessment of Educational Progress (NAEP) is to provide reports to the nation on the academic achievement of all students in grades 4, 8, and 12. NAEP accomplishes this through sampling, a process similar to those used in political polling, marketing surveys, and other contexts, in which only a scientifically selected portion of the target population, the group about whom data are needed, is actually assessed. This process is complex, and ensuring that it is conducted correctly is critical to the integrity of NAEP’s reported results.

There are a number of factors that make the sampling process a challenge for the NAEP officials who are responsible for it, and that make interpreting the results difficult for users of the data who want to understand the academic achievement of students in the United States. For one, the sampling process is affected by decisions made at the local level about which of the sampled students who have a disability or are English language learners should participate in NAEP. The process is also dependent on the consistency with which a variety of procedures that are part of the administration of NAEP assessments are applied in local settings around the country. This chapter provides a description of the way NAEP sampling works and discussion of several factors that complicate it. We explore the variability in state policies for identifying students with disabilities and English language learners and the variability in state policies regarding allowable accommodations on state assessments, and we consider ways in which local decision making affects the integrity of NAEP samples and its results.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments 4 Factors That Affect the Accuracy of NAEP’s Estimates of Achievement The purpose of the National Assessment of Educational Progress (NAEP) is to provide reports to the nation on the academic achievement of all students in grades 4, 8, and 12. NAEP accomplishes this through sampling, a process similar to those used in political polling, marketing surveys, and other contexts, in which only a scientifically selected portion of the target population, the group about whom data are needed, is actually assessed. This process is complex, and ensuring that it is conducted correctly is critical to the integrity of NAEP’s reported results. There are a number of factors that make the sampling process a challenge for the NAEP officials who are responsible for it, and that make interpreting the results difficult for users of the data who want to understand the academic achievement of students in the United States. For one, the sampling process is affected by decisions made at the local level about which of the sampled students who have a disability or are English language learners should participate in NAEP. The process is also dependent on the consistency with which a variety of procedures that are part of the administration of NAEP assessments are applied in local settings around the country. This chapter provides a description of the way NAEP sampling works and discussion of several factors that complicate it. We explore the variability in state policies for identifying students with disabilities and English language learners and the variability in state policies regarding allowable accommodations on state assessments, and we consider ways in which local decision making affects the integrity of NAEP samples and its results.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments NAEP SAMPLING PROCEDURES Because NAEP is designed to provide estimates of the performance of large groups of students in more than five separate subject areas and at three different stages of schooling, it would not be practical to test all of the students about whom data are sought in all subjects. Not only would each student be subjected to a prohibitively large amount of testing time in order to cover all of the targeted subject matter, but schools would also be unacceptably disrupted by such a burden. The solution is to assess only a fraction of the nation’s students, evaluating each participating student on only a portion of the targeted subject matter. In order to be sure all of the material in each subject area is covered, developers design the assessment in blocks, each representing only a portion of the material specified in the NAEP framework for that subject. These blocks are administered according to a matrix sampling procedure, through which each student takes only two or three blocks in a variety of combinations. Statistical procedures are then used to link these results and project the performance to the broader population of the nation’s students (U.S. Department of Education, National Center for Education Statistics, and Office of Educational Research and Improvement, 2001). NAEP’s estimates of proficiency are based on scientific samples of the population of interest, such as fourth grade students nationwide. In other words, the percentage of students in the total group of fourth graders who fall into each of the categories about which data are sought—such as girls, boys, members of various ethnic groups, and residents of urban, rural, or suburban areas—is calculated. A sample—a much smaller number of children—can then be identified whose proportions approximate those of the target population. Data are collected about other kinds of characteristics as well, including such information as parents’ education levels, the type of school in which students are enrolled (public/private, large/small), and whether students have disabilities or are English language learners. In this way, NAEP reports can provide answers to a wide variety of questions about the percentages of students in each of a variety of groups, the relative performance of different groups, and the relationships among achievement and a wide variety of academic and background characteristics. The sampling for NAEP is based on data received from schools about their students’ characteristics as well as other factors. The selection of students in each school identified for NAEP participation is crucial to the representativeness of the overall sampling and the resulting estimates of performance. Local administrators are given lists of students who are to participate and instructions as to what adjustments to this list are permitted in response to absences and other factors that may affect participation. However, in the case of both students with disabilities and English language learners, which students ultimately remain in the sample depends in part on decisions made at the local level. These decisions are discussed in greater detail below.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments COMPARABILITY OF NAEP SAMPLES ACROSS STATES As was mentioned in Chapter 1, decision making about the identification of students with disabilities and English language learners, their inclusion in large-scale assessments, and the testing accommodations they need is guided by federal legislation (although far more detailed guidance is provided regarding students with disabilities than English language learners). It is up to states, however, to develop policies for complying with legislative requirements, and consequently the policies and the way they are interpreted vary from state to state, in some cases considerably. The variation in state policies has particular implications for NAEP. Decisions made at the state and local level affect NAEP’s results and the ways in which they can be interpreted. For each administration NAEP officials identify a sample of students to participate in the assessment, and they provide guidelines for administering it. However, school-level officials influence the process in several ways. First, as they are developing the sample, NAEP officials make no attempt to identify students with disabilities or English language learners themselves; rather, the percentages of those students who end up in the sample reflect decisions that have already been made at the school level; these decisions are guided by state policies, which vary. Second, NAEP officials leave it to school-level staff, who are knowledgeable about students’ educational functioning levels, to determine whether selected students who have a disability or are English language learners can meaningfully participate. In general, this process is guided by the policy set forth in the NAEP 2003 Assessment Administrators’ Manual (U.S. Department of Education, National Center for Education Statistics, and Office of Educational Research and Improvement, 2003, pp. 4-19). Finally, NAEP officials provide lists of allowable accommodations for each of its assessments, but here as well it is school-level staff who decide which accommodations are appropriate for their students and which of those allowed in NAEP they are in a position to offer. Thus, differences in policies and procedures both among and within states can affect who participates in NAEP and the way in which students participate. According to the most recent legislation, the purpose of NAEP is “to provide, in a timely manner, a fair and accurate measurement of student academic achievement and reporting of trends in such achievement in reading, mathematics, and other subject matter as specified in this section” (Section 303 of HR 3801). The legislation further indicates that the commissioner for education statistics shall: (a) use a random sampling process which is consistent with relevant, widely accepted professional assessment standards and that produces data that are representative on a national and regional basis; (b) conduct a national assessment and collect and report assessment data, including achievement data trends, in a valid and reliable manner on student academic achievement in public and private elementary schools and secondary schools at least once every two years, in grades 4 and 8 in reading and mathematics;

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments (c) conduct a national assessment and collect and report assessment data, including achievement data trends, in a valid and reliable manner on student academic achievement [in] public and private schools in reading and mathematics in grade 12 in regularly scheduled intervals, but at least as often as such assessments were conducted prior to the date of enactment of the No Child Left Behind Act of 2001. Both intrastate and interstate variability in the policies and procedures that determine which students participate and which accommodations they receive have implications for the interpretation of NAEP results. First, local decision making will affect the composition of a state sample, and thus the characteristics of the sample may vary across states in unintended and perhaps unrecognized ways. Likewise, local decisions about which accommodations a student requires will affect the conditions under which scores are obtained. This means that a state’s results are subject to these locally made decisions, which may be based on criteria that vary from school to school in a state. Moreover, national NAEP results, in which scores are aggregated across states, are also subject to these locally made decisions. Finally, a key objective for NAEP is to characterize the achievement of the school-age population in the United States, yet the extent to which NAEP results are representative of the entire population depends on the locally made decisions that affect the samples. Identifying and Classifying Students with Disabilities and English Language Learners Determining which students should be classified as disabled in some way or as an English language learner is thus critical to ensuring that these groups of students are adequately represented, but making these classifications is far more complicated than many people recognize. In both cases, the specific situations that may call for such a classification vary widely, and there is no universally used or accepted method to use in making these judgments, particularly for English language learners. In general, decisions about whether and how specific students should be tested in NAEP are derived from previous decisions about those students’ educational needs and placement, so it is important to understand how these decisions are made. Identifying Students with Disabilities1 The process of identifying and classifying students with disabilities and determining their eligibility for special education typically involves three steps: 1   Text in this section has been adapted from the reports of the National Research Council’s Committee on Goals 2000 and the Inclusion of Students with Disabilities (National Research Council, 1997a) and the Committee on Minority Representation in Special Education (National Research Council, 2002a).

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments referral, which generally begins with the teacher; evaluation; and placement. Once an individual is identified as having a disability, a determination is made as to whether he or she qualifies for special education and related services. Under the Individuals with Disabilities Education Act (IDEA), eligibility for special education services is based on two criteria: first, the individual must meet the criteria for at least one of the 13 disabilities recognized in the IDEA (or the counterpart categories in state law) and, second, the individual must require special education or related services in order to receive an appropriate education. If both the disability diagnosis and special education need are confirmed, then the student has the right to an individualized education program (IEP). The IEP will also specify accommodations required for instructional purposes and for testing. Although the IDEA is explicit about the procedures for identifying students as having a disability, significant variability exists in the way procedures are implemented. For some kinds of disabilities (such as physical or sensory ones), the criteria are clear. However, for others, such as learning disabilities, mild mental retardation, and serious emotional disturbance, the criteria are much less clear and the implementation practices are more variable. States and districts do not have to adopt the disability categories in the federal laws and regulations (Hehir, 1996), and classification practices vary significantly from place to place; variation exists, for example, in the names given to categories, key dimensions on which the diagnosis is made, and criteria for determining eligibility (National Research Council, 1997a). This variability led the Committee on Goals 2000 and the Inclusion of Students with Disabilities to note that “it is entirely possible for students with identical characteristics to be diagnosed as having a disability in one state but not in another, or to have the categorical designation change with a move across states lines” (National Research Council, 1997a, p. 75). Another source of variability is the referral process. Many of the referrals are made by classroom teachers. However, local norms are applied in making the judgment that achievement is acceptable or unacceptable. That is, whether a teacher perceives a student’s level of achievement as acceptable or unacceptable varies as a function of the typical or average level of achievement in that student’s classroom. It is the classroom teacher who compares the student with others and decides whether referral is appropriate (National Research Council, 2002a, p. 227). Special education referral rates can also be affected by policies and practices in a school system. The availability of other special programs, such as remedial reading and Title I services, can affect the number of students referred for special education (National Research Council, 1997a, p. 71). Educators also face competing incentives in serving students who may have disabilities. For example, financial pressures on school districts and a lack of adequate federal and state support may make local officials reluctant to refer students for special education services even when they seem to meet relevant eligibility criteria (National Research Council, 1997a, p. 55). At the same time,

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments staff in some schools may view their special education program as a kind of organizational safety valve that allows teachers to remove disruptive students from their classrooms, or that provides an alternative for vocal parents wanting additional assistance for their children (National Research Council, 1997a, p. 54). Consequently, schools may refer students for special education services when other remedies are more appropriate. Although none of these reasons is an adequate or even legitimate basis for deciding whether students are eligible for services, they represent the realities of local implementation. Educators’ efforts to balance their responsibilities to serve all students, interpret applicable legal requirements for individual children, work within existing fiscal and organizational constraints, and respond to parental concerns may yield discrepancies, with the result that similar students might receive services in one school and be ineligible for them in another (National Research Council, 1997a, p. 55). The requirement that the IEP be tailored to individual students’ needs has also led to variability in the implementation of the IDEA. Evaluation, placement, and programming decisions for students with disabilities are intended to be idiosyncratic and to focus on the specific needs of the individual. The IEP process is designed this way so that the tendency for institutions to standardize their procedures will be countered by pressure from parents and special education staff to provide each student with the education and services he or she needs (National Research Council, 1997a). Because the IEP is the paramount determinant of matters affecting the education of students with disabilities, including participation in assessment and accommodations, this is a critical source of variability in the context of NAEP. Identifying English Language Learners For English language learners, there are also difficulties in identification and classification, although for somewhat different reasons. There is no legislation akin to the IDEA to provide guidance to states on identifying English language learners, and there is no universally used definition of English language learners. Hence the category includes a broad range of students whose level of fluency in English, literacy in their native language, previous academic experiences, and socioeconomic status all vary significantly. Below we present results from several analyses of state policies with regard to identification of English language learners. Research conducted by Rivera et al. (2000) revealed that states vary considerably in the way they define English proficiency. For example, Rivera reported that 15 states base their definitions on the fairly detailed definition from the Improving America’s Schools Act of 1994, that is, a limited English proficient individual is one who (Rivera et al., 2000, p. 4): (a) was not born in the United States or whose native language is a language other than English and comes from an environment where a language other than English is dominant; or

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments (b) is a Native American, or Alaska Native, or a native resident of the outlying areas and comes from an environment where a language other than English has had a significant impact on such individual’s level of English proficiency; or (c) is migratory and whose native language is other than English and comes from an environment where a language other than English is dominant; and (d) who has sufficient difficulty speaking, reading, writing, or understanding the English language, and whose difficulties may deny such an individual the opportunity to learn successfully in classrooms where the language of instruction is English or to participate fully in society. According to Rivera et al. (2000), other states use much less detailed definitions, such as “students who do not understand, speak, read or write English” (in Pennsylvania) or “students assessed as having English skills below their age appropriate grade level” (in Missouri). In addition, some states base the identification on information gathered from enrollment records, home language surveys, interviews, observations, and teacher referrals, while others identify students as English language learners from their performance on tests designed to measure “English proficiency” (National Research Council, 2000b). More recently, the U.S. Department of Education’s Office of English-language Acquisition, Language Enhancement, and Academic Achievement for Limited English Proficient Students (OELA) conducted a survey that provided some data on the variety of criteria states use for identifying students as English language learners (Kindler, 2002). Among the state education agencies responding to the survey, about 80 percent use home language surveys, teacher observation, teacher interviews, and parent information to identify students as English language learners; 60 percent use student records, student grades, informal assessments, and referrals. Most also use some type of language proficiency test. The most widely used tests are the Language Assessment Scales, the IDEA Language Proficiency Tests, and the Woodcock-Munoz Language Survey. A number of states also used results from achievement tests to identify students with limited English proficiency. The results from this survey are presented in Table 4-1. As of 2001, most states allowed English language learners to be exempted from statewide assessments for a certain period of time (Golden and Sacks, 2001). According to Rivera, 11 states allowed a 2-year delay before including such students in testing, 21 states allowed 3 years, 2 states allowed more than 3, and 1 state had no time limit (Golden and Sacks, 2001). Jurisdictions also differ in the amount of time they allow English language learners to receive educational supports. Some offer services for as little as one year; others for multiple years.2 This variation can have significant implications 2   Although these limits are common, researchers have found that it typically takes three to five years for English language learners to develop true oral proficiency. Academic proficiency—the capacity to use spoken and written English with sufficient complexity that one’s academic performance is not impaired at all—takes longer, four to seven years on average (Hakuta et al., 1999).

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments TABLE 4-1 Methods States Use for Identifying English Language Learners Methods for Identifying English Language Learners Number of Statesa Type of Data Home language 50   Parent information 48   Teacher observation 46   Student records 45   Teacher interview 45   Referral 44   Student grades 43   Other 32 Tests Language proficiency tests: 51   Language assessment scales 46   IDEA language proficiency tests 38   Woodcock-Munoz Language Survey 28   Language assessment battery 13   Basic Inventory of Natural Languages 6   Maculaitis assessment 6   Secondary Level English Proficiency 6   Woodcock Language Proficiency Battery 6   Achievement Tests: 41   State Achievement Test 16   Stanford 15   ITBS 14   CTBS 11   Gates-MacGinitie 11   Terra Nova 11   Criterion Referenced Tests (CRT): 21   State CRT 1   NWEA Assessment 4   District CRT/Benchmark 3   Qualitative Reading Inventory 3   Other CRT 5   Other Test 19 aIncludes states, the District of Columbia, and outlying areas (n = 54). SOURCE: Kindler (2002, p. 9). not only for students’ academic careers, but also for the data collected about them. Jurisdictions typically do not track English language learners’ progress once they stop receiving educational supports, although they may be far from fluent. Moreover, students who are no longer identified as needing educational supports would not ordinarily receive testing accommodations either. The No Child Left Behind Act of 2001 provides a definition of English language learners that all states are to use, at least in the context of the assess-

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments ments the act requires them to undertake, but this definition, too, is open to interpretation. According to the legislation, the term “limited English proficient,” when used with respect to an individual means an individual— (a) who is aged 3 through 21; (b) who is enrolled or preparing to enroll in an elementary or secondary school; (c) (i) who was not born in the United States or whose native language is a language other than English; (ii) (I) who is a Native American or Alaska native, or a native resident of the outlying areas; and (II) who comes from an environment where a language other than English has had a significant impact on the individual’s level of English-language proficiency; or (iii) who is migratory, whose native language is other than English, and who comes from an environment where a language other than English is dominant; and (d) whose difficulties in speaking, reading, writing, or understanding the English-language may be sufficient to deny the individual— (i) the ability to meet the State’s proficient level of achievement on State assessments described in section 1111(b)(3); (ii) the ability to successfully achieve in classrooms where the language of instruction is English; or (iii) the opportunity to participate fully in society. Data are not yet available on how states are applying this new definition. The extent to which state policies will continue to vary remains to be seen. Policies on Accommodation NAEP results are also affected by the ways in which students with disabilities and English language learners are accommodated when they participate in NAEP. As was noted earlier, NAEP officials have been investigating ways of including more students in these two groups in testing, and thus the pros and cons of providing available accommodations. These decisions for NAEP are influenced by decisions made at the local level in several ways. However, like the identification procedures discussed above, policies in this area vary significantly from state to state. In their efforts to comply with federal legislation and include these students in accountability programs, states and districts have been devising their policies without the benefit of either nationally recognized guidelines or a clear research base for overcoming many specific difficulties in assessing students with disabilities and English language learners. Not only do existing policies and proce-

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments dures vary from state to state, but also they change frequently in many places as states adjust to changes in their student populations, in their testing programs, in the political climate surrounding testing, and in the evidence emerging from both research and practice. It is also important to note in this context that states’ policies regarding accommodations properly depend in part on the constructs measured by specific assessments, which vary from test to test and from state to state. Until recently, states could exclude students from their state and local testing. Now, under the requirements of the No Child Left Behind Act of 2001, states must strive to include all students with disabilities and English language learners in their accountability systems. This means that they must find a means to evaluate these students’ skills in reading and math, either by including them in the standard state assessment or by providing an alternate assessment. State’s inclusion and accommodation policies for the two groups of students are described below. Accommodation Policies for Students with Disabilities As was mentioned earlier, now that states must include nearly all students in their assessments, the importance of accommodations has grown. All states define both allowable and nonallowable practices, the latter being those that are believed to alter the construct being assessed. In general the testing accommodations for students with disabilities are based on the services and classroom accommodations that have been identified in the IEP, and the IEP is considered the authoritative guide to testing accommodations for each student who has one. Table 4-2 presents recent data on the types of accommodations that states currently allow. Accommodation Policies for English Language Learners In many states, the policies for including and accommodating English language learners have been derived from those established for students with disabilities (Golden and Sacks, 2001), and these have not always been clearly suited to the needs of both kinds of students. The No Child Left Behind Act has meant that far fewer English language learners can be excluded from assessments, and that accommodations and alternate assessments will be used for many students who might formerly have been excluded. The variation in policies for accommodating these students around the country is similar to that evident for students with disabilities. Table 4-3 provides recent data on the types of accommodations states currently use. Differences Between NAEP Policies and State Policies Since NAEP, unlike the states, is not required by law to include all students with disabilities and English language learners in their assessments, NAEP officials are free to continue to adhere to the policies they have devised for both

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments TABLE 4-2 Accommodations for Students with Disabilities Allowed for State Assessments and for NAEP Type of Accommodation Number of StatesThat Allow the Accommodation Allowed in NAEP Presentation:     Oral reading of questions 47 Yes (except for reading) Large print 48 Yes Braille 47 Noa Read aloud 46 Yes (except for reading) Signing of directions 48 Noa Oral reading of directions 48 Not specified Audio taped directions or questions 29 No Repeating of directions 47 Yes Explanation of directions 38 Yes Interpretation of directions 28 Not specified Short segment testing booklets 14 Not specified Equipment:     Use of magnifying glass 47 Not specified Amplification No info Not specified Light/acoustics No info Yes Calculator No info Only on calculator use Templates to reduce visual field 38 Not specified Response Format:     Use of scribe 48 Yes Write in test booklet 44 Yes Use template for recording answers 29 Not specified Point to response, answer orally 41 Yes Use sign language 42 Noa Use typewriter/computer/word processor 41 Yes Use of Braille writer 42 Yes Answers recorded on audio tape 32 No Scheduling/Timing:     Extended time 46 Yes More breaks 46 Yes Extending sessions over multiple days 37 No Altered time of day that test is given 41 Not specified Setting:     Individual administration 47 Yes Small group 47 Yes Separate room 47 Yes Alone in study carrel 43 Yes At home with supervision 27 Not specified In special education class 46 Not specified Other:     Out of level testing 15 No Use of word lists or dictionaries 25 No Spell checker 16 No aNot provided by NAEP, but school, district, or state may provide after fulfilling NAEP security requirements. SOURCES: Annual Survey of State Student Assessment Programs 2000-2001 (Council of Chief State School Officers, 2002); Available: http://nces.ed.gov/nationsreportcard/about/inclusion.asp#accom_table.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments The student’s IEP requires that the student be tested with an accommodation that NAEP does not permit, and the student cannot demonstrate his or her knowledge of reading or mathematics without that accommodation. A student who is identified as limited English proficient (LEP) and who is a native speaker of a language other than English should be included in NAEP unless (http://nces.ed.gov/nationsreportcard/about/criteria.asp): The student has received reading or mathematics instruction primarily in English for less than 3 school years including the current year, and The student cannot demonstrate his or her knowledge of reading or mathematics in English even with an accommodation permitted by NAEP. The phrase “less than 3 school years including the current year” means 0, 1, or 2 school years. Therefore, in applying the criteria: Include without any accommodation all LEP students who have received reading or mathematics instruction primarily in English for 3 years or more and those who are in their third year; Include without any accommodation all other LEP students who can demonstrate their knowledge of reading or mathematics without an accommodation; Include and provide accommodations permitted by NAEP to other LEP students who can demonstrate their knowledge of reading or mathematics only with those accommodations; and Exclude LEP students only if they cannot demonstrate their knowledge of reading or mathematics even with an accommodation permitted by NAEP. The decision regarding whether any of the students identified as SD or LEP cannot be included in the assessment should be made in consultation with knowledgeable school staff. When there is doubt, the student should be included. As for accommodations, NAEP allows some that are typically allowed on state and district assessments, but there are many used by states and districts that NAEP does not allow. For example, reading aloud of passages or questions on the reading assessment is explicitly prohibited, and alternative language versions and bilingual glossaries are not permitted on the reading assessments. Braille forms are allowed but used only if schools can provide the necessary resources to create the forms. Allowable and nonallowable accommodations for NAEP are listed in column 2 of Table 4-2 and Table 4-3. Decisions about which of the allowed accommodations will be provided to individual students selected for a NAEP assessment are made by school authorities. In general, school authorities rely on the guidance provided in the student’s IEP regarding required accommodations for students with disabilities. As has been noted, there is currently no legislation parallel to IDEA to guide decision making about accommodations for English language learners. When a student in either group requires an accommodation that is not on the approved list for NAEP, the student is generally excluded from the assessment. While a detailed investigation of the implementation of the policies regarding inclusion and accommodation in NAEP at the school level was beyond the

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments scope of the committee’s charge, it is worth noting here that a considerable amount of responsibility for the implementation of the sampling procedure rests with school-level coordinators. Since it is clear that uniformity in this process is very important to the integrity of the sampling procedure and the accuracy of the assessment results, we raise the caution that precise instructions as to the handling of ambiguous circumstances are needed to ensure that coordinators make decisions that are consistent both with NAEP guidelines and with the decisions being made in other schools in the sample. The committee has become aware of anecdotal reports from state officials that coordinators may not, in all cases, be completely familiar with the IEP process, with state and district accommodations policies, or with federal law regarding inclusion and accommodation; these reports also indicate that there may be instances in which the coordinators have not adhered to the NAEP guidelines. It is not clear that the oversight of this aspect of the process is adequate, or that the implementation is as uniform as it needs to be. We hope that this issue will be investigated further by the sponsors of NAEP. REPRESENTATIVENESS OF NAEP SAMPLES There are several complications that affect the NAEP sampling procedures for students with disabilities and English language learners. First, as noted earlier, NAEP’s purpose is defined in legislation (see Section 303 of HR 3801), and the assessment is generally understood to provide results that reflect the academic proficiency of the nation’s entire population of fourth, eighth, and twelfth graders. However, the target population is not precisely described in the legislation. The legislation does not provide details about the characteristics of the target population that the assessed samples must match. Indeed, the only specific points made in the legislation are that the target population should be national and should include both public and private schools. This ambiguity creates difficulties. Although the national results presented in NAEP’s reports are designed to be representative of fourth, eighth, and twelfth grade students in the nation (U.S. Department of Education National Center for Education Statistics and Institute of Education Sciences, 2003, p. 135), students with disabilities and English language learners may be excluded from NAEP sampling at two stages in the process. First, students with disabilities may be excluded because schools exclusively devoted to special education students are not included in the sampling. Second, students with disabilities and English language learners may be excluded because the test is not administered to students who, in the judgment of school personnel, cannot meaningfully participate.3 3   That is, students with disabilities who would require an accommodation that is not allowed on NAEP or an alternate assessment, as well as English language learners who do not meet NAEP’s rules for inclusion.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments With regard to exclusion at the first stage, the key point is that students may be systematically excluded from the population that is sampled. Special education schools serve a wide range of students, including both students with lower levels of cognitive functioning and students with higher levels of cognitive functioning whose placement in such schools is the result of physical (e.g., visual, hearing, or motor skill impairments) or behavioral problems. Thus, there is a potential bias in the resulting estimates of performance as a consequence of this exclusion. An additional complication arises as a result of local decision making. That is, there is another way in which the sample of students actually tested may have characteristics different from those of the target population, and it is difficult to estimate the extent of this divergence. As we have seen, the decisions made by school personnel in identifying students as having disabilities or being English language learners vary both within states and across states, but there is no way to measure this variance or its effect on the sample. Nevertheless, it is very likely that students are excluded from NAEP according to criteria that are not uniform. If this is so, statistical “noise” is introduced into inferences that are based on comparisons of performance across states. Results from the 2002 administration of the NAEP reading assessment indicated that of all students nationwide selected for the sample, 6 percent were excluded from participation at the fourth grade level for some reason. Exclusion was somewhat less frequent for older students: 5 percent at the eighth grade level and 4 percent at the twelfth grade level (U.S. Department of Education, National Center for Education Statistics, and Institute of Education Sciences, 2003, pp. 151-152). Although certain students selected for inclusion are not ultimately assessed in NAEP, those who do not participate are still accounted for. That is, students selected for a NAEP sample are placed into three categories: regular participation, participation with accommodations, and excluded. If the argument can be made that the excluded category reflects students who could not meaningfully participate in the assessment, including those who receive their education in special education schools, then NAEP results can be understood to reflect the academic achievement of all students who can be assessed “in a valid and reliable manner,” using tools currently available. However, if the excluded category includes students who might have been able to participate meaningfully but who were excluded because of incorrect or inconsistent applications of the guidelines, or because a needed, appropriate accommodation was not permitted or available, then inferences about the generalizability of NAEP results to the full population of the nation’s students are compromised. An Attempt to Compare the Composition of a NAEP Sample with National Demographics The committee was concerned about the extent to which the samples of students included in NAEP are representative of the numbers of students with

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments disabilities and English language learners nationwide. We explored this issue by attempting to compile data with which to compare the characteristics of NAEP’s samples to the characteristics of the nation’s population of students with disabilities and English language learners. Table 4-4 presents the results of this attempt. For this table, data on the total enrollment in public schools (see row 7) were obtained from the NCES Common Core of Data survey, Table 38 (http://nces.ed.gov/programs/digest/d02/tables/PDF/table38.pdf); this is the number of students enrolled in the 50 states, the District of Columbia, and outlying areas in the specified grade for the 2000-2001 school year. The number of students with disabilities (column 1, row 1) was obtained from the 24th Annual Report to TABLE 4-4 Comparisons of the Percentages of Students with Disabilities and English Language Learners in the United Statesa for the 2000-2001 School Year with Those in the NAEP Samples for 2002 Reading and 2003 Mathematics   (1) Students with Disabilities (2) English Language Learners   Fourth Graders/9-Year-Olds Eighth Graders/13-Year-Olds Fourth Graders Eighth Graders National Data         (1) Number in United States 522,370b 501,008b 169,421c 108,994c (2) Percentage of Total U.S. Enrollment 14.1%d 14.2%e 4.6% 3.1% Percentages of NAEP Samplef         (3) Identified for 2002 Reading 12% 12% 8% 6% (4) Assessed in 2002 Reading 7% 8% 6% 4% (5) Identified for 2003 Math 13% 13% 10% 6% (6) Assessed in 2003 Math 10% 10% 8% 5% (7) Total enrollment in United States: fourth grade = 3,707,931; eighth grade = 3,432,370 aBased on data for the 50 states, the District of Columbia, and outlying areas. bCounts of students with disabilities in the United States are by age. cCounts of English language learners in the United States are by grade. dNumber of students with disabilities age 9 divided by total number of enrolled fourth graders. eNumber of students with disabilities age 13 divided by total number of enrolled eighth graders. fAll percentages in NAEP are by grade level. SOURCES: Kindler (2002); NAEP 2003 Mathematics Report available http://nces.ed.gov/nationsreportcard/mathematics/results2003/acc-permitted-natl-yes.asp; NCES Common Core of Data Survey available http://nces.ed.gov/programs/digest/d02/tables/PDF/table38.pdf; U.S. Department of Education (2002), Table AA8; U.S. Department of Education National Center for Education Statistics and Institute of Education Sciences (2003).

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments Congress, Table AA8, and is the number of children served under the IDEA during the 2000-2001 school year in the United States and outlying areas; these data are reported by age, not grade, so the counts for 9-year-olds and 13-year-olds were used (U.S. Department of Education, 2002). The percentage of students with disabilities in the nation (column 1, row 2) was calculated by dividing the counts in row 1 by the appropriate totals in row 7. The counts of English language learners were obtained from Kindler (2002) and are the number of students (column 2, row 1) enrolled in the specified grade in the United States and outlying areas during the 2000-2001 school year. The percentage of English language learners in the nation (column 2, row 2) was calculated by dividing the counts in row 1 by the appropriate totals in row 7. The weighted percentages4 of the NAEP sample, which appear in rows 3 through 8, were obtained from the NAEP’s 2002 Reading Report (see Table 3-3) and NAEP’s 2003 Mathematics Report (http://nces.ed.gov/nationsreportcard/mathematics/results2003/acc-permitted-natl-yes.asp). Rows 3 and 5 show the percentage of the NAEP sample identified (by school officials) as students with disabilities or English language learners; rows 4 and 6 show the percentage of the NAEP sample of students assessed who had disabilities or were English language learners. At a first glance, these data suggest that the NAEP sampling underrepresents the numbers of students with disabilities and slightly overrepresents the numbers of English language learners. However, interpretation of these data is complicated by the fact that they are not directly comparable. For example, the counts for students with disabilities are for age group, not grade level. The national demographics on students with disabilities and English language learners include counts of students for outlying areas (American Samoa, Guam, Northern Marianas, Puerto Rico, Virgin Islands), which are not all always included in the NAEP sample, and the way in which the data were reported would not allow disentangling these numbers for all of the columns on this table.5 Furthermore, the differences in the estimated proportions of students with disabilities and English language learners sampled in NAEP and existing in the United States could be attributable to differences in the way students are counted, the way the data are reported, or both; neither source for the estimated proportions should be considered infallible. In addition, the most current national data available at the time this report was being prepared were for a different school 4   Percentages are weighted according to sampling weights determined as part of NAEP’s sampling procedures. These “weighted percentages” are the percentages that appear in NAEP reports. 5   While grade level counts for the entire enrollment and for students with disabilities were available for various combinations of the 50 states, the District of Columbia, Department of Defense schools, outlying areas, and Bureau of Land Management schools, grade-level counts for English language learners were available only for the 50 states, the District of Columbia, and outlying areas.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments year (2000-2001) than the one in which the NAEP assessment occurred (2001-2002). Nevertheless, we include this information for two reasons. One is that we strongly believe that attempts should be made to evaluate the extent to which NAEP samples are representative of the students with disabilities and English language learners nationwide. Second, we call attention to the deficiencies in the existing data sources and the consequent difficulties in making such comparisons. We further note that while national data may be useful in evaluating the representativeness of national NAEP results, state-level demographics would be needed to evaluate the representativeness of the state NAEP results. Tables 3-4 and 3-5 presented data by state on the percentages of students with disabilities and English language learners, respectively, who participated in NAEP’s 1998 and 2002 reading assessments. We attempted to evaluate the representativeness of the state samples with respect to the two groups of students but were unable to obtain all of the necessary data. We were able to obtain data on the percentages of students with disabilities who are age 9 and age 13 for each state, but again these data were not available by grade level. We were not able to obtain grade-level or age-level data by states for English language learners. The data we were able to obtain are presented in Table 4-5. In the table, state-level enrollment counts for fourth and eighth grades (columns 2 and 6) were obtained from Table 38 of NCES’s Common Core of Data surveys (http://nces.ed.gov/programs/digest/d02/tables.dt038.asp). Counts of students with disabilities by state (columns 3 and 7) were obtained from the 24th Annual Report to Congress, Table AA8 (U.S. Department of Education, 2002). Percentages (columns 4 and 8) were calculated by dividing column 3 by column 2 and column 7 by column 6. The percentages in column 5 were taken from Table 3-4 and are the percentages of students with disabilities identified by school officials and assessed for the fourth grade 2002 NAEP Reading Assessment. Likewise, the percentages in column 9 were obtained from NAEP’s report of the percent of students with disabilities identified by school officials and assessed for the eighth grade 2002 NAEP Reading Assessment (http://nces.ed.gov/nationsreportcard/reading/results2002/acc-sd-g8.asp). Thus the committee found that it was not possible to compare the proportions of students with disabilities and English language leanrers in NAEP samples with their incidence in the population at large. Local, state, and federal agencies do not produce the kinds of comparable data that would make these comparisons possible at the national level. Moreover, although it would also be important to compare state NAEP results with the proportions of students with disabilities and English language learners in the respective state populations, those comparisons are also, by and large, not possible. As was discussed earlier, states in many cases collect the desired data but do not present them in a way that makes it possible to compare them with state NAEP results or to compile them across states.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments TABLE 4-5 Comparisons of the Percentages of Students with Disabilities in the United States for the 2000-2001 School Year with Those in the Sample for NAEP’s 2002 Reading Assessment in Fourth and Eighth Grade   Fourth Graders/9-Year-Olds Eighth Graders/13-Year-Olds (1) State (2) Total Enrollmenta Students with Disabilitiesb (5) Percent (Identified) and Assessed in NAEPa (6) Total Enrollmenta Students with Disabilitiesb (9) Percentage (Identified) and Assessed in NAEPc     (3) N (4) Percent     (7) N (8) Percentd   Alabama 59,749 7,976 13.3 (13) 11 56,951 8,129 14.3 (13) 11 Alaska 10,646 1,553 14.6 Not available 10,377 1,398 13.5 (15) 14 Arizona 72,295 8,173 11.3 (11) 7 65,526 8,132 12.4 (11) 9 Arkansas 35,724 4,178 11.7 (12) 7 34,873 4,566 13.1 (15) 13 California 489,043 55,266 11.3 (7) 4 441,877 51,888 11.7 (11) 9 Colorado 57,056 6,511 11.4 Not available 55,386 6,188 11.2 (12) 10 Connecticut 44,682 5,624 12.6 (13) 9 42,597 6,182 14.5 (14) 11 Delaware 8,848 1,404 15.9 (15) 8 9,075 1,291 14.2 (16) 8 District of Columbia 5,830 934 16 14 (7) 3,371 869 25.8 (16) 11 Florida 194,320 30,783 15.8 (17) 13 185,663 28,953 15.6 (14) 12 Georgia 116,678 14,948 12.8 (10) 7 109,124 13,406 12.3 (11) 10 Hawaii 15,291 1,882 12.3 (12) 8 13,424 1,914 14.3 (16) 13 Idaho 18,964 2,486 13.1 (13) 9 19,045 2,138 11.2 (10) 10 Illinois 160,495 25,335 15.8 (13) 9 149,045 22,630 15.2 (15) 12 Indiana 79,738 13,829 17.3 (12) 8 73,888 11,256 15.2 (14) 11 Iowa 36,448 5,708 15.7 (15) 8 36,458 6,034 16.6 (16) 14 Kansas 35,165 4,910 14 (14) 10 36,085 4,456 12.3 (13) 11

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments Kentucky 50,899 7,017 13.8 (11) 4 48,938 6,526 13.3 (13) 9 Louisiana 63,884 7,174 11.2 (19) 8 61,997 7,568 12.2 (16) 11 Maine 16,121 2,810 17.4 (16) 10 17,035 2,831 16.6 (16) 12 Maryland 69,279 9,167 13.2 (12) 6 64,647 9,331 14.4 (14) 10 Massachusetts 78,287 12,213 15.6 (16) 12 74,527 13,376 17.9 (16) 14 Michigan 130,886 18,530 14.2 (11) 4 123,080 17,483 14.2 (13) 8 Minnesota 63,334 8,758 13.8 (13) 10 66,254 8,526 12.9 (13) 11 Mississippi 40,177 4,479 11.1 (7) 3 36,588 4,226 11.6 (9) 4 Missouri 71,208 11,540 16.2 (15) 7 68,717 11,098 16.2 (15) 12 Montana 11,682 1,653 14.1 (13) 8 12,517 1,524 12.2 (12) 10 Nebraska 21,357 3,793 17.8 (18) 4 21,864 3,339 15.3 (14) 11 Nevada 28,616 3,349 11.7 (12) 7 25,327 2,995 11.8 (12) 10 New Hampshire 16,852 2,306 13.7 Not available 17,209 2,600 15.1 (19) 15 New Jersey 100,622 18,696 18.6 Not available 92,094 16,958 18.4 (19) 14 New Mexico 25,493 3,913 15.3 (15) 9 24,870 4,394 17.7 (20) 18 New York 217,881 33,689 15.5 (14) 8 203,482 33,273 16.4 (16) 12 North Carolina 105,105 15,421 14.7 (17) 6 99,295 13,483 13.6 (16) 12 North Dakota 7,982 1,127 14.1 (16) 11 8,651 1,023 11.8 (14) 13 Ohio 143,373 19,400 13.5 (13) 8 139,740 18,727 13.4 (13) 8 Oklahoma 47,064 7,147 15.2 (17) 13 46,276 6,785 14.7 (16) 14 Oregon 43,436 6,774 15.6 (16) 10 42,364 6,128 14.5 (14) 12 Pennsylvania 142,366 19,434 13.7 (13) 9 143,638 19,247 13.4 (14) 13 Rhode Island 12,490 2,581 20.7 (19) 15 11,750 2,469 21 (20) 17 South Carolina 54,468 8,865 16.3 (16) 11 53,259 7,853 14.7 (15) 8 South Dakota 9,583 1,489 15.5 (11) 8 10,303 1,055 10.2 (11) 9 Tennessee 73,373 10,046 13.7 (11) 8 66,429 9,675 14.6 (14(12 Texas 313,731 38,906 12.4 (14) 6 304,419 41,946 13.8 (15) 9 Utah 35,910 4,652 13 (12) 7 34,579 3,831 11.1 (11) 9 Vermont 7,736 1,035 13.4 (13) 9 8,005 1,174 14.7 (17) 15

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments   Fourth Graders/9-Year-Olds Eighth Graders/13-Year-Olds (1) State (2) Total Enrollmenta Students with Disabilitiesb (5) Percent (Identified) and Assessed in NAEPa (6) Total Enrollmenta Students with Disabilitiesb (9) Percentage (Identified) and Assessed in NAEPc     (3) N (4) Percent     (7) N (8) Percentd   Virginia 92,073 13,860 15.1 (14) 6 87,455 12,933 14.8 (15) 9 Washington 78,505 10,340 13.2 (13) 9 77,160 8,970 11.6 (13) 11 West Virginia 21,995 4,038 18.4 (15) 5 21,902 3,753 17.1 (16) 13 Wisconsin 64,455 9,080 14.1 (13) 8 67,950 9,441 13.9 (15) 13 Wyoming 6,736 1,005 14.9 (14) 2 7,284 945 13 (15) 14 aCounts and percentages are by grade level. bCounts and percentages are by age. cNumber of students with disabilities age 9 divided by total number of enrolled fourth graders. dNumber of students with disabilities age 13 divided by total number of enrolled eighth graders. SOURCES: NCES Common Core of Data Survey available http://nces.ed.gov/programs/digest/d02/tables/PDF/table38.pdf; U.S. Departm ent of Education (2002),Table AA8; U.S. Department of Education National Center for Education Statistics and Institute of Education Sciences (2003).

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments FINDINGS AND RECOMMENDATIONS Our review of policies and procedures for identifying students with disabilities and English language learners and for including and accommodating these students in NAEP and other large-scale assessments has revealed a large amount of variability both among and within states and between the states and NAEP. We recognize that the task of standardizing inclusion and accommodation policies is not a small one, but in our judgment improvements can be made. Greater uniformity in these procedures is important for several reasons. First, in the context of NAEP, the integrity of NAEP samples, and consequently the accuracy of its data, depend on the consistency with which students are identified as students with disabilities or as English language learners, as well as on the consistency with which they are included in and accommodated for NAEP testing around the country. The integrity of the samples is of paramount importance for data regarding students with disabilities and English language learners, but it also affects the validity of NAEP data about the population as a whole and other subgroups as well. At the same time there exists the possibility that greater attention to the data regarding students with disabilities and English language learners provided by NAEP, and the factors that complicate interpretation of these data, may raise difficult questions for NAEP. To date, litigation concerning accommodations has primarily been related to so-called high-stakes tests, whose results are used in decisions about promotion, graduation, and placement for individual students. There have been no legal challenges to NAEP because its results are not used to make high-stakes decisions. However, as NAEP continues to be viewed as a tool for evaluating how the nation’s students are progressing in the context of the goals for the No Child Left Behind Act, its conclusion and accommodation policies may need to be sharpened and aligned with those of state assessment systems. In the committee’s view, it is important to know the extent to which the percentages in the NAEP reports correspond to the percentages of students with disabilities and English language learners reported in other sources. The committee believes that many states are undertaking additional efforts at collecting such data, partly in response to the requirements of such legislation as the No Child Left Behind Act of 2001. We encourage all parties (NAEP as well as state and federal agencies) to collect and compile such data so that the desired comparisons can be made. Specifically, the committee takes note of the following circumstances: FINDING 4-1: Decision making regarding the inclusion or exclusion of students and the use of accommodations for NAEP is controlled at the school level. There is variability in the way these decisions are made, both across schools within a state and across states.

OCR for page 62
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments FINDING 4-2: The target population for NAEP assessments is not clearly defined. It is not clear to whom the results are intended to generalize. FINDING 4-3: The extent to which the demographic estimates in NAEP reports compare with the actual proportions of students with disabilities and English language learners is not known; in part, this is the result of deficiencies in the national and state-level demographic data that are available. Our review of these circumstances leads us to make the following recommendations: RECOMMENDATION 4-1: NAEP officials should: review the criteria for inclusion and accommodation of students with disabilities and English language learners in NAEP in light of federal guidelines; clarify, elaborate, and revise their criteria as needed; and standardize the implementation of these criteria at the school level. RECOMMENDATION 4-2: NAEP officials should work with state assessment directors to review the policies regarding inclusion and accommodation in NAEP assessments and work toward greater consistency between NAEP and state assessment procedures. RECOMMENDATION 4-3: NAEP should more clearly define the characteristics of the population of students to whom the results are intended to generalize. This definition should serve as a guide for decision making and the formulation of regulations regarding inclusion, exclusion, and reporting. RECOMMENDATION 4-4: NAEP officials should evaluate the extent to which their estimates of the percentages of students with disabilities and English language learners in a state are comparable to similar data collected and reported by states to the extent feasible given the data that are available. Differences should be investigated to determine the causes. In addition to those four recommendations to NAEP officials, we also recommend that: RECOMMENDATION 4-5: Efforts should be made to improve the availability of data about students with disabilities and English language learners. State-level data are needed that report the total number of English language learners and students with disabilities by grade level in the state. This information should be compiled in a way that allows comparisons to be made across states and should be made readily accessible.