Read "Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments" at NAP.edu

Page 62 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

4
Factors That Affect the Accuracy of NAEP’s Estimates of Achievement

The purpose of the National Assessment of Educational Progress (NAEP) is to provide reports to the nation on the academic achievement of all students in grades 4, 8, and 12. NAEP accomplishes this through sampling, a process similar to those used in political polling, marketing surveys, and other contexts, in which only a scientifically selected portion of the target population, the group about whom data are needed, is actually assessed. This process is complex, and ensuring that it is conducted correctly is critical to the integrity of NAEP’s reported results.

There are a number of factors that make the sampling process a challenge for the NAEP officials who are responsible for it, and that make interpreting the results difficult for users of the data who want to understand the academic achievement of students in the United States. For one, the sampling process is affected by decisions made at the local level about which of the sampled students who have a disability or are English language learners should participate in NAEP. The process is also dependent on the consistency with which a variety of procedures that are part of the administration of NAEP assessments are applied in local settings around the country. This chapter provides a description of the way NAEP sampling works and discussion of several factors that complicate it. We explore the variability in state policies for identifying students with disabilities and English language learners and the variability in state policies regarding allowable accommodations on state assessments, and we consider ways in which local decision making affects the integrity of NAEP samples and its results.

Page 63 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

NAEP SAMPLING PROCEDURES

Because NAEP is designed to provide estimates of the performance of large groups of students in more than five separate subject areas and at three different stages of schooling, it would not be practical to test all of the students about whom data are sought in all subjects. Not only would each student be subjected to a prohibitively large amount of testing time in order to cover all of the targeted subject matter, but schools would also be unacceptably disrupted by such a burden. The solution is to assess only a fraction of the nation’s students, evaluating each participating student on only a portion of the targeted subject matter. In order to be sure all of the material in each subject area is covered, developers design the assessment in blocks, each representing only a portion of the material specified in the NAEP framework for that subject. These blocks are administered according to a matrix sampling procedure, through which each student takes only two or three blocks in a variety of combinations. Statistical procedures are then used to link these results and project the performance to the broader population of the nation’s students (U.S. Department of Education, National Center for Education Statistics, and Office of Educational Research and Improvement, 2001).

NAEP’s estimates of proficiency are based on scientific samples of the population of interest, such as fourth grade students nationwide. In other words, the percentage of students in the total group of fourth graders who fall into each of the categories about which data are sought—such as girls, boys, members of various ethnic groups, and residents of urban, rural, or suburban areas—is calculated. A sample—a much smaller number of children—can then be identified whose proportions approximate those of the target population. Data are collected about other kinds of characteristics as well, including such information as parents’ education levels, the type of school in which students are enrolled (public/private, large/small), and whether students have disabilities or are English language learners. In this way, NAEP reports can provide answers to a wide variety of questions about the percentages of students in each of a variety of groups, the relative performance of different groups, and the relationships among achievement and a wide variety of academic and background characteristics.

The sampling for NAEP is based on data received from schools about their students’ characteristics as well as other factors. The selection of students in each school identified for NAEP participation is crucial to the representativeness of the overall sampling and the resulting estimates of performance. Local administrators are given lists of students who are to participate and instructions as to what adjustments to this list are permitted in response to absences and other factors that may affect participation. However, in the case of both students with disabilities and English language learners, which students ultimately remain in the sample depends in part on decisions made at the local level. These decisions are discussed in greater detail below.

Page 64 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

COMPARABILITY OF NAEP SAMPLES ACROSS STATES

As was mentioned in Chapter 1, decision making about the identification of students with disabilities and English language learners, their inclusion in large-scale assessments, and the testing accommodations they need is guided by federal legislation (although far more detailed guidance is provided regarding students with disabilities than English language learners). It is up to states, however, to develop policies for complying with legislative requirements, and consequently the policies and the way they are interpreted vary from state to state, in some cases considerably. The variation in state policies has particular implications for NAEP. Decisions made at the state and local level affect NAEP’s results and the ways in which they can be interpreted.

For each administration NAEP officials identify a sample of students to participate in the assessment, and they provide guidelines for administering it. However, school-level officials influence the process in several ways. First, as they are developing the sample, NAEP officials make no attempt to identify students with disabilities or English language learners themselves; rather, the percentages of those students who end up in the sample reflect decisions that have already been made at the school level; these decisions are guided by state policies, which vary. Second, NAEP officials leave it to school-level staff, who are knowledgeable about students’ educational functioning levels, to determine whether selected students who have a disability or are English language learners can meaningfully participate. In general, this process is guided by the policy set forth in the NAEP 2003 Assessment Administrators’ Manual (U.S. Department of Education, National Center for Education Statistics, and Office of Educational Research and Improvement, 2003, pp. 4-19). Finally, NAEP officials provide lists of allowable accommodations for each of its assessments, but here as well it is school-level staff who decide which accommodations are appropriate for their students and which of those allowed in NAEP they are in a position to offer. Thus, differences in policies and procedures both among and within states can affect who participates in NAEP and the way in which students participate.

According to the most recent legislation, the purpose of NAEP is “to provide, in a timely manner, a fair and accurate measurement of student academic achievement and reporting of trends in such achievement in reading, mathematics, and other subject matter as specified in this section” (Section 303 of HR 3801). The legislation further indicates that the commissioner for education statistics shall:

(a) use a random sampling process which is consistent with relevant, widely accepted professional assessment standards and that produces data that are representative on a national and regional basis;

(b) conduct a national assessment and collect and report assessment data, including achievement data trends, in a valid and reliable manner on student academic achievement in public and private elementary schools and secondary schools at least once every two years, in grades 4 and 8 in reading and mathematics;

Page 65 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

(c) conduct a national assessment and collect and report assessment data, including achievement data trends, in a valid and reliable manner on student academic achievement [in] public and private schools in reading and mathematics in grade 12 in regularly scheduled intervals, but at least as often as such assessments were conducted prior to the date of enactment of the No Child Left Behind Act of 2001.

Both intrastate and interstate variability in the policies and procedures that determine which students participate and which accommodations they receive have implications for the interpretation of NAEP results. First, local decision making will affect the composition of a state sample, and thus the characteristics of the sample may vary across states in unintended and perhaps unrecognized ways. Likewise, local decisions about which accommodations a student requires will affect the conditions under which scores are obtained. This means that a state’s results are subject to these locally made decisions, which may be based on criteria that vary from school to school in a state. Moreover, national NAEP results, in which scores are aggregated across states, are also subject to these locally made decisions. Finally, a key objective for NAEP is to characterize the achievement of the school-age population in the United States, yet the extent to which NAEP results are representative of the entire population depends on the locally made decisions that affect the samples.

Identifying and Classifying Students with Disabilities and English Language Learners

Determining which students should be classified as disabled in some way or as an English language learner is thus critical to ensuring that these groups of students are adequately represented, but making these classifications is far more complicated than many people recognize. In both cases, the specific situations that may call for such a classification vary widely, and there is no universally used or accepted method to use in making these judgments, particularly for English language learners. In general, decisions about whether and how specific students should be tested in NAEP are derived from previous decisions about those students’ educational needs and placement, so it is important to understand how these decisions are made.

Identifying Students with Disabilities¹

The process of identifying and classifying students with disabilities and determining their eligibility for special education typically involves three steps:

¹	Text in this section has been adapted from the reports of the National Research Council’s Committee on Goals 2000 and the Inclusion of Students with Disabilities (National Research Council, 1997a) and the Committee on Minority Representation in Special Education (National Research Council, 2002a).

Page 66 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

referral, which generally begins with the teacher; evaluation; and placement. Once an individual is identified as having a disability, a determination is made as to whether he or she qualifies for special education and related services. Under the Individuals with Disabilities Education Act (IDEA), eligibility for special education services is based on two criteria: first, the individual must meet the criteria for at least one of the 13 disabilities recognized in the IDEA (or the counterpart categories in state law) and, second, the individual must require special education or related services in order to receive an appropriate education. If both the disability diagnosis and special education need are confirmed, then the student has the right to an individualized education program (IEP). The IEP will also specify accommodations required for instructional purposes and for testing.

Although the IDEA is explicit about the procedures for identifying students as having a disability, significant variability exists in the way procedures are implemented. For some kinds of disabilities (such as physical or sensory ones), the criteria are clear. However, for others, such as learning disabilities, mild mental retardation, and serious emotional disturbance, the criteria are much less clear and the implementation practices are more variable.

States and districts do not have to adopt the disability categories in the federal laws and regulations (Hehir, 1996), and classification practices vary significantly from place to place; variation exists, for example, in the names given to categories, key dimensions on which the diagnosis is made, and criteria for determining eligibility (National Research Council, 1997a). This variability led the Committee on Goals 2000 and the Inclusion of Students with Disabilities to note that “it is entirely possible for students with identical characteristics to be diagnosed as having a disability in one state but not in another, or to have the categorical designation change with a move across states lines” (National Research Council, 1997a, p. 75).

Another source of variability is the referral process. Many of the referrals are made by classroom teachers. However, local norms are applied in making the judgment that achievement is acceptable or unacceptable. That is, whether a teacher perceives a student’s level of achievement as acceptable or unacceptable varies as a function of the typical or average level of achievement in that student’s classroom. It is the classroom teacher who compares the student with others and decides whether referral is appropriate (National Research Council, 2002a, p. 227). Special education referral rates can also be affected by policies and practices in a school system. The availability of other special programs, such as remedial reading and Title I services, can affect the number of students referred for special education (National Research Council, 1997a, p. 71).

Educators also face competing incentives in serving students who may have disabilities. For example, financial pressures on school districts and a lack of adequate federal and state support may make local officials reluctant to refer students for special education services even when they seem to meet relevant eligibility criteria (National Research Council, 1997a, p. 55). At the same time,

Page 67 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

staff in some schools may view their special education program as a kind of organizational safety valve that allows teachers to remove disruptive students from their classrooms, or that provides an alternative for vocal parents wanting additional assistance for their children (National Research Council, 1997a, p. 54). Consequently, schools may refer students for special education services when other remedies are more appropriate. Although none of these reasons is an adequate or even legitimate basis for deciding whether students are eligible for services, they represent the realities of local implementation. Educators’ efforts to balance their responsibilities to serve all students, interpret applicable legal requirements for individual children, work within existing fiscal and organizational constraints, and respond to parental concerns may yield discrepancies, with the result that similar students might receive services in one school and be ineligible for them in another (National Research Council, 1997a, p. 55).

The requirement that the IEP be tailored to individual students’ needs has also led to variability in the implementation of the IDEA. Evaluation, placement, and programming decisions for students with disabilities are intended to be idiosyncratic and to focus on the specific needs of the individual. The IEP process is designed this way so that the tendency for institutions to standardize their procedures will be countered by pressure from parents and special education staff to provide each student with the education and services he or she needs (National Research Council, 1997a). Because the IEP is the paramount determinant of matters affecting the education of students with disabilities, including participation in assessment and accommodations, this is a critical source of variability in the context of NAEP.

Identifying English Language Learners

For English language learners, there are also difficulties in identification and classification, although for somewhat different reasons. There is no legislation akin to the IDEA to provide guidance to states on identifying English language learners, and there is no universally used definition of English language learners. Hence the category includes a broad range of students whose level of fluency in English, literacy in their native language, previous academic experiences, and socioeconomic status all vary significantly. Below we present results from several analyses of state policies with regard to identification of English language learners.

Research conducted by Rivera et al. (2000) revealed that states vary considerably in the way they define English proficiency. For example, Rivera reported that 15 states base their definitions on the fairly detailed definition from the Improving America’s Schools Act of 1994, that is, a limited English proficient individual is one who (Rivera et al., 2000, p. 4):

(a) was not born in the United States or whose native language is a language other than English and comes from an environment where a language other than English is dominant; or

Page 68 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

(b) is a Native American, or Alaska Native, or a native resident of the outlying areas and comes from an environment where a language other than English has had a significant impact on such individual’s level of English proficiency; or

(c) is migratory and whose native language is other than English and comes from an environment where a language other than English is dominant; and

(d) who has sufficient difficulty speaking, reading, writing, or understanding the English language, and whose difficulties may deny such an individual the opportunity to learn successfully in classrooms where the language of instruction is English or to participate fully in society.

According to Rivera et al. (2000), other states use much less detailed definitions, such as “students who do not understand, speak, read or write English” (in Pennsylvania) or “students assessed as having English skills below their age appropriate grade level” (in Missouri). In addition, some states base the identification on information gathered from enrollment records, home language surveys, interviews, observations, and teacher referrals, while others identify students as English language learners from their performance on tests designed to measure “English proficiency” (National Research Council, 2000b).

More recently, the U.S. Department of Education’s Office of English-language Acquisition, Language Enhancement, and Academic Achievement for Limited English Proficient Students (OELA) conducted a survey that provided some data on the variety of criteria states use for identifying students as English language learners (Kindler, 2002). Among the state education agencies responding to the survey, about 80 percent use home language surveys, teacher observation, teacher interviews, and parent information to identify students as English language learners; 60 percent use student records, student grades, informal assessments, and referrals. Most also use some type of language proficiency test. The most widely used tests are the Language Assessment Scales, the IDEA Language Proficiency Tests, and the Woodcock-Munoz Language Survey. A number of states also used results from achievement tests to identify students with limited English proficiency. The results from this survey are presented in Table 4-1.

As of 2001, most states allowed English language learners to be exempted from statewide assessments for a certain period of time (Golden and Sacks, 2001). According to Rivera, 11 states allowed a 2-year delay before including such students in testing, 21 states allowed 3 years, 2 states allowed more than 3, and 1 state had no time limit (Golden and Sacks, 2001).

Jurisdictions also differ in the amount of time they allow English language learners to receive educational supports. Some offer services for as little as one year; others for multiple years.² This variation can have significant implications

²

Although these limits are common, researchers have found that it typically takes three to five years for English language learners to develop true oral proficiency. Academic proficiency—the capacity to use spoken and written English with sufficient complexity that one’s academic performance is not impaired at all—takes longer, four to seven years on average (Hakuta et al., 1999).

Page 69 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

TABLE 4-1 Methods States Use for Identifying English Language Learners

Methods for Identifying English Language Learners		Number of States^a
Type of Data	Home language	50
	Parent information	48
	Teacher observation	46
	Student records	45
	Teacher interview	45
	Referral	44
	Student grades	43
	Other	32
Tests	Language proficiency tests:	51
	Language assessment scales	46
	IDEA language proficiency tests	38
	Woodcock-Munoz Language Survey	28
	Language assessment battery	13
	Basic Inventory of Natural Languages	6
	Maculaitis assessment	6
	Secondary Level English Proficiency	6
	Woodcock Language Proficiency Battery	6
	Achievement Tests:	41
	State Achievement Test	16
	Stanford	15
	ITBS	14
	CTBS	11
	Gates-MacGinitie	11
	Terra Nova	11
	Criterion Referenced Tests (CRT):	21
	State CRT	1
	NWEA Assessment	4
	District CRT/Benchmark	3
	Qualitative Reading Inventory	3
	Other CRT	5
	Other Test	19
^aIncludes states, the District of Columbia, and outlying areas (n = 54). SOURCE: Kindler (2002, p. 9).

not only for students’ academic careers, but also for the data collected about them. Jurisdictions typically do not track English language learners’ progress once they stop receiving educational supports, although they may be far from fluent. Moreover, students who are no longer identified as needing educational supports would not ordinarily receive testing accommodations either.

The No Child Left Behind Act of 2001 provides a definition of English language learners that all states are to use, at least in the context of the assess-

Page 70 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

ments the act requires them to undertake, but this definition, too, is open to interpretation.

According to the legislation, the term “limited English proficient,” when used with respect to an individual means an individual—

(a) who is aged 3 through 21;

(b) who is enrolled or preparing to enroll in an elementary or secondary school;

(c) (i) who was not born in the United States or whose native language is a language other than English;

(ii) (I) who is a Native American or Alaska native, or a native resident of the outlying areas; and

(II) who comes from an environment where a language other than English has had a significant impact on the individual’s level of English-language proficiency; or

(iii) who is migratory, whose native language is other than English, and who comes from an environment where a language other than English is dominant; and

(d) whose difficulties in speaking, reading, writing, or understanding the English-language may be sufficient to deny the individual—

(i) the ability to meet the State’s proficient level of achievement on State assessments described in section 1111(b)(3);

(ii) the ability to successfully achieve in classrooms where the language of instruction is English; or

(iii) the opportunity to participate fully in society.

Data are not yet available on how states are applying this new definition. The extent to which state policies will continue to vary remains to be seen.

Policies on Accommodation

NAEP results are also affected by the ways in which students with disabilities and English language learners are accommodated when they participate in NAEP. As was noted earlier, NAEP officials have been investigating ways of including more students in these two groups in testing, and thus the pros and cons of providing available accommodations. These decisions for NAEP are influenced by decisions made at the local level in several ways. However, like the identification procedures discussed above, policies in this area vary significantly from state to state.

In their efforts to comply with federal legislation and include these students in accountability programs, states and districts have been devising their policies without the benefit of either nationally recognized guidelines or a clear research base for overcoming many specific difficulties in assessing students with disabilities and English language learners. Not only do existing policies and proce-

Page 71 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

dures vary from state to state, but also they change frequently in many places as states adjust to changes in their student populations, in their testing programs, in the political climate surrounding testing, and in the evidence emerging from both research and practice. It is also important to note in this context that states’ policies regarding accommodations properly depend in part on the constructs measured by specific assessments, which vary from test to test and from state to state.

Until recently, states could exclude students from their state and local testing. Now, under the requirements of the No Child Left Behind Act of 2001, states must strive to include all students with disabilities and English language learners in their accountability systems. This means that they must find a means to evaluate these students’ skills in reading and math, either by including them in the standard state assessment or by providing an alternate assessment. State’s inclusion and accommodation policies for the two groups of students are described below.

Accommodation Policies for Students with Disabilities

As was mentioned earlier, now that states must include nearly all students in their assessments, the importance of accommodations has grown. All states define both allowable and nonallowable practices, the latter being those that are believed to alter the construct being assessed. In general the testing accommodations for students with disabilities are based on the services and classroom accommodations that have been identified in the IEP, and the IEP is considered the authoritative guide to testing accommodations for each student who has one. Table 4-2 presents recent data on the types of accommodations that states currently allow.

Accommodation Policies for English Language Learners

In many states, the policies for including and accommodating English language learners have been derived from those established for students with disabilities (Golden and Sacks, 2001), and these have not always been clearly suited to the needs of both kinds of students. The No Child Left Behind Act has meant that far fewer English language learners can be excluded from assessments, and that accommodations and alternate assessments will be used for many students who might formerly have been excluded. The variation in policies for accommodating these students around the country is similar to that evident for students with disabilities. Table 4-3 provides recent data on the types of accommodations states currently use.

Differences Between NAEP Policies and State Policies

Since NAEP, unlike the states, is not required by law to include all students with disabilities and English language learners in their assessments, NAEP officials are free to continue to adhere to the policies they have devised for both

Page 72 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

TABLE 4-2 Accommodations for Students with Disabilities Allowed for State Assessments and for NAEP

Type of Accommodation	Number of StatesThat Allow the Accommodation	Allowed in NAEP
Presentation:
Oral reading of questions	47	Yes (except for reading)
Large print	48	Yes
Braille	47	No^a
Read aloud	46	Yes (except for reading)
Signing of directions	48	No^a
Oral reading of directions	48	Not specified
Audio taped directions or questions	29	No
Repeating of directions	47	Yes
Explanation of directions	38	Yes
Interpretation of directions	28	Not specified
Short segment testing booklets	14	Not specified
Equipment:
Use of magnifying glass	47	Not specified
Amplification	No info	Not specified
Light/acoustics	No info	Yes
Calculator	No info	Only on calculator use
Templates to reduce visual field	38	Not specified
Response Format:
Use of scribe	48	Yes
Write in test booklet	44	Yes
Use template for recording answers	29	Not specified
Point to response, answer orally	41	Yes
Use sign language	42	No^a
Use typewriter/computer/word processor	41	Yes
Use of Braille writer	42	Yes
Answers recorded on audio tape	32	No
Scheduling/Timing:
Extended time	46	Yes
More breaks	46	Yes
Extending sessions over multiple days	37	No
Altered time of day that test is given	41	Not specified
Setting:
Individual administration	47	Yes
Small group	47	Yes
Separate room	47	Yes
Alone in study carrel	43	Yes
At home with supervision	27	Not specified
In special education class	46	Not specified
Other:
Out of level testing	15	No
Use of word lists or dictionaries	25	No
Spell checker	16	No
^aNot provided by NAEP, but school, district, or state may provide after fulfilling NAEP security requirements. SOURCES: Annual Survey of State Student Assessment Programs 2000-2001 (Council of Chief State School Officers, 2002); Available: http://nces.ed.gov/nationsreportcard/about/inclusion.asp#accom_table.

Page 73 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

TABLE 4-3 Accommodations for English Language Learners Allowed for State Assessments and for NAEP

Accommodation	Number of States That Allow the Accommodation	Allowed in NAEP
Presentation:
Oral reading in English	34	Yes (except for reading)
Person familiar to student administers test	36	Yes
Translation of directions	28	No
Translation of test into native language	13	No
Bilingual version of test (English and native language)	5	Yes, Spanish version of math
Oral reading in native language	18	No
Explanation of directions	32	Yes
Response Format:
Respond in native language	10	No
Respond in native language and English	9	No
Scheduling/Timing:
Extended time (same day)	40	Yes
More breaks	34	Yes
Extending sessions over multiple days	26	No
Setting:
Small group	41	Yes
Separate room	40	Yes
Alone in study carrel	37	Yes
Other:
Out of level testing	2	No
Use of word lists or dictionaries	24	Bilingual dictionary (except for reading)
Use of technology	13	Not specified
SOURCES: Annual Survey of State Student Assessment Programs 2000-2001 (Council of Chief State School Officers, 2002); Available: http://nces.ed.gov/nationsreportcard/about/inclusion.asp#accom_table.

inclusion and accommodation. The sponsors of NAEP have, however, as has been noted, been exploring modifications to these policies with the goal of increasing the participation of students in both groups. NAEP’s current inclusion policy follows (http://nces.ed.gov/nationsreportcard/about/criteria.asp):

A student identified as having a disability, that is, a student with an IEP or equivalent classification, should be included in NAEP unless:

The IEP team or equivalent group has determined that the student cannot participate in assessments such as NAEP, or
The student’s cognitive functioning is so severely impaired that he or she cannot participate, or

Page 74 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

The student’s IEP requires that the student be tested with an accommodation that NAEP does not permit, and the student cannot demonstrate his or her knowledge of reading or mathematics without that accommodation.

A student who is identified as limited English proficient (LEP) and who is a native speaker of a language other than English should be included in NAEP unless (http://nces.ed.gov/nationsreportcard/about/criteria.asp):

The student has received reading or mathematics instruction primarily in English for less than 3 school years including the current year, and
The student cannot demonstrate his or her knowledge of reading or mathematics in English even with an accommodation permitted by NAEP.

The phrase “less than 3 school years including the current year” means 0, 1, or 2 school years. Therefore, in applying the criteria:

Include without any accommodation all LEP students who have received reading or mathematics instruction primarily in English for 3 years or more and those who are in their third year;
Include without any accommodation all other LEP students who can demonstrate their knowledge of reading or mathematics without an accommodation;
Include and provide accommodations permitted by NAEP to other LEP students who can demonstrate their knowledge of reading or mathematics only with those accommodations; and
Exclude LEP students only if they cannot demonstrate their knowledge of reading or mathematics even with an accommodation permitted by NAEP.

The decision regarding whether any of the students identified as SD or LEP cannot be included in the assessment should be made in consultation with knowledgeable school staff. When there is doubt, the student should be included.

As for accommodations, NAEP allows some that are typically allowed on state and district assessments, but there are many used by states and districts that NAEP does not allow. For example, reading aloud of passages or questions on the reading assessment is explicitly prohibited, and alternative language versions and bilingual glossaries are not permitted on the reading assessments. Braille forms are allowed but used only if schools can provide the necessary resources to create the forms. Allowable and nonallowable accommodations for NAEP are listed in column 2 of Table 4-2 and Table 4-3.

Decisions about which of the allowed accommodations will be provided to individual students selected for a NAEP assessment are made by school authorities. In general, school authorities rely on the guidance provided in the student’s IEP regarding required accommodations for students with disabilities. As has been noted, there is currently no legislation parallel to IDEA to guide decision making about accommodations for English language learners. When a student in either group requires an accommodation that is not on the approved list for NAEP, the student is generally excluded from the assessment.

While a detailed investigation of the implementation of the policies regarding inclusion and accommodation in NAEP at the school level was beyond the

Page 75 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

scope of the committee’s charge, it is worth noting here that a considerable amount of responsibility for the implementation of the sampling procedure rests with school-level coordinators. Since it is clear that uniformity in this process is very important to the integrity of the sampling procedure and the accuracy of the assessment results, we raise the caution that precise instructions as to the handling of ambiguous circumstances are needed to ensure that coordinators make decisions that are consistent both with NAEP guidelines and with the decisions being made in other schools in the sample.

The committee has become aware of anecdotal reports from state officials that coordinators may not, in all cases, be completely familiar with the IEP process, with state and district accommodations policies, or with federal law regarding inclusion and accommodation; these reports also indicate that there may be instances in which the coordinators have not adhered to the NAEP guidelines. It is not clear that the oversight of this aspect of the process is adequate, or that the implementation is as uniform as it needs to be. We hope that this issue will be investigated further by the sponsors of NAEP.

REPRESENTATIVENESS OF NAEP SAMPLES

There are several complications that affect the NAEP sampling procedures for students with disabilities and English language learners. First, as noted earlier, NAEP’s purpose is defined in legislation (see Section 303 of HR 3801), and the assessment is generally understood to provide results that reflect the academic proficiency of the nation’s entire population of fourth, eighth, and twelfth graders. However, the target population is not precisely described in the legislation. The legislation does not provide details about the characteristics of the target population that the assessed samples must match. Indeed, the only specific points made in the legislation are that the target population should be national and should include both public and private schools.

This ambiguity creates difficulties. Although the national results presented in NAEP’s reports are designed to be representative of fourth, eighth, and twelfth grade students in the nation (U.S. Department of Education National Center for Education Statistics and Institute of Education Sciences, 2003, p. 135), students with disabilities and English language learners may be excluded from NAEP sampling at two stages in the process. First, students with disabilities may be excluded because schools exclusively devoted to special education students are not included in the sampling. Second, students with disabilities and English language learners may be excluded because the test is not administered to students who, in the judgment of school personnel, cannot meaningfully participate.³

³	That is, students with disabilities who would require an accommodation that is not allowed on NAEP or an alternate assessment, as well as English language learners who do not meet NAEP’s rules for inclusion.

Page 76 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

With regard to exclusion at the first stage, the key point is that students may be systematically excluded from the population that is sampled. Special education schools serve a wide range of students, including both students with lower levels of cognitive functioning and students with higher levels of cognitive functioning whose placement in such schools is the result of physical (e.g., visual, hearing, or motor skill impairments) or behavioral problems. Thus, there is a potential bias in the resulting estimates of performance as a consequence of this exclusion.

An additional complication arises as a result of local decision making. That is, there is another way in which the sample of students actually tested may have characteristics different from those of the target population, and it is difficult to estimate the extent of this divergence. As we have seen, the decisions made by school personnel in identifying students as having disabilities or being English language learners vary both within states and across states, but there is no way to measure this variance or its effect on the sample. Nevertheless, it is very likely that students are excluded from NAEP according to criteria that are not uniform. If this is so, statistical “noise” is introduced into inferences that are based on comparisons of performance across states.

Results from the 2002 administration of the NAEP reading assessment indicated that of all students nationwide selected for the sample, 6 percent were excluded from participation at the fourth grade level for some reason. Exclusion was somewhat less frequent for older students: 5 percent at the eighth grade level and 4 percent at the twelfth grade level (U.S. Department of Education, National Center for Education Statistics, and Institute of Education Sciences, 2003, pp. 151-152). Although certain students selected for inclusion are not ultimately assessed in NAEP, those who do not participate are still accounted for. That is, students selected for a NAEP sample are placed into three categories: regular participation, participation with accommodations, and excluded. If the argument can be made that the excluded category reflects students who could not meaningfully participate in the assessment, including those who receive their education in special education schools, then NAEP results can be understood to reflect the academic achievement of all students who can be assessed “in a valid and reliable manner,” using tools currently available. However, if the excluded category includes students who might have been able to participate meaningfully but who were excluded because of incorrect or inconsistent applications of the guidelines, or because a needed, appropriate accommodation was not permitted or available, then inferences about the generalizability of NAEP results to the full population of the nation’s students are compromised.

An Attempt to Compare the Composition of a NAEP Sample with National Demographics

The committee was concerned about the extent to which the samples of students included in NAEP are representative of the numbers of students with

Page 77 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

disabilities and English language learners nationwide. We explored this issue by attempting to compile data with which to compare the characteristics of NAEP’s samples to the characteristics of the nation’s population of students with disabilities and English language learners. Table 4-4 presents the results of this attempt.

For this table, data on the total enrollment in public schools (see row 7) were obtained from the NCES Common Core of Data survey, Table 38 (http://nces.ed.gov/programs/digest/d02/tables/PDF/table38.pdf); this is the number of students enrolled in the 50 states, the District of Columbia, and outlying areas in the specified grade for the 2000-2001 school year. The number of students with disabilities (column 1, row 1) was obtained from the 24th Annual Report to

TABLE 4-4 Comparisons of the Percentages of Students with Disabilities and English Language Learners in the United States^a for the 2000-2001 School Year with Those in the NAEP Samples for 2002 Reading and 2003 Mathematics

	(1) Students with Disabilities		(2) English Language Learners
	Fourth Graders/9-Year-Olds	Eighth Graders/13-Year-Olds	Fourth Graders	Eighth Graders
National Data
(1) Number in United States	522,370^b	501,008^b	169,421^c	108,994^c
(2) Percentage of Total U.S. Enrollment	14.1%^d	14.2%^e	4.6%	3.1%
Percentages of NAEP Sample^f
(3) Identified for 2002 Reading	12%	12%	8%	6%
(4) Assessed in 2002 Reading	7%	8%	6%	4%
(5) Identified for 2003 Math	13%	13%	10%	6%
(6) Assessed in 2003 Math	10%	10%	8%	5%
(7) Total enrollment in United States: fourth grade = 3,707,931; eighth grade = 3,432,370
^aBased on data for the 50 states, the District of Columbia, and outlying areas. ^bCounts of students with disabilities in the United States are by age. ^cCounts of English language learners in the United States are by grade. ^dNumber of students with disabilities age 9 divided by total number of enrolled fourth graders. ^eNumber of students with disabilities age 13 divided by total number of enrolled eighth graders. ^fAll percentages in NAEP are by grade level. SOURCES: Kindler (2002); NAEP 2003 Mathematics Report available http://nces.ed.gov/nationsreportcard/mathematics/results2003/acc-permitted-natl-yes.asp; NCES Common Core of Data Survey available http://nces.ed.gov/programs/digest/d02/tables/PDF/table38.pdf; U.S. Department of Education (2002), Table AA8; U.S. Department of Education National Center for Education Statistics and Institute of Education Sciences (2003).

Page 78 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

Congress, Table AA8, and is the number of children served under the IDEA during the 2000-2001 school year in the United States and outlying areas; these data are reported by age, not grade, so the counts for 9-year-olds and 13-year-olds were used (U.S. Department of Education, 2002). The percentage of students with disabilities in the nation (column 1, row 2) was calculated by dividing the counts in row 1 by the appropriate totals in row 7. The counts of English language learners were obtained from Kindler (2002) and are the number of students (column 2, row 1) enrolled in the specified grade in the United States and outlying areas during the 2000-2001 school year. The percentage of English language learners in the nation (column 2, row 2) was calculated by dividing the counts in row 1 by the appropriate totals in row 7. The weighted percentages⁴ of the NAEP sample, which appear in rows 3 through 8, were obtained from the NAEP’s 2002 Reading Report (see Table 3-3) and NAEP’s 2003 Mathematics Report (http://nces.ed.gov/nationsreportcard/mathematics/results2003/acc-permitted-natl-yes.asp). Rows 3 and 5 show the percentage of the NAEP sample identified (by school officials) as students with disabilities or English language learners; rows 4 and 6 show the percentage of the NAEP sample of students assessed who had disabilities or were English language learners.

At a first glance, these data suggest that the NAEP sampling underrepresents the numbers of students with disabilities and slightly overrepresents the numbers of English language learners. However, interpretation of these data is complicated by the fact that they are not directly comparable. For example, the counts for students with disabilities are for age group, not grade level. The national demographics on students with disabilities and English language learners include counts of students for outlying areas (American Samoa, Guam, Northern Marianas, Puerto Rico, Virgin Islands), which are not all always included in the NAEP sample, and the way in which the data were reported would not allow disentangling these numbers for all of the columns on this table.⁵

Furthermore, the differences in the estimated proportions of students with disabilities and English language learners sampled in NAEP and existing in the United States could be attributable to differences in the way students are counted, the way the data are reported, or both; neither source for the estimated proportions should be considered infallible. In addition, the most current national data available at the time this report was being prepared were for a different school

⁴	Percentages are weighted according to sampling weights determined as part of NAEP’s sampling procedures. These “weighted percentages” are the percentages that appear in NAEP reports.
⁵	While grade level counts for the entire enrollment and for students with disabilities were available for various combinations of the 50 states, the District of Columbia, Department of Defense schools, outlying areas, and Bureau of Land Management schools, grade-level counts for English language learners were available only for the 50 states, the District of Columbia, and outlying areas.

Page 79 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

year (2000-2001) than the one in which the NAEP assessment occurred (2001-2002). Nevertheless, we include this information for two reasons. One is that we strongly believe that attempts should be made to evaluate the extent to which NAEP samples are representative of the students with disabilities and English language learners nationwide. Second, we call attention to the deficiencies in the existing data sources and the consequent difficulties in making such comparisons.

We further note that while national data may be useful in evaluating the representativeness of national NAEP results, state-level demographics would be needed to evaluate the representativeness of the state NAEP results. Tables 3-4 and 3-5 presented data by state on the percentages of students with disabilities and English language learners, respectively, who participated in NAEP’s 1998 and 2002 reading assessments. We attempted to evaluate the representativeness of the state samples with respect to the two groups of students but were unable to obtain all of the necessary data. We were able to obtain data on the percentages of students with disabilities who are age 9 and age 13 for each state, but again these data were not available by grade level. We were not able to obtain grade-level or age-level data by states for English language learners. The data we were able to obtain are presented in Table 4-5.

In the table, state-level enrollment counts for fourth and eighth grades (columns 2 and 6) were obtained from Table 38 of NCES’s Common Core of Data surveys (http://nces.ed.gov/programs/digest/d02/tables.dt038.asp). Counts of students with disabilities by state (columns 3 and 7) were obtained from the 24th Annual Report to Congress, Table AA8 (U.S. Department of Education, 2002). Percentages (columns 4 and 8) were calculated by dividing column 3 by column 2 and column 7 by column 6. The percentages in column 5 were taken from Table 3-4 and are the percentages of students with disabilities identified by school officials and assessed for the fourth grade 2002 NAEP Reading Assessment. Likewise, the percentages in column 9 were obtained from NAEP’s report of the percent of students with disabilities identified by school officials and assessed for the eighth grade 2002 NAEP Reading Assessment (http://nces.ed.gov/nationsreportcard/reading/results2002/acc-sd-g8.asp).

Thus the committee found that it was not possible to compare the proportions of students with disabilities and English language leanrers in NAEP samples with their incidence in the population at large. Local, state, and federal agencies do not produce the kinds of comparable data that would make these comparisons possible at the national level. Moreover, although it would also be important to compare state NAEP results with the proportions of students with disabilities and English language learners in the respective state populations, those comparisons are also, by and large, not possible. As was discussed earlier, states in many cases collect the desired data but do not present them in a way that makes it possible to compare them with state NAEP results or to compile them across states.

Page 80 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

TABLE 4-5 Comparisons of the Percentages of Students with Disabilities in the United States for the 2000-2001 School Year with Those in the Sample for NAEP’s 2002 Reading Assessment in Fourth and Eighth Grade

	Fourth Graders/9-Year-Olds				Eighth Graders/13-Year-Olds
(1) State	(2) Total Enrollmenta	Students with Disabilities^b		(5) Percent (Identified) and Assessed in NAEP^a	(6) Total Enrollment^a	Students with Disabilities^b		(9) Percentage (Identified) and Assessed in NAEP^c
		(3) N	(4) Percent			(7) N	(8) Percent^d
Alabama	59,749	7,976	13.3	(13) 11	56,951	8,129	14.3	(13) 11
Alaska	10,646	1,553	14.6	Not available	10,377	1,398	13.5	(15) 14
Arizona	72,295	8,173	11.3	(11) 7	65,526	8,132	12.4	(11) 9
Arkansas	35,724	4,178	11.7	(12) 7	34,873	4,566	13.1	(15) 13
California	489,043	55,266	11.3	(7) 4	441,877	51,888	11.7	(11) 9
Colorado	57,056	6,511	11.4	Not available	55,386	6,188	11.2	(12) 10
Connecticut	44,682	5,624	12.6	(13) 9	42,597	6,182	14.5	(14) 11
Delaware	8,848	1,404	15.9	(15) 8	9,075	1,291	14.2	(16) 8
District of Columbia	5,830	934	16	14 (7)	3,371	869	25.8	(16) 11
Florida	194,320	30,783	15.8	(17) 13	185,663	28,953	15.6	(14) 12
Georgia	116,678	14,948	12.8	(10) 7	109,124	13,406	12.3	(11) 10
Hawaii	15,291	1,882	12.3	(12) 8	13,424	1,914	14.3	(16) 13
Idaho	18,964	2,486	13.1	(13) 9	19,045	2,138	11.2	(10) 10
Illinois	160,495	25,335	15.8	(13) 9	149,045	22,630	15.2	(15) 12
Indiana	79,738	13,829	17.3	(12) 8	73,888	11,256	15.2	(14) 11
Iowa	36,448	5,708	15.7	(15) 8	36,458	6,034	16.6	(16) 14
Kansas	35,165	4,910	14	(14) 10	36,085	4,456	12.3	(13) 11

Page 81 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

Kentucky	50,899	7,017	13.8	(11) 4	48,938	6,526	13.3	(13) 9
Louisiana	63,884	7,174	11.2	(19) 8	61,997	7,568	12.2	(16) 11
Maine	16,121	2,810	17.4	(16) 10	17,035	2,831	16.6	(16) 12
Maryland	69,279	9,167	13.2	(12) 6	64,647	9,331	14.4	(14) 10
Massachusetts	78,287	12,213	15.6	(16) 12	74,527	13,376	17.9	(16) 14
Michigan	130,886	18,530	14.2	(11) 4	123,080	17,483	14.2	(13) 8
Minnesota	63,334	8,758	13.8	(13) 10	66,254	8,526	12.9	(13) 11
Mississippi	40,177	4,479	11.1	(7) 3	36,588	4,226	11.6	(9) 4
Missouri	71,208	11,540	16.2	(15) 7	68,717	11,098	16.2	(15) 12
Montana	11,682	1,653	14.1	(13) 8	12,517	1,524	12.2	(12) 10
Nebraska	21,357	3,793	17.8	(18) 4	21,864	3,339	15.3	(14) 11
Nevada	28,616	3,349	11.7	(12) 7	25,327	2,995	11.8	(12) 10
New Hampshire	16,852	2,306	13.7	Not available	17,209	2,600	15.1	(19) 15
New Jersey	100,622	18,696	18.6	Not available	92,094	16,958	18.4	(19) 14
New Mexico	25,493	3,913	15.3	(15) 9	24,870	4,394	17.7	(20) 18
New York	217,881	33,689	15.5	(14) 8	203,482	33,273	16.4	(16) 12
North Carolina	105,105	15,421	14.7	(17) 6	99,295	13,483	13.6	(16) 12
North Dakota	7,982	1,127	14.1	(16) 11	8,651	1,023	11.8	(14) 13
Ohio	143,373	19,400	13.5	(13) 8	139,740	18,727	13.4	(13) 8
Oklahoma	47,064	7,147	15.2	(17) 13	46,276	6,785	14.7	(16) 14
Oregon	43,436	6,774	15.6	(16) 10	42,364	6,128	14.5	(14) 12
Pennsylvania	142,366	19,434	13.7	(13) 9	143,638	19,247	13.4	(14) 13
Rhode Island	12,490	2,581	20.7	(19) 15	11,750	2,469	21	(20) 17
South Carolina	54,468	8,865	16.3	(16) 11	53,259	7,853	14.7	(15) 8
South Dakota	9,583	1,489	15.5	(11) 8	10,303	1,055	10.2	(11) 9
Tennessee	73,373	10,046	13.7	(11) 8	66,429	9,675	14.6	(14(12
Texas	313,731	38,906	12.4	(14) 6	304,419	41,946	13.8	(15) 9
Utah	35,910	4,652	13	(12) 7	34,579	3,831	11.1	(11) 9
Vermont	7,736	1,035	13.4	(13) 9	8,005	1,174	14.7	(17) 15

Page 82 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

	Fourth Graders/9-Year-Olds				Eighth Graders/13-Year-Olds
(1) State	(2) Total Enrollmenta	Students with Disabilities^b		(5) Percent (Identified) and Assessed in NAEP^a	(6) Total Enrollment^a	Students with Disabilities^b		(9) Percentage (Identified) and Assessed in NAEP^c
		(3) N	(4) Percent			(7) N	(8) Percent^d
Virginia	92,073	13,860	15.1	(14) 6	87,455	12,933	14.8	(15) 9
Washington	78,505	10,340	13.2	(13) 9	77,160	8,970	11.6	(13) 11
West Virginia	21,995	4,038	18.4	(15) 5	21,902	3,753	17.1	(16) 13
Wisconsin	64,455	9,080	14.1	(13) 8	67,950	9,441	13.9	(15) 13
Wyoming	6,736	1,005	14.9	(14) 2	7,284	945	13	(15) 14
^aCounts and percentages are by grade level. ^bCounts and percentages are by age. ^cNumber of students with disabilities age 9 divided by total number of enrolled fourth graders. ^dNumber of students with disabilities age 13 divided by total number of enrolled eighth graders. SOURCES: NCES Common Core of Data Survey available http://nces.ed.gov/programs/digest/d02/tables/PDF/table38.pdf; U.S. Departm ent of Education (2002),Table AA8; U.S. Department of Education National Center for Education Statistics and Institute of Education Sciences (2003).

Page 83 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

FINDINGS AND RECOMMENDATIONS

Our review of policies and procedures for identifying students with disabilities and English language learners and for including and accommodating these students in NAEP and other large-scale assessments has revealed a large amount of variability both among and within states and between the states and NAEP. We recognize that the task of standardizing inclusion and accommodation policies is not a small one, but in our judgment improvements can be made.

Greater uniformity in these procedures is important for several reasons. First, in the context of NAEP, the integrity of NAEP samples, and consequently the accuracy of its data, depend on the consistency with which students are identified as students with disabilities or as English language learners, as well as on the consistency with which they are included in and accommodated for NAEP testing around the country. The integrity of the samples is of paramount importance for data regarding students with disabilities and English language learners, but it also affects the validity of NAEP data about the population as a whole and other subgroups as well.

At the same time there exists the possibility that greater attention to the data regarding students with disabilities and English language learners provided by NAEP, and the factors that complicate interpretation of these data, may raise difficult questions for NAEP. To date, litigation concerning accommodations has primarily been related to so-called high-stakes tests, whose results are used in decisions about promotion, graduation, and placement for individual students. There have been no legal challenges to NAEP because its results are not used to make high-stakes decisions. However, as NAEP continues to be viewed as a tool for evaluating how the nation’s students are progressing in the context of the goals for the No Child Left Behind Act, its conclusion and accommodation policies may need to be sharpened and aligned with those of state assessment systems.

In the committee’s view, it is important to know the extent to which the percentages in the NAEP reports correspond to the percentages of students with disabilities and English language learners reported in other sources. The committee believes that many states are undertaking additional efforts at collecting such data, partly in response to the requirements of such legislation as the No Child Left Behind Act of 2001. We encourage all parties (NAEP as well as state and federal agencies) to collect and compile such data so that the desired comparisons can be made.

Specifically, the committee takes note of the following circumstances:

FINDING 4-1: Decision making regarding the inclusion or exclusion of students and the use of accommodations for NAEP is controlled at the school level. There is variability in the way these decisions are made, both across schools within a state and across states.

Page 84 Cite

Suggested Citation:"4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." National Research Council. 2004. Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments. Washington, DC: The National Academies Press. doi: 10.17226/11029.

×

FINDING 4-2: The target population for NAEP assessments is not clearly defined. It is not clear to whom the results are intended to generalize.

FINDING 4-3: The extent to which the demographic estimates in NAEP reports compare with the actual proportions of students with disabilities and English language learners is not known; in part, this is the result of deficiencies in the national and state-level demographic data that are available.

Our review of these circumstances leads us to make the following recommendations:

RECOMMENDATION 4-1: NAEP officials should:

review the criteria for inclusion and accommodation of students with disabilities and English language learners in NAEP in light of federal guidelines;
clarify, elaborate, and revise their criteria as needed; and
standardize the implementation of these criteria at the school level.

RECOMMENDATION 4-2: NAEP officials should work with state assessment directors to review the policies regarding inclusion and accommodation in NAEP assessments and work toward greater consistency between NAEP and state assessment procedures.

RECOMMENDATION 4-3: NAEP should more clearly define the characteristics of the population of students to whom the results are intended to generalize. This definition should serve as a guide for decision making and the formulation of regulations regarding inclusion, exclusion, and reporting.

RECOMMENDATION 4-4: NAEP officials should evaluate the extent to which their estimates of the percentages of students with disabilities and English language learners in a state are comparable to similar data collected and reported by states to the extent feasible given the data that are available. Differences should be investigated to determine the causes.

In addition to those four recommendations to NAEP officials, we also recommend that:

RECOMMENDATION 4-5: Efforts should be made to improve the availability of data about students with disabilities and English language learners. State-level data are needed that report the total number of English language learners and students with disabilities by grade level in the state. This information should be compiled in a way that allows comparisons to be made across states and should be made readily accessible.