Read "Improving Schooling for Language-Minority Children: A Research Agenda" at NAP.edu

Page 113 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 113

5—
Student Assessment

This chapter addresses the issue of assessing the language proficiency and subject matter knowledge and skills of English-language learners.1

State Of Knowledge

Assessment plays a central role in the education of English-language learners and bilingual children. Teachers generally use assessments to monitor language development in students' first or second language and track the quality of their day-to-day subject matter learning. In addition, assessments are used to place students in special programs and to provide information used for accountability and policy analysis purposes. The research issues related to these roles have much in common.

Several uses of assessment at the classroom and school levels are unique to English-language learners and bilingual children, while others also apply to students generally. Uses unique to English-language learners and bilingual children include the following:

•	Identification of children whose English proficiency is limited

1The standards for assessing reading and writing developed by the International Reading Association and the National Committee of Teachers of English, as well as those currently in development by Teachers of English to Speakers of Other Languages for assessing English proficiency, are consistent with and supportive of the model of assessment emerging from the review in this chapter.

Page 114 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 114

•	Determination of eligibility for placement in specific language programs (e.g., bilingual education or English as a second language [ESL])
•	Monitoring of progress in and readiness to exit from special language service programs

Uses of assessment that extend beyond English-language learners include the following:

•	Placement in categorically funded education programs, such as special education, gifted and talented, and Title I programs
•	Placement in remedial or advanced academic course work
•	Monitoring of achievement in compliance with school district and/or state-level assessment programs
•	Certification for high school graduation and determination of academy mastery at graduation

In addition, the federal government sponsors a variety of assessments, such as the National Assessment of Educational Progress, to measure the performance and progress of U.S. students. Additional discussion of the National Assessment of Educational Progress and other large-scale assessments in relation to English-language learners is included in Chapter 9.

The remainder of this section begins by looking at issues of validity and reliability associated with student assessment. The next two subsections review uses of assessment that are unique to English-language learners and issues involved in assessing language proficiency. This is followed by two subsections that examine uses of assessment that extend beyond English-language learners and those associated with the assessment of subject matter knowledge. One additional set of assessment issues is then explored—those associated with assessing special populations, including very young second-language learners and English-language-learners with disabilities. The chapter ends with a discussion of standards-based reform and its implications for the design and conduct of student assessments.

Validity and Reliability Issues

It is essential that those using any assessment impacting children's education strive to meet standards of validity and reliability (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1985). Validity concerns whether the inferences drawn from assessment outcomes are appropriate to the purposes of the assessment. It encompasses use of an assessment to measure current achievement and ability relative to specific performance criteria, as well as the potential for future achievement, and to investigate the underlying competencies that theory indicates

Page 115 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 115

should be tapped by an assessment. Reliability concerns the accuracy of assessment outcomes in light of the variations in those outcomes that are due to factors irrelevant to what the assessment was intended to measure. Such factors might include characteristics of the individual, the fact that the assessment represents only a sample of a larger universe of assessment items, and inconsistency of the scoring of performance on an assessment (such as a constructed response test) from scorer to scorer and across an individual's scoring of the same assessment. The issue of reliability is made more complex because these factors may interact in ways that are not readily measured for their impact on performance (Cronbach et al., 1995). The validity and reliability of assessments can be investigated using a wide range of psychometric and statistical procedures, as well as experimental and qualitative studies of assessment performance.

Garcia and Pearson (1994:343-349) examine assessment and diversity across a wide range of subject matters and test types. They highlight potential problems for English-language learners that result from the ''mainstream bias" of formal testing, including a norming bias (small numbers of particular minorities included in probability samples, increasing the likelihood that minority group samples are unrepresentative), content bias (test content and procedures reflecting the dominant culture's standards of language function and shared knowledge and behavior), and linguistic and cultural biases (factors that adversely affect the formal test performance of students from diverse linguistic and cultural backgrounds, including timed testing, difficulty with English vocabulary, and the near impossibility of determining what bilingual students know in their two languages).

The ensuing discussion of assessment as applied to English-language learners and bilingual children inherently involves questions about the validity and reliability of assessments and their appropriateness for these children. It is also important to note that assessment practices have social and educational consequences that should be considered in an ongoing program of validity research (Messick, 1988).

Assessment Purposes Unique to English-Language Learners

There are many purposes for assessments of language proficiency, including placing students in special services, monitoring their progress, predicting educational outcomes, and exiting students from special language services. According to four recent surveys, states and local districts use a variety of methods to determine which language-minority students have limited English proficiency, to place these students in special language-related programs, and to monitor the progress of the students in such programs (August and Lara, 1996; Cheung et al., 1994; Fleishman and Hopstock, 1993; Rivera, 1995). These methods include home language surveys, registration and enrollment information, observations, interviews, referrals, grades, and classroom performance and testing (Cheung et al., 1994). However, administration of language proficiency tests in English is

Page 116 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 116

the most common method (Fleishman and Hopstock, 1993). Fleishman and Hopstock found that 83 percent of school districts with English-language learners used English-language proficiency testing, either alone or in combination with other techniques, to determine which language-minority students were of limited English proficiency. Similarly, such tests were used by 64 percent of school districts for assigning English-language learners to specific instructional services in schools and by 74 percent of school districts for reclassifying students once they have developed English proficiency.

Achievement tests in English are also frequently used by school districts and schools to help identify English-language learners, assign them to school programs, and reclassify them when English proficient (Fleischman and Hopstock, 1993). Specifically, 52 percent of school districts and schools across the country use such tests to help identify English-language learners, 40 percent use them to help assign students to specific instructional programs within a school, and over 70 percent use them for reclassification purposes (as reported in Zehler et al., 1994).

There is a great deal of variability across school districts in the way assessments are used for the above purposes. This is because many states, while providing guidance to the districts on assessment procedures for students with limited English proficiency, allow them considerable flexibility in choosing assessment methods, assessment instruments (usually from a menu of state-approved instruments), and cutoff scores for these instruments (August and Lara, 1996).2

Issues in Assessing Language Proficiency

Regardless of the modality of testing, many existing English-language proficiency instruments emphasize measurement of a limited range of grammatical and structural skills. Test items are frequently designed to assess a specific discrete language skill, though some tests and test items involve assessment of a number of discrete skills simultaneously. In part, emphasis on assessment of

2Of the 25 states that have assessment requirements for determining which language-minority students are of limited English proficiency, 22 specify English proficiency tests. Of these 22 states, 8 also specify achievement tests, and 3 specify English proficiency tests and below-average performance based on grades or classwork. When assessment is used for program placement, similar procedures are used. In the other states, it is up to individual districts to set these policies. In some states, native-language proficiency assessments are required (Arizona, Hawaii, Utah, California, Texas, New Jersey) or recommended. The only information regarding methods for reclassifying students from language assistance programs (Cheung and Soloman, 1991) indicates that language tests are the most frequently used method (required in 36 percent of states, recommended in 30 percent), followed by content area tests (required in 34 percent of states, recommended in 11 percent). Other methods recommended for determining program exit include observations and interviews. About one-third of states reported having no state requirement regarding exit criteria.

Page 117 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 117

grammatical and structural control of a language is a legacy from first-language acquisition studies. First-language acquisition research was dominated, especially in the 1970s, by arguments between empiricists and nativists who used morphology and syntax as the primary battleground for framing our scientific understanding of language acquisition (Bialystok and Hakuta, 1994).

During the 1970s and 1980s, new models of bilingual language competence emerged from the fields of linguistic pragmatics, interactional sociolinguistics, and cognitive studies of discourse processing. These perspectives, which were better attuned to the language demands faced by language-minority students in everyday settings (Rivera, 1984), examined how children acquire competence in using language to accomplish purposeful functions arising in social interaction (e.g., Wong Fillmore, 1982) and how language practices are tied to ongoing participation in classroom activities, referred to as authentic assessment (e.g., Gutierrez, 1995). As a consequence of these new models of language competence, Valdez Pierce and O'Malley (1992) recommend assessment procedures for monitoring the language development of language-minority students in the upper elementary and middle grades that reflect tasks typical of the classroom or real-life settings. As examples, they cite oral interviews, story retellings, simulations/situations, directed dialogues, incomplete story/topic prompts, picture cues, teacher observation checklists, and student self-evaluations. They also describe a portfolio assessment framework for monitoring the development of English-language learners. Authentic assessments are both more difficult to administer and less objectively scored than traditional assessments, but they do reflect the important view that language proficiency is multifaceted and varies according to the task demands and content area domain (see Chapter 2). Widespread implementation of practical assessments based on this viewpoint has been slow to emerge and is an important area for further research. One promising approach has been developed by Royer and Carlo (1991). They report on the utility of a sentence verification technique test (which basically involves reading or listening to a passage and then marking sentences as to whether they correctly reflect the information in the passage). The authors suggest the passages can be developed locally, based on curricular material familiar to the student. This form of assessment is relatively easy to develop in any language, and the reliability and validity data appear strong.

However, in pursuing new assessments of language proficiency for English-language learners and bilingual children, we should not ignore existing language assessment methods that focus on discrete language skills, even though there are differing beliefs about which components are most critical. For example, evidence exists throughout the cognitive and psycholinguistic research literature that routinization of basic language recognition and production skills is associated with greater fluency in language use at the level of spoken and written discourse (McLaughlin, 1984). Thus the assessment of these skills is a legitimate endeavor, though it is important to recognize that such assessments may have good predictive

Page 118 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 118

ability because they are tapping an ability correlated with a variety of language proficiencies, not because they constitute language proficiency.

In summary, the major purpose of English-language proficiency testing has been to determine placement in special language programs, monitor students' progress while in these programs, and decide when students should be exited from these programs. Most measures used not only have been characterized by the measurement of decontextualized skills, but also have set fairly low standards for language proficiency. Ultimately, English-language learners should be held to high standards for both English language and literacy, and should transition from special language measures to full participation in regularly administered assessments of English-language arts.

Assessment Purposes That Extend Beyond English-Language Learners

The assessment policies discussed in this section are related to determining eligibility for federal assistance and monitoring student progress at the state and district levels.

Title I is by far the largest federal program serving English-language learners. Yet past practice in using tests to assess eligibility for such programs raises a number of issues. For example, in documenting district policies, Strang and Carlson (1991) found that many English-language learners were not being served through Title I because districts required students to be English proficient before they could be served. However, those English-language learners who met the English proficiency requirements also scored above the cut-off on English achievement tests used for Title I selection.

New Title I assessment policy is currently being discussed because of changes in the law (see Kober and Feuer, 1996). Those changes provide for the participation of all students, including English-language learners, in assessments to determine whether they are meeting performance standards and for reasonable adaptations of these assessments to accomplish this end. According to the law, English-language learners are to be included in assessments to the extent practicable, in the language and form most likely to yield accurate and reliable information on what they know and can do, including their mastery of skills in target subject matter areas, not just English. The law now further requires that each state plan identify the languages other than English that are present in the participating student population and indicate the languages for which yearly student assessments are not available and are needed. States are required to make every effort to develop such assessments and may request assistance from the Secretary of the Department of Education if linguistically accessible assessment measures are needed (see August et al., 1995).

Assessment is particularly important for purposes of selecting eligible students for services in Title I targeted assistance programs, whereby Title I services are made available to a subset of the students "on the basis of multiple, educationally

Page 119 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 119

related, objective criteria established by the local educational agency and supplemented by the school" (Section 1115). The current policy guidance provided by the U.S. Department of Education does not elaborate on how this might be accomplished for English-language learners, and leaves it up to local districts to select those eligible students "most in need of special services." In the absence of adaptations to assessments, including assessments conducted in the native language, as well as methods for determining how English-language learners compare with other students on educational needs, a large proportion of English-language learners may not be served through Title I.

Surveys of state-wide assessment systems (August and Lara, 1996; Rivera, 1995) show that states use a variety of measures to assess student performance, including performance-based assessments and standardized achievement tests, and that states are in various stages of incorporating English-language learners into these assessments. August and Lara (1996) found that only 5 states require English-language learners to take state-wide assessments required of other students;3 36 states exempt English-language learners from such assessments, although 22 of those states require these students to take the assessments after a given period of time (usually 1-3 years). Some states base their assessment decision on the proficiency level of their English-language learners; of these, a few leave it up to local districts to determine which students have enough English proficiency to participate in the state-wide assessments. Finally, some states use multiple criteria to excuse students from state-wide assessments, including number of years in English-speaking classrooms, language proficiency scores, school achievement, and teacher judgment.

States use a variety of approaches to assess students that have been exempted from the state-wide assessments. Hafner (1995) reports that 55 percent of states allow modifications in the administration of at least one of their assessments to incorporate English-language learners. The most common modifications are extra time (20 states), small-group administration (18 states), flexible scheduling (16 states), simplification of directions (14 states), use of dictionaries (13 states), and reading of questions aloud in English (12 states). Other accommodations include assessments in languages other than English, availability of both English and non-English versions of the same assessment items, division of assessments into shorter parts, and administration of the assessment by a person familiar with the children's primary language and culture (Rivera, 1995).

Some states also provide guidance to scorers on evaluating the work of English-language learners. Hafner (1995) reports that 10 percent of states give special training on evaluating the work of English-language learners, and 10 percent give directions in their manuals. Some training entails the development of scoring rubrics and procedures for constructed response items that are sensitive

3In 3 of these states, however, English-language learners may be exempted under certain conditions.

Page 120 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 120

to the language and cultural characteristics of English-language learners. The Council of Chief State School Officers recently developed a Scorer's Training Manual (Wong Fillmore and Lara, 1996) to be used by states and local education agencies to aid in the scoring of English-language learners' answers to open-ended mathematics questions. In collaboration with the National Center for Educational Statistics and the Educational Testing Service, this manual will be piloted using the work of English-language learners who participated in the 1996 National Assessment of Educational Progress math assessment to see how well it prepares scorers to assess the work of those students accurately.

Clearly, classroom teachers also assess students to determine how well they are grasping coursework and to inform instructional practice (see Chapter 7). Innovations at the classroom level include an assessment process that is multiple referenced and incorporates information about the students in a variety of contexts obtained from a variety of sources through a variety of procedures (Genesee and Hamayan, 1994). Navarette et al. (1990) describe innovative assessment procedures that include unstructured techniques (e.g., writing samples, homework, logs, games, debates, story telling) and structured techniques (e.g., criterion-referenced tests, cloze tests, structured interviews), as well as a combination of the two (portfolios). In addition, students are assessed in their native language to better determine their academic achievement and ensure appropriate coursework (Genesee and Hamayan, 1994). Information on student background characteristics such as literacy in the home, parents' educational backgrounds, and previous educational experiences is collected and provides essential information that helps put the assessment results in context.

Issues in Assessing Subject Matter Knowledge

A central issue in assessing subject matter knowledge is determining what knowledge is intended for assessment. This issue is discussed in detail in the later section on standards-based reform. In the discussion in this section, we assume that the developers of an assessment have decided what to assess and examine the difficulties involved in incorporating English-language learners and bilingual children into assessments intended for their English-proficient peers.

As noted in the Standards for Educational and Psychological Tests, every assessment is an assessment of language (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1985). This is even more so given the advent of performance assessments requiring extensive comprehension and production of language.4

4For example, the performance description for mathematical communication, one of seven mathematical performance areas for elementary school children, requires the student to "use appropriate mathematical terms, vocabulary, and language based on prior conceptual work; show ideas in a variety of ways including words, numbers, symbols, pictures, charts, graphs, tables, diagrams, and models; explain clearly and logically solutions to problems, and support solutions with evidence, in both oral and written form; consider purpose and audience when communicating; and comprehend mathematics from reading assignments and from other sources" (New Standards, 1995). Quite clearly, this assessment of mathematical skills is also an assessment of language proficiency.

Page 121 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 121

The English-language proficiency levels of students affect their performance on subject area assessments given in English. For example, Garcia (1991) found that the English reading test performance of Spanish-speaking Hispanic students was adversely affected by their unfamiliarity with vocabulary terms used in the test questions and answer choices. In fact, interview data demonstrate that the presence of unknown vocabulary in the questions and answer choices was the major linguistic factor that adversely affected the Hispanic children's reading performance.5 Alderman (1981) found that the relationship between test scores on Prueba de Aptitud Academica (a Spanish version of the SAT developed for use with native Spanish speakers) and English SAT scores increased with higher English proficiency test scores for native Spanish-speaking high school students. This study indicated that aptitude can be seriously underestimated if the test taker is not proficient in the language in which the test is being given.

Given that the English proficiency level of students affects their performance on assessments administered in English and that recent assessments require high levels of English proficiency, research is needed to develop assessments and assessment procedures appropriate for English-language learners. One strategy under active investigation is the use of native-language assessments. Approximately 75 percent of English-language learners come from Spanish-language backgrounds. For some of these students, it is realistic to develop native-language assessments. However, in doing so, it is desirable to keep in mind the difficulties involved in developing native-language assessments that are equivalent to the English versions. Such difficulties include problems of regional and dialect differences, nonequivalence of vocabulary difficulty between two languages, problems of incomplete language development and lack of literacy development in students' primary languages, and the extreme difficulty of defining a "bilingual" equating sample (each new definition of a bilingual sample will demand a new statistical equating). Minimally, back-translation should be done to determine equivalent meaning, and ideally, psychometric validation should be undertaken as well.6

The challenge of using native-language assessments or bilingual versions is illustrated by the results of research on developing and administering mathematics

5Garcia (1991) also found that the Hispanic students' English reading test performance was adversely affected by their limited prior knowledge of certain test topics, their poor performance on the implicit questions (which required use of background knowledge), and their tendency to interpret the test literally when determining their answers. These findings have implications for the schooling of English-language learners (see Chapters 3 and 7).

6Hambleton and Kanjee (1994) recommend validating the translated version with empirical evidence using item response theory.

Page 122 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 122

test items only in Spanish or in side-by-side Spanish-English format as part of the National Assessment of Educational Progress field test of mathematics items (Anderson et al., 1996). Spanish-language items were translations of English-version items. This research found substantial psychometric discrepancies in students' performance on the same test items across both languages, leading to the conclusion that the Spanish and English versions of many test items were not measuring the same underlying mathematical knowledge. This result may be attributable to a lack of equivalence between original and translated versions of test items and needs further investigation.

Another strategy to make assessments both comprehensible and conceptually appropriate for English-language learners might entail decreasing the English-language load through actual modification of the items or instructions. This would not be a straightforward task, however. While some experts recommend reducing nonessential details and simplifying grammatical structures (Short, 1991), others claim that simplifying the surface linguistic features will not necessarily make the text easier to understand (Saville-Troike, 1991). When Abedi et al. (1995) reduced the linguistic complexity of National Assessment of Educational Progress mathematics test items in English, they reported only a modest and statistically unreliable effect in favor of the modified items for students at lower levels of English proficiency.

Other strategies for incorporating English-language learners into assessments include those mentioned earlier, such as extra time, small-group administration, flexible scheduling, reading of directions aloud, use of dictionaries, and administration of the assessment by a person familiar with the children's primary language and culture (Rivera, 1995). Additional possibilities include making test instructions more explicit and allowing English-language learners to display their knowledge using alternative forms of representation (e.g., showing math operations on numbers and knowledge of graphing in problem solving). Almost no research has been conducted to determine the effectiveness of these techniques, however.

Another issue in assessment of subject matter knowledge for English-language learners is the errors that result from inaccurate and inconsistent scoring of open-ended or performance-based measures. There is evidence that scorers may pay attention to linguistic features of performance unrelated to the content of the assessment. Thus, scorers may inaccurately assign low scores for performance in which English expression (either oral or written) is weak. This obviously confounds the accuracy of the score enormously.7 Absent training, different scorers probably will rate the same student work very differently.

7Interestingly, Lindholm (1994) found highly significant and positive correlations between standardized scores of Spanish reading achievement and teacher-rated reading rubric scores, as well as between the standardized reading scores and students' ratings of their reading competence, for native English-speaking and native Spanish-speaking students enrolled in a bilingual immersion program.

Page 123 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 123

Issues in Assessing Special Populations

Very Young Second-Language Learners

Assessing young children's development in meaningful ways is already surrounded by a great deal of controversy and concern among the preschool education community. As Meisels (1994:210-211) states:

…measurement in preschool is marked by recurrent practical problems of formulation and administration. … Many measurement techniques used with older children are inappropriate for use with children below school age, or even below grade 3. For example, the following methods are extremely unlikely to yield valid information about normative trends in development: paper and pencil questionnaires, lengthy interviews, abstract questions, fatiguing assessment protocols, extremely novel situations or demands, objectively-scored, multiple choice tests, isolated sources of data. None of these methods are consistent with principles of developmentally appropriate assessment.

If none of these practices are appropriate for young children in general, their inappropriateness for children from different linguistic and/or cultural backgrounds can certainly be taken as a given.

For these reasons, McLaughlin et al. (1995:7-8) have called for a special set of guidelines to be used in assessing bilingual preschool children. These guidelines include the following:

•	Developmental and cultural appropriateness
•	Awareness of the child's linguistic background
•	An approach that allows children to demonstrate what they can do
•	Involvement of parents and family members, teachers, and staff, as well as the child

Using these guidelines, McLaughlin et al. recommend what they call "instructionally embedded assessment," in which teachers make a plan about what, when, and how to assess a child; collect information from a variety of sources, including observations, prompted responses, classroom products, and conversations with family members; develop a portfolio; write narrative summaries; meet with family and staff; and finally, use the information to inform curriculum development. And this is a recursive process that begins again once it has been completed for any individual child. An assessment system of this sort is, of course, extremely time-consuming and necessitates reform in several areas, including use of time, professional staff development, accountability, and relationships with parents. It may, however, be the only meaningful way teachers can assess young second-language learners.

Page 124 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 124

Children with Disabilities

The field still lacks instruments appropriate for assessing English-language learners with disabilities. This problem is exacerbated by the lack of assessment personnel with expertise in evaluating linguistically and culturally diverse learners. Among the most commonly recommended approaches to nondiscriminatory assessment are the use of nonverbal measures (e.g., the Performance Scale of the Weschler Intelligence Scales for Children [WISC], the Leiter International Performance Scale); translation of instruments into the student's native language; culture-free, culture-fair tests; culture-specific tests; and pluralistic assessments (Shinn and Tindal, 1988). Shinn and Tindal caution, however, that many of these alternative assessment instruments have inadequate psychometric properties and may not provide a comprehensive picture of students' skills and abilities. For example, even so-called nonverbal tests have a verbal component (e.g., instructions for item completion are usually required). With regard to translation of instruments into the student's native language, this is difficult to do well; moreover, some English-language learners may not be literate in their native language. Furthermore, if norms for native-language test versions are not available, assessment personnel may interpret results using English norms, a practice that may give inaccurate results. And because learning occurs in environmental contexts, it is not possible to develop culture-free or culture-fair tests. The practical strategy may be to train assessment personnel rather than await the development of norm-referenced instruments appropriate for English-language learners.

The literature does identify several promising practices in assessment of English-language learners with disabilities that may be useful as well for inclusion of all English-language learners in local and state assessments. Tharp and Gallimore (cited in Durán, 1989) recommend a process of assisted performance in which the teacher first assesses the student's learning performance and then aids the learner in attaining new competencies. Durán also recommends the use of dynamic assessment (e.g., Feuerstein's [1979] Learning Potential Assessment Device), which also involves a test-train-test cycle during which a student's response to a criterion problem is evaluated, and feedback is given to help improve performance. Lewis (1991) recommends the use of the Kaufman Assessment Battery for Children because it separates the mental processing scores from the achievement scores and because it includes a training component to ensure that the student understands the task. He suggests that this approach accommodates different cognitive processing styles, an advantage in assessing diverse cultural groups. He suggests that Feuerstein's dynamic assessment approach and the Kaufmann Assessment Battery for Children are more advantageous than instruments like the Weschler Intelligence Scales for Children-Revised (WISC-R) because they deemphasize factual information and learned content and focus instead on problem-solving tasks.

Because of the myriad of factors that must be considered in distinguishing

Page 125 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 125

linguistic and cultural differences from disabilities, ecological models of assessment are recommended so that learning problems are examined in light of contextual variables affecting the teaching-learning process, including the interaction of teachers, students, curriculum, instructional variables, and so forth. Assessors must consider the student's native- and English-language skills, select appropriate measures for assessing skills across languages, and interpret outcomes in light of factors such as the student's age and cultural and experiential background (Cloud, 1991).

Standards-Based Reform

The standards-based reform movement has major implications for English-language learners, especially in the area of assessment. This section first provides background information about standards-based reform and then examines the implications for English-language learners.

Background

The standards-based reform movement commonly refers to three types of standards: content, performance, and opportunity-to-learn standards (McLaughlin and Shepard, 1995). Content standards address what students should know, what schools should teach, and what instruction should be about. Such standards in a broad range of subject matters, such as language arts, mathematics, science, social studies, history, geography, foreign language, art, and physical education, are being developed through the efforts of a number of educational stakeholder institutions at the national, state, and local community levels and subject matter professional groups. Well-publicized efforts of this sort include the mathematics content standards developed by the National Council of Teachers of Mathematics (1989) and the controversial language arts standards developed by the International Reading Association and the National Council of Teachers of English (see the discussion of literacy in Chapter 3). More recently, these efforts have also included the development of model standards by the Teachers of English to Speakers of Other Languages (1996) to guide instruction and assessment of English skills and knowledge for non-English-background children and adults.

Content standards are critical to the assessment and evaluation process. In essence they operationalize an implicit theory of what education can be about for children. While content standards begin to specify broad curriculum goals, performance standards are intended to specify concrete examples and explicit definitions of what students must know and be able to do to demonstrate that they are proficient in the skills and knowledge framed by content standards (McLaughlin and Shepard, 1995). The emphasis of performance standards is on evidence that provides information about the degree of mastery or proficiency shown by students

Page 126 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 126

in a content area. Performance standards address the question: ''How good is good enough (New Standards, 1995)?

Opportunity-to-learn standards define the level and availability of programs, staff, and other resources sufficient to enable all students to meet challenging content and performance standards (McLaughlin and Shepard, 1995).

The passage of Goals 2000 and the reauthorization of the Elementary and Secondary Education Act (now called the Improving America's Schools Act) in 1994 set in legislation many of the tenets of standards-based reform (see Annex 5-1). Goals 2000 provided resources to states and local districts for developing standards and assessments and implementing state and local improvement initiatives. The Improving America's Schools Act adopted the standards-based framework most conspicuously in its Title I compensatory education programs, which serve a large number of English-language learners (Moss and Puma, 1995). The state plan, the local education agency plan, a demonstration of yearly progress, and school improvement plans are all framed around standards and their assessment. Title VII was also framed around standards, with greater emphasis on school- and district-wide programs to include English-language learners in systemic improvement efforts focused on attaining performance standards.

Efforts to develop performance standards and assessments tied to those standards have been spearheaded by such groups such as New Standards (1995) and the Council of Chief State School Officers (1992), as well as by numerous states and consortia of school districts. Assessments based on performance standards can employ a variety of assessment techniques, including familiar standardized multiple-choice test items; brief constructed response exercises, such as fill-in items; extended constructed response problem-solving exercises, such as essays and written or oral explanations; projects; and aggregations of student work in the form of portfolios or collections of student work. Regardless of the technique employed, however, all such assessments aspire to "faithfully reflect important learning goals.… Individual assessment tasks should elicit the kinds of demonstrations and applications of knowledge ultimately expected of students; and the complete assessment should represent the extent and range of knowledge expected" (McLaughlin and Shepard, 1995:52).

Individual states or other education agencies responsible for a standards-based system develop specifications that serve as guidelines for the creation of performance assessments.8 The actual development and technical study of assessments are then undertaken by the state or other appropriate education agencies, or alternatively by test publishers sanctioned to perform such work. Collectively,

8The term "performance assessment" is sometimes used in an ambiguous manner. On some occasions, it is used to refer to any form of assessment pertinent to performance standards. For example, a multiple-choice question requiring recall of a key fact or concept might be consistent with a performance standard tied to a content standard. On other occasions, the term is used to denote assessment exercises requiring constructed responses or complex open-ended performances.

Page 127 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 127

all of these groups bear professional and scientific responsibility for investigating the validity and reliability of assessment systems for all children, including English-language learners and bilingual students (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1985).

Implications for English-Language Learners

Both Goals 2000 and the Improving America's Schools Act state explicitly that all students, including English-language-learners, are expected to attain high standards. For example, program accountability provisions in both Title I and Title VII are framed around the need to demonstrate that students in these programs are meeting state and local performance standards for all students (see Annex 5-1). The demonstration of results has been a particularly complex issue for English-language learners because of the unavailability of assessments suited to their needs, as discussed previously.

Issues of validity and reliability in assessing the subject matter knowledge and skills of English-language learners were discussed earlier in this chapter. Another assessment issue related to standards-based reform is how to define adequate yearly progress for English-language learners. The new Title I law, for example, requires that adequate yearly progress be defined in a manner that "…is sufficient to achieve the goal of all children served under [this part in] meeting the State's proficient and advanced levels of performance, particularly economically disadvantaged and LEP students." Yearly progress as defined by the law pertains to the progress of districts and schools, measured by the aggregation of individual student scores on assessments aligned with performance standards. According to the law, the same high performance standards that are established for all students are the ultimate goal for English-language learners as well.

On average, however, English-language learners (especially those with limited prior schooling) may take more time to meet these standards. Therefore, additional benchmarks might be developed for assessing the progress of these students toward meeting the standards. Moreover, because English-language learners are acquiring English-language skills and knowledge already possessed by students who arrive in school speaking English, additional content and performance standards in English-language arts may be appropriate. Recently, Teachers of English to Speakers of Other Languages has developed model content standards to guide the instruction and assessment of English skills and knowledge for such students.

Another issue related to adequate yearly progress has to do with districts' obligation to determine whether schools served by Title I funds are progressing sufficiently toward enabling all children to meet the state's student performance standards. According to the law, adequate progress is defined as that which results in continuous and substantial yearly improvement of each district and

Page 128 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 128

school, sufficient to achieve the goal of having all children—particularly the economically disadvantaged and English-language learners—meet the state's proficient and advanced levels of performance. To determine whether English-language learners are meeting these standards, assessment results will have to be disaggregated by English proficiency status. Some states, such as Florida, Hawaii, Louisiana, Maine, Ohio, and Washington, report disaggregating data by English proficiency status (August and Lara, 1996). However, research is needed to determine how best to accomplish this in statistically sound ways, especially in light of alternative assessment procedures used with English-language learners.

Because of the difficulties in assessing English-language learners, it may be important to assess their access to necessary resources and conditions, such as adequate and appropriate instruction. However, defining and assessing these conditions is a very difficult task. Although there has been substantial work in defining some conditions, such as content coverage and time for mainstream students (Carroll, 1958; Leinhardt, 1978), the research base for defining the most important and effective resources and conditions for English-language learners is very weak (see Chapter 7). However, many English-language learners find themselves in poor schools and do not have access to the basics of education necessary for success in school. A good start would be to define and assess these essential resources (e.g., textbooks, course offerings, accessibility of information) while continuing research into other aspects of school life, such as effective school-wide and classroom attributes.

In terms of improving opportunities to learn for English-language learners, another strategy would be to encourage the development and evaluation of methods to help school staff monitor progress in improving schooling through systematic attempts to compare their school's performance against certain quality indicators.9 This notion is further elaborated in Chapter 7.

Research Needs

A relatively small number of student assessment research needs stand out as candidates for highest priority given our existing knowledge base. To address these needs, there must be coordination with the research findings on the linguistic and cognitive development of children (see Chapters 2 and 3).

Issues in Assessing Language Proficiency

5-1. Research is needed on how assessments of children's language proficiency in their primary language and English can be improved so they are

9California, for example, has a Program Quality Review System that relies on peer review. Additional benchmarks could include school-wide and classroom factors that are known to improve the performance of English-language learners.

Page 129 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 129

consistent with research findings on first- and second-language acquisition and literacy development.

Existing English-language proficiency instruments emphasize measurement of a limited range of frequently discrete language skills, such as grammar and syntax. Assessments of language proficiency need to be broadened to reflect findings from research in such fields as linguistic pragmatics, interactional sociolinguistics, and cognitive studies of discourse processing that better reflect the language demands placed on language-minority children in everyday contexts (although existing methods for assessing discrete skills should not be ignored).

New research on language proficiency, building on research on social factors in school learning highlighted in the previous chapter, should attend to issues such as sensitivity to bilingualism as a social phenomenon and should take into account the potential impact of bilingualism on language proficiency assessment (Valdes and Figueroa, 1994). We need to know more about how community language use affects the development of proficiency in two languages (see Chapter 2). Verhoeven (1996), for example, has found that immigrant children may acquire a less-developed knowledge of grammar in their first language as a result of their limited exposure to use of that language in their new communities. Acquiring language in a bilingual community may lead to variations in both the first and second languages that incorporate grammatical, lexical, and idiomatic features from the other language.

5-2. Research is needed on how to use assessments to determine levels of proficiency in different aspects of English required for English-language learners to participate in English-only instruction. What are the measurement issues associated with the determination of these aspects? How do these proficiency requisites vary by subject and grade?

Although many states and local districts have established performance standards for exit from special language assistance programs, these standards have not been validated by tracking student performance in mainstream classrooms. Proficiency requisites may vary by subject since some content areas are more dependent on language than others (for example, reading versus math). They will vary by age since language becomes less contextualized in the upper grades.

Issues in Assessing Subject Matter Knowledge

5-3. Research on the assessment of subject matter knowledge needs to address the following questions. First, how do students' levels of English proficiency affect their performance on subject area assessments given in English? Second, how does verbal facility with a first language affect performance on assessments in the native language? Third, how does the language used for instruction affect performance on assessments in the native language?

Page 130 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 130

Research to date (Alderman, 1981) has found that a student's aptitude in a subject can be significantly underestimated if the test is administered in a language in which the student has limited proficiency. Further research is needed to explore this relationship.

With regard to the question of how facility with a first language affects performance on assessments in that language, it is not appropriate to approach this issue as a question of "proficiency" in the native language; such an approach makes neither theoretical nor empirical sense since native speakers acquire proficiency in a first language through their early socialization and additional capacity for proficiency through biological maturation. Instead, the key issues in assessment surround children's familiarity with the kind of language used on an assessment in the first language. For example, we need to investigate how well children understand assessment instructions in their first language—a peculiar usage of language that depends on previous experience with tests of the sort being administered. Research is also needed on how the language used for instruction affects assessment performance in a primary language. For example, do students with communicative competence in their native language but schooling in English perform better when assessed in English than when assessed in their native language? What effect do native language proficiency, years of schooling in English, and difficulty of subject matter assessed have on their performance?

5-4. Research is needed to develop assessments and assessment procedures that incorporate more English-language learners. Further, research is needed toward developing guidelines for determining when English-language learners are ready to take the same assessments as their English-proficient peers and when versions of the assessment other than the "standard" English version should be administered.

This research should include attention to the language and performance demands of assessments and assessment instructions that are separate from the content and domain under assessment. It should also include investigation of the effect of any modifications on the validity and reliability of the assessment. We need to understand better the interaction between the performance of English-language learners and the nature of the assessment. That is, do certain assessment formats (e.g., multiple choice versus constructed response) make it more difficult or easier for such students to express subject matter knowledge?

Criteria are needed as well for determining which English-language learners should take which form of an assessment—an unmodified English version, a native-language version, a modified English version, an English assessment with support, or some other alternative assessment mode (see August and McArthur, 1996).

5-5. Research is needed to address inaccurate and inconsistent scoring of open-ended or performance-based measures of the work of English-language

Page 131 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 131

learners. How can errors resulting from such inaccurate and inconsistent scoring be reduced?

We need to understand the mechanisms by which the filter of English can influence scorers' accuracy and consistency and ways in which the scoring of English-language learner assessments can be improved.

Standards-Based Reform

5-6. Research needs to address whether it is possible to establish common, standard benchmarks for subject matter knowledge and English proficiency for English-language learners within a valid theoretical framework; what these benchmarks might be; and how the benchmarks for English proficiency might be related to performance standards for English-language arts.

5-7. Research is needed to determine whether in the context of school and district outcomes, English-language learners are making progress toward meeting proficient and advanced levels of performance. How can the outcomes of nonstandard administrations/alternative assessments be incorporated into district- and state-wide accountability systems and reporting requirements?

5-8. Research is needed into how opportunities to learn can be evaluated.

Standards-based reform has contributed to a redefinition of the role of assessment that has implications for English-language learners. These policies call for the inclusion of these students in assessments, for assessments that are systematically linked to standards systems at the district and state levels, and for evaluations of programs at the school site and district levels to ensure that students are meeting the standards. Although current policy does not require assessments of school and classroom conditions and resources that make it possible for students to meet new standards, educators concerned with helping language-minority children are interested in assessing these opportunities to learn.

Page 132 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 132

Annex:
Legislative Context For Standards And Assessment

Legislation passed by Congress in recent years contains several consistent themes regarding student assessment, program evaluation (see Chapter 6), and standards with respect to English-language learners. These themes provide important opportunities for directed research to guide the policy process. Legislative language expressing these themes can be found in Goals 2000 (P.L. 103-227), Title I (Helping Disadvantaged Children Meet High Standards) and Title VII (Bilingual Education Programs) of the Improving America's Schools Act of 1994 (P.L. 103-382), and the Reauthorization of the Office of Educational Research and Improvement (Title IX of P.L. 103-227). The themes might be encapsulated as follows:

•	Standards and assessments are to fully include English-language learners.
•	Innovative ways of assessing student performance are encouraged, including modifications to existing instruments for English-language learners.
•	Programs are to be evaluated with respect to whether they meet "challenging" performance standards, rather than on a normative or comparative basis.
•	Evaluations are to be useful for program improvement as well as program accountability.

The following subsections summarize key provisions of the major legislation.

Department of Education Organization Act of 1994

According to Section 216(b)(3) of this act:

The Secretary shall ensure that limited-English-proficient and language-minority students are included in ways that are valid, reliable, and fair under all standards and assessment development conducted or funded by the Department.

Goals 2000

Goals 2000 provides resources to states and communities to develop and implement systemic education reforms aimed at helping all students meet challenging academic and occupational standards. The law defines "all students" as meaning "students or children from a broad range of backgrounds and circumstances, including among others, students or children with limited English proficiency."

The law authorizes grants to states and local education agencies (LEAs) to help defray the cost of developing, field testing, and evaluating assessment systems

Page 133 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 133

that are aligned with state content standards. It sets aside a portion of funds for developing assessments in languages other than English.

Goals 2000 further authorizes federal grants to state education agencies (SEAs) for the purpose of developing a state plan to improve the quality of education for all students. Development of the state plan is to include establishment of teaching and learning standards and assessments aligned with these standards, as well as strategies for program improvement and accountability.

Title I

The law requires states to develop or adopt a set of high-quality yearly assessments, including assessments in at least reading or language arts and math, to be used as the primary means of determining the yearly performance of each LEA and school served under Title I in enabling all children to meet the state's student performance standards. (If states are using transitional assessments, they must devise a procedure for identifying LEAs and schools for improvement, and this procedure must rely on accurate information about the academic progress of each LEA and school.)

The law states that the same assessments must be used to measure the performance of all children. It specifies that assessments must be aligned with challenging content and student performance standards; provide coherent information about student attainment of such standards; be used for purposes for which such assessments are valid and reliable; measure the proficiency of students in the academic subjects in which a state has adopted challenging content and student performance standards; administered at some time during grades 3 through 6, 6 through 9, and 10 through 12; and involve multiple up-to-date measures of student performance. The assessments are to provide for the participation of all students; reasonable adaptations and accommodations for students with diverse learning needs; and the inclusion of English-language learners, who are to be assessed, to the extent practicable, in the language and form most likely to yield accurate and reliable information on what they know and can do, so that their mastery of skills in subjects other than English can be determined.

Furthermore, the law states that adequate and yearly progress must be defined in a manner that is consistent with guidelines established by the Secretary of Education, resulting in continuous and substantial yearly improvement of each LEA and school; such improvement must be sufficient to achieve the goal of enabling all children served under this part of the legislation to meet the state's proficient and advanced levels of performance, particularly economically disadvantaged students and English-language learners. Moreover, progress must be linked primarily to performance on the assessments carried out under this section of the legislation, while also being established in part through the use of other measures.

Page 134 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 134

Title VII

The law clearly indicates the purposes of evaluations for programs funded under Subpart 1 (Bilingual Education Capacity and Demonstration Grants): "(1) for program improvement, (2) to further define the program's goals and objectives, and (3) to determine program effectiveness."

Evaluations are to address student achievement using state student performance standards (if any), including data comparing English-language learners and other students on school retention, academic achievement, and gains in English (and where applicable, the non-English language) proficiency. The evaluations are also required to incorporate "program implementation indicators that provide information for informing and improving program management and effectiveness," including information on the curriculum and professional development. In addition, evaluations must describe the relationship of activities funded under Title VII to the overall school program and activities conducted through other sources.

Evaluations have consequences for comprehensive school grants and system-wide improvement grants. These programs are to be terminated if students "are not making adequate progress toward achieving challenging State content standards and challenging State student performance standards," and "in the case of a program to promote dual language facility, such program is not promoting such facility" (Sections 7114(b)(2) and 7115(b)(2)).

Subpart 2 of Title VII authorizes funds for data collection, dissemination, research, and program evaluation through grants, contracts, and cooperative agreements. Current or recent recipients of program grants may conduct longitudinal research to monitor the students. Funds are also made available for activities to promote the adoption and implementation of "programs that demonstrate promise of assisting children and youth of limited English proficiency to meet challenging State standards."

References

Abedi, J., C. Lord, and J. Plummer
1995 Language Background Report. Graduate School of Education, National Center for Research on Evaluation, Standards, and Student Testing. Los Angeles: University of California at Los Angeles.

Alderman, D.
1981 Language proficiency as a moderator variable in testing academic aptitude. Journal of Educational Psychology 74:580-857.

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education
1985 Standards of Educational and Psychological Testing. Washington, DC: American Psychological Association.

Anderson, N.E., F.F. Jenkins, and K.E. Miller
1996 NAEP Inclusion Criteria and Testing Accommodations. Findings from the NAEP 1995 Field Test in Mathematics. Washington, DC: Educational Testing Service.

Page 135 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 135

August, Diane, and Julia Lara
1996 Systemic Reform and Limited English Proficient Students. Washington, DC: Council of Chief State School Officers.

August, Diane, Kenji Hakuta, Fernando Olguin, and Delia Pompa
1995 LEP Students and Title I: A Guidebook for Educators. Washington, DC: National Clearinghouse for Bilingual Education.

August, Diane, and Edith McArthur
1996 Proceedings of the Conference on Inclusion Guidelines and Accommodations for Limited English Proficient Students in the National Assessment of Educational Progress (December 5-6, 1994). National Center for Education Statistics, Office of Educational Research and Improvement, U.S. Department of Education, Washington, DC.

Bialystok, E., and K. Hakuta
1994 In Other Words. New York: Basic Books.

Carroll, John B.
1958 Communication theory, linguistics, and psycholinguistics. Review of Educational Research 28(2):79-88

Cheung, Oona M., and Lisa W. Soloman
1991 Summary of State Practices Concerning the Assessment of and the Data Collection about Limited English Proficient (LEP) Students. Washington, DC: Council of Chief State School Officers

Cheung, Oona M., Barbara S. Clements, and Y. Carol Mieu
1994 The Feasibility of Collecting Comparable National Statistics about Students with Limited English Proficiency. Washington, DC: Council of Chief State School Officers.

Cloud, N.
1991 Educational assessment. Pp. 219-245 in E.V. Hamayan and J.S. Damico, eds., Limiting Bias in the Assessment of Bilingual Students. Austin, TX: Pro-Ed.

Council of Chief State School Officers
1992 Recommendations for Improving the Assessment and Monitoring of Students with Limited English Proficiency. Washington, DC: Council of Chief State School Officers.

Cronbach, L., R. Linn, R. Brennen, and E. Haertel
1995 Generalizability Analysis for Educational Assessments. Los Angeles: Center for Research on Evaluation, Standards and Student Testing and Center for the Study of Evaluation, University of California.

Durán, Richard P.
1989 Assessment and instruction of at-risk Hispanic students. Exceptional Children 56(2):154-158.

Feuerstein, R.
1979 The Dynamic Assessment of Retarded Persons. Baltimore, MD: University Park Press.

Fleischman, H.L., and P.J. Hopstock
1993 Descriptive Study of Services to Limited English Proficient Students, Volume 1. Summary of Findings and Conclusions. Prepared for Office of the Under Secretary, U.S. Department of Education by Development Associates, Inc., Arlington, VA.

Garcia, G.E.
1991 Factors influencing the English reading test performance of Spanish-speaking Hispanic children. Research Reading Quarterly 26(4):371-392.

Garcia, G.E., and P.D. Pearson
1994 Assessment and diversity. Review of Research in Education (20):337-391.

Genesee, F., and E.V. Hamayan
1994 Classroom-based assessment. In F. Genesee, ed., Educating Second Language Children: The Whole Child, the Whole Curriculum, the Whole Community. New York: Cambridge University Press.

Gutierrez, K.
1995 Unpackaging academic discourse. Discourse Processes 19(1):21-37.

Page 136 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 136

Hafner, A.
1995 Assessment Practices: Developing and Modifying Statewide Assessments for LEP Students. Paper presented at the annual conference on Large Scale Assessment sponsored by the Council of Chief State School Officers, June 1995. School of Education, California State University, Los Angeles.

Hambleton, R.K., and A. Kanjee
1994 Enhancing the validity of cross-cultural studies: Improvements in instrument translation methods. In T. Husen and T.N. Postlewaite, eds., International Encyclopedia of Education (2nd edition). Oxford, UK: Pergamon Press.

Kober, Nancy L., and Michael J. Feuer
1996 Title I Testing and Assessment. Challenging Standards for Disadvantaged Children. Summary of a Workshop. Board on Testing and Assessment, National Research Council. Washington, DC: National Academy Press.

Leinhardt, G.
1978 Educational opportunity: Opportunity to learn. Pp. 15-24, Chapter III in Perspectives in the Instructional Dimensions Study: A Supplemental Report from the National Institute of Education. Washington, DC: National Institute of Education.

Lewis, J.
1991 Innovative approaches in assessment. In R.J. Samuda and S.L. Kong, J. Cummins, J. Pascual-Leone, and J. Lewis, eds., Assessment and Placement of Minority Students. Toronto, Canada: C.J. Hogrefe.

Lindholm, K.
1994 Standardized Achievement Tests vs. Alternative Assessment: Are Results Complementary or Contradictory? Paper presented at the American Educational Research Association, New Orleans, April. School of Education, San Jose State University.

McLaughlin, B.
1984 Second-Language Acquisition in Childhood, 2d ed. Hillsdale, NJ: Erlbaum.

McLaughlin, B., A. Blanchard, and Y. Osanai
1995 Assessing Language Development in Bilingual Preschool Children. NCBE Program Information Guide Series, No. 22. Washington, DC: National Clearinghouse for Bilingual Education.

McLaughlin, M.W., and L.A. Shepard
1995 Improving Education Through Standards-Based Reform. Stanford, CA: The National Academy of Education.

Meisels, S.
1994 Designing meaningful measurements for early childhood. Pp. 202-222 in B. Mallory and R. New, eds., Diversity and Developmentally Appropriate Practices: Challenges for Early Childhood Education. New York: Teachers College Press.

Messick, Cheryl K.
1988 Ins and outs of the acquisition of spatial terms. Topics in Language Disorders 8(2):14-25.

Moss, M., and M. Puma
1995 Prospects: The Congressionally Mandated Study of Educational Growth and Opportunity. First Year Report on Language Minority and Limited English Proficient Students. Prepared for Office of the Under Secretary, U.S. Department of Education by Abt Associates, Inc., Cambridge, MA.

National Council of Teachers of Mathematics
1989 Curriculum and Evaluation Standards for School Mathematics. Reston, VA: National Council of Teachers of Mathematics.

Navarette, C., J. Wilde, C. Nelson, R. Martinez, and G. Hargett
1990 Informal Assessment in Educational Evaluation: Implications for Bilingual Education Programs. Program Information Guide No. 13. Washington, DC: National Clearinghouse for Bilingual Education.

Page 137 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 137

New Standards
1995 Performance Standards. English Language Arts, Mathematics, Science, and Applied Learning. Volumes 1, 2, and 3. Consultation Drafts. Washington, DC: National Center for Education and the Economy.

Rivera, Charlene
1984 Communicative Competence Approaches to Language Proficiency Assessment: Research and Application. Multilingual Matters 9. Rosslyn, VA: InterAmerican Research Associates.
1995 How We Can Ensure Equity in Statewide Assessment Programs? Findings from a national survey of assessment directors on statewide assessment policies for LEP students, presented at annual meeting of the National Conference on Large Scale Assessment, June 18, 1995, Phoenix, AZ. The Evaluation Assistance Center East. Washington, DC: George Washington University Institute for Equity and Excellence in Education.

Royer, J., and M. Carlo
1991 Assessing the language acquisition progress of limited English proficient students: Problems and a new alternative. Applied Measurement in Education 4:85-113.

Saville-Troike, Muriel
1991 Teaching and Testing for Academic Achievement: The Role of Language Development. Focus, Occasional Papers in Bilingual Education, No. 4. Washington, DC: National Clearinghouse for Bilingual Education.

Shinn, M.R., and G.A. Tindal
1988 Using student performance data in academics: A pragmatic and defensible approach to non-discriminatory assessment. Pp. 383-407 in R.G. Jones, ed., Psychoeducational Assessment of Minority Group Children: A Casebook. Berkeley, CA: Cobb and Henry.

Short, D.
1991 How to Integrate Language and Content Instruction: A Training Manual. Washington, DC: Center for Applied Linguistics.

Strang, E. William, and Elaine Carlson
1991 Providing Chapter 1 Services to Limited English-Proficient Students. Final Report. Rockville, MD: Westat.

Teachers of English to Speakers of Other Languages (TESOL)
1996 ESL Standards for Pre-K-12 Students. Washington, DC: Center for Applied Linguistics.

Valdes, Guadalupe, and Richard A. Figueroa
1994 Bilingualism and Testing: A Special Case of Bias. Norwood, NJ: Ablex.

Valdez Pierce, L., and J.M. O'Malley
1992 Performance and Portfolio Assessment for Language Minority Students. NCBE Program Information Guide Series. Washington, DC: National Clearinghouse for Bilingual Education.

Verhoeven, L.
1996 Early bilingualism, cognition, and assessment. Pp. 276-291 in M. Milanovic and N. Saville, eds., Performance Testing, Cognition and Assessment. Cambridge, England: Cambridge University Press.

Wong Fillmore, L.
1982 Language minority students and school participation: What kind of English is needed? Journal of Education 164:143-156.

Wong Fillmore, L., and Julia Lara
1996 Summary of the Proposal Setting the Pace for English Learning: Focus on Assessment Tools and Staff Development. Washington, DC: Council of Chief State School Officers.

Zehler, Annette M., Paul J. Hopstock, Howard L. Fleischman, and Cheryl Greniuk
1994 An Examination of Assessment of Limited English Proficient Students. Special Issues Analysis Center, Task Order Report, March 28, 1994. Arlington, VA: Development Associates, Inc.

Page 138 Cite

Suggested Citation:"5 STUDENT ASSESSMENT." Institute of Medicine and National Research Council. 1997. Improving Schooling for Language-Minority Children: A Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/5286.

×

Page 138

PROGRAM EVALUATION: SUMMARY OF THE STATE OF KNOWLEDGE
The following key points can be drawn from the literature on program evaluation:
•	The major national-level program evaluations suffer from design limitations; lack of documentation of study objectives, conceptual details, and procedures followed; poorly articulated goals; lack of fit between goals and research design; and excessive use of elaborate statistical designs to overcome shortcomings in research designs.
•	In general, more has been learned from reviews of smaller-scale evaluations, although these, too, have suffered from methodological limitations.
•	It is difficult to synthesize the program evaluations of bilingual education because of the extreme politicization of the process. Most consumers of research are not researchers who want to know the truth, but advocates who are convinced of the absolute correctness of their positions.
•	The beneficial effects of native-language instruction are clearly evident in programs that are labeled ''bilingual education," but they also appear in some programs that are labeled "immersion." There appear to be benefits of programs that are labeled "structured immersion," although a quantitative analysis of such programs is not yet available.
•	There is little value in conducting evaluations to determine which type of program is best. The key issue is not finding a program that works for all children and all localities, but rather finding a set of program components that works for the children in the community of interest, given that community's goals, demographics, and resources.
•	Five general lessons have been learned from the past 25 years of program evaluation:
	—	Higher-quality program evaluations are needed.
	—	Local evaluations need to be made more informative.
	—	Theory-based interventions need to be created and evaluated.
	—	We need to think in terms of program components, not politically motivated labels.
	—	A developmental model needs to be created for use in predicting the effects of program components on children in different environments.