8
Alternative Approaches to Assessment

While the vision in the Individuals with Disabilities Education Act (IDEA) and associated state guidelines is of a program that looks carefully at the individual needs of a student who is referred for special education, both the state guidelines implemented at the school level and traditional special education assessment rely heavily on standardized batteries of tests. Those same standardized test scores are frequently the primary determinant of eligibility for gifted and talented programs. In this chapter, we review the major challenges to these standardized testing practices, including challenges to the very notion of context-free measures of intellectual ability, as well as challenges to the usefulness and efficiency of standardized scores in providing information that is relevant to intervention. We then discuss alternative approaches to assessment that are tied more closely to intervention and present our recommendations for policy change.

CONTEXT, CULTURE, AND ASSESSMENT

Approaches to assessing intellectual ability used widely in special and gifted education placement (see Chapter 7) are rooted in a conception of intelligence as a general factor (often labeled g), which underlies all adaptive behavior (Sternberg, 1999; Jensen, 1998). The very notion of decontextualized intelligence is challenged by two lines of work that highlight the



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 279
Minority Students in Special and Gifted Education 8 Alternative Approaches to Assessment While the vision in the Individuals with Disabilities Education Act (IDEA) and associated state guidelines is of a program that looks carefully at the individual needs of a student who is referred for special education, both the state guidelines implemented at the school level and traditional special education assessment rely heavily on standardized batteries of tests. Those same standardized test scores are frequently the primary determinant of eligibility for gifted and talented programs. In this chapter, we review the major challenges to these standardized testing practices, including challenges to the very notion of context-free measures of intellectual ability, as well as challenges to the usefulness and efficiency of standardized scores in providing information that is relevant to intervention. We then discuss alternative approaches to assessment that are tied more closely to intervention and present our recommendations for policy change. CONTEXT, CULTURE, AND ASSESSMENT Approaches to assessing intellectual ability used widely in special and gifted education placement (see Chapter 7) are rooted in a conception of intelligence as a general factor (often labeled g), which underlies all adaptive behavior (Sternberg, 1999; Jensen, 1998). The very notion of decontextualized intelligence is challenged by two lines of work that highlight the

OCR for page 279
Minority Students in Special and Gifted Education role of culture and context in the development and assessment of intellectual abilities. One line of work, termed here cross-cultural psychological research, has focused on the influence of factors related to culture and context on testing and on cognition more generally. The other line of work is from a more traditional psychological or psychometric orientation and is focused somewhat more directly on issues of test bias and cultural bias in standardized assessment batteries, including IQ and intellectual ability measures. Cross-Cultural Psychological Research on Cognitive and Intellectual Ability Rogoff and Chavajay (1995) have traced the development of crosscultural psychological research over the past three decades. Initially much attention was directed at the exploration in other cultural settings of the robustness of cognitive tasks developed in the United States and in Europe. Emanating from a Piagetian perspective, a great deal of this work investigated the claims of universality of the stages of intellectual and cognitive development (Dasen, 1977a, b; Dasen and Heron, 1981). A clear finding is that people in many cultures did not reach what is called the formal operational stage without having had extensive experience in school (Ashton, 1975; Goodnow, 1962; Super, 1979). Characteristics assumed intrinsic to child development were found to be context dependent. In the attempt to understand this variation, many investigators began to examine the power of situational contexts of testing and the issue of subjects’ familiarity with test materials and concepts (Irwin and McLaughlin, 1970; Price-Williams et al., 1969; Ceci, 1996; Gardner, 1983; Lave, 1988; Nuñes et al., 1993). Cross-cultural settings were particularly productive for this purpose (Posner and Barody, 1979; Dasen, 1975; Carraher et al., 1985; Ceci and Roazzi, 1994; Nuñes, 1994). Several studies documented clear differences across cultures in people’s ability to sort objects into taxonomic categories (Cole et al., 1971; Hall, 1972; Scribner, 1974; Sharp and Cole, 1972; Sharp et al., 1979). Those whose experiences were not rooted in Western schooling tended to sort objects into functional categories rather than into more abstract conceptual taxonomies. In tasks thought to tap into logical thinking, often employing logical syllogisms, non-Western subjects often refused to accept the premise of the task, preferring to confine reasoning and deduction to immediate practical experience rather than hypothetical situations (Cole et al., 1971; Fobih, 1979; Scribner, 1975, 1977; Sharp et al., 1979). When the task was modified to focus on immediate and familiar everyday experience, non-Western subjects were able to make judgments, draw conclusions, and exhibit other features of

OCR for page 279
Minority Students in Special and Gifted Education logical thinking and memory that appeared absent in hypothetical problem solving (Cole et al., 1971; Cole and Scribner, 1977; Dube, 1982; Kagan et al., 1979; Kearins, 1981; Lancy, 1983; Mandler et al., 1980; Neisser, 1982; Price-Williams et al., 1967; Rogoff and Waddell, 1982; Ross and Millsom, 1970; Scribner, 1974, 1975, 1977). This body of work led many investigators to challenge the assumption that cognitive tasks or batteries developed in a specific cultural setting were context-free measures of cognitive abilities (Cole et al., 1976; Ceci, 1996, Gardner, 1983; Lave, 1988; Nuñes et al., 1993). Research focused on analogues of standardized cognitive tasks that were embedded in people’s everyday lives, such as weaving patterns, the calculating of change in the store, and personal narration (Cole et al., 1976; Greenfield, 1974; Greenfield and Childs, 1977; Lave, 1977; Serpell, 1977). In many of these studies, “native” subjects were shown to perform better than Western subjects when the materials and tasks reflected some correspondence to the more familiar, everyday versions of the tasks. During this same period, increasing attention was directed to the social context surrounding standardized testing situations and the study of testing as a unique context in itself with its own discourse and interactional rules for what constitutes appropriate behavioral expectations (Goodnow, 1976; Miller-Jones, 1989; Rogoff and Mistry, 1985). In more recent research challenging a universal g factor, Sternberg and Grigorenko (1997b) tested Kenyan children using several different instruments: one measured tacit knowledge of appropriate use of natural herbal medicines, including their source, their use, and dosage. Two other instruments designed to measure reasoning ability (Raven’s Coloured Progressive Matrices Test) and formal knowledge-based abilities (Mill Hill Vocabulary Scale) were administered as well. The findings showed no correlation between the “practical intelligence” measured by the herbal medicine test and the test scores for reasoning ability, as well as a negative correlation with the formal knowledge-based test. Ethnographic work with the families suggested to the authors that they saw either formal schooling or practical knowledge as relevant to a child’s future and so emphasized only one. The implication drawn by the authors is that variation in performance on intelligence tests may capture what is valued in the home environment rather than what is intrinsic to the child’s intellectual ability (Sternberg, 1999). International research results have been supported in research done more locally. Housewives in Berkeley, California who successfully did mathematics when comparison shopping were unable to do the same mathematics when placed in a classroom and given isomorphic problems presented abstractly (Lave, 1988; Sternberg, 1999). A similar result was found with weight watchers’ strategies for solving mathematical measurement prob-

OCR for page 279
Minority Students in Special and Gifted Education lems related to dieting (de la Rocha, 1986). Men who successfully handicapped horse races could not apply the same skill to securities in the stock market (Ceci and Liker, 1986; Ceci, 1996). In short, the available cross-cultural literature suggests that variations from the cultural norms embedded in tests and testing situations may significantly influence the judgments about intellectual ability and performance resulting from their use. Researchers have documented how these sociocultural contexts in the homes of different ethnic, racial, and linguistic groups in the United States can vary significantly from those of mainstream homes (Goldenberg et al., 1992; Heath, 1983, 1989). In light of differences in the fit between home and school culture for many minority children and the difference in the school experiences provided (see Chapter 5), these results bear directly on IQ testing of minority children. Psychometric Views of Culture and Context: Research on Test Bias In contrast to the cross-cultural and sociocultural research just described, work from a psychometric framework has centered on the issue of test bias. As early as the mid-1970s, questions were raised about the effects of cultural differences on standardized tests and their interpretation (Mercer, 1973a). Some researchers have considered the long-standing patterns of disproportionate representation of certain racial, ethnic, and English language learner groups in special education as de facto evidence of test bias (Bermudez and Rakow, 1990; Hilliard, 1992; Patton, 1992). The general argument has been that the content, structure, format, or language of standardized tests tends to be biased in favor of individuals from mainstream or middle- and upper-class backgrounds. Miller (1997) argues that all measures of intelligence are culturally grounded because performance depends on individual interpretations of the meaning of situations and their background presuppositions, rather than on pure g. A contrasting approach to test bias is based on a more statistical or psychometric view. That is, a test is considered biased if quantitative indicators of validity differ for different groups (Jensen, 1980). A common procedure has been to conduct item analysis of specific tests to examine construct validity. A specific test would be considered to be biased if there is a significant “item by group interaction,” suggesting that a specific item deviates significantly from the overall profile for any group. Several researchers have concluded that there is no evidence for test bias using such procedures (Jensen, 1974; Sandoval, 1979), a view that was embraced by the 1982 National Research Council (NRC) committee (1982). Other investigators have noted, however, that cultural factors may serve to depress the scores of a particular group in a more generalized or comprehensive fashion so that individual items would not stand out, even though cultural

OCR for page 279
Minority Students in Special and Gifted Education effects may still be present (Figueroa, 1983). This would be the case if familiarity with testing itself were at issue. Another psychometric indicator of bias that has been used is predictive validity. Normally this involves correlating measures of intellectual functioning with academic achievement, such as grades. Generally moderate to high correlation coefficients are obtained in these analyses. Critics such as Hilliard (1992) point out that the same biases that operate on standardized tests also are likely to operate in institutions such as schools. Moreover, Reschly et al. (1988) have suggested that these analyses when applied to students referred to special education are not predictive in a true sense, since the standardized measure is normally administered only after low achievement has been demonstrated. There is also a long tradition of investigation into the social and contextual factors embedded in standardized testing situations, in particular those conducted one-on-one with an unfamiliar examiner. Perhaps because it is easier to demonstrate these effects empirically, it has been argued that effects such as examiner familiarity differentially affect Hispanic and black children (Fuchs and Fuchs, 1989). However, efforts to determine whether white examiners impede the test performance of black children have found no evidence that they do (Sattler and Gwynne, 1982; Moore and Retish, 1974). A recent, more comprehensive treatment of the issues raised here is presented by Valencia and Suzuki (in press). In addition, discussion of issues specific to English language learners is found in Valdés and Figueroa (1994) and elsewhere in this volume. It is important to note, however, that many have begun to question the utility of the debate, at least with respect to designing meaningful interventions for students. That is, even if the ideal standardized test could be created that minimized the incorrect categorization or labeling of individual students, the question still remains: What does such an approach have to offer in terms of designing appropriate interventions that will maximize achievement and academic outcomes (Reschly and Tilly, 1999)? For this reason, many have begun to turn attention to more academically meaningful assessment approaches, such as performance-based assessment, curriculum-based measures, and other approaches more closely tied to instruction and classroom practice. Problems with IQ-Based Disability Determination Objections to IQ testing and strong reactions to the interpretation of IQ test differences as reflecting hereditary differences among groups continue to complicate discussions of the meaning, appropriate uses, and possible biases in tests of general intellectual functioning. In addition to the limitations of IQ tests from the perspectives of cultural psychology, it is

OCR for page 279
Minority Students in Special and Gifted Education questionable whether the costs of IQ tests are worth the benefits in special education eligibility determination. The costs of the testing alone are several hundred dollars in the form of the time of related services professionals such as psychologists and do not include either an estimate of the costs in the time of the students or an analysis of the usefulness of what might be done in place of IQ tests (MacMillan et al., 1998a). Treatment validity. Perhaps the most convincing of the arguments against IQ tests is that the results are largely unrelated to the design, implementation, and evaluation of interventions designed to overcome learning and behavioral problems in school settings. For example, IQ is not a good predictor either of the kind of reading problem that a student exhibits or of the student’s response to treatments designed to overcome that reading problem (Fletcher et al., 1994). The same general interventions appear to work with basic skills problems regardless of whether the student is classified with mild mental retardation (MMR), learning disability (LD), or emotional disturbance (ED) (Gresham and Witt, 1997; Reschly, 1997). The differentiation between LD and MMR that is done primarily with IQ test results does not lead to unique treatments or to more effective treatments. Moreover, it is noted by MacMillan and colleagues (1998a) that significant numbers of students now classified as LD are in the borderline range of ability of about 70-85 or, in some cases, functioning in the MR range defined by an IQ of approximately 75 or below. Misuse and racism. Further objections to the use of IQ-based disability determination come from the literature documenting the misuse of IQ tests to justify racist interpretations of individual differences among groups. No contemporary test author or publisher endorses the notion that IQ tests are direct measures of innate ability. Yet misconceptions that the tests reflect genetically determined, innate ability that is fixed throughout the life span remain prominent with the public, many educators, and some social scientists. These myths about the meaning of such results markedly complicate rational discussion of the proper role that IQ tests results might play in disability determination in school settings. Mercer (1979b) provided a useful discussion of the very narrow conditions under which differences among individuals on IQ tests might properly be interpreted as indicating differences in genetic bases for intellectual performance. The necessary conditions never occur with groups that differ by economic resources, cultural practices, and educational achievement. Moreover, test authors and test publishers all acknowledge that IQ tests are measures of what individuals have learned—that is, it is useful to think of them as tests of general achievement, reflecting broad culturally rooted ways of thinking and problem solving. The tests are only indirect measures

OCR for page 279
Minority Students in Special and Gifted Education of success with the school curriculum and imperfect predictors of school achievement. LD classification criteria. The most frequent use of IQ tests today is in determining whether a “severe discrepancy between achievement and intellectual ability” exists as per the federal criteria for LD (34 CFR 300.541) and state LD classification criteria. Several problems exist with this procedure. First and most fundamental, there is no “bright line” in performance that can be used to determine the appropriate size of the discrepancy; the size required is arbitrary. Some states use more stringent criteria (e.g., 23 standard score points), others more lenient ones (15 points). Second, serious technical problems exist with the methodologies for discrepancy determination used in most states that do not account for the phenomenon of regression to the mean, a special problem with extreme scores (Mercer et al., 1996; Reynolds, 1985). Failure to account for regression effects penalizes lower-scoring students in decreasing the likelihood of being diagnosed as LD rather than MMR. A third problem with the discrepancy method is that its intended objectivity may not be realized if multidisciplinary teams that are willing to administer a large number of achievement tests until the requisite discrepancy is attained without careful consideration of which test is most valid for a particular child and achievement problem. This activity is often predicated on the altruistic-sounding motive of making sure that students with achievement problems get services designed to ameliorate their difficulties; however, it seriously undermines the purpose of having an eligibility criterion. A fourth and more fundamental problem with the intellectual ability/ achievement discrepancy is that the discrepancy is inherently unreliable in a single measurement occasion and notoriously unstable in repeated measurement occasions (Shinn et al., 1999). Moreover, the vast majority of students evaluated for LD and special education placement have discrepancies that just meet or just fail to meet the discrepancy criterion. The instability of the discrepancy means that if they were assessed again, the discrepancy status for many would change. It is important to remember that these problems occur with students with low achievement, some of whom are found eligible for LD and others of whom, with equally low achievement, especially those with IQs in the 70s and 80s, often are found ineligible. Is this a valid distinction? Validity of LD Discrepancies The case against using the “severe discrepancy between achievement and intellectual ability” criterion is further strengthened by a series of studies funded by the National Institute of Child Health and Human Devel-

OCR for page 279
Minority Students in Special and Gifted Education opment (NICHD) (Lyon, 1996), which reached a number of conclusions about the use and validity of IQ in defining LD in the area of reading: Results do not support the validity of discrepancy versus low achievement definitions. Although differences between children with impaired reading and children without impaired reading were large, differences between those children with impaired reading who met IQ-based discrepancy definitions and those who met low reading achievement definitions were small or not significant (Fletcher et al., 1994:6). The present study suggests that the concept of discrepancy operationalized using IQ scores does not produce a unique subgroup of children with reading disabilities when a chronological age design is used; rather, it simply provides an arbitrary subdivision of the reading-IQ distribution that is fraught with statistical and other interpretative problems (Fletcher et al., 1994:20). Poor readers who make up 70 to 80 percent of the current LD population seem to have the same needs and the same cognitive processing profiles, and they respond to the same treatments regardless of their IQ status (it should be noted that children with IQs less than 80 generally were excluded from the NICHD studies). Therefore, arbitrarily dividing poor readers into subgroups with higher IQs (those who meet the current LD criteria) and those with IQs similar to their reading achievement levels is invalid. With regard to reading-related characteristics, these subgroups are much more similar than different, calling into serious question the current LD diagnostic practices. These practices have an even more serious side effect: the wait-to-fail phenomenon. Learning to read in the early grades is crucial. The evidence suggests that a student’s status as a poor or good reader at the end of 3rd grade is highly stable through adolescence (Coyne et al., 2001; Juel, 1988). To be effective, intervention needs to occur early with poor readers; otherwise, there are grave barriers to changing from learning to read (in the kindergarten to 3rd grade period) to reading to learn (in 4th grade and beyond). Special education services for students with reading and math achievement problems are typically delayed until 2nd, 3rd, or 4th grade by the intellectual ability/achievement discrepancy criterion for LD. As noted by Fletcher and colleagues, “For treatment, the use of the discrepancy models forces identification to an older age when interventions are demonstrably less effective” (Fletcher et al., 1998:201). This effect of the IQ-achievement discrepancy method greatly diminishes the potential positive effects of LD services because they are initiated after two or more years of failure (Fletcher, 1998), not when it first is apparent that a student is having significant problems in acquiring reading or math skills. The wait-to-fail

OCR for page 279
Minority Students in Special and Gifted Education effects are markedly damaging to students and equally negative regarding the potential positive effects of special education. Significant changes in how LD is diagnosed, along with universal early interventions for children with reading problems, are crucial to improving the current system and to improving the achievement of minority children and youth. Problems with Abandoning IQ-Based Disability Determination Before leaving the topic of IQ-based disability determination, the long tradition associated with the use of IQ in determining disabilities and the current practices involving IQ across a variety of contexts must be acknowledged. If IQ-based conceptions and classification criteria for LD and MMR were abandoned, significant retraining of existing special education and related services personnel would be required. Even more daunting is the change required in the thinking of professionals and the public about disabilities—a change from assumptions of fixed abilities and internal child traits to new assumptions about the malleability of skills and the powerful effects of instruction and positive environments. Belief changes of this magnitude do not occur immediately or easily, but they are supported by research understanding and are likely to be beneficial to children. Abandoning IQ-based disability determination will complicate articulation of eligibility and service delivery across different settings and agencies. The largest problem is likely to occur with MR, a disability category recognized in the laws pertaining to a number of agencies, including law enforcement and social security. For example, a person with an IQ below 60 is presumptively eligible for Social Security Income Maintenance benefits, and persons with IQs in the 60 to 70 range are eligible pending an evaluation of intellectual functioning and confirmation of deficits in adaptive behavior (as well as meeting income requirements). Examination of school records is often part of the process of identifying deficits in adaptive behavior. School practices over the past 25 years involving increasing reluctance to identify MMR and the apparent practice of diagnosing some students as LD who meet criteria for MMR compromise the usefulness of school records and potentially undermine an individual’s access to services and protections that should be accorded to persons with MR. Today, IQ data typically are available for persons classified as LD, and those data assist with determination of adult eligibility for services. In the future, such data may not be available. A counterargument, however, is that schools should not identify disabilities to meet the needs of other agencies. The goal of the schools is to assist children and youth in developing the academic skills, problem-solving capabilities, social understanding, and moral values that promote successful adult lives. The use of IQ tests and IQ-based disability determination

OCR for page 279
Minority Students in Special and Gifted Education does not promote the achievement of those critical goals; therefore, IQ should be abandoned, even if that action complicates the work of other agencies. It seems entirely reasonable to expect the other agencies to collect data relevant to their eligibility and to learn how to use the kinds of school data described in the last section. Use of the diagnostic construct of MR without IQ is problematic. Intellectual functioning is critical to all contemporary conceptions of MR and has been a part of the construct since it first was differentiated from mental illness by John Locke in the 17th century (Kanner, 1964; Doll, 1962, 1967). No one has developed alternative criteria for this diagnostic construct that do not use intellectual functioning either implicitly or explicitly. Before classifying someone as MR, given all of the classification schemes that currently exist (American Psychiatric Association, 1994; Luckasson et al., 1992; World Health Organization, 1992), use of a comprehensive and reliable test of general intellectual functioning is mandatory. Some children may be incorrectly classified as MR if IQ is eliminated from the MR conceptual definition and classification criteria (Lambert, 1981; Reschly, 1981, 1988d); IQ tests results can protect children from the more subjective judgments of adults. It also is important to recognize what will not occur with an elimination of IQ-based disability determination and the use of IQ tests in the full and individual evaluation of students suspected of having disabilities. First, current patterns of over- and underrepresentation in special education and for gifted and talented services are likely to continue unless substantial improvements in levels of minority students’ achievement are realized. As noted in the 1982 NRC report, IQ tests are not mechanically applied to all students in the general population. If they were, “the resulting minority overrepresentation would be almost 8 to 1” (NRC, 1982:42). At the time of that report, the actual overrepresentation in MR was slightly over 3 to 1. Further evidence of continued overrepresentation even though IQ testing was eliminated is available from California, where federal Judge Robert Peckham issued a ban on IQ testing in 1986 that was in effect until 1992, when it was modified by the same judge. The ban had no effect on disproportionate special education representation. The IQ issue in the context of special education was never the principal issue to the Larry P. v. Riles court, which in 1979 and 1986 ordered first a limitation of the use of IQ tests with black students and then a complete ban on such use. The judge clarified his views of the meaning of the case in 1992 with the following comments: “First, the case was,...clearly limited to the use of IQ tests in the assessment and placement of African-American students in dead end programs such as MMR” (Larry P., 1992, also cited as Crawford et al. v. Honig [Crawford et al.], 1992:15). Furthermore,

OCR for page 279
Minority Students in Special and Gifted Education “Despite the Defendants’ attempts to characterize the court’s 1979 order as a referendum on the discriminatory nature of IQ testing, this court’s review of the decision reveals that the decision was largely concerned with the harm to African-American children resulting from improper placement in dead-end educational programs” (Crawford et al., 1992:23). The real Larry P. issue, according to the judge who adjudicated the case over a 20-year period from 1972 to 1994, was the effectiveness of special education programs for black students. Without data confirming effectiveness, Judge Peckham regarded overrepresentation as highly suspicious. The 1992 order required the California Department of Education to inform the court regarding which of the 1990s special education programs in California were “substantially equivalent” to the dead-end programs of concern to the court in the 1979 opinion. Instead of responding to that order, the department appealed the decision to the 9th Circuit. The appeal was rejected, leaving Judge Peckham’s 1992 order to stand. No further action in the Larry P. case has occurred since 1994, although the 1992 order to the California Department of Education is still in effect. Perhaps the most important lesson from Larry P. is that the outcomes of special education matter a great deal in judging fairness to the minority students who are overrepresented in programs. Demonstratably effective outcomes would probably have changed the original ban on IQ tests and would greatly diminish if not eliminate contemporary concerns about disproportionate representation. This leads to a useful reframing of the IQ issue, providing as well the foundation of the next section on alternatives to the current system of special education. Are IQ tests useful in promoting positive outcomes for children and youth with severe achievement and social behavior problems? In the committee’s view, the balance of the evidence does not provide continued support for the use of IQ tests in special education decision making. The major advantages of eliminating IQ-based disability determination and use of IQ in the full and individual evaluations have to do with focusing the efforts of parents, students, teachers, and related services personnel on promoting greater competence in academic skills and social behaviors. The use of IQ tests detracts from efforts to analyze environments carefully and develop effective interventions. The time and cost of IQ testing during the full and individual evaluation and reevaluations could be put to better use if they were devoted to more thorough analyses of reading, math, written language, or other achievement deficits, as well as analyses and development of interventions for classroom behaviors that interfere with effective instruction and achievement of positive learning outcomes. Abandoning IQ testing does not automatically produce more appropriate assessment. Accomplishment of the latter will require significant changes in state and local

OCR for page 279
Minority Students in Special and Gifted Education Accountability It is clear that the original framers of Education of All Handicapped Act (1975) were concerned with making sure that special education programs were effective. The various procedural requirements, such as due process, IEP development, full and individual evaluation, annual review, and triennial reevaluation, were all designed to ensure that the services would be effective. The framers established “process or procedural” protections to ensure accountability. Although a great deal has been accomplished with the procedural requirements, accountability for results was not achieved. IDEA (1997, 1999) placed more emphasis on accountability and moved special education for students with disabilities into the mainstream of educational reform. The system now demands accountability without adjustments in the classification practices and assessment requirements to make accountability feasible. Research on the effectiveness of special education overwhelming supports changes away from IQ-based disability determination to functional assessment and problem-solving interventions. One aspect of problem solving is particularly important: formative evaluation. Formative evaluation methods involve establishing goals, gathering baseline data to reflect current performance, instruction or behavioral interventions, with monitoring of progress frequently (daily, twice per week), and with changes made in interventions depending on the ongoing results of that intervention. If goals are met, typically the goal is raised to ensure that the student always has a challenging but achievable goal to guide and motivate efforts. If goals are not met, instructional and behavior change interventions are analyzed further and changed to foster better outcomes and efforts to improve instruction are implemented (Fuchs and Fuchs, 1986; Kavale and Forness, 1999). Interventions guided by this kind of problem solving are more effective by 0.75 to 1.0 SD over typical special education interventions. Gifted and Talented Identification It is far more difficult to make a case for early identification and intervention for gifted and talented students, because no research base currently provides guidance in this regard. There has been an absence of public support for gifted programs for the very young, resulting in few opportunities to conduct research on program features that promote achievement at the highest end of the distribution. This is perhaps not surprising given the well-known problems of reliability of traditional instruments for assessing intellectual function in young children. “Readiness tests” used as screening instruments for intellectual competence and traditional tests of intelligence and aptitude have been soundly criticized for their inappropriateness for

OCR for page 279
Minority Students in Special and Gifted Education young children generally, and with minority children in particular (Meisels, 1987; Anastasi, 1988, Gandara, 2000). Thus, while many of the predictors of academic failure are well established even for the very young, there is currently no consensus regarding predictors of giftedness. For elementary and secondary students, limited programs of identification and services for gifted and talented students have been carried out under the auspices of the Jacob K. Javits Gifted and Talented Students Education Program. But the collection of data in the framework of any systematic research paradigm has been limited. Yet the importance of early identification and opportunity to learn is likely to be as critical to the success of students at the upper end of the achievement distribution as it is for those at the lower end. And the problem of disentangling the child’s abilities from the previous opportunities to learn strikes a clear parallel. Nevertheless, the existing research base provides too weak a foundation for proposing an alternative assessment approach similar to that proposed for special education. CONCLUSIONS AND RECOMMENDATIONS Assessment in special education is guided by complex legal requirements that are responsible in part for the gap between current practices and the state of the art. Direct measures of skills in natural settings, along with the application of problem-solving methodologies, have the promise of significantly improving the outcomes for students in special education and for those considered for but not placed in special education. Traditional disability conceptions and classification criteria interfere with the implementation of systematic problem solving, functional assessment, formative evaluation, and accountability for outcomes. The system changes discussed here and in the recommendations were anticipated in the 1982 National Research Council report. Over the last two decades, significant system changes have become more feasible due to advances in assessment and intervention knowledge. It now is time to implement these changes more widely as a means to protect all children from inappropriate classification and placement, as well as from ineffective special education programs. The proposed change would focus attention away from efforts to uncover unobservable child traits ,the identification of which gives little insight into instructional response, and toward the problems encountered in the classroom and appropriate responses. The role of instruction and classroom management in student performance is explicitly acknowledged, and effort is devoted first to ensuring the opportunity to succeed in general education.

OCR for page 279
Minority Students in Special and Gifted Education Federal Level Changes Recommendation SE.1: The committee recommends that federal guidelines for special education eligibility be changed in order to encourage better integrated general and special education services. We propose that eligibility ensue when a student exhibits large differences from typical levels of performance in one or more domain(s) and with evidence of insufficient response to high-quality interventions in the relevant domain(s) of functioning in school settings. These domains include achievement (e.g., reading, writing, mathematics), social behavior, and emotional regulation. As is currently the case, eligibility determination would also require a judgement by a multidisciplinary team, including parents, that special education is needed. We provide more detail regarding our intended meaning below: Eligibility The proposed approach would not negate the eligibility of any student who arrives at school with a disability determination, or who has a severe disability, from being served as they are currently. Our concern here is only with the categories of disability that are defined in the school context in response to student achievement and behavior problems. While eligibility for special education would by law continue to depend on establishment of a disability, in the committee’s view noncategorical conceptions and classification criteria that focus on matching a student’s specific needs to an intervention strategy would obviate the need for the traditional high-incidence disability labels such as LD and ED. If traditional disability definitions are used, they would need to be revised to focus on behaviors directly related to classroom and school learning and behavior (e.g., reading failure, math failure, persistent inattention and disorganization). Assessment By high-quality interventions we mean evidence-based treatments that are implemented properly over a sufficient period to allow for significant gains, with frequent progress monitoring and intervention revisions based on data. Research-based features of intervention quality are known and must be implemented rigorously including: an explicit definition of the target behavior in observable, behavioral language; collection of data on current performance;

OCR for page 279
Minority Students in Special and Gifted Education establishment of goals that define an acceptable level of performance; development and implementation of an instructional or behavioral intervention that is generally effective according to research results; assessment and monitoring of the implementation of the intervention to ensure that it is being delivered as designed, frequent data collection to monitor the effects of the intervention, revisions of the intervention depending on progress toward goals, and evaluation of intervention outcomes through comparison of postintervention competencies with baseline data. Several sources detail these procedures (Flugum and Reschly, 1994; Reschly et al., 1999; Shinn, 1998; Upah and Tilly, 2002). Assessment for special education eligibility would be focused on the information gathered that documents educationally relevant differences from typical levels of performance and is relevant to the design, monitoring, and evaluation of treatments. Competencies would be assessed in natural classroom settings, preferably on multiple occasions. While an IQ test may provide supplemental information, no IQ test would be required, and results of an IQ test would not be a primary criterion on which eligibility rests. Because of the irreducible importance of context in the recognition and nurturance of achievement, the committee regards the effort to assess students’ decontextualized potential or ability as inappropriate and scientifically invalid. Reporting and Monitoring Current federal requirements regarding reporting by states of the overall numbers of students served as disabled and the program placements used to provide an appropriate education would not change with these recommendations. Moreover, the reporting of the nine low-incidence disabilities would continue to be done by category. Reporting of the numbers of students currently diagnosed with high-incidence disabilities would become noncategorical, with the loss of very little useful information due to the enormous variations in the operational definition of the high-incidence categories used currently. The reporting by states concerning students now classified in high-incidence categories could be made more meaningful if the reporting also included the nature of the learning or behavioral problem as reflected in the top 2-4 IEP goals for each student, that is, the number of students with IEP goals in basic reading, reading comprehension, math calculation, self-help skills, social skills, math reasoning, etc. The latter information would provide more accurate information on the actual needs

OCR for page 279
Minority Students in Special and Gifted Education of students with disabilities than the current information indicating unreliable categorical diagnoses. Consistent with IDEA 1997 and 1999, federal compliance monitoring should move in the direction of examining the quality of special education interventions and the outcomes for students with disabilities. Current compliance monitoring focuses on important, but limited, characteristics of the delivery of special education programs, particularly implementation of the due process procedural safeguards and the mandated components of the IEP. Compliance monitoring by the Federal Office of Special Education Programs and the state departments of education must assume an outcomes focus in addition to the traditional process considerations. State-Level Changes State regulatory changes would be required for implementation of a reformed special education program that uses functional assessment measures to promote positive outcomes for students with disabilities. Some states have already instituted changes that move in this direction. In Iowa, noncategorical special education for students with high-incidence disabilities has been implemented since the early 1990s. Several other states have approved “rule replacement” programs that allow school districts to implement special education systems that do not require categorical designation of students with high-incidence disabilities (e.g., Illinois, Kansas, South Carolina). These state rules require a systematic problem-solving process that is centered around quality indicators associated with successful interventions (see previous section). The rules are explicit about each of these quality indicators, and compliance monitoring is focused on their implementation. Several features of rules in the majority of states can be omitted in a noncategorical system, including the requirements regarding IQ testing. The changes in federal regulations and state rules toward greater emphasis on producing positive outcomes and away from an eligibility determination process that is largely unrelated to interventions are consistent with the greater emphasis in IDEA (1997, 1999) on positive outcomes for students with disabilities. Positive outcomes are enhanced by the implementation of high-quality interventions; no such claim can be made for conducting the assessments required to assign students with significant learning and behavior problems to the high-incidence categories of LD, ED, and MMR. Early Screening Universal screening of young children for prerequisities to and the early development of academic and behavioral skills is increasingly recognized as

OCR for page 279
Minority Students in Special and Gifted Education crucial to achieving better outcomes in schools and preventing achievement and behavior problems. While this is true for all children, a disproportionate number of disadvantaged children are on a developmental trajectory that is flatter than their more advantaged counterparts. Evidence suggests that effective and reliable screening of young children by age 4 to 6 can identify those most at risk for later achievement and behavior problems, including those most likely to be referred to special education programs. In two arenas—reading and behavior—the knowledge base exists to screen and intervene in general education both systematically and early. Less attention has been devoted to early identification and intervention for mathematics problems. However, the NICHD has launched a research program in this area. Other efforts to develop early screening mechanisms in mathematics have been developed, but their psychometric properties have not yet been widely tested (Ginsberg and Baroody, 2002; Griffin and Case, 1997). While early reading is only one of the areas in which students struggle, it is an important one because failure in early reading makes learning in the many subject areas that require reading more difficult. Moreover, there is a great deal of comorbidity between reading problems and other difficulties (attentional, behavioral) that results in special education referral. As indicated above, early screening and intervention would help to identify children who may be missed in a wait-to-fail model. It may obviate the need for placement in special education for some children, and it would provide the evidence of response or lack of response to high-quality instruction that we proposed be written into federal regulations. Recommendation SE.2: The committee recommends that states adopt a universal screening and multitiered intervention strategy in general education to enable early identification and intervention with children at risk for reading problems. The committee’s model for prereferral reading intervention is as follows: All children should be screened early (late kindergarten or early 1st grade) and then monitored through 2nd grade on indicators that predict later reading failure. Those students identified through screening as at risk for reading problems should be provided with supplemental small-group reading instruction by the classroom teacher for about 20-30 minutes per day, and progress should be closely monitored. For those students who continue to display reading difficulties and for whom supplemental small-group instruction is not associated with improved outcomes, more intensive instruction should be provided by other

OCR for page 279
Minority Students in Special and Gifted Education support personnel, such as the special education teacher and/or reading support teacher in school. For students who continue to have difficulty, referral to special education and the development of an IEP would follow. The data regarding student response to intervention would be used for eligibility determination. State guidelines should direct that the screening process be undertaken early, and the instructional response follow in a very timely fashion. The requirement for general education interventions should not be used to delay attention to a student in need of specialized services. The committee’s recommendation to adopt a universal screening and multitiered intervention strategy is meant to acknowledge that there is some distance to travel between the knowledge base that has been accumulated and the capacity to use that knowledge on a widespread basis. There are early examples in Texas and Virginia of taking screening to scale. But making the tools available to teachers, preparing teachers both to assess students and to respond productively to the assessment results, and supporting teachers to work with the instructional demands of intervening differently for subgroups of students at different skill levels require the careful development of capacity and infrastructure. At the same time that the committee acknowledges the investment required to adopt this recommendation, we call attention to the potential return on the investment and the consequences of not making such an investment. When early screening and intervention is not undertaken, more students suffer failure. The demands on the school to invest in a support structure for those students is simply postponed to a later age, when the response to intervention is less promising and when the capacity of teachers to intervene effectively is made even more difficult by a weaker knowledge base and limited teacher skill. The consequences of school failure for the student and for society go well beyond the cost to the school, of course. Behavior Management Current understanding of early reading problems is the outcome of a sustained research and development effort that has not been undertaken on a similar scale with respect to other learning and behavior problems. In the committee’s view, however, there is enough evidence regarding universal behavior management interventions, behavior screening, and techniques to work with children at risk for behavior problems to better prevent later serious behavior problems. Research results suggest that these interventions can work. However a large-scale pilot project would provide a firmer foundation of knowledge regarding scaling up the practices involved.

OCR for page 279
Minority Students in Special and Gifted Education Recommendation SE.3: The committee recommends that states launch large-scale pilot programs in conjunction with universities or research centers to test the plausibility and productivity of universal behavior management interventions, early behavior screening, and techniques to work with children at risk for behavior problems. We propose a model for experimentation similar to that proposed for reading: Assessment of the classroom and of noninstructional school settings (hallways, playgrounds) should be made yearly. Behavioral adjustment of all children in grades K-3 should be screened yearly to provide teachers with information regarding individual children. The assessments should be reviewed yearly by a school-level committee (comprised of administrative and teaching staff, specialists, and parents) to ensure that school-wide interventions are implemented when indicated in a timely fashion and to ensure that individual children are given special services quickly when needed. Because characteristics of the classroom and school can increase risk for serious emotional problems, the first step in the determination of an emotional or behavioral disability is the assessment of the classroom and school-wide context. Key contextual factors should be assessed and ruled out as explanations before intervention at the individual child level is considered. If it is determined that contextual factors are not significantly involved in the child’s problem, then individualized measures should be taken to help the child adjust in the standard classroom/school setting. Only those interventions with empirical evidence supporting their effectiveness should be considered. For example, common features of emotional and behavioral problems are off-task and disruptive behaviors. Well documented interventions with demonstrated effectiveness at reducing these behaviors should be employed before the child is considered disabled. Because the most serious and developmentally predictive emotional and behavioral problems in children tend to be manifested across settings, and because family issues and solutions tend to overlap with those at school, every effort should be made to include parents and guardians as partners in the educational effort. To the extent that this is done, early and accurate identification of serious problems should be facilitated, and parents can be enlisted to collaborate with teachers in both standard education and in solving emerging academic, emotional, and behavioral problems. For children who do not respond to standard interventions, the intensity of the interventions should be increased through the use of behavioral consultants, more intensive collaborations with parents, or through

OCR for page 279
Minority Students in Special and Gifted Education adjunct interventions to address various skill or emotional deficits (e.g., anger control, social skills instruction). Such individualized programs should be carefully articulated through the use of IEPs, coupled with systematic assessments of the child’s behavioral response to the interventions. Teacher Quality To support the proposed changes, school psychologists and special education teachers would need preparation that is different in some respects from that now required. Recommendation TQ.3: A credential as a school psychologist or special education teacher should require instruction in classroom observation/assessment and in teacher support to work with a struggling student or with a gifted student. These skills should be considered as critical to their professional role as the administration and interpretation of tests are now considered. Instruction should prepare the professional to provide regular behavioral assessment and support for teachers who need assistance to understand and work effectively with a broad range of student behavior and achievement. Recognizing and working with implicit and explicit racial stereotypes should be incorporated. The proposed reform of special education that would focus on response to intervention in general education would require substantial changes in the current relationship between general and special education. It would put in place a universal prevention element that does not now exist on a widespread basis with the purpose of: (a) providing assistance to children who may now be missed and (b) obviating the need for the special education referrals that can be remedied by early high-quality intervention in the general education context. In the final analysis, the committee cannot predict the effect of this approach on the number of special education students nor on racial/ethnic disproportion, but the result, in our judgment, would be that children identified for special education services would be those truly in need of ongoing support. And if the effect of the classroom context and opportunity to learn is successfully disentangled from the student’s need for additional supports, in our view that disproportion in identification would not be as problematic as it is currently.

OCR for page 279
Minority Students in Special and Gifted Education Federal Support of State Reform Efforts Recommendation SE.4: While the United States has a strong tradition of state control of education, the committee recommends that the federal government support widespread adoption of early screening and intervention in the states. In particular: Technical assistance and information dissemination should be coordinated at the federal level. This might be done through the Department of Education, the NICHD, a cooperative effort of the two, or through some other designated agent. Accumulation and dissemination of information and research findings has “public good” properties and economies of scale that make a federal effort more efficient than many state efforts. The federal government can encourage the use of Title I funds to implement early screening and intervention in both reading and behavior for schools currently receiving those funds. Funds provided in the Reading Excellence Act might also support this effort under the existing mandate. Gifted and Talented Eligibility The research base justifying alternative approaches for the screening, identification, and placement of gifted children is neither as extensive nor as informative as that for special education. While limited programs of identification and services for gifted students have been carried out under the auspices of the Jacob K. Javits Gifted and Talented Students Education Program, the collection of data in the framework of any systematic research paradigm has been limited. Yet the importance of early opportunity to learn is likely to be as important for the success of students at the upper end of the achievement distribution as it is for those at the lower end. And the problem of disentangling the children’s abilities from their previous opportunities to learn strikes a clear parallel. Nevertheless, the existing research base restricts our understanding and therefore our recommendations: rather than proposing a specific approach to screening or identification for gifted and talented students, we propose research that may allow for better informed decision making in the future. Recommendation GT.1: The committee recommends a research program oriented toward the development of a broader knowledge base on early identification and intervention with children who exhibit advanced performance in the verbal or quantitative realm, or who exhibit other advanced abilities.

OCR for page 279
Minority Students in Special and Gifted Education This research program should be designed to determine whether there are reliable and valid indicators of current exceptional performance in language, mathematical, or other domains, or indicators of later exceptional performance. To the extent that the assessments described above provide information relevant to the identification of gifted students, they should be used for that purpose. In addition to research to support the development of identification instruments, research on classroom practice designed to encourage the early and continued development of gifted behaviors in underrepresented populations should be undertaken so that screening can be followed by effective intervention. That research should be designed to identify: Opportunities that can be provided during the kindergarten year to engage children in high-interest learning activities that allow development of complex, advanced reasoning, accelerated learning pace, and advanced content and skill learning capabilities. Interventions in later school years with children who demonstrate advanced learning capabilities and their impact on the performance of these children over time. The effect of curricular differentiation through various options, such as resource room instruction, independent study, and acceleration, and the interaction of treatments with individual student profiles. Group size, instructional method, and complexity of the curriculum should all be variables under study. An enriched curriculum designed for gifted students may well improve educational outcomes for all children. As mentioned in Chapter 5, when class size was reduced in 15 schools in Austin, Texas, the two that showed improved student achievement were schools that made other changes as well, including making the curriculum for gifted students in reading and mathematics available to all students (NRC, 1999a). This does not imply, however, that the pace of instruction or the level of student independence is necessarily the same for all students. We recommend that research be conducted using control groups to determine the impact of interventions designed for children identified as gifted on children who have not been so identified.