A Collaborative Agenda for Improving International Comparative Studies in Education
A Collaborative Agenda for Improving International Comparative Studies in Education
HOW CAN INTERNATIONAL COMPARATIVE STUDIES BE
IMPROVED?
It is important to conserve and accumulate whatever knowledge and
experience is needed for the betterment of future comparative international
studies. In particular, since advancing the science and technology of
cross-national comparative education research is a legitimate end in itself,
both theoretical and empirical research aimed directly at methodological
improvement should be encouraged.
To improve the way such knowledge is gathered, interpreted, and used, a
number of topics that require attention are discussed in the sections that
follow: research design, data analysis, and dissemination; methods for
assessing current and desired outcomes of education; comparability across
nations and ways of interpreting differences within varying contexts; use of
ethnographic and historical studies to strengthen investigations using
statistical analysis and to provide in general a deeper, richer sense of what
education is, can, and should be; quality control and monitoring; and
accumulation, synthesis, dissemination, and use of cross-national knowledge
about education.
Current approaches to the design and analysis of international studies are
inadequate to show how much various educational inputs and processes contribute
to students' learning and later occupational endeavors and to estimate the
effectiveness of these inputs and processes in different environments. In such
analyses, it remains important to find better ways to take into account factors
not under the control of education decision makers, such as students' family
background and neighborhood setting. More generally, there is a need to be
thoughtful and explicit about the underlying theories for comparative education
research.
It is also important in both the design and dissemination phases of a
study, to consider researchers who may wish to carry out secondary analyses of
the data. Secondary analyses of large education studies frequently make major
contributions to the literature and to policy concerns. For example, using
data from the public-access data base developed at the University of Illinois
at Urbana-Champaign for the Second International Mathematics Study (SIMS),
Westbury (1992) compared mathematics achievement in the United States and
Japan. He examined curriculum and achievement in grade 7/8 algebra and grade
12 elementary functions and analysis (calculus). He found that the curricula
in the United States are not as well matched to the SIMS tests as are the
curricula of Japan, a situation that results in lower achievement in the United
States. In grade 8 algebra classes, however, the United States curricula is
comparable to the "curriculum" of the test and the Japanese curriculum, and
U.S. achievement is similar to that of Japan. His finding that the difference
between Japanese and U.S. achievement is a consequence of different curricula
is very different from earlier analyses of SIMS, which concluded that U.S.
students performed poorly in every grade and in every aspect of mathematics
tested when compared with students in the other 19 countries in the study
(Crosswhite et al., 1986). He concludes that analysis focused on the
undifferentiated variable of "country" as a unit of analysis, which has been
used in studies like those of the IEA or the IAEP, needs to be rethought
to provide more emphasis "on analytic variables defining the properties of
school systems that are common across countries but that might be distributed
in different ways in different places" (p. 23).
Therefore, in designing studies, consideration should be given to
fostering ways to increase the opportunities for the research community to do
secondary analyses by (1) using well-established methods of analysis or (2)
clarifying methodological innovations to enhance their understanding and use.
It is also crucial for comparative data to be made available to the education
research community in a timely and effective manner, since most researchers may
not be affiliated with the organization collecting the comparative data and so
will not have early access to the data. It should be possible to establish a
data bank in which all published data from international education studies are
entered and to make the data bank widely available to anyone with a computer
and a modem.
It is equally important to make available to a wide audience discussions
in nontechnical terms of the primary research questions addressed by
cross-national studies. These presentations should spell out the advantages
and disadvantages of alternative designs to provide answers to the research and
policy questions posed. The topics covered in such a public discussion should
also include the validity of the measures used in the designs and the
possibilities for including outcomes other than student achievement. In
clarifying issues that have arisen in previous studies, it is important to
distinguish between the objectives and value of cross-sectional studies and
longitudinal studies and to clarify the advantages and disadvantages of
examining cumulative achievement levels over examining change in achievement
levels within a given time period (such as a school year). Publishing
methodologically oriented reports that have a common theme, namely the design
and analysis of cross-national educational studies, or sponsoring training
institutes on this subject, would contribute both to better understanding of
their value to the countries concerned and to influencing the design of future
studies.
The research questions investigated in a country are often shaped by the
methods used. Just as the global community provides a natural laboratory for
comparing different education goals, systems, and methods, so it provides a
variety of traditions and approaches to educational measurement. In the United
States, since researchers have historically made much use of standardized
multiple-choice tests of cognitive outcomes, research has emphasized ways to
improve the kinds of learning that can be measured in that way. Some European
countries have made greater use of essay examinations or performance
assessments and have established research traditions that have placed less
emphasis on reliability and objectivity as defined in the United States. In
still other parts of the world, education scholars and policy makers have
placed more emphasis on less tangible schooling outcomes, including personal
and social values, character, and the ability to communicate or
cooperate.
Cross-national studies are under pressure to draw on this plurality of
methods and traditions, and at the same time they have been challenged to
discover ways to quantify those kinds of learning for which good measures have
not yet been found. Specific areas in need of investigation include: (1)
reliable and valid performance-based measurement of cognitive learning outcomes
across subject areas and grade levels; (2) measurement of personal values and
other affective goals of schooling; (3) measurement of the abilities to
communicate and cooperate as well as other schooling outcomes that are
manifested only in a social context; (4) measurement of context variables at
the level of class, school, neighborhood, and society, rather than the
individual student; (5) measurement of schooling processes, including in
particular a student's opportunity to learn not only content knowledge but also
strategies for learning.
In planning a cross-national study, consideration should be given to
whether it will serve its intended purpose. In particular, evaluation is
required of its intended and likely unintended social consequences. Although
it may not be possible to identify all the social consequences of an
assessment, Messick (1988) has suggested contrasting the potential social
consequences of a proposed assessment with those of alternative procedures,
including not testing at all. This type of appraisal contributes to a
consequential basis of test validity and should preclude some of the problems
in the use of international data.
In making cross-cultural comparisons, in addition to value and cultural
issues, both logical and measurement problems arise in defining and
implementing fair and meaningful practices. Two subcategories are important:
first, primarily technical questions, for which different languages and
practices complicate the problem of producing valid comparative information but
do not, in principle, complicate the definition of sensible comparison groups
or of the variable to be measured. As an example of a primarily technical
problem, consider the comparison of per-pupil expenditures. Different
education funding policies, accounting conventions, and currencies may
complicate the development of comparable cross-national statistics. Economists
have spent a large amount of time on this problem and have addressed it in
great detail. Following the lead of quantitative economists, it should be
possible to reach consensus on what should or should not be included as an
education expenditure. In resolving such questions, useful models from other
fields may include such international classification codes as the Standard
International Trade Classification and the International Standard Industrial
Classification of All Economic Activities. Effort should be directed to
further development of such codes with a common language to describe and
discuss educational organizations, processes, and outcomes so that
cross-national measures and analyses of these elements can be improved.
At a second level, there are deeper substantive questions, for which
different languages and practices may lead to incommensurabilities, making it
difficult even to conceptualize variables representing corresponding
attributes of students, practices, and institutions in different parts of the
world. A problem in this category is the comparison of writing proficiency
across languages. Problems of translations per se are difficult enough, but,
beyond translation, languages have different conventions for organizing prose
and presenting ideas.
Comparisons among countries can be difficult to analyze because of the
large number of differences in the countries being compared. The large number
of interactions between variables makes it difficult to identify which
variables influence outcomes and which are covarying with those that influence
the outcomes. Both the problem of covariation and the problem of interaction
cause difficulties in making interpretations in comparative studies. These
difficulties in drawing conclusions from comparisons do not mean that
comparisons should be ignored -- there are many useful advantages afforded by
such investigations as discussed earlier in this report.
In cross-national research, such issues of what constitutes an intelligent
comparison could be more readily addressed if some studies were focused on
specific topics within a few participating countries instead of omnibus studies
welcoming as many countries as wish to join. For example, limiting the
countries in a study to a few developed countries in Europe and North America
(or to groups of developing countries, African countries, Asian countries, or
South American countries) would reduce the number of differences between
countries being compared, making it easier to identify which variables
influence outcomes. Conceptualizing variables representing attributes of
students, practices, and institutions would also be easier for a limited study
than for a worldwide study. Finally, if the study were focused on a few
topics, for example, teaching practices and student achievement in secondary
school chemistry, the topics could be explored in greater depth than in a study
with countries having very disparate education systems.
In addition, a limited study would have fewer problems with languages,
culture, and value systems; would have fewer communication problems in study
administration; would require less training; would be less costly; and could be
completed on a more timely basis. Such a study could also be designed to meet
the policy needs of the participating countries.
Many significant questions in comparative education are best addressed by
small, focused studies, which may draw on a broad range of techniques and
provide a deeper, richer sense of what education is, can, and should be. Thus,
in addition to large-scale surveys, there is a need for a wide range of other
cross-national research, such as ethnographic studies, case studies,
small-scale focused quantitative and qualitative studies, and historical
studies that would allow us to understand what it means to be educated in
diverse settings around the world. Such studies go beyond the exploratory and
the descriptive. They have become essential parts of the explanatory
repertoire. These studies provide ways of analyzing and explaining a variety
of processes, conditions, and contexts. They help to uncover patterns of
interaction and to interpret complex situations both in the classroom and in
the larger community. There is a great need for small, in-depth studies of
local situations that would permit cross-cultural comparisons capable of
identifying the myriad of causal variables that are not recognized in
large-scale surveys. In fact, much survey data would remain difficult to
interpret and explain without the deep understanding of society that other
kinds of studies provide. Given that research in cross-national contexts
benefits from increased documentation of related contextual information, it
would be useful to combine large-scale surveys and qualitative methods.
Ethnographic and other qualitative studies are especially important in
clarifying the perspectives of many diverse groups of learners and their
teachers. Such groups include gifted and talented students, individuals with
disabilities, racial and ethnic minorities, women, and religious groups that
reject the state's secular systems of education.
But there is also a role for the large-scale surveys. About the only way
to obtain a simple numerical comparison of a large number of countries on a
common set of measures is with a large-scale survey. Large-scale studies also
permit curriculum analysis on a scale that could not possibly be done
adequately by a few independent researchers, because such studies require a
high degree of international organization and structure. For example, the
preliminary curriculum analyses of data collected by IEA in the Third
International Mathematics and Science Study (TIMSS) (Schmidt, 1993) reveal a
large and interesting variation in mathematics and science textbooks across a
large number of countries--variation that can be helpful in better
understanding the role of the teacher in various countries. Textbooks can have
a major influence on teachers' curricular decisions (Schmidt et al., 1987).
Teachers are more likely to teach the topics included in the textbook than
others not included. Large-scale studies such as TIMSS or the OECD indicators
project will also provide trend data, and independent researchers are unlikely
to have the commitment, longevity, or resources to produce such data in
adequate fashion.
Due to the lack of continuity in funding and personnel as well as the lack
of an organizational structure capable of maintaining rigorous quality control,
the overall quality of some large-scale cross-national studies of educational
achievement has suffered. In many countries, the quality of the sampling, the
translation of instruments into the national language(s), and the collection
and management of data have been impeccable, but in other countries errors have
been made in one or more of these processes. The results have been damaging in
two respects: on one hand, suspect conclusions have been accepted, and on the
other hand, critics of large-scale studies have overgeneralized and exaggerated
these faults and asserted that nothing in the whole enterprise is valid or to
be believed. For example, whereas in some countries, exclusions from target
populations may have been substantial enough to mislead the user about the
effects of a given country's schooling policies and practices, such exclusions
can be clearly identified in reports and, in many cases, shown not to be so
extensive as to affect conclusions drawn about the country as a whole.
Hence increased attention to controlling all sources of errors is
important. This includes such matters as defining comparative populations,
constructing sampling frames and selecting samples, developing instruments and
maintaining cross-national equivalency of instrumentation, administering
instruments and recording data, entering and editing data, coding and scoring,
and weighting and analysis. Documentation on the steps taken to control errors
is essential at all stages: at the planning stage, during data collection as a
guide to study personnel in all locations, and as a necessary part of reporting
what was done.
Dissemination and utilization of international education data remain weak
links in the information-reform connection within education. Public discussion
of cross-national education research is frequently superficial or incorrect or
both. The use of international data in the United States needs close scrutiny.
In the past, test results were often extracted from a study and reported
independently of other variables that provide essential context for the
results. Sometimes the limitations, such as sampling problems, were glossed
over as if they did not matter, and at other times they were exaggerated as if
they were sufficient to rebut everything that could be learned from such
studies. In the future, researchers and organizations responsible for
cross-national studies need to begin early in project cycles to plan for public
and professional discussion of the results and to improve the presentation of
findings to all concerned. Press conferences should be planned that respond to
the needs of journalists and at the same time present a more in-depth
background on what has been done. These meetings should clarify the importance
of considering variables such as demographics, expenditures, and teacher
training in conjunction with achievement scores in presenting the results of
the increasingly sophisticated research on education that is under way or
already available.
In addition to a better understanding of how and what to communicate,
consideration is needed of the potential contributions of new communication
technologies and data bases to the planning of high-quality cross-national
studies and to the dissemination and utilization of results. The rapidity and
simplicity of communication by facsimile and by electronic networks (e.g.,
BITNET and Internet) have revolutionized international research in the last
five years. As cross-national studies produce items and instruments, it will
be important to consider how all the information generated by such studies
might be made more accessible through electronic communication networks.
The presentation and interpretation of current results could be improved
if there were an easily accessible repository of cross-national research.
Students of education-
-
from
undergraduate prospective teachers to senior researchers in highly specialized
subfields-
-
ought
to have readily at hand a literature (good research published in reasonable
places) and associated data bases that would give them access not just to a
description of the structure of other education systems, but to an
understanding of how such systems work, with particular emphasis on how context
influences school organization, teaching practices, and learning
outcomes.
Previous Section |
HTML Home Page |
Next Section
NAS Home Page |
NAP Home Page |
Reading Room |
Report Home Page
|