Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 195
10
Putting Surveys, Studies, and Datasets
Together: Linking NCES Surveys to One
Another and to Datasets from Other Sources
George Terhanian and Robert Boruch
"Relations stop nowhere....The exquisite problem is eternally but to draw
the circle within which they happily appear to do so."
. . .
Henry James, Roderick Hudson, 1876
This paper examines ideas about combining different datasets so as to inform
science and society. It was prepared at the invitation of the National Research
Council's (NRC) Board on Testing and Assessment so as to inform the board's
deliberations about policy on education surveys in the United States. The surveys
of paramount interest are those sponsored by the National Center for Education
Statistics (NCES).
The research reviewed here and the implications that are educed from it are
directed first to the NRC. They are dedicated in the second place to the interests
of the NCES. The third target is the social sciences community more generally.
Examples given here are drawn from a variety of sciences inasmuch as data
linkage issues transcend academic disciplines. They are taken from different
institutional jurisdictions because the issues cross geopolitical boundaries.
Two studies are used to provoke discussion and to frame some issues:
Hilton's (1992) Using Data-bases in Educational Research and Hedges and
Nowell's (1995) paper on national surveys of the mathematics and science abili-
ties of boys and girls. We also depend heavily on other materials generated by
NCES, the NRC, and others. This includes work, for example, on teacher supply,
demand, and quality (National Research Council, 1992) and on integrating fed-
eral statistics on children (National Research Council, 1995~. The minutes of the
195
OCR for page 196
196
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
NCES Advisory Council on Education Statistics reflect periodic interest in the
way NCES surveys can be linked to one another or to data generated by other
federal agencies (Griffith, 1992) and we exploit these also.
In what follows we begin with the two illustrations that help frame discus-
sion. The pedigree of linkage is considered briefly, and the ubiquity of linkages
in contemporary surveys is then discussed. Inasmuch as the meaning of words
such as linkage, merging, and so on are used differently in the research literature,
the next section covers ways to clarify the language. Distinctions are further
drawn between statistical policy for making surveys connectable in contrast to de
facto policy in which post facto connections are difficult. Evaluating the prod-
ucts of any variety of linkages is important, and this topic is covered also, based
on suggestions about mapping and registering linkage studies. In the next to last
section of the paper we suggest exploring some new kinds of linkage. The paper
concludes with a summary of the implications of this work.
TWO INTRODUCTORY ILLUSTRATIONS
The origin of Hilton's (1992) book was in a project undertaken by the Edu-
cational Testing Service (ETS) to understand whether different sources of statis-
tical information, each based perhaps on a national sample, could be combined to
produce a "comprehensive unified database" of science indicators for the United
States. Sponsored by the National Science Foundation, the project' s general goal
was to improve the way we capitalize on data that bear on educating scientists,
mathematicians, and engineers. The book's implications, inadvertent and other-
wise, are important for designing NCES surveys, among others.
Twenty-four education databases were reviewed by the project, including
the Survey of Doctoral Recipients, national teacher examinations, and at least
four massive longitudinal databases. Only 8 of the 24 were deemed worthy of
deeper examination. That is, the eight could be "linked" in some sense with
others, given the resources available. They included the National Longitudinal
Study of the Class of 1972 (NLS:72) and the National Education Longitudinal
Study of 1988 (NELS:88), the equality of Opportunity Surveys (1960s), cross-
sectional systems such as the Scholastic Assessment Test (SAT), and the NCES
National Assessment of Educational Progress (NAEP).
As Hilton made plain in the preface to his book, the project was "not fea-
sible." Put more bluntly, the ETS effort to combine datasets was a flop despite
competent and thoughtful efforts. The databases chosen for examination could
not be used for the purpose considered (i.e., to produce a comprehensive science
database). It was, nonetheless, a project noble in aspiration and diligent in its
execution.
The questions posed in the Hilton project about the available databases and
which are relevant to linking any datasets seem important for designing new
NCES surveys. Put in modified terms, the questions are as follows:
OCR for page 197
GEORGE TERHANIAN AND ROBERT BORUCH
197
· What variables are common to various databases?
· What ways of measuring each variable, ways of sampling, and adminis
.
tration are common, making comparison (or linkage) among datasets
easy?
What differences in ways of measuring, administrating, and sampling
make comparison (or linkages) dubious or difficult?
· What can be done to fix different datasets so they are "comparable" (or
sinkable) in some way and therefore make it sensible to put them together?
The Hilton book contained no detailed catalog of why the databases failed to
meet one or more of the criteria implied by these questions.
Hedges and Nowell (1995) attacked a different but related topic under-
standing gender differences in mental abilities of various kinds based on dispar-
ate surveys. These authors chose to depend only on studies based on samples of
roughly the same target populations and that purportedly measured the same
abilities (e.g., reading). That is, they selected only studies that approached the
first three questions above in similar ways. Their final selections included NCES-
sponsored work, notably NELS:88, NLS-72, High School and Beyond (HS&B),
NAEP (trend data only), Project Talent, and the National Longitudinal Youth
Survey sponsored by the U.S. Department of Labor, among others. These are
summarized in Table 10-1. We rely periodically on its contents in what follows.
There was sufficient commonality in what was measured on whom in the
target populations in the Hedges-Nowell (1995) ambit to produce an informative
analysis. It is a fine illustration of combining different datasets so as to learn
whether males and females really differ on mental abilities and how they might
differ. For instance, the authors' dependence on well-defined national probability
samples avoided the inferential problems encountered in earlier studies, notably
depending on self-selected samples (as in SAT/ACT testing), idiosyncratic
samples (e.g., in test storming), and distributional assumptions (to get at charac-
teristics of extreme scores). A main product of the Hedges and Nowell's work is
learning that males are more variable than females in their tested intellectual
achievement. This finding helps to elevate substantially the scientific conversa-
tion about the purported differences in the mean levels of math and science
abilities of boys and girls. It helps to show how more variability among boys may
produce specious claims about their ability relative to girls.
THE PEDIGREE OF EFFORTS TO PUT DIFFERENT
DATASETS TOGETHER
The idea underlying any linkage effort undertaken by NCES or by others is
that combining data from different sources can help us learn something new.
More to the point, the combination permits us to learn something that cannot be
learned from individual sources. The idea has fine origins. Alexander Graham
OCR for page 198
198
_'
3
o
Cq
a'
be
a'
a'
a'
· Cq
o
Cq
a'
Cq
x
VO
a'
o
Cq
C)
.=
Cq
.=
a'
o
VO
To
¢
EM
¢
A
Do
Do
. .
V,
z
o
o
;o~
~ m
· ~
4= ~
~ 0
0 ~
0
0 ~
z V)
z
4=
EM
C)
.O
o
ca
C)
4=
· ~
SO
C)
v
ca
~ 0 ~
~ ~ 8
_1-~ ;^ cola
~ ~ ~ .~
Ct
V,
[~ ca
sit
o = ~ .=
Ed ca
.N ~
~ o
Do
Do
o
.o cd
~ ;^
4=
ca
o ~-~
(~N -~ o ~)
s~
O ~ R
;^
¢ O ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
o
O t,.4 0 ~
ca ~ ~ ~ ~
~ e ,, ~ = , ~ O .~ e , ~ ~= ~ ~ ~ .=
OCR for page 199
GEORGE TERHANIAN AND ROBERT BORUCH
199
Bell, for instance, exploited the notion in his study of genetic transmission of
deafness. In the late 1880s he depended on completed Census Bureau interview
forms found strewn in a government building basement and linked these to ge-
nealogical records from other sources (Bruce, 1973~.
One can also trace the theme to John Graunt's effort in the seventeenth
century to learn how to use records in the Crown's interest. Graunt exhorted the
King to understand his empire through a lens consisting of compilations of records
in statistical form: the counts of soldiers at arms, for instance, from one source
and the numbers of births, deaths, and so on from other sources. Scheuren
(1995), similarly thoughtful and exhortative, has reviewed and refreshed our
thinking about how to augment administrative records and understand them better
through surveys.
The pedigree of linkage studies is also reflected in contemporary efforts to
evaluate social programs. In studies of manpower training and employment, for
example, it has become common to link the employment records on specified
individuals to their program records and to link these data in turn to research
records on individuals (Rosen, 1974~. In agriculture, health, and taxation, there
have been fine studies of why and how one ought to couple data from different
sources in a variety of ways (Kilss and Alvey, 1985~.
From papers by Scheuren (1995) and others we learn about contemporary
history of record linkage algorithms (developed by Tepping and Felligi-Sunter,
among others), the construction of matching rules and the information exploited
in matches, the idea of linkage documentation, and various approaches to adjust-
ing for mismatches. We can learn about the role of privacy issues and statistical
analysis implications from a related body of work (e.g., Cox and Boruch, 1988~.
We learn about appraising the benefits and costs of linkage of administrative
records or the difficulty of doing so on account of sloppy practice, from aggres-
sive investigatory agencies such as the U.S. General Accounting Office (1986a,
1986b).
The title of Hilton's book, Using National Data-bases in Educational
Research, may suggest to some readers that they can learn something about
whether, why, and how massive studies are combined and used. In fact, recent
work on how to enhance the usefulness of statistical data is pertinent. Some of it
has been economically oriented for instance, Spencer's work (1980) on cost-
benefit analysis to allocate resources to various data collection efforts and the
follow-up papers by Moses, Spencer, and others. Scholarly papers on why and
how social research data, including educational and health research data, are used
are also relevant. Kruskal's volume (1982) is a gem on this account.
The analyses in Hilton's book were not burdened by the history of linkage.
That is, the authors failed to put the ETS linkage studies into the larger context of
such studies or the still larger context of design and exploitation of databases and
survey. We learn about attempts to link the Armed Services Vocational Aptitude
Battery to tests given in the longitudinal HS&B survey and to SATs, but we are
OCR for page 200
200
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
not told about how this would enhance science indicators or inform decisions or,
more importantly, improve the design of surveys. Similarly, the Hedges and
Nowell (1995) paper does not consider implications of the work for the design of
better surveys that can be linked in any respect, despite the fact that the authors
are sensitive to the implications of their work on other accounts.
THE UBIQUITY OF PUTTING DIFFERENT DATASETS TOGETHER
AND FUNCTIONAL CATEGORIES
Some varieties of linkage are common, even pedestrian. So frequently do
they occur that they are taken for granted. Other varieties of linkage are not
encountered often. They may be undertaken for reasons that seem obviously
important or, to the lay public, obscure or trivial. This section provides illustra-
tion of linkages, pedestrian and otherwise. The examples are put into categories
that have meaning for scientists and an informed public: national probability
sample surveys, longitudinal studies, studies of the quality of data, intersurvey
consistency, and hierarchical studies.
National Probability Sample Surveys
Virtually all national probability sample surveys in this country and else-
where are an exercise in combining information from different systems. Tele-
phone surveys often draw on a population listing of telephone numbers. A
population census may draw on an address list for dwellings. The NCES Schools
and Staffing Survey, for instance, depends on lists of schools identified as admin-
istrative units or locations. List information is used to construct the sample.
Listed information is often combined in the same microrecord with the informa-
tion provided by the respondent.
Longitudinal Studies: Tracking Change
Any longitudinal survey involves linkage at a basic level. Microrecords
obtained on individuals or institutions at one point in time are linked to those
obtained subsequently, as in NELS:88, NLS:72, and HS&B. The organization
responsible for each wave of the survey may vary, of course, as when NCES used
different contractors. Target populations, variables, and their measurement may
also differ somewhat between waves.
Studies of the Quality of the Data
Any postenumeration survey of a national census and most post facto studies
of the quality of a large survey employ linkage. Microrecords in the main initial
survey, for instance, are compared to those generated in a more intensive, smaller,
OCR for page 201
GEORGE TERHANIAN AND ROBERT BORUCH
201
and presumably more accurate study of a subsample of the original target popu-
lation. Efforts to estimate reliability of achievement tests focus on stability of
individual scores over time; individual scores must be linked across time. Finally,
many if not most studies of the validity of respondent reports in surveys rely on
two or more sources of information on the trait or characteristics of interest.
Enrollment records in colleges may be compared to self-reported enrollment
information in a sample of students receiving subsidized loans.
In the federal statistics arena, most studies of response quality or measure-
ment error require linkage and are described regularly in the professional litera-
ture. Scholarly reports usually appear, for instance, in the annual Proceedings of
the Section on Survey Methods of the American Statistical Association and in
reports issued by the federal agency that sponsored the work. It is disconcerting
to see little representation of municipal statisticians in these Proceedings and
reports. It is not clear why their contribution is sparse, and the matter deserves a
bit of researchers' attention.
Intersurvey Consistency
The NCES has conducted a Private School Survey (PSS) independent of a
special supplement to the Schools and Staffing Survey (SASS). SASS has
depended on the PSS for a sampling frame of schools, using a basic form of
linkage. More generally, both the supplemented SASS and PSS have provided
estimates of the numbers of schools, teachers, and students in the private sector.
Each survey is normally run at different times and measures some of the same
variables. On at least one occasion each was run in the same year (Holt et al.,
1994~.
The results of each survey may or may not agree, differences in time frame
being one possible reason for discordance. The occasion of a PSS and a SASS
supplement in the same year permitted NCES to investigate the consistency
between them. At times then NCES depends on applying algorithms to SASS
that reweigh subgroups' totals of schools, teachers, and students in various cat-
egories so as to produce overall group totals that are consistent with PSS group
totals. A "group" here might be a type of private school (e.g., Catholic).
"Linkages" here are of two kinds. First, the PSS is used as the sampling
frame for SASS. Second, the memberships of schools in subgroups are supposed
to be identical in PSS and SASS, and a linkage between the two is required for
estimating new sampling weight.
Consider next the problem of assuring that a school's locale is properly
identified as a large city or as midsized, as urban or suburban, and so on. Each
year NCES attempts to record every school and its locale through the annual
Common Core of Data (CCD) survey. Census Bureau data are used in the CCD
to identify locales, using seven well-defined locational categories used by the
bureau. Every two to five years SASS is run, targeting a sample of schools. In
OCR for page 202
202
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
this effort SASS also elicits information on locales using a simplified question
involving eight categories or responses. A challenge lies in reconciling the two
sources of information about school locale (Johnson, 1993; Bushery et al., 1992~.
Reconciliation of the SASS and CCD files then involves linkage. Such studies
reveal, for instance, that roughly 70 percent of SASS reports on locales are
correct, that 87 percent of Census classifications are accurate, and that the most
common discordance lies in the suburban categories. More important, note that
both data sources are imperfect in different ways. This makes linkage-based
reconciliation studies essential to assuring the quality and interpretability of the
survey results.
Reconciliation studies that illuminate the discrepancies that might be found
between two or more independent surveys are important. It would be dis-
comfitting to find a 10 percent difference in the number of teachers in the United
States based on one NCES survey, for example, in contrast to another NCES
survey undertaken independently and within a year or two of the first survey.
The differences between results of two independent surveys may be a matter of
sampling error. Or they may be substantial and attributable to differences in
questionnaire wording, definitions, and sampling frame. Being able to link
records so as to understand the discrepancies is essential. Linkages may be at the
entity level, such as a school, school district, or state. Or they may be at the
individual level, as when teachers respond to a questionnaire about their career in
teaching.
Consider the following examples based on Kasprzyk et al. (1994) and Jenkins
and Wetzel (1994~. Discrepancies between independent surveys of institutions,
such as "schools," occur for a variety of reasons. For instance, some commercial
firms define schools in terms of their physical locations. The CCD defines schools
in terms of administrative units, two or more of which may be lodged in the same
location. These differences are relevant to sampling frames and to results of
surveys, of course. Careful analyses are done to assure that discrepancies and
their implications are understood.
Furthermore, estimates of the number of teachers in each state may be based
on SASS or on state-generated counts for CCD. The estimates may and do differ
at times for some states. For instance, overestimates of 15 percent in nine states
appeared in the 1990 to 1991 SASS for a variety of reasons. One such reason was
the questionnaire wording used in each survey. A respondent in the CCD would
report on a unit involving grades kindergarten through 6; the SASS respondent
might report on kindergarten through 6 and on grades 7 and 8. Postprocessing
edits helped reduce discrepancies.
Hierarchical Studies
Once said, it is obvious that any survey of schools, teachers in schools, and
students assigned to particular teachers must involve a basic linkage of micro
.
OCR for page 203
GEORGE TERHANIAN AND ROBERT BORUCH
203
records to be useful as a hierarchical study. That is, one must be able to link each
child to his or her teacher and each teacher to the school that the teacher serves.
Research on the problem of doing such work in the context of SASS has been
conducted since at least the early 1990s (King and Kaufman, 1994~. Partly
because such work often involves ex-ante design, rather than ex-post facto record
linkage, difficulties in linkage appear to be ordinary. Rather, estimation issues
appear to be difficult. Of course, many more levels of linkage are possible. The
Third International Mathematics and Science Study (TIMSS) is an obvious
example. It involves no temporal linkage of the kind that longitudinal studies
require. It does involve sampling test items in each child, sampling classrooms in
schools, sampling schools in each nation, and a nonprobability sample of nations.
Thousands of instances of linkage of diverse kinds are entailed in such a study.
WHAT DOES "LINKAGE" MEAN?
Vernacular in the sciences is not as uniform as one might expect. Recall, for
instance, debates over what constitutes a gene or genome in the Human Genome
Project. Discussions about integrating or linking data in the social sciences also
are affected by dialect differences. We discuss illustrations below and then
dimensionalize the idea of linkage. The focus is on units whose records are to be
linked, the populations from which units are sampled, and the variables that are
measured on these units and other matters. All in what follows depends on
learning from others about what linkage has meant in the context of work spon-
sored by NCES and others.
Vernacular and Definitions in Education Statistics
The Hilton (1992) book's vernacular is sufficiently different from technical
parlance in related areas to confuse some readers. For instance, there are repeated
references to "linking" and "merging" of different databases, but these terms are
undefined. Further, the book's use of these words is at times not the same as is
customary in contemporary statistical work. For instance, linkage is defined, in
effect and occasionally, as combining microrecords based on a common identi-
fier for the same person or entity. At times the book's use of the word link is to
imply an intention to "put together." At other times the word link means to
stratify the units in each database in the same way (e.g., high ability, Hispanic,
and so on) in order to look at how frequencies in these strata change over time on
a dimension such as persistence in studying science. The word merge is also used
to describe putting different records together, records that may or may not have a
common source.
The phrase "pooling data" was used by Hilton (1992) and has been used by
others in the sense of doing a side-by-side comparison of statistical results from
each of several different datasets. This phrase is not used in a way that some
OCR for page 204
204
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
readers would expect. For some analysts "pooling data" means combining the
data from two or more samples of the same population into one that can be
analyzed as a complete sample. For others it means combining the results from
samples of different populations. Finally, consider another more recent example.
Bohrnstedt (1997) uses the words link, integrate, and connect in a thoughtful
essay entitled "Connecting NAEP Outcomes to a Broader Context of Educational
Information." His use of these terms, at first glance, is instructive. The consci-
entious reader might observe, for example, that Bohrnstedt makes a careful dis-
tinction between link and integrate. He refers, for example, to the "integration"
of CCD information with NAEP data, and he discusses the possible "linkage" of
NELS:88 and NAEP data. The reader who also possesses some knowledge of
what these datasets contain might then conclude that two datasets can be linked,
at least in the context of education, if both involve the assessment of achievement
or performance. This reader would be mistaken, though. As Bohrnstedt con-
cludes, he uses the term link when referring to CCD/NAEP integration: that is, he
substitutes link for integrate. The word connect does not reappear in the paper's
prose. What are the implications of this example? Especially in creative efforts
such as Bohrnstedt's, the precise meanings of such words as link, integrate, and
connect ought to be made plain.
Vernacular in Other Sciences
Work on genes and genomes engenders problems of differences in labeling
the object of their attention in context. For instance, a gene for one species may
be called something different from the same gene in another species. Given the
remarkable growth in genetic research, including the number and size of genome
sequence databases, this is not a trivial matter (Williams, 1997~. Similarly,
scientists have begun to build a World Wide Web-oriented database on gene
mutations as a part of the Human Genome Project effort. A feature of the design
problem is to agree on what to call mutation. "The nomenclature is nearly agreed
on . . . (with) the systematic name . . . based on the nucleic acid change and . . .
the common name based on the amino acid change" (Cotton et al., 1998:9~. The
Internet will be used to further explicate and debate.
The vernacular problem is not confined to the life sciences. It extends to
mathematics. "Computation," for instance, was heralded in a recent Science
piece on bridging databases. In fact, basic statistical analyses, rather than compu-
tations, were the main topic: understanding how to estimate relationships when
there are many errors attributable to sampling and measurement (Nadis, 1996~.
The lead on an interesting letter to Science was entitled "Bioinformatics: Math-
ematical Challenges" (Grace, 1997~. Yet the letter concerns what is now regarded
as a conventional statistical analysis approach to understanding the structure
underlying data (i.e., analysis of variance), developed by two scholars who
admired and exploited mathematics, R.A. Fisher and O. Kempthorne.
OCR for page 205
GEORGE TERHANIAN AND ROBERT BORUCH
205
Science has also carried excellent articles with headings such as "Digital
Libraries" (e.g., Nadis, 1996), "Letters" (Cotton et al., 1998), and "Bioinformatics"
(Williams, 1997~. They all deal with the names of things. But such papers are
not easily found in any Web or library-based search based on a single keyword.
One of us had to review the articles published over a five-year period to get the
connection.
Implication: Understanding and Standardizing Nomenclature
One of the implications of this vernacular problem for NCES is that discus-
sion, analysis, and agreement on terminology are in order. Because there has
been little standardization in educational statistics produced at the state level, in
recent years NCES has played a leadership role in getting state education agen-
cies to agree to common definitions in statistical reporting. Witness the rough
consensus on using two or three definitions of "dropout," for example. Witness
also the NCES surveys of how public schools ask about student's race and
ethnicity and the stupefying variety in measurement that then impedes better
thinking. NCES can play a related role here and to refresh the roles taken at times
by the Internal Revenue Service's Statistics of Income division, the Census
Bureau's methods division, and others. That is, NCES can help make plain what
we mean by "combining" datasets or surveys; "connecting" them; "linking"
microrecords, datasets, or surveys; "pooling" datasets or surveys; "integrating"
surveys or statistical systems; "unified databases; and "merging" files. In other
words, putting things together. Absent explicit definitions of what these words
mean, reaching mutual understandings in the statistical and political communities
will be difficult or impossible. Most importantly, designing surveys so that they
can be linked, compared, merged, and so on will be impossible. NCES can be a
leading agency in this effort.
Dimensions of Linkage
One way of arranging the way we think about linkage is to depend on the
elements used in designing conventional statistical surveys. Consider then the
ideas of units of sampling, populations, and variables in this context and exten-
sions of the ideas.
Units: Individuals, Entities, or Both
Records on an individual may be linked, as when a child's school transcript
is linked to the child's responses to a survey questionnaire, as in High School and
Beyond. Or responses on one wave of the HS&B may be linked to responses on
subsequent waves, as in any longitudinal study. Similarly, a child or parent's
OCR for page 218
218
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
counts are a stereotypical device for characterizing value, but other approaches
can be exploited. For instance, the Proceedings of the American Statistical
Association is not viewed by some as a scholarly journal. Nonetheless, the work
products published therein are fundamental to our understanding of what goes on
at NCES and other statistical agencies. NCES's planned journal, and other peer-
reviewed journals, may publish works that appeared earlier in the Proceedings.
But it would be as foolish to rely on the latter alone as it would be to ignore the
Proceedings.
SOME LINKAGE OPTIONS IN EDUCATION STATISTICS
There is no formal, well-articulated "linkage policy" at NCES or any other
statistical or research agency in the United States. We are aware of no such
policy in Sweden, Israel, France, the United Kingdom, Japan, or Germany.
Absent formal policy, identifying viable and interesting examples of what is
desirable is a dubious objective. In what follows we suppress our ambivalence
and discuss what might be desirable. Each suggestion for the future ought to be
considered in light of our earlier suggestions in this paper about evaluation and
vernacular.
Linking NCES Surveys
Several of the NCES datasets mentioned earlier, including NAEP, SASS,
NELS:88, and CCD, contribute in distinct ways to the research and policy-making
communities' understanding of a variety of important educational issues. NAEP
for example, generates national and subnational estimates of achievement in core
subject areas on a regular basis. SASS, on a somewhat less regular basis, pro-
duces a wealth of information concerning teacher supply, demand, quality, and,
more generally, conditions in schools. NELS:88 allows researchers to test myriad
hypotheses bearing on how, and how well, students learn over time. And the
CCD provides general information on the nation's universe of school districts
and schools, respectively, on an annual basis.
Are These Datasets "Puzzle Pieces" that Fit Together Neatly?
Despite their unique contributions, these NCES surveys are not pieces of an
education puzzle that fit together neatly. On the contrary, certain pieces seem
broken, several duplicate pieces exist, some pieces are inexplicably missing, and
a few new pieces are produced so slowly that they appear to be altogether lost.
Examples are given in what follows.
OCR for page 219
GEORGE TERHANIAN AND ROBERT BORUCH
Broken Pieces: Example 1
219
Terhanian (1997) analyzed 1994 NAEP data in the interest of developing a
deeper understanding of the relationship between school expenditures and student
reading proficiency. To obtain school expenditure information for his analysis,
Terhanian linked CCD district information (which he then converted to per-pupil
values) with NAEP district, school, teacher, and student information. The task of
linking CCD and NAEP data was by no means straightforward or seamless,
however, because the NAEP dataset did not include the CCD unique identifica-
tion code for participating school districts or schools. Yet, as Terhanian discov-
ered inadvertently, the NAEP dataset did include the two "broken" pieces (i.e.,
separate variables) of the unique district code. By simply concatenating the two,
Terhanian was able to create the one variable that was necessary to augment the
NAEP data with CCD data.
A Peculiar Irony
NCES does not provide researchers with instructions on how to "fix" the
"broken" pieces in the NAEP user's manual. Nor do NCES representatives
actively publicize the presence of these pieces. It is perhaps for these reasons that
scholars who focus on NAEP' s improvement often recommend linkage with the
CCD. They simply do not realize that the two datasets are already sinkable, albeit
with difficulty.
Duplicate Pieces: Example 2
Several NCES datasets, including NELS:88, SASS, and NAEP, include ques-
tions about school quality, teacher experience, and other common areas that
concern policy makers and researchers. In some cases the exact same questions,
or very similar ones, appear on different surveys. In other cases, however,
questions about the same topic are phrased so differently across surveys that it is
impossible to compare responses. Understanding NCES's rationale here is not as
complicated as it seems. No one at NCES is charged with the responsibility of
coordinating the various surveys, many of which run during the same year, at the
microlevel. That is, no one really knows which questions are on which surveys,
much less how they got there. We believe there is a better way.
Missing Pieces: Example 3
Linkage efforts are less successful than planned at times because puzzle
pieces are missing. In the 1992 NAEP eighth-grade national math assessment,
for instance, only about 60 percent of 8,300 math teachers could be linked cor-
rectly to their students. Data were completely missing for 35 percent of the total
OCR for page 220
220
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
sample of teachers and partly missing for another 5 percent. Attempts by re-
searchers to shed light on the relationship between teacher characteristics and
student achievement, then, could only flop. NCES and its contractors seem to
have corrected the within-school linkage problem the teacher/student match
rate improved appreciably for the 1994 and 1996 NAEP assessments. The ability
of NCES and its contractors to learn from such failures certainly bodes well for
the future.
Lost Pieces: Example 4
NCES datasets are not always produced expeditiously. Instead, some datasets,
notably the CCD, are produced so slowly that they appear to be altogether lost.
This not only diminishes the usefulness of the CCD to researchers and others but
also adds to their frustration. Consider, as an example, the case of the JASON
Foundation for Education. In 1997 the foundation developed a promising method
to deliver science instruction via the Internet to middle school students. At the
same time, it developed a simple registration process for potential participants
that exploited the interactive nature of the Internet and relied on information from
the 1992 to 1993 CCD.
In order to register for the pilot program, participants had to first identify
their school district from a menu of districts and then their school from a menu of
all schools in their district. After they did so, additional information about the
district and the school populated several data fields on the registration page.
JASON then asked potential participants to complete the registration form by
confirming or editing the CCD information that populated the data fields. From
start to finish, the entire process should have taken less than five minutes.
The registration process turned out to be flawed, however, because a non-
trivial percentage of CCD information was either obsolete or missing (i.e., it
seemed "lost". For this reason about 10 percent of the first several hundred
JASON registrants could not find their school districts or schools listed among
those on the registration Web site menu. Others who were able to find their
school districts or schools often felt obligated to correct dated information (e.g.,
number of students in the school). The registration process turned out to be a
burden for respondents despite the good intentions of the folks at JASON.
What does this example of a "lost piece" imply for NCES? If researchers
and others are to rely on the CCD, NCES must ensure that data are collected and
compiled more expeditiously. Comparing the pace of the current collection and
compilation process to that of the movement of a glacier, regardless of the cause
(e.g., state officials possess no obvious incentive to provide NCES with informa-
tion in a timely manner), seems fair.
OCR for page 221
GEORGE TERHANIAN AND ROBERT BORUCH
What Combination of NCES Data Is Available and at
What Linkage Level?
221
For any randomly chosen public school in the United States, the CCD is
likely to be the only NCES information source available to researchers and policy
makers. Absent a change in how NCES designs its surveys, there is little reason
to expect some nontrivial combination of CCD, SASS, NAEP, and NELS:88 data
to be collected during the same year for a meaningfully representative sample of
schools. This is despite the fact that some combination of these data would, in
our opinion, better serve the research and policy-making communities.
Table 10-2 displays crudely the current linkages among and between the
NCES datasets mentioned here. It also describes the level at which these datasets
are currently sinkable. What are the current research implications of these poten-
tial linkages on analysis? It is possible to link some combination of CCD (e.g.,
core per-pupil expenditures of the Amarillo Independent School District), SASS,
NAEP, and NELS:88 information at the district level in a given year. See
Terhanian (1997), Wenglinsky (1997), and Taylor (1997) for recent examples of
analyses that have exploited some combination of these linkage opportunities. It
is also possible, in some cases, to link CCD, SASS, and NELS:88 at the school
level in a given year. About 23 percent of the schools in which the sample of
NELS:88 students were enrolled in both 1990 and 1992, for instance, also partici-
pated in the 1990 to 1991 wave of SASS. CCD information, then, is also avail-
able for these schools during these years.
The value of linkage may seem trivial to researchers who wish to carry out
analyses of student or school samples that are representative of the nation or
states. The implications for the design of future surveys, however, are perhaps
less trivial. Just as we recommended that NCES or some other thoughtful federal
agency develop a map or maps of variables across surveys, we also suggest that
they consider doing so for the actual surveys they sponsor. The object of map-
ping is to better understand how the education puzzle pieces fit together, what
pieces are missing, and what pieces are needed to better complete the puzzle.
Linkage and Augmentation of NCES Data and Non-NCES Data
At times, states, other federal agencies, and government contractors produce
information that can be linked to NCES datasets, including NAEP. For instance,
the Pennsylvania Educational Policy Studies Project, which is affiliated with the
TABLE 10-2 Linkages Between and Among NCES Datasets
Level
Data Source
District SASS NELS:88 CCD NAEP
School SASS NELS:88 CCD
OCR for page 222
222
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
University of Pittsburgh, maintains a database that provides general descriptive
data on the universe of Pennsylvania's school districts. These data include valu-
able information that is not available through other sources such as the CCD,
notably each school district's Equalized Subsidy for Basic Education (ESBE)
revenue (which is the largest source of state aid to school districts) and the ratio
the state uses to determine ESBE revenue.
States such as Pennsylvania, then, are in a position to exploit linkage oppor-
tunities. For instance, the Pennsylvania state department of education might
compare NAEP results with results from its own state assessment. Or Pennsylva-
nia might undertake a large-scale satisfaction survey of the sample of schools
participating in NAEP or SASS in the interest of understanding the effect of
school quality, measured more broadly than it is currently measured, on school
and perhaps even student achievement. Instances of states capitalizing on NCES's
efforts are hard to find, however.
An example of a government agency capitalizing on and augmenting NCES' s
work is not so hard to find. The General Accounting Office (GAO) used the
SASS sample in its recent work to investigate the quality of school facilities
across the United States. GAO did not, however, return an augmented dataset to
NCES for analysis because no arrangement had been made with NCES in ad-
vance. To us this seems quite shortsighted on the part of either NCES, GAO, or
perhaps both.
The American Institutes for Research, a government contractor, has pro-
duced a Teacher Cost Index (TCI) to which NAEP or other NCES datasets might
be linked. The TCI is a district-level index that accounts for factors that underlie
differences in the cost of living among school districts (Chambers, 1995~. Devel-
oped in part on the basis of an analysis of the 1993 to 1994 SASS, the TCI
provides researchers with an arguably important tool for adjusting expenditure
data to make expenditure effectiveness comparisons more fair. It enables re-
searchers to estimate, for instance, the annual salary that school districts across a
state would have to pay a similarly qualified teacher.
Private Organizations
At a high level of analysis, private organizations often link their efforts to a
dataset generated by public agencies. Louis Harris and Associates, for instance,
periodically surveys nationally representative samples of teachers, students, and
parents. The sampling frames on which the organization relies include the CCD.
Harris's efforts do not usually engender individual privacy issues because data
are reported only in the aggregate. Moreover, the issues that concern Harris are
not necessarily those that NCES and other federal agencies are able to focus on.
Rather, Harris consciously seeks to fill missing information gaps and therefore
focuses on certain important issues in far greater depth than NCES. These issues
OCR for page 223
GEORGE TERHANIAN AND ROBERT BORUCH
223
include parental involvement; safety and violence in schools, neighborhoods, and
cities; and gender equity in schools.
There is no great reason why Louis Harris and Associates or other private
organizations could not cooperate with NCES (or other statistical agencies) to
enhance understanding of the value of sample augmentation linkage of the sort
described earlier. Harris could have used the NCES Schools and Staffing Survey
or any of the recent NAEP samples, for example, to inform or improve the design
of the 1997 Metropolitan Life surveys that investigated gender equity and parental
involvement in schools from the perspectives of students, teachers, and parents.
And the organization might have provided NCES with resultant datasets as well
as suggestions for improving future surveys and/or linkage.
Organizations such as Louis Harris and Associates are sensitive to the idea
that linkages of various kinds can advance the company's mission in the public
interest. They also recognize that linkage of datasets may be useless and that
linkage engenders both naive and subtle privacy issues. More important, such
organizations can be encouraged to develop more creative and innocuous ap-
proaches to policy on putting datasets together. This effort could be made for
national samples of schools, local education agencies, sampling frames, and so
forth. The information that comes about as a result ought to become a part of the
knowledge base for NCES and other statistical agencies.
SUMMARY
Implication: Electronic Mapping
NCES, and perhaps other statistical agencies, can invent a Web-based system
for mapping the variables measured in each survey sponsored by the agency (and
other studies), the questions that address the variables, and the question response
categories, exploiting hypertext to facilitate the acquisition of deeper information
and wider searches. This would make easier the task of understanding what is
common and unique to diverse surveys in education and perhaps other areas.
Such a system is a natural extension of NCES's work on data warehousing and
electronic code books and can adopt software that meets open database connec-
tivity standards.
Implications: Nomenclature
NCES can play a leadership role in clarifying and standardizing the semantics
of linkage. This would help make plainer and more uniform words such as
merging, pooling, connecting datasets and so forth and fostering sensitivity to
definitions of these in statistical policy, activity, and publications. NCES has
been vigorous in related respects in the past, to judge from the agency's work
OCR for page 224
224
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
with state education agencies on, for example, determining what dropout means
and how a dropout is counted.
Implications: Dimensionalizing Linkage
NCES can explore ways to make plainer the functions of linking surveys, in
effect dimensionalizing linkage activity. This might be done, as suggested earlier,
by hinging dimensionalization on the ideas of augmenting a primary survey with
two or more secondary ones, focusing on what is augmented: samples, popula-
tions, variables, modes of measurement, replication, and so on. The rationale is
that we need to learn how to better arrange our thinking about very complex
linkage efforts.
Implication: Linkage Policy
NCES can explore at least two approaches to linkage policy. Ex-ante policy
stresses the idea that all surveys can be planned so as to be more connectable in
specific senses. Ex-post facto policy recognizes that not all linkage can be
planned and that unplanned linkage must be planned for. Further, institutional
vehicles for developing policy can be identified and explored, such as inter-
agency councils and statistical agency task forces. In the continued absence of
coherent policy, we are unlikely to make much progress in productively exploit-
ing diverse surveys or in better understanding the benefits and costs of linked
studies.
Implication: Registries, Displays, and Evaluation
Developing a registry of each study that depends on linkage and developing
new ways of displaying sinkable or linked studies is possible. These are essential
to understanding the linkage landscape and, moreover, to evaluating the value of
linkages of various kinds. No such registries exist. Partly for this reason, per-
haps, few formal and comprehensive evaluations of linkage efforts have been
published.
Implication: Broken Pieces, Missing Pieces
NCES can consider approaching linkage issues productively by using a
"broken pieces, missing pieces" theme. That is, one tries to understand how a
study could be more informative had the possibility of linkage actualized through
better planning. This perspective is kin to the idea underlying good postmortems
in medicine and good crash investigations in the aviation and nuclear sciences,
engineering, and other disciplines. It can be exploited by statistical agencies in
the linkage context as it is, in effect, in individual survey efforts and formalized.
OCR for page 225
GEORGE TERHANIAN AND ROBERT BORUCH
Implication: Cross-Agency and Cross-Institution Initiatives
225
NCES can play a leadership role in understanding whether, how, and how
productive certain kinds of linkage studies that cross institutional and geopolitical
jurisdiction lines have been and could be done. In pnnciple, for example, some
surveys sponsored by the public might easily be linked in one or more dimensions
with privately sponsored surveys. In principle a survey mounted by a federal
statistical agency such as NCES can be designed so as to permit easy connection
to a study designed by a federal agency with another mission, such as program
evaluation. What is possible in principle is not always possible in practice, but
unless we explore the former, we will not improve the latter.
To return to the general topic of this essay, recall the quotation from Henry
James at the start of this paper. It says, in other words, that everything is related
to everything else. To make this manageable, NCES and the statistical and social
sciences community have to draw circles around the more connectable things. In
this respect the work reviewed in this paper and the implications educed here can
help NCES and the research community do better in the future. This requires
resources, of course, not the least among which is the political and scientific will
to make data work harder to serve the public interest.
ACKNOWLEGMENTS
Research for this paper was sponsored by the National Center for Education
Statistics, the National Science Foundation, and the U.S. Department of Educa-
tion. We are grateful to colleagues at the Planning and Evaluation Service of the
U.S. Department of Education, the U.S. General Accounting Office, and the
Education Statistical Services Institute for conversations that helped clarify our
thinking on the topic.
REFERENCES
Blasius, J., and M. Greenacre
1998 Visualization of Categorical Data. New York: Academic Press.
Boruch, R.F., and G. Terhanian
1996 So what? The implications of new analytic methods for designing NCES surveys. Pp.
4.1-4.118 in From Data to Information: New Directions for the National Center for
Education Statistics, G. Hoachlander, J. Griffith, and J.H. Ralph, eds. Washington, D.C.:
U.S. Department of Education.
1998 Controlled experiments and survey-based studies on educational productivity: Cross-
design synthesis. Pp. 59-85 in Advances in Educational Productivity, Volume 7, A.
Reynolds and H. Walberg, eds. Greenwich, Conn.: JAI Press.
Bohrnstedt, G.W.
1997 Connecting NAEP Outcomes to a Broader Context of Educational Information. Paper
presented at the annual meeting of the American Educational Research Association,
Chicago.
OCR for page 226
226
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
Braun, M., and W. Miller
1997 Measurement of education in comparative research. Comparative Social Research
16: 163-201.
Brooks-Gunn, J., B. Brown, G.J. Duncan, and K.A. Moore
1995 Child development in the context of community resources: An agenda for national data
collection. Pp. 27-97 In Integrating Federal Statistics on Children: Report of a Work-
shop. Board on Children and Families and Committee on National Statistics, National
Research Council. Washington, D.C.: National Academy Press.
Bruce, R.V.
1973 Bell: Alexander Graham Bell and the Conquest of Solitude. New York: Little Brown.
Bushery, J., D. Royce, and D. Kasprzyk
1992 The Schools and Staffing Survey: How re-interview measures data quality. In 1992
Proceedings of the Section on Survey Research Methods. Alexandria, Va.: American
Statistical Association.
Citro, C.F.
1997 Editor's postscript. Chance 10(4):31.
Chambers, J.
1995 Public School Teacher Cost Differences Across the United States. Washington, D.C.:
National Center for Education Statistics.
Cotton, R.G.H., V. McKusick, and C.R. Scriver
1998 The HUGO Mutation Database Initiative. Science 279:10-11.
Cox, L.H., and R.F. Boruch
1988 Emerging policy issues in record linkage and privacy. Journal of Official Statistics
4(1):3-16.
Evinger, S.
1997 Recognizing diversity: Recommendations to OMB on standards for data on race and
ethnicity. Chance 10(4):26-31.
Letter. Science 275: 1862-1863.
Grace, J.B.
1997
Griffith, J.
1992 Presentation to the National Advisory Council on Education Statistics (March 12-13,
1992): Draft Paper on a Proposal for an Integrated Longitudinal Studies Program. Wash-
ington, D.C.: National Center for Education Statistics.
Harkness, J., and P. Mohler
1998 Towards a Manual of European Background Variable: Part I, Appendix II: Report on
Background Variables in a Comparative Perspective. Mannheim, Germany: Zentrum
fur Umfragen, Methoden und Analysen.
Hedges, L.V., and A. Nowell
1995 Sex differences in mental test scores, variability, and numbers of high scoring individuals.
Science 269:41-45.
Hilton, T., ed.
1992 Using National Data-bases in Educational Research. Hillsdale, N.J.: Lawrence Erlbaum
Associates.
Hofferth, S.L.
1995 Children's transition to school. Pp. 98-123 in Integrating Federal Statistics on Children:
Report of a Workshop. Board on Children and Families and Committee on National
Statistics, National Research Council. Washington, D.C.: National Academy Press.
Holland, P.W., and D.B. Rubin, eds.
1982 Test Equating. New York: Academic Press.
OCR for page 227
GEORGE TERHANIAN AND ROBERT BORUCH
227
Holt, A., S. Kaufman, F. Scheuren, and W. Smith
1994 Intersurvey consistency in school surveys. Pp. 105-l lO in Volume II: 1994 Proceedings
of the Section on Survey Research Methods. Alexandria, Va.: American Statistical
Association.
Jenkins, C.R., and A. Wetzel
1994 The 1991-92 teacher follow-up survey reinterviewed and extensive reconciliation. Pp.
821-826 in Volume II: 1994 Proceedings of the Section on Survey Research Methods.
Alexandria, Va.: American Statistical Association.
Johnson, F.
1993 Comparisons of school locale settings: Self-reported vs. assigned. Pp. 689-691 in 1993
Proceedings of the Section of Survey Research Methods. Alexandria, Va.: American
Statistical Association.
Kasprzyk, D., K. Gruber, S. Salvucci, M. Saba, F. Zhang, and S. Fink
1994 Some data issues in school-based surveys. Pp. 815-820 in Volume II: 1994 Proceedings
of the Section on Survey Research Methods. Alexandria, Va.: American Statistical
Association.
Kilss, W., and W. Alvey, eds.
1985 Record Linkage Techniques: Proceedings of the Workshop on Exact Matching Method-
ologies. Washington, D.C.: U.S. Department of the Treasury.
King, K.E., and S. Kaufman
1994 Estimation issues related to the student component of SASS. Pp. 1111-1115 in 1994
Proceedings of the Section on Survey Research Methods. Alexandria, Va.: American
Statistical Association.
Kruskal, W.H., ed.
1982 The Social Sciences: Their Nature and Use. Chicago: University of Chicago Press.
Ligon, G.
1998 Success Finder Mapper. Available at: www.evalusoft.com.
McCabe, B., and J. Harkness
1998 Towards a Manual of European Background Variable: Part I, Appendix II: Report on
Background Variables in a Comparative Perspective. Mannheim, Germany: Zentrum
fur Umfragen, Methoden und Analysen.
Nadis, S.
1996 Computation cracks semantic barriers between data-bases. Science 272:1419.
National Research Council
1992 Teacher Supply, Demand, and Quality: Policy Issues, Models, and Data-bases, E.E. Boe
and D.M. Gilford, eds. Committee on National Statistics. Washington, D.C.: National
Academy Press.
1995 Integrating Federal Statistics on Children. Board on Children and Families and Commit
tee on National Statistics. Washington, D.C.: National Academy Press.
1999 Grading the Nation's Report Card: Evaluating NAEP and Transforming the Assessment
of Educational Progress, J.W. Pellegrino, L.R. Jones, and K.J. Mitchell, eds. Committee
on the Evaluation of National and State Assessments of Educational Progress, Board on
Testing and Assessment. Washington, D.C.: National Academy Press.
Pallas, A.
1995 Federal data on educational attainment and the transition to work. Pp. 122-155 in Inte
grating Federal Statistics on Children: Report of a Workshop. Board on Children and
Families and Committee on National Statistics, National Research Council. Washington,
D.C.: National Academy Press.
OCR for page 228
228
PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER
Rosen, S., ed.
1974 Final Report of the Panel on Manpower Training Evaluation: The Use of Social Security
Earnings Data for Assessing the Impact of Manpower Training Programs. Washington,
D.C.: National Academy of Sciences.
Scheuren, F.
1995 Administrative Record Opportunities in Educational Survey Research. Report prepared
for the National Center on Educational Statistics. Washington, D.C.: George Washington
University.
Spencer, B.D.
1980 Conducting benefit cost analysis. Pp. 38-59 in R.W. Pearson and R.F. Boruch, eds.
Lecture Notes in Statistics: Survey Research Designs. New York: Springer-Verlag.
Taylor, C.
1997 The Effect of School Expenditures on the Achievement of High School Students: Evi-
dence from NELS and the CCD. Paper presented at the American Educational Research
Association annual meeting, Chicago.
Terhanian, G.
1997 School Policies and Practices, Student Proficiency, and Racial Differences in Proficiency:
Evidence from a Multilevel Analysis of the Reading Proficiency of 4th Graders from
Pennsylvania and New York. Paper presented at the Summer Data Conference of the
National Center for Education Statistics, Washington, D.C.
Homepage. Available at: http://dolphin.upenn.edu/~terhania.
Tufte, E.R.
1990 Envisioning Information. Cheshire, Conn.: Graphics Press.
U.S. General Accounting Office
1986a Computer Matching: Assessing Its Costs and Benefits. Washington, D.C.: U.S. General
Accounting Office.
1986b Computer Matching: Factors Influencing the Agency Decision Making Process. Wash-
ington, D.C.: U.S. General Accounting Office.
Vogel, G.
1997 Publishing sensitive data: Who calls the shots? Science 276:523-526.
Wenglinsky, H.A.
1997 When Money Matters: How Educational Expenditures Improve Student Performance and
When They Don't. Princeton, N.J.: Policy Information Center, Educational Testing
Service.
Williams, N.
1997 How to get databases talking to one another. Science 275:301-330.
Representative terms from entire chapter:
robert boruch