Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 89
8
Discussion and Next Steps
THE NEED FOR CHANGE
The issues and challenges facing federal data collections and the sustain -
ability of the current system were revisited by several participants at the end of
the workshop. Robert Groves said that the increasing costs of data collections
combined with the possibility of declining budgets are bringing the federal
statistical system to the “edge of chaos,” where a small decline in a statistical
agency’s budget could threaten the existence of entire surveys. He argued that
agencies should work together to develop contingency plans for situations in
which a survey may have to be dropped, thinking about whether the statistical
system collectively would still be able to produce some of the necessary data
after a cut of this type. Robert Kominski voiced a similar concern, saying that
federal statistical agencies tend to make decisions in a methodical and organized
way, based on information available about the past. However, changes in the
environment can happen, and sometimes these changes are quite large.
Graham Kalton went further, suggesting that the system is characterized
by a tendency to maintain the status quo and fear of the possible adverse con -
sequences of change. He was not sure that questioning the sustainability of the
federal statistical system was warranted, but he agreed that the current surveys
are not in line with many of the current needs described, especially increasing
demand for data at smaller geographic areas and disaggregated for smaller
subgroups to inform more focused policy-making decisions. The growth in this
area has been a trend for many years, and it is time to discuss ways of address -
ing these needs.
Katharine Abraham agreed that the increased need for richer information
89
OCR for page 90
90 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
is evident from the discussions at the workshop. She emphasized that a global
evaluation of the current state and the future of federal household surveys will
involve making some difficult choices and setting priorities.
Kalton argued that approaching the task incrementally is quite appropriate.
Groves said that, although he agrees, he would like to see a vision crystallize in
the near future. Parts of a vision have seemed to emerge during the workshop
and nailing that down soon would make incremental steps toward a specific
vision possible. Andrew White also urged participants to spell out the intended
goals and line up initiatives with their expected outcomes, especially in light of
the magnitude of the projects discussed.
INTEGRATION OF SURVEY CONTENT
Abraham summarized one of the main themes of the workshop as the
importance of survey content integration. One aspect of this is the use of com -
mon definitions for the concepts measured—to the extent that this is appro -
priate—because comparability enables researchers to make better use of the
information available. Kalton said that the discussion of the development of
standardized disability measures was a good example of the benefits, especially
when the questions are set up so that additional measures can be added to
expand the definition of a concept. The main set of questions provides a valu-
able benchmark for comparison across surveys.
Abraham argued that making headway in the area of integration of con -
tent would require agencies working together from the planning stages of a
survey and collaborating during redesign efforts to determine crucial content.
The burden cannot be placed entirely on the Office of Management and Bud -
get (OMB). Cynthia Clark recalled her experience working on the United
Nations Global Strategy to Improve Agricultural and Rural Statistics, which
brought together organizations to identify the core data items that needed to
be produced.
Trivellore Raghunathan compared federal statistical agencies to academic
departments, in which researchers are focused on their particular disciplines.
His own work illustrates that bringing together interdisciplinary teams to
address these types of issues works well. This was echoed in Groves’s comments
that people have to stop talking to just themselves and begin a dialogue with
others whom they do not usually think about when they design data collections.
Hal Stern raised the question of whether, given the costs of data collec -
tions, there is information currently collected by federal statistical agencies that
goes beyond what is mandated or widely used. As an “outsider” (an academic),
he said he can afford to raise difficult questions, but his question tied in with
Abraham’s point about addressing priorities and determining collectively which
measures are crucial.
Edward Sondik (National Center for Health Statistics) also sees as valu-
OCR for page 91
91
DISCUSSION AND NEXT STEPS
able setting core standards and benchmarks for what represents critical data
in a field. In the area of health, there is an explosion of information, including
data collections funded by the National Institutes of Health, and many of these
data collections do not go through OMB. Private companies are also producing
more and more data. Sondik said that this is not necessarily good or bad, but
the increase in the volume of information from an increasing variety of sources
will require federal statistical agencies to step up and provide an assessment—a
“consumer’s report”—on the quality of these data. This is perhaps an important
future role for the federal system, he said.
Kominski reminded participants that the decentralized nature of the sta -
tistical system is one of its virtues. For example, the high school dropout rate
published by the Census Bureau differs from the one published by the Depart -
ment of Education. This reflects differences in terms of what to measure and
how to measure it, and it is not necessarily a problem, but something to con -
sider when assessing the challenges involved in getting different agencies to
coordinate their measures. He added that it is nevertheless important to ensure
that coordination happens in a systematic way.
Making a similar argument as Sondik, he observed that this is particularly
true in light of increasing volumes of data produced outside the federal statisti -
cal system that are receiving substantial attention, in part because they can be
made available much faster than federal data. An example of this is the Google
consumer price index, which is based on the tracking of online price data.
Although the value and potential of these types of data are not clear, there is
little doubt that researchers should at least be paying attention to these alterna -
tive approaches and that the role and usefulness of “official statistics” should
be evaluated in this context as well.
Groves warned that the timeliness of data releases is a particularly big con -
cern, because federal statistical agencies are out of sync with competing sources
of information. For example, the quality of an alternative price index may be
really poor, but if it is available in real time, then that may be a compelling
argument for some uses. Abraham responded that a lot of the economic data
are released very quickly: for example, the unemployment rate is published on
the first Friday after the month to which the estimates apply, and that is quite
good. Groves agreed that timeliness is relatively good in terms of the economic
data released, even though he questioned why the unemployment data cannot
be published weekly. However, he emphasized that in other areas the lack
of timeliness is a significant problem—for example, in many cases the data
released are two years old. The question becomes whether defensible estimates
could be produced at a higher frequency, even if this requires more resources.
Reflecting on the topic of official statistics, Kominski argued that there are
relatively few statistics that are declared official. Some are used as if they were
official only because there are no alternatives available. However, having more
OCR for page 92
92 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
data on similar concepts typically leads to having to confront the question of
which measures are official.
SMALL-AREA ESTIMATION
Some of the discussion revolved around the need for small-area data and
modeling techniques used to produce estimates when direct estimation is not
possible. Kalton clarified that the challenges in this area are usually a combina -
tion of a small-area and a small-domain problem. If the population of interest
itself is small, as in the case of 5-17-year-olds in the Small Area Income and
Poverty Estimates (SAIPE) Program, then the sample size of this population
in a small area will also be very small. In addition, the estimate itself is often
a very small proportion. These factors have consequences for modeling. He
added that it is important to not lose sight of the quality of the auxiliary data
used, because that is more important than the model. For example, there are
distortions introduced if the data are not collected the same way in all areas, as
is the case with the information about free and reduced price lunches.
Concerns were raised related to data users’ willingness to embrace model-
based estimates in the same way they embrace direct estimates. Kominski said
that the procedures involved in SAIPE seem a little bit like “voodoo econom -
ics” to many, but focusing on educating users would go a long way toward
ensuring that these types of estimates are better received. Labeling the estimates
as experimental or research series would also be useful, according to Groves,
who said that people need some relief from the thinking that everything pub -
lished by the federal statistical system is official, because that stifles innovation.
Abraham agreed, saying that when statistical agencies have gone out on a limb
in the past and produced what amounts to experimental series, yet explained
what they were doing clearly, the user base followed.
Another concern was the lack of statisticians with the skills required to
implement advanced modeling techniques. Groves said that there is a commu -
nity of people around the country who have these skills, as long as agencies are
willing to look outside their existing staff and form alliances.
INTEGRATION OF SAMPLING FRAMES
Another possible direction for integration discussed at the workshop is
coordination among the statistical agencies in the area of sampling frames.
Clark argued that the time is right to consider the idea of a common sampling
frame, and the Census Bureau’s Master Address File (MAF) represents a start -
ing point to consider. Although sharing information from the MAF outside
the Census Bureau is subject to confidentiality restrictions, it is important to
consider whether some parts of it are not subject to these restrictions and could
be made available to other agencies under some kind of agreement. One source
OCR for page 93
93
DISCUSSION AND NEXT STEPS
of input to the MAF is the U.S. Postal Service’s Delivery Sequence File (DSF);
perhaps the Census Bureau could add information to it and make that product
available to others.
Kalton recalled the Canadian example of the address register that is con-
tinuously updated, in part through their labor force survey. What if the United
States were to bring together all of its surveys to improve an overall address
frame that everyone in the statistical community could benefit from, possibly
even beyond the federal statistical agencies?
Groves said that thinking about the continuous updating of the address
frame does not have to be limited to the updating of addresses. Instead, it
could be conceptualized as a collection of observable auxiliary data about the
addresses, and various organizations could contribute information to it. Kalton
added that if some of the data come from sources other than government agen -
cies, the limitations could be different. For example, faster delivery times could
be possible, and the confidentiality restrictions may also be less stringent.
THE ROLE OF THE AMERICAN COMMUNITY SURVEY
The discussions of both integration of content and sampling frames circled
back to the American Community Survey (ACS) on a number of occasions.
Clark said that the most important function of the ACS is to provide estimates
for small areas and that it is in fact the only good source of direct estimates for
small geographies. Nevertheless, other promising uses mentioned at the work -
shop could certainly be discussed further.
Abraham summarized the discussion about one possible use of the ACS as
a more integrated household survey, with a set of rotated modules. This could
increase efficiencies and lead to data that serve a broader array of analytic pur-
poses. Clark talked about the possibility of using the ACS to help other agencies
test and develop new modules. However, there are some obvious challenges
emphasized by Abraham, including the burden placed on ACS respondents,
the survey’s inability to collect information that is comparable in depth to topic-
specific surveys, and practical barriers that were brought up by the ACS team.
The possibility of using the ACS as a sampling frame for other surveys
was also discussed. Clark said that this model works well in the National Agri-
cultural Statistics Service; the Census of Agriculture accommodates screening
questions for other surveys. This approach has enabled them to meet emerging
needs, such as measuring bioenergy and organic production. However, she
mentioned that the ACS itself in its current form has some weaknesses when it
comes to rural populations, and it would not be a suitable screener for a study
focused on rural America.
Kalton would have especially liked further discussion about the idea of the
ACS providing sample on a rolling basis. Currently, one year’s worth of ACS
data has to be processed before the National Science Foundation can receive
OCR for page 94
94 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
sample for the National Survey of College Graduates (NSCG), for example. He
acknowledged that providing sample on a rolling basis would involve additional
data management tasks, but he thought that it was an idea worth discussing.
He would have also liked more discussion of the issue of misclassification in
the sample provided and how it affects a sampling design that involves rare
populations.
Stern brought up the point that the ACS collects a lot of data that are not
released for small areas, except after five years of aggregation. He wondered
whether some of the data available could at least be used for modeling pur-
poses, even if they are not released.
Kominski said that the ACS appeared to emerge as the silver bullet from
many of the discussions, and this is perhaps not surprising given that it is a mas -
sive data system and most people have not even fully considered the power of
the five-year estimates, released on a yearly basis. Even with the overlap across
the data contained in those releases, 10 or 15 years of these estimates will have
huge potential. However, he cautioned against limiting the thinking about the
future of the federal statistical system to the ACS, especially in terms of pursu -
ing the idea of adding modules to the survey. He used the example of the CPS,
which does have supplements, but the space is booked for every month for
the next three years. The CPS has been routinely used for the past 40 years by
researchers both inside and outside the government as the staging ground for
many new ideas and problems to be measured, and the process has been fairly
efficient, but it is not an elegant method and not necessarily something that
should be transferred to the ACS.
Scott Boggess (Census Bureau) reminded the workshop participants of every-
thing the ACS is already doing. He pointed out that the ACS does in about four
months what the 2000 census took approximately two years to do, and it does
it with fewer resources. In addition to the long-awaited five-year estimates, they
have been producing one- and three-year estimates, redesigned their weighting
approach to improve variances at the tract level, redesigned their data products,
developed a Spanish-language questionnaire, and added Puerto Rico and group
quarters to the sample. The ACS is fast and responsive, he said, but he also made
the point that it takes a long time to change an entire system.
Kalton said that the many ideas that emerged during the workshop made
him question whether another survey is needed to accomplish the goals dis -
cussed. After all, the ACS has to fulfill its mandated roles before doing anything
else.
ADMINISTRATIVE RECORDS
Participants were encouraged by the progress reported by Rochelle
Martinez from OMB in the area of administrative records use. Clark mentioned
that, while she was at the Census Bureau, she and her colleagues started the
OCR for page 95
95
DISCUSSION AND NEXT STEPS
Statistical Administrative Records System (StARS) database, and it would be
of great value if that could be made available to other agencies. Some obvious
uses for administrative records are direct use, imputation, verification of data,
and covariates in models, but there may be others and it is important to think
broadly, she said.
Kalton added that administrative records can represent a source of longi -
tudinal data, sometimes with information available before and after the time
of the survey data collection. The Panel Study of Income Dynamics (PSID)
and the Health and Retirement Study (HRS), for example, use Social Security
data to chart income patterns over respondents’ lifetimes. Jean-Louis Tambay
encouraged the participants to imagine meeting five years from now and to
identify current opportunities related to administrative records that will look
like a real shame to have missed looking back from the future.
Regarding the use of administrative records abroad, Kalton commented
that Julie Trépanier’s presentation about the use of tax records in Canada was
an example of a use that reduces respondent burden and is communicated to
the respondent as such. The presentation by Jelke Bethlehem about the popu -
lation register in the Netherlands led to a lot of debate during the workshop,
and Kalton encouraged the participants to continue that dialogue, even if a
register is unlikely to be implemented in the United States in a similar form.
Stern made a similar argument, saying that it is difficult to imagine that there
would be political will in the United States for implementing something similar
to what other countries are doing with administrative records, but that does
not preclude it from discussion, because registers have the potential to offer
enormous cost savings.
BROADER INTEGRATION OF DATA COLLECTIONS
A lot of the discussion centered around the more ambitious notion of inte -
gration advanced by Raghunathan, who used his own work to illustrate a way
of thinking about a research problem in terms of a matrix of the information
necessary to address it. Missing pieces in the matrix can be filled in with data
from a variety of sources and combined using the latest modeling techniques.
The analogy he drew to the statistical system as a whole generated a lot of
discussion.
Abraham said that the concept of the statistical system as a giant matrix
with interlocking pieces was intriguing, because it perhaps presents a solu -
tion to the dilemma of not being able to obtain all the data needed from one
survey, as well as to the difficulties related to combining information from
surveys that have evolved independently of each other. She emphasized that
implementing something similar would require a more global way of thinking
about the household surveys in the federal system. Roderick Little added that
what is necessary is a new way of thinking about survey design and the associ-
OCR for page 96
96 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
ated analysis that goes beyond concentrating efforts on the specific survey one
happens to be working on.
In Abraham’s view, an overarching model, such as the matrix idea, would
provide additional incentive for a discussion about what types of estimates
are appropriate for federal statistical agencies to be generating. Sondik added
that the lack of resources and capacity to produce needed small-area estimates
should focus attention on defining core measures and indicators.
Kalton observed that Don Dillman’s discussion of mixed-mode surveys
becomes especially relevant in the context of integration among surveys.
Although research has explored the effects of mixed-mode data collection
within a survey, less is known about the consequences of combining data from
two surveys that are conducted through different modes. The discussion of the
disability measures illustrated that estimates are not necessarily the same, even
when the questions are the same, and this could in part be due to a mode effect.
Kalton made the point that surveys that use other surveys as a source of
sampling for rare populations could make better use of the information avail -
able from the source if there was more attention paid to coordinating content
as well. In other words, if the new survey was thought of as an extension of the
existing survey, then the data could be combined and used for purposes beyond
what is possible with the individual surveys.
Thinking about the possibilities of linking surveys can extend beyond
research domains, according to Stern. He made the point that currently surveys
that rely on other surveys as a source of sample tend to do so within the same
domain. An example of this is the relationship between the Medical Expendi-
ture Panel Survey (MEPS) and the National Health Interview Survey (NHIS).
Other major benefits are possible in looking beyond the institutional boundar-
ies and to other disciplines.
According to Sondik, a report on developing key national indicators for
children—which recognized that to accomplish this goal required going beyond
established domains—is an example that could apply in a variety of areas,
including health, education, and the economic situation. This recognition could
inform more of what is done and lead to a focus on the critical information
needed to serve as benchmarks. For example, the NHIS could also pick up
basic information related to education and housing, in addition to its current
content.
Abraham said that the initiatives in the area of administrative records also
fit well with this model if one thinks beyond survey integration to envision
data integration, in which administrative records are contributing an important
piece. She encouraged the participants to be bold in moving forward.