Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 89
8 Discussion and Next Steps THE NEED FOR CHANGE The issues and challenges facing federal data collections and the sustain - ability of the current system were revisited by several participants at the end of the workshop. Robert Groves said that the increasing costs of data collections combined with the possibility of declining budgets are bringing the federal statistical system to the “edge of chaos,” where a small decline in a statistical agency’s budget could threaten the existence of entire surveys. He argued that agencies should work together to develop contingency plans for situations in which a survey may have to be dropped, thinking about whether the statistical system collectively would still be able to produce some of the necessary data after a cut of this type. Robert Kominski voiced a similar concern, saying that federal statistical agencies tend to make decisions in a methodical and organized way, based on information available about the past. However, changes in the environment can happen, and sometimes these changes are quite large. Graham Kalton went further, suggesting that the system is characterized by a tendency to maintain the status quo and fear of the possible adverse con - sequences of change. He was not sure that questioning the sustainability of the federal statistical system was warranted, but he agreed that the current surveys are not in line with many of the current needs described, especially increasing demand for data at smaller geographic areas and disaggregated for smaller subgroups to inform more focused policy-making decisions. The growth in this area has been a trend for many years, and it is time to discuss ways of address - ing these needs. Katharine Abraham agreed that the increased need for richer information 89
OCR for page 90
90 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS is evident from the discussions at the workshop. She emphasized that a global evaluation of the current state and the future of federal household surveys will involve making some difficult choices and setting priorities. Kalton argued that approaching the task incrementally is quite appropriate. Groves said that, although he agrees, he would like to see a vision crystallize in the near future. Parts of a vision have seemed to emerge during the workshop and nailing that down soon would make incremental steps toward a specific vision possible. Andrew White also urged participants to spell out the intended goals and line up initiatives with their expected outcomes, especially in light of the magnitude of the projects discussed. INTEGRATION OF SURVEY CONTENT Abraham summarized one of the main themes of the workshop as the importance of survey content integration. One aspect of this is the use of com - mon definitions for the concepts measured—to the extent that this is appro - priate—because comparability enables researchers to make better use of the information available. Kalton said that the discussion of the development of standardized disability measures was a good example of the benefits, especially when the questions are set up so that additional measures can be added to expand the definition of a concept. The main set of questions provides a valu- able benchmark for comparison across surveys. Abraham argued that making headway in the area of integration of con - tent would require agencies working together from the planning stages of a survey and collaborating during redesign efforts to determine crucial content. The burden cannot be placed entirely on the Office of Management and Bud - get (OMB). Cynthia Clark recalled her experience working on the United Nations Global Strategy to Improve Agricultural and Rural Statistics, which brought together organizations to identify the core data items that needed to be produced. Trivellore Raghunathan compared federal statistical agencies to academic departments, in which researchers are focused on their particular disciplines. His own work illustrates that bringing together interdisciplinary teams to address these types of issues works well. This was echoed in Groves’s comments that people have to stop talking to just themselves and begin a dialogue with others whom they do not usually think about when they design data collections. Hal Stern raised the question of whether, given the costs of data collec - tions, there is information currently collected by federal statistical agencies that goes beyond what is mandated or widely used. As an “outsider” (an academic), he said he can afford to raise difficult questions, but his question tied in with Abraham’s point about addressing priorities and determining collectively which measures are crucial. Edward Sondik (National Center for Health Statistics) also sees as valu-
OCR for page 91
91 DISCUSSION AND NEXT STEPS able setting core standards and benchmarks for what represents critical data in a field. In the area of health, there is an explosion of information, including data collections funded by the National Institutes of Health, and many of these data collections do not go through OMB. Private companies are also producing more and more data. Sondik said that this is not necessarily good or bad, but the increase in the volume of information from an increasing variety of sources will require federal statistical agencies to step up and provide an assessment—a “consumer’s report”—on the quality of these data. This is perhaps an important future role for the federal system, he said. Kominski reminded participants that the decentralized nature of the sta - tistical system is one of its virtues. For example, the high school dropout rate published by the Census Bureau differs from the one published by the Depart - ment of Education. This reflects differences in terms of what to measure and how to measure it, and it is not necessarily a problem, but something to con - sider when assessing the challenges involved in getting different agencies to coordinate their measures. He added that it is nevertheless important to ensure that coordination happens in a systematic way. Making a similar argument as Sondik, he observed that this is particularly true in light of increasing volumes of data produced outside the federal statisti - cal system that are receiving substantial attention, in part because they can be made available much faster than federal data. An example of this is the Google consumer price index, which is based on the tracking of online price data. Although the value and potential of these types of data are not clear, there is little doubt that researchers should at least be paying attention to these alterna - tive approaches and that the role and usefulness of “official statistics” should be evaluated in this context as well. Groves warned that the timeliness of data releases is a particularly big con - cern, because federal statistical agencies are out of sync with competing sources of information. For example, the quality of an alternative price index may be really poor, but if it is available in real time, then that may be a compelling argument for some uses. Abraham responded that a lot of the economic data are released very quickly: for example, the unemployment rate is published on the first Friday after the month to which the estimates apply, and that is quite good. Groves agreed that timeliness is relatively good in terms of the economic data released, even though he questioned why the unemployment data cannot be published weekly. However, he emphasized that in other areas the lack of timeliness is a significant problem—for example, in many cases the data released are two years old. The question becomes whether defensible estimates could be produced at a higher frequency, even if this requires more resources. Reflecting on the topic of official statistics, Kominski argued that there are relatively few statistics that are declared official. Some are used as if they were official only because there are no alternatives available. However, having more
OCR for page 92
92 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS data on similar concepts typically leads to having to confront the question of which measures are official. SMALL-AREA ESTIMATION Some of the discussion revolved around the need for small-area data and modeling techniques used to produce estimates when direct estimation is not possible. Kalton clarified that the challenges in this area are usually a combina - tion of a small-area and a small-domain problem. If the population of interest itself is small, as in the case of 5-17-year-olds in the Small Area Income and Poverty Estimates (SAIPE) Program, then the sample size of this population in a small area will also be very small. In addition, the estimate itself is often a very small proportion. These factors have consequences for modeling. He added that it is important to not lose sight of the quality of the auxiliary data used, because that is more important than the model. For example, there are distortions introduced if the data are not collected the same way in all areas, as is the case with the information about free and reduced price lunches. Concerns were raised related to data users’ willingness to embrace model- based estimates in the same way they embrace direct estimates. Kominski said that the procedures involved in SAIPE seem a little bit like “voodoo econom - ics” to many, but focusing on educating users would go a long way toward ensuring that these types of estimates are better received. Labeling the estimates as experimental or research series would also be useful, according to Groves, who said that people need some relief from the thinking that everything pub - lished by the federal statistical system is official, because that stifles innovation. Abraham agreed, saying that when statistical agencies have gone out on a limb in the past and produced what amounts to experimental series, yet explained what they were doing clearly, the user base followed. Another concern was the lack of statisticians with the skills required to implement advanced modeling techniques. Groves said that there is a commu - nity of people around the country who have these skills, as long as agencies are willing to look outside their existing staff and form alliances. INTEGRATION OF SAMPLING FRAMES Another possible direction for integration discussed at the workshop is coordination among the statistical agencies in the area of sampling frames. Clark argued that the time is right to consider the idea of a common sampling frame, and the Census Bureau’s Master Address File (MAF) represents a start - ing point to consider. Although sharing information from the MAF outside the Census Bureau is subject to confidentiality restrictions, it is important to consider whether some parts of it are not subject to these restrictions and could be made available to other agencies under some kind of agreement. One source
OCR for page 93
93 DISCUSSION AND NEXT STEPS of input to the MAF is the U.S. Postal Service’s Delivery Sequence File (DSF); perhaps the Census Bureau could add information to it and make that product available to others. Kalton recalled the Canadian example of the address register that is con- tinuously updated, in part through their labor force survey. What if the United States were to bring together all of its surveys to improve an overall address frame that everyone in the statistical community could benefit from, possibly even beyond the federal statistical agencies? Groves said that thinking about the continuous updating of the address frame does not have to be limited to the updating of addresses. Instead, it could be conceptualized as a collection of observable auxiliary data about the addresses, and various organizations could contribute information to it. Kalton added that if some of the data come from sources other than government agen - cies, the limitations could be different. For example, faster delivery times could be possible, and the confidentiality restrictions may also be less stringent. THE ROLE OF THE AMERICAN COMMUNITY SURVEY The discussions of both integration of content and sampling frames circled back to the American Community Survey (ACS) on a number of occasions. Clark said that the most important function of the ACS is to provide estimates for small areas and that it is in fact the only good source of direct estimates for small geographies. Nevertheless, other promising uses mentioned at the work - shop could certainly be discussed further. Abraham summarized the discussion about one possible use of the ACS as a more integrated household survey, with a set of rotated modules. This could increase efficiencies and lead to data that serve a broader array of analytic pur- poses. Clark talked about the possibility of using the ACS to help other agencies test and develop new modules. However, there are some obvious challenges emphasized by Abraham, including the burden placed on ACS respondents, the survey’s inability to collect information that is comparable in depth to topic- specific surveys, and practical barriers that were brought up by the ACS team. The possibility of using the ACS as a sampling frame for other surveys was also discussed. Clark said that this model works well in the National Agri- cultural Statistics Service; the Census of Agriculture accommodates screening questions for other surveys. This approach has enabled them to meet emerging needs, such as measuring bioenergy and organic production. However, she mentioned that the ACS itself in its current form has some weaknesses when it comes to rural populations, and it would not be a suitable screener for a study focused on rural America. Kalton would have especially liked further discussion about the idea of the ACS providing sample on a rolling basis. Currently, one year’s worth of ACS data has to be processed before the National Science Foundation can receive
OCR for page 94
94 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS sample for the National Survey of College Graduates (NSCG), for example. He acknowledged that providing sample on a rolling basis would involve additional data management tasks, but he thought that it was an idea worth discussing. He would have also liked more discussion of the issue of misclassification in the sample provided and how it affects a sampling design that involves rare populations. Stern brought up the point that the ACS collects a lot of data that are not released for small areas, except after five years of aggregation. He wondered whether some of the data available could at least be used for modeling pur- poses, even if they are not released. Kominski said that the ACS appeared to emerge as the silver bullet from many of the discussions, and this is perhaps not surprising given that it is a mas - sive data system and most people have not even fully considered the power of the five-year estimates, released on a yearly basis. Even with the overlap across the data contained in those releases, 10 or 15 years of these estimates will have huge potential. However, he cautioned against limiting the thinking about the future of the federal statistical system to the ACS, especially in terms of pursu - ing the idea of adding modules to the survey. He used the example of the CPS, which does have supplements, but the space is booked for every month for the next three years. The CPS has been routinely used for the past 40 years by researchers both inside and outside the government as the staging ground for many new ideas and problems to be measured, and the process has been fairly efficient, but it is not an elegant method and not necessarily something that should be transferred to the ACS. Scott Boggess (Census Bureau) reminded the workshop participants of every- thing the ACS is already doing. He pointed out that the ACS does in about four months what the 2000 census took approximately two years to do, and it does it with fewer resources. In addition to the long-awaited five-year estimates, they have been producing one- and three-year estimates, redesigned their weighting approach to improve variances at the tract level, redesigned their data products, developed a Spanish-language questionnaire, and added Puerto Rico and group quarters to the sample. The ACS is fast and responsive, he said, but he also made the point that it takes a long time to change an entire system. Kalton said that the many ideas that emerged during the workshop made him question whether another survey is needed to accomplish the goals dis - cussed. After all, the ACS has to fulfill its mandated roles before doing anything else. ADMINISTRATIVE RECORDS Participants were encouraged by the progress reported by Rochelle Martinez from OMB in the area of administrative records use. Clark mentioned that, while she was at the Census Bureau, she and her colleagues started the
OCR for page 95
95 DISCUSSION AND NEXT STEPS Statistical Administrative Records System (StARS) database, and it would be of great value if that could be made available to other agencies. Some obvious uses for administrative records are direct use, imputation, verification of data, and covariates in models, but there may be others and it is important to think broadly, she said. Kalton added that administrative records can represent a source of longi - tudinal data, sometimes with information available before and after the time of the survey data collection. The Panel Study of Income Dynamics (PSID) and the Health and Retirement Study (HRS), for example, use Social Security data to chart income patterns over respondents’ lifetimes. Jean-Louis Tambay encouraged the participants to imagine meeting five years from now and to identify current opportunities related to administrative records that will look like a real shame to have missed looking back from the future. Regarding the use of administrative records abroad, Kalton commented that Julie Trépanier’s presentation about the use of tax records in Canada was an example of a use that reduces respondent burden and is communicated to the respondent as such. The presentation by Jelke Bethlehem about the popu - lation register in the Netherlands led to a lot of debate during the workshop, and Kalton encouraged the participants to continue that dialogue, even if a register is unlikely to be implemented in the United States in a similar form. Stern made a similar argument, saying that it is difficult to imagine that there would be political will in the United States for implementing something similar to what other countries are doing with administrative records, but that does not preclude it from discussion, because registers have the potential to offer enormous cost savings. BROADER INTEGRATION OF DATA COLLECTIONS A lot of the discussion centered around the more ambitious notion of inte - gration advanced by Raghunathan, who used his own work to illustrate a way of thinking about a research problem in terms of a matrix of the information necessary to address it. Missing pieces in the matrix can be filled in with data from a variety of sources and combined using the latest modeling techniques. The analogy he drew to the statistical system as a whole generated a lot of discussion. Abraham said that the concept of the statistical system as a giant matrix with interlocking pieces was intriguing, because it perhaps presents a solu - tion to the dilemma of not being able to obtain all the data needed from one survey, as well as to the difficulties related to combining information from surveys that have evolved independently of each other. She emphasized that implementing something similar would require a more global way of thinking about the household surveys in the federal system. Roderick Little added that what is necessary is a new way of thinking about survey design and the associ-
OCR for page 96
96 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS ated analysis that goes beyond concentrating efforts on the specific survey one happens to be working on. In Abraham’s view, an overarching model, such as the matrix idea, would provide additional incentive for a discussion about what types of estimates are appropriate for federal statistical agencies to be generating. Sondik added that the lack of resources and capacity to produce needed small-area estimates should focus attention on defining core measures and indicators. Kalton observed that Don Dillman’s discussion of mixed-mode surveys becomes especially relevant in the context of integration among surveys. Although research has explored the effects of mixed-mode data collection within a survey, less is known about the consequences of combining data from two surveys that are conducted through different modes. The discussion of the disability measures illustrated that estimates are not necessarily the same, even when the questions are the same, and this could in part be due to a mode effect. Kalton made the point that surveys that use other surveys as a source of sampling for rare populations could make better use of the information avail - able from the source if there was more attention paid to coordinating content as well. In other words, if the new survey was thought of as an extension of the existing survey, then the data could be combined and used for purposes beyond what is possible with the individual surveys. Thinking about the possibilities of linking surveys can extend beyond research domains, according to Stern. He made the point that currently surveys that rely on other surveys as a source of sample tend to do so within the same domain. An example of this is the relationship between the Medical Expendi- ture Panel Survey (MEPS) and the National Health Interview Survey (NHIS). Other major benefits are possible in looking beyond the institutional boundar- ies and to other disciplines. According to Sondik, a report on developing key national indicators for children—which recognized that to accomplish this goal required going beyond established domains—is an example that could apply in a variety of areas, including health, education, and the economic situation. This recognition could inform more of what is done and lead to a focus on the critical information needed to serve as benchmarks. For example, the NHIS could also pick up basic information related to education and housing, in addition to its current content. Abraham said that the initiatives in the area of administrative records also fit well with this model if one thinks beyond survey integration to envision data integration, in which administrative records are contributing an important piece. She encouraged the participants to be bold in moving forward.