2

The Federal Household Survey System at a Crossroads

To set the stage for the workshop, the first session provided background on the current state of some of the major federal household surveys in the United States and outside perspectives on how other nations handle many similar difficulties in household data collection. The first talk in this session focused on a review of the current U.S. federal household data collection system. Subsequent talks presented foreign case studies: the current United Kingdom (U.K.) model for survey integration; the case of the Netherlands, which relies less on household surveys and more on official population registers; and Canada’s use of a multipronged approach to improve efficiencies, including establishing a corporate business architecture and developing a strategy of survey integration. The international examples of survey data collection served to open up a broader discussion about data collection approaches to consider.

FEDERAL HOUSEHOLD DATA COLLECTIONS IN THE UNITED STATES

Katharine Abraham (University of Maryland) highlighted three major aspects of the federal statistical system: (1) the current survey environment is difficult, (2) data users have become more demanding of survey data, and (3) the system is searching for solutions. Specifically, she described several data collection challenges that have contributed to making the current survey environment increasingly difficult. One of these issues is the quality of survey frames. Survey practitioners and researchers agree that, generally, household survey frames provide poor coverage of several important segments of the population.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 5
2 The Federal Household Survey System at a Crossroads To set the stage for the workshop, the first session provided background on the current state of some of the major federal household surveys in the United States and outside perspectives on how other nations handle many similar dif - ficulties in household data collection. The first talk in this session focused on a review of the current U.S. federal household data collection system. Subse - quent talks presented foreign case studies: the current United Kingdom (U.K.) model for survey integration; the case of the Netherlands, which relies less on household surveys and more on official population registers; and Canada’s use of a multipronged approach to improve efficiencies, including establishing a corporate business architecture and developing a strategy of survey integra - tion. The international examples of survey data collection served to open up a broader discussion about data collection approaches to consider. FEDERAL HOUSEHOLD DATA COLLECTIONS IN THE UNITED STATES Katharine Abraham (University of Maryland) highlighted three major aspects of the federal statistical system: (1) the current survey environment is difficult, (2) data users have become more demanding of survey data, and (3) the system is searching for solutions. Specifically, she described several data col- lection challenges that have contributed to making the current survey environ - ment increasingly difficult. One of these issues is the quality of survey frames. Survey practitioners and researchers agree that, generally, household survey frames provide poor coverage of several important segments of the population. 5

OCR for page 5
6 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS Another issue is that it has become increasingly difficult to reach respondents. It is also increasingly difficult, once people are reached, to convince them to grant an interview. Finally, increasing concerns about privacy and confidentiality have exacted a toll on survey participation. Coverage patterns in many federal household surveys are evidence that survey frames are not always adequate to reach a representative sample of the population. As Abraham noted, coverage ratios for personal visit surveys tend to be lower for black respondents than for nonblack ones; they are lower for men than women; and they vary systematically by age. Despite coverage ratios that generally trended downward from 2000 to 2008, coverage ratios for the American Community Survey (ACS) have, by contrast, been higher and more stable than those of other Census Bureau surveys. To help combat the coverage problem, the Census Bureau, in its 2010 survey redesign process, decided to use the continually updated Master Address File (MAF)—the frame the ACS uses—as the frame for its other current surveys. The use of the MAF will begin with the 2014 surveys. Another problem creating challenges in the survey environment is the increasing difficulty of contact with survey respondents. Gated communities restrict access to respondents for in-person interviews and nonresponse follow- up. The use of voicemail and caller ID helps respondents avoid contact with an interviewer in telephone surveys: they can let calls go to voicemail or not answer calls from numbers they do not recognize on their caller ID display. The number of cell-phone-only households has risen sharply in the past 10 years and continues on an upward trend, thus making an initial contact through a telephone frame more difficult in the case of these households. Obtaining respondent cooperation has become increasingly difficult. Abraham explained that increasing demands on respondents’ time, such as long commute times and increasing numbers of telephone solicitations, make respondents less likely to cooperate with an interview request. Furthermore, survey requests, such as from the federal government, compete with multiple other surveys and sales solicitations for the already limited time and interest of potential respondents. Finally, pervasive concerns about privacy and confiden - tiality among many in U.S. society hinder survey participation. It is not only the federal government and its data collection contractors that suffer from an increasingly unfriendly and costly climate for surveys; other survey research organizations are also encountering similar problems. In addition to an increasing unwillingness to participate in surveys, there is also evidence of rising item nonresponse within surveys. As an example, Abraham cited a study by Bollinger and Hirsch (2006) showing that item non - response has increased on the Current Population Survey’s usual weekly earn - ings question. Increased item nonresponse is further evidenced by increasing imputation rates on questions of wages and salaries. By 2000-2004, imputation rates for weekly earnings were up to about 30 percent for survey respondents.

OCR for page 5
7 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS Next, Abraham briefly discussed the increasing demands from increasingly more sophisticated data users. Data users tend to demand more timely and comprehensive data. Many have pushed for more detailed data—that is, data on small geographic areas and population subgroups. There has also been a call in the data user community for better integration of estimates (e.g., income, disability, poverty) from different sources. Agencies have used multiple strategies to increase or maintain current survey response rates. Some surveys use advance notification mail materials or offer multiple modes for response. Other means used are increasing the number of contact attempts with respondents, improving interviewer training, and, in the case of the ACS, making the survey mandatory. Some surveys offer incentives for participation. Abraham noted, however, that the evidence of the effectiveness of any of these methods is limited, and their use comes with increased survey costs. In addition to these strategies, Abraham laid out possible actions that agencies could take to meet the challenges facing federal household surveys. Although the last two years have seen an increase in funding for some statistical agencies, it is unlikely that increases will continue, particularly in the current political climate with calls for reduced government spending—making it even more important to look for ways of increasing efficiencies. Frame improvement is one area in which agencies are attempting to iden - tify opportunities for increased efficiency. As mentioned earlier, the Census Bureau will begin using the MAF for many of its personal visit surveys. In addition, the ACS will be used to provide stratifications for sample designs by providing more current information on the characteristics of geographic areas. Abraham asked if, in addition to this change, the ACS should be used directly as a sample frame itself. Other frame improvement ideas include incorporating cell phone numbers into random digit dialing (RDD) samples. The use of the Internet for survey administration would be most cost-effective; however, there is not yet any agreed-on methodology for creating a frame for online surveys. While online surveys remain an attractive prospect for survey administrations, Abraham stated that more work is needed on how the web option can be most effectively presented and on ensuring web-reporting data quality. Administrative records are another avenue agencies are pursuing for use as sampling frames, as survey benchmarks, as sources of auxiliary data for model-based estimates, and for direct analysis. This is a promising area for future research, Abraham said, but she added a word of caution about treating administrative records as the “gold standard” of data, because little is known of their error properties. Better methodologies could be explored for use to reduce nonresponse and imputation rates. For example, paradata (i.e., data automatically generated by electronic data collection tools about the survey process) and better survey

OCR for page 5
8 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS frames could aid in improving nonresponse adjustment. Of particular interest is the potential role of the ACS, or some other large data set, as a sampling frame. This could provide better information on both respondents and nonrespon - dents—information that could be used for better adjustments. Model-based estimates are another methodology to make greater use of. These have become increasingly accepted as a viable alternative to direct esti - mates, particularly as direct estimates for small areas become prohibitively expensive. The ACS is important here, too, in that it may be a valuable source of auxiliary information for use in small-domain models. Outside the technical aspects of federal household surveys, it is worth con- sidering the organizational environment in which these surveys are conducted. Improved interagency cooperation and coordination are essential. For example, the Census Bureau could facilitate this by more transparent cost accounting for client agencies, giving agencies greater input on infrastructure decisions that affect their surveys, as well as giving them broader access to frames and survey data that are important to accomplishing agency missions. Title 13 of the U.S. Code (the law that guarantees the confidentiality of census information) is a factor that must always be considered with respect to who gets access to what data. Yet it would be extremely valuable to client agencies to have access to the sampling frames used for their surveys and to have more access to the information that is collected, particularly if an agency wished to go back to a set of respondents. Clearly, federal statistical agencies face an increasingly difficult environ - ment for collecting data as well as growing demands with respect to the data that are collected. A substantial amount of research is being done to meet these challenges, but strong interagency collaboration is going to be critical to efficiently implement the new ideas coming out of this research. SURVEY HARMONIZATION IN THE UNITED KINGDOM Cynthia Clark (National Agricultural Statistics Service) presented an over- view of the U.K.’s approach to household survey harmonization in government surveys. Paul Smith from the U.K. Office of National Statistics (ONS), the author of the presentation, and one of the prime contributors to the work on the U.K. Integrated Household Survey (IHS), was not able to attend the work - shop. Clark explained that the focus of the presentation is on the original design of the IHS but includes a discussion of the challenges the United Kingdom has faced related to the design over the years. Responding to many of the same pressures that confront household surveys in the United States and as part of the U.K.’s survey modernization program, the ONS developed an Integrated Household Survey design. The basic concept was to develop a framework in which multiple household surveys could be integrated into a common design. In the United Kingdom, household surveys

OCR for page 5
9 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS have developed independently, much like in the United States. Each had dif - ferent objectives and different methodologies for obtaining the ideal survey sample for a given topic area. For example, the Labour Force Survey (LFS) is not a clustered design, whereas many of the ONS’s other household surveys are clustered. The integrated design increases the sample size for core variables by asking them on all the component surveys. The design of the IHS relies on the use of modules formed from four existing continuous household surveys: the LFS (including some regional supplementary surveys), which serves as the IHS survey core and provides the majority of sample cases (200,000 households); the General Lifestyle Survey (formerly the General Household Survey); the Living Cost and Food Survey (formerly the Expenditure and Food Survey); and the Opinions Survey (formerly the Omnibus Survey). After the original modular design incorporated these four surveys, others, such as the English Household Survey, were added. The idea behind the modules was to standardize concepts and questions across the surveys. In its current form, the survey sample includes 265,000 households and uses a staged approach. Figure 2-1 shows the modular structure of the surveys. The vertical axis on the graph represents the sample cases, and the horizontal axis the different modules and interview length. All interviews include the core survey, followed by a rotating core. The remaining modules represent different surveys pre - sented to different respondents. Parts of the sample are visited quarterly over five quarters, parts are visited annually over four years, and parts are visited only once. Such an undertaking, Clark noted, relies on several critical assumptions about changes. First, the flexibility of the field staff must be increased, and interviewers have to be trained to do all interview types. Surveys with an original clustered design are ideally unclustered to be joined with the core LFS, which has benefits in reduced variance of estimates.1 Content and procedures require standardization among the surveys. Finally, increases in sample size for core variables help to improve small-area estimation. The expected benefits include reduced sampling variance due to increased sample size, cost savings associated with the unclustering of the sample designs, and two-phase calibration, which will enable the use of the estimates from the core in calibration for components. The increased sample size of the core is expected to produce a variance reduction of up to 20 percent for the LFS (if fully unclustered). An unclustered design for the non-LFS surveys is expected to reduce variance of the module variables by 2-15 percent, although this has not yet been implemented. One of the many challenges encountered was the implementation of the IHS in the field. Originally, an entirely new case management system was 1The unclustered design would sample addresses directly from the Postcode Address File (PAF) rather than selecting them from a subset of postal code sectors (Office for National Statistics, 2010).

OCR for page 5
10 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS FIGURE 2-1 Illustrative diagram of a modular continuous population survey. SOURCE: Workshop presentation by Cynthia Clark based on Office for National Statis- tics public sector information licensed under the U.K. Open Government Licence v1.0. planned as part of the field office modernization for the IHS, but the office modernization project turned out to be too ambitious. Instead, field operations had to fall back on existing survey systems. Given that data users do not like to see variables dropped, another problem was that the survey core ended up being too long to be practically administered in the field. Problems related to inconsistencies in the survey outputs also persist. The two-phase calibration has only been partly implemented so far. The calibration works, building in automatic consistency, which increases the quality and usability of outputs, but it has shown only marginal variance gains. Estimates from the IHS are currently released as “experimental,” which allows data user input and feed - back to quality-check the procedures and outputs; they are not yet classified as “national.” Although the implementation of the original design has proven to be chal - lenging, many of the difficulties were due to the necessary systems not being in place. Stepping back has made survey harmonization both more important and more challenging. Despite the difficulties, there has been considerable progress in the design and implementation of the IHS.

OCR for page 5
11 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS DISCUSSION Hal Stern invited the workshop attendees to ask questions of the first two presenters. Phillip Kott (Research Triangle Institute) directed the first ques - tion to Clark: what did the author from the ONS, Paul Smith, mean by the unclustering of the current LFS in the United Kingdom, and how would this save money? Clark explained that the LFS was already unclustered, and, since it was the largest of the surveys, it made sense to move the smaller surveys to that design. Because Clark was not the author of the presentation, she referred to Paul Smith’s paper for additional information about the plans related to unclustering (Smith, 2009). Eric Bergman (Bureau of Labor Statistics) noted that there are certain economies of scale to combining these surveys and asked whether there were any initiatives to make the IHS mandatory. Clark responded that there were no initiatives along those lines. Lawrence Brown (University of Pennsylvania) asked how the integrated survey design affected the longitudinal character of the LFS and how this would be reflected in the other integrated surveys. Clark said that she did not have enough information about the design of the other surveys or if they had longitudinal components in them, but the LFS in its current form is conducted in 5 segments over the course of 15 months. Stern wanted to better understand how modules moving into and out of the integrated survey would look over time and if there are forecasts regarding ultimate costs for the IHS on a large scale. Clark said she did not have an answer to those questions. Abraham asked about the total time required to administer the survey. Given the length of many of the surveys in the United States, it would be dif - ficult to see how this model could be applicable here, she said. Clark noted that the LFS core of the IHS is approximately 20 minutes, and some of the other modules rotate in and out. Robert Groves (Census Bureau) made the point that there is nothing inherent in the design of the IHS to say that questionnaire length could not be constant across interviews, through appropriate matrix sampling of the mod - ules. Furthermore, although the ONS is not doing this, administrative records could be used to guide inclusion probabilities for the matrix sampling. In other words, there would be an administrative data-driven inclusion probability for rotating modules. Andrew White (National Center for Education Statistics) asked whether the push for integration was budgetary in nature. He also asked whether the United Kingdom has been experiencing challenges related to household survey data collections similar to those in the United States and whether the ONS expects the harmonization to address these problems. Clark responded that funding became available for infrastructure development, which represented an incentive to embark on this project. The primary reasons for doing this were

OCR for page 5
12 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS not necessarily in response to the types of challenges described by Abraham in connection to the U.S. household surveys, she said. Katherine Wallman (Office of Management and Budget) commented that it appears that the IHS was not designed with the goal of reducing response bur- den. When the U.S. Government Accountability Office (GAO) has prepared reports on the federal household survey network in the past, its perceptions were that the surveys are duplicative and a heavy burden on respondents. The GAO wanted to know why surveys are not combined together in a framework similar to the IHS, but it appears that the IHS has grown out of different considerations. It is also interesting to hear that some of the supplements are included only periodically. Graham Kalton (Westat) asked how difficult it was to bring together the existing surveys and whether there was any infighting, given response burden constraints and the probability that the sponsors of each of the existing sur- veys had different interests and agendas. Clark said that in her experience this was not a major problem. There was a significant push for harmonization and modernization as part of the integration process, which may have facilitated their willingness to compromise. However, she added, the integration process has not completely succeeded yet, and the LFS still publishes its own estimates, rather than the IHS estimates. Alan Zaslavsky (Harvard Medical School), asking what an acceptable “national statistic” entails, said that there are several potential problems related to generating such a statistic. One issue might be the technical and operational quality of the systems used to generate the statistic and whether they are working correctly and are doing, procedurally, what they are supposed to do. Another issue might be the acceptability of the estimation methods, as these become more complicated than simply asking 1,000 people a question and tabulating the numbers. He asked about the importance of these considerations as the new methodology is implemented in the United Kingdom and whether acceptance has been built for these new methods of estimation. Clark responded that, in her opinion, an important part of the transition to a national statistic is ensuring that the data remain relevant when compared with past data and specifically ensuring that there are some mechanisms for benchmarking and for helping users to understand the new data series. The ONS uses quality measures similar to those used in the United States—time - liness, accessibility, comparability, accuracy, relevance, and consistency—in determining what will become a national statistic. If, in fact, new estimates are adequately bridged to previous data, then, generally, after several years, a statis- tic will move from an experimental one to a national one. Small-area estimation procedures were also used for the first time in official statistics, after a period of being considered experimental. Barbara O’Hare (Census Bureau) asked how the federal statistical com - munity can move toward greater acceptance of model-based estimates, similar

OCR for page 5
13 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS to what was done with small-area estimates in the United Kingdom. Clark sug - gested that the U.K. model of labeling model-based estimates as experimental until they have gained acceptance (and can become national statistics) could be a model for the United States as well. The Small Area Income and Poverty Estimates (SAIPE) Program at the Census Bureau is an example of publish- ing model-based estimates in the United States, but when these data were first released, not all users were comfortable with using them. The estimates were released because they were better than anything else available, and they were labeled to advise data users to exercise caution when using them. However, this does not always help in gaining acceptance. Constance Citro (Committee on National Statistics) noted that the SAIPE estimates are available and are being used, although it is not wise to spring new data on users overnight. It is imperative that the statistical community have a dialogue with data users and describe the positives of model-based estimates, such as stability over time. Once they understand what they are dealing with, they will want the data. Returning to the topic of challenges related to nonresponse, Jelke Bethlehem (Statistics Netherlands) commented that, on the basis of his 30 years of experi - ence working on the issue of nonresponse, he now thinks that the focus should be on the composition of the responses, rather than on trying to improve the response rates. If an organization spends enough time and money, it is possible to increase the response rate, but research shows that this sometimes makes the responses less representative of the sample. Instead, the focus should be on measures that help balance the response. Abraham agreed that increasing response rates at all costs should not be the objective, but she expressed concern about measures taken to balance a sample. In some cases, balancing the sample along demographic variables works well, but there may be other variables of interest for which it does not work. She noted that the approach of balancing the sample sounds similar to a quota sample, and experience shows that quota samples do not perform well, at least in the case of establishment surveys. Clark added that one of the objectives of an adaptive design of this type is to enable researchers to evaluate the composi - tion of the respondents, and that it helps to have paradata to be able to monitor the sample in real time. Citro made the point that great design ideas alone will not solve the current problems of the federal household surveys. The success of integration depends at least as much on systems, procedures, and cost accounting as it does on design ideas. She referred to Clark’s discussion of the problems with the case management system, which were a problem with the 2010 census as well. The question—and challenge—for the statistical agencies is to work together to do better than in the past in improving the basic components of the survey “manufacturing process.”

OCR for page 5
14 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS STATISTICS WITHOUT SURVEYS? DATA COLLECTION IN THE NETHERLANDS Continuing the focus on foreign survey systems, Jelke Bethlehem (Statistics Netherlands) presented an overview of the way the Dutch statistical system col- lects national data and discussed the population register that serves as a back - bone to an integrated information system. He began by walking the audience through a brief history of the census and survey systems in the Netherlands. The Netherlands has a mandatory national register, which has been digital since 1994. It no longer fields a census in the traditional sense, instead conduct - ing a virtual census, which involves information gathered from the population register and through surveys. Demographic data are obtained from the register, and socioeconomic data are gathered via the LFS. Statistics Netherlands successfully uses the population register for three main applications: (1) as a simple and quick data source for monthly population statistics (only counts, not estimates), (2) as a sampling frame for surveys (for persons only, households must be constructed), and (3) as a source of auxiliary variables for weighting adjustments to correct for nonresponse. Responding to increasing calls for more comprehensive, higher quality data, Statistics Netherlands created the Social Statistical Database (SSD), an amalgam of the population register, the LFS, the Survey on Unemployment and Earnings, and other administrative sources. In the case of the Netherlands’ 2001 census, the SSD was used with much success to meet the European Union’s demand for greater census detail. Using the SSD, the work of putting together a census was completed early, despite getting a late start, and at a cost of €3 million, versus the €300 million a traditional census would have cost. SSD data can also be linked to both survey respondents and nonrespondents. Despite the reliance on the SSD, Bethlehem said, there is still pressure to reduce response burden. As a result of this pressure and budget constraints, the focus of data gathering has shifted to more secondary data collection, mostly from registers. In this context, Bethlehem mentioned the Netherlands Statistics Law of 2003, which stipulates that surveys should occur only when the data are not available elsewhere. It also gives Statistics Netherlands access to all government registers. Naturally, the population register is not error-free, and some of the data require substantial editing. One of the main reasons for the errors relates to students who tend to move and not register. The fact that Statistics Netherlands does not control the data collection is also a challenge because of a lack of understanding of quality control and definitional problems, he noted. The government can mandate changes in the registry data at any time, a circumstance that can also lead to problems. The data for the construction sector are an example of this; the sector reports its earnings via tax administra - tion. During a recent economic crisis, companies were allowed to change their

OCR for page 5
15 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS declarations from monthly to quarterly. This introduced a lack of comparability and problems with the reliability in economic data in the construction sector. To keep pace with increasing data demands and shrinking budgets and to combat current data collection problems, new ways to collect data are under study, Bethlehem reported. One strategy is to collect data directly from the administrative and financial systems of companies. Another is to use radio frequency identification tags (RFID) and global positioning systems (GPS) to collect transportation statistics. The use of online robots that collect data from specific websites allows for the leveraging of information already available on the Internet. One possible use of such a robot is for the collection of price data to produce a consumer price index. Of course, he said, there are many questions surrounding these data collection methodologies. Do they work? Are they legal? Bethlehem concluded by saying that, despite opportunities for using regis - ters and technological advances for data collection, there will still be a need for surveys in the future. It is likely that the surveys of the future will be increas - ingly Internet-based or mixed-mode, although these present new challenges, such as mode and selection effects, that are difficult to separate. There are other methodologies yet to be considered, and Statistics Netherlands is keeping an open mind about the possibilities. CANADA’S HOUSEHOLD SURVEY STRATEGY Jean-Louis Tambay (Statistics Canada) presented another perspective from outside the United States, by giving an overview of the Canadian household sur- vey system. Table 2-1 lists major Canadian surveys with monthly data collection. Currently, Statistics Canada has three major sampling vehicles for household surveys: (1) the LFS area frame design, (2) RDD, and (3) a census of popula - tion, conducted every five years. Many household surveys draw their samples from LFS sample clusters, are administered as supplements to the LFS ques - tionnaire, or, to cover certain population subgroups, survey recently rotated-out LFS sample units. Like other nations, Canada faces an increasing demand for survey data—a demand that exceeds the current capacity of the LFS to provide samples. New solutions are being proposed and tested to address the limits of the current survey platform, which involve the flexibility and timeliness of sur- veys (especially developing computer applications for surveys), costs, response burden (particularly for LFS respondents), falling response rates, coverage problems with RDD and telephone surveys, and the challenges of surveying difficult-to-reach populations. In response to the demand for data, Tambay said, Statistics Canada has developed several strategies grouped under the term “New Household Survey Strategy,” including survey integration, spreading interviewer and response burden, development of a master sample, creation of a population frame, and

OCR for page 5
16 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS TABLE 2-1 Major Canadian Surveys with Monthly Data Collection Survey Size Details Labour Force Survey 60,000 households 6-month rotation (10,000 (120,000/year) new cases/month); telephone- first contact for 36% of new cases; use Address Register to replace/supplement listing activities Canadian Community 65,000/year 50% CAPI (LFS area frame); Health Survey 50% CATI (telephone lists); pool 2 years’ sample for small health regions Survey of Household 20,000/year LFS area frame Spending General Social Survey 25,000/year Random digit dialing Travel Survey of 110,000/year LFS “live” supplement Residents of Canada Canadian Tobacco Use 50,000 households/year Random digit dialing Monitoring Survey 20,000 persons/year NOTE: CAPI = computer-assisted personal interviewing, CATI = computer-assisted telephone interviewing, LFS = Labour Force Survey. SOURCE: Workshop presentation by Jean-Louis Tambay. integration of listing activities. The process of survey integration includes using a common core of questions for all surveys, harmonizing content modules, creation of a master sample, and integrating survey and census listing activi - ties. Spreading interviewer and response burden was achieved, in one case, by spreading the collection period for the Survey of Household Spending over a 12-month period, rather than the 3-month collection period that was used in the past. The Canadian Community Health Survey (CCHS) sample of 130,000 respondents was divided in half, and data collection was spread over two years, instead of using the whole sample every other year. Finally, Statistics Canada is considering ways to increase response options, such as offering electronic data reporting, which is currently used for business surveys and was also tested during the 2006 census. Of the four options considered for the design of a master sample, it was decided to create the sample by pooling first-phase surveys but to limit the surveys used to just the LFS and the CCHS. The sample was created, and a pilot survey was conducted in 2008 using an existing survey vehicle, the General Social Survey (GSS). Tambay said that this was complex to imple - ment because it was difficult to develop the proper weights and variances.

OCR for page 5
17 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS Furthermore, there had to be a way to deal with samples that were not really independent. The results were disappointing: response rates were low and design effects were high. The master sample option was thus abandoned, and the idea of using the census as a frame was reopened. A population frame (of persons) created from census follow-up was considered in lieu of the master sample design, although this type of frame also suffered from problems, par- ticularly privacy concerns. Integration of listing activities involves the coordination of census and the LFS cluster listing activities via a common listing application. To aid in cluster listing operations, Statistics Canada provides its interviewers with dwelling lists from the Address Register (AR), which is similar to the U.S Census Bureau’s Master Address File. Used since the late 1980s, the AR is derived from tele - phone billing files from many major telephone companies and Infodirect (simi - lar to a white pages compilation of all Canada), plus other smaller sources, such as tax rebate records for new dwellings. The AR was used to define mailout areas for the 2006 and 2011 censuses, which account for 70-80 percent of the country. In 2004, it was also used to replace or supplement the LFS listing in many clusters. For the 2011 census, a continuous listing was introduced to update the AR (for the 2006 census, the AR was updated through a full-scale block-canvassing operation that took place the previous fall). Leading up to the census, interviewers would verify only clusters that AR methodologists believed were in substantial need of updat - ing, with the assumption that about a third of the clusters would be visited for continuous listing. This is what gave rise to the idea that if interviewers were in the field to do listing for the census, the activity could be combined with listing for the LFS, Tambay noted. The LFS usually conducts its own listing activities, although for about 40 percent of the clusters, the AR is considered of good enough quality to dispense with the initial listing. In another 20 percent of the LFS clusters, AR dwelling lists are updated by interviewers, and in the remaining 40 percent AR coverage is such that it is deemed preferable to have interviewers develop new dwelling lists. Tambay explained that the process for integrating survey and census listing activities had three components: (1) coordination of census and the LFS listing activities, (2) development of a common listing application, and (3) increased use of the AR to replace or supplement the LFS listing. The coordination com - ponent consisted of positive and negative coordination. Positive coordination meant that if a cluster for the LFS has to be listed in a certain month and the AR has to list it sometime before the next census, then Statistics Canada tries to coordinate the process so that the cluster is listed for the AR before it is needed for the LFS. Negative coordination means the listing for the AR is skipped for clusters in which LFS is actively interviewing. The latest innovation at Statistics Canada is a corporate business architec - ture, Tambay said. The goals are to be more efficient, robust, and better able

OCR for page 5
18 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS to respond to new developments. Two of the main principles are (1) decision making optimized across the organization and (2) centralization of such pro - cesses as staff services or information technology services and infrastructure. Several proposals for social surveys have come out of the new program, including creating a household survey frame function and developing a social survey processing environment that is common to multiple surveys as well as increasing the use of electronic data reporting. The LFS is ideal for testing electronic data reporting because survey respondents have the option of pro - viding an email address in their first month in the sample or responding for the following five months of the survey via an Internet address provided by Statistics Canada. To address the first proposal, the household survey frame project was cre - ated. One activity for this project is to improve AR quality and content. This means it is necessary to increase the availability of phone numbers, maximize AR coverage, and increase AR content. The plan is to achieve this through sev - eral steps. First is to increase the availability of phone numbers, which mostly come from billing files and Infodirect. Phone numbers are then supplemented with information from the census or tax data. However, the 2006 census did not provide much more information than Infodirect already had. Telephone numbers from tax files are also problematic because the number could be for an accountant who prepared the return or a work number. The child tax benefit file has proved to be a more useful source of telephone numbers, and it tends to cover households with young children, Tambay observed. Other indirect methods of obtaining more complete information are also under consideration, such as matching tax records to Infodirect phone num - bers to add apartment numbers that are missing on Infodirect. Exploring a cell phone billing file was also attempted. An application to sample from this frame has yet to be developed. A consequence of trying to add additional phone numbers to the frame is that regional offices are communicating that their tele - phone centers are already operating at capacity with the phone numbers that are currently in certain frames. Statistics Canada is also attempting to expand its address resources using such tools as municipal lists and tax forms. Frame coverage in the AR currently is 96-97 percent, with 85 percent of these addresses being mailable. In addi - tion, the Canada Post Corporation Point-of-Call file, which is comparable to the U.S. Postal Service’s Delivery Sequence File, is also a very reliable source, especially in urban areas. Another goal of this activity is to improve AR content by creating a person frame. The census short form, which has household composition information, and the tax family file, which is a file that is constructed from tax records, can be used to construct this frame. Because people tend to declare their children, coverage is about 96 percent. That will be used to update the census information.

OCR for page 5
19 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS The second activity of the Household Survey Frame Project is to develop a common frame for household surveys. This would entail establishing processes for sample management (to control respondent burden), completing integration of the AR with the LFS area frame, and developing a methodology for the use of phone numbers in the design of computer-assisted telephone interviewing (CATI) surveys. There are several keys to a more complete integration of the AR with the LFS, Tambay noted. The first is two-way communication on new dwellings. If any growth is identified through the LFS or the AR, then one should be com - municated to the other to get the best possible integrated address. The second key is an ongoing attempt to integrate into the AR noncity-style addresses, for example, postal installation addresses consisting of a type of delivery, which may be general delivery; lock box number; or municipality name, province, and postal code. Finally, every attempt is being made to identify AR needs for the 2014 LFS redesign. Although still in the planning stages, researchers are currently attempting to develop a methodology for the use of phone numbers from the improved frame in the design of CATI surveys. The goal is to pilot this methodology on the General Social Survey in fall 2011. For the future, Tambay said, the next thing to consider may be sample coordination (rather than coordination only for frames). Tied to the LFS rede - sign is the redevelopment of the generalized sampling system. Statistics Canada would also like to develop a new system for selecting dwellings. For the por- tion of the LFS that can utilize the AR, options for keeping this frame current include updating it by administrative sources and forgoing listing, taking simple random samples of subclusters, and sample coordination with other surveys to avoid visiting the same respondent too often. DISCUSSION Chester Bowie (National Opinion Research Center), session discussant, observed that one of the themes of the morning’s session that sets the context for the rest of the workshop is that surveys have become more complex and difficult over the past 10-15 years. A number of factors drive this complexity: quality and cost concerns related to sampling frames, increasing nonresponse rates; privacy and confidentiality concerns; and rising survey costs, with concur- rently shrinking budgets. The statistical community is also not yet sure how to best use administrative data or model-based estimates. Each of the countries represented at the workshop is addressing these issues differently. The United Kingdom has standardized and integrated its major household surveys. This is an intriguing idea, Bowie said, but such a system would be much more difficult to implement in the United States, where the statisti- cal system is more decentralized. Several past attempts to standardize basic

OCR for page 5
20 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS demographic questions across surveys at the Census Bureau were unsuccessful because each survey sponsor had its reasons for wanting to ask a specific ques - tion in a particular way. The Netherlands Social Statistical Database is interesting because it is a move away from surveys toward population registers, Bowie said. This lowers survey costs, but there are issues inherent in gathering data this way. Canada has addressed some of its challenges through the use of master sample frames and samples, integrated listing activities, and household survey sample coordi - nation. Some of these strategies are unique. Some have argued that the current approach to conducting household surveys in the United States is unsustainable. Bowie reiterated that this problem is the focus of the workshop and that serious thought should be given to what can be done in the future to address it. Hermann Habermann (Committee on National Statistics) sought clarifica - tion on the use of population registers in the Netherlands. If it was a distrust of government that made people wary of censuses, how was a register received? A register can be perceived as even more pernicious than a census. Bethlehem said that there has always been a good population register in the Netherlands. This became an issue during World War II because religion was recorded on the register and, when the Germans invaded, they were able to easily identify Jews in the country using the register. Today, there is a variety of registers, and they seem to not bother people anymore. Many, if not most, people in the United States may be in registers without even knowing it. A follow-up to Habermann’s question concerned the political discussion on using registers instead of surveys in the Netherlands. Were privacy advo - cates concerned that the combination of registers would be a threat to privacy? Bethlehem responded that the only political discussion was about reducing the administrative burden of government. No privacy issues were raised when the bill was proposed in Parliament, and the public really does not seem to be concerned about it. Wallman asked if registering was mandatory in the Netherlands, as it is in Germany. She wondered whether there would be an adverse reaction to such a requirement in the United Kingdom or the United States. Bethlehem again noted that most people in the Netherlands probably do not even realize that they are in the population register. The only time citizens encounter the register is when they have to renew a passport or when they move and they are required to fill out a form on the Internet. In situations like that, it can become a problem if they are not in the register. However, the fact that the register is mandatory has never surfaced as an issue. Tambay recalled a case in which a journalist discovered that the department that administers unemployment benefits in Canada has been maintaining a data file on the labor force. The Canadian government publishes what files are used by which government departments every year, so the existence of this file was

OCR for page 5
21 THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS always public information. Yet, when the journalist brought attention to this, a scandal followed that affected subsequent data collection efforts, because fewer people were willing to share information with this particular department after the incident. The department was also ordered to destroy the file, because although the existence of the file was always public, information about how the data were being used was found to be not transparent enough. Robert Kominski (Census Bureau) suggested that a synthetic register, or one compiled from several data sources, may be a viable concept in the United States. There are already many data systems here, and these could be used to develop an effective register. An example of an existing register in the private sector is the charge card registration system, which includes point-of-purchase data and other information. The banks are authorized by the federal govern - ment to collect these data, and the federal government could say that these data are within its purview. Kominski added that perhaps this is a radical idea, but the purpose of the workshop is to think broadly. He went on to say that, in the current political climate, U.S. residents might be willing to give up their privacy and register, if they thought that such a system would prevent public services from being delivered to those who “do not deserve them.” Some people might do this to obtain greater security or, in their eyes, fair administration of state and federal goods and services. Some might be offended by these ideas, he said, but there is a very large segment of the population that would not be. A workshop participant noted that even if only 5 percent of the popula - tion refused to get an identity card or register, that is still 5 percent of the population that would be missing, which would ordinarily be considered unacceptable. Wallman did not think the issues surrounding registers were necessarily related to whether or not the registration was mandatory, but rather, in talking to colleagues in other countries, whether or not the register was tied to certain benefits. For example, eligibility for child care in the Netherlands is entirely tied to the registration of that child. Such a setup would have a huge impact here. There may be pros in addition to the cons typically associated with registers, she said. Lawrence Brown cited the example of Israel, which has a census as well as a registration system. Although this system is far from perfect, particularly for households, the government is building a secondary system of dual-system esti - mation to correct the registry lists for census purposes. A question that remains, however, is how a system like this can be built into a household data system with the same effectiveness. Another question pertains to inaccuracies in the registration system. Although the register in the Netherlands enables a count of the population, there do not seem to be good address records. He asked: Would it be better to have a dual-system follow-up to correct these inaccuracies? Bethlehem said there were about 2,000 persons in the Netherlands not in

OCR for page 5
22 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS the national register, and they are most likely illegal immigrants. About 15 per- cent of the register records contain errors, but these errors come from incorrect addresses. If someone is listed at an incorrect address in the register, this can become a problem for them should they wish to, for example, get a passport. Because people depend on the register to receive services, it tends to be fairly accurate. Statistics Netherlands defines survey populations to be the population in the register, thus that sampling frame completely fits the population. There is also a database for information to do weighting adjustments. The question of whether including illegal immigrants in the count and surveys is a problem is a decision each country has to make.