Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 5
2
The Federal Household Survey
System at a Crossroads
To set the stage for the workshop, the first session provided background on
the current state of some of the major federal household surveys in the United
States and outside perspectives on how other nations handle many similar dif -
ficulties in household data collection. The first talk in this session focused on
a review of the current U.S. federal household data collection system. Subse -
quent talks presented foreign case studies: the current United Kingdom (U.K.)
model for survey integration; the case of the Netherlands, which relies less on
household surveys and more on official population registers; and Canada’s use
of a multipronged approach to improve efficiencies, including establishing a
corporate business architecture and developing a strategy of survey integra -
tion. The international examples of survey data collection served to open up a
broader discussion about data collection approaches to consider.
FEDERAL HOUSEHOLD DATA COLLECTIONS
IN THE UNITED STATES
Katharine Abraham (University of Maryland) highlighted three major
aspects of the federal statistical system: (1) the current survey environment is
difficult, (2) data users have become more demanding of survey data, and (3)
the system is searching for solutions. Specifically, she described several data col-
lection challenges that have contributed to making the current survey environ -
ment increasingly difficult. One of these issues is the quality of survey frames.
Survey practitioners and researchers agree that, generally, household survey
frames provide poor coverage of several important segments of the population.
5
OCR for page 6
6 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
Another issue is that it has become increasingly difficult to reach respondents. It
is also increasingly difficult, once people are reached, to convince them to grant
an interview. Finally, increasing concerns about privacy and confidentiality have
exacted a toll on survey participation.
Coverage patterns in many federal household surveys are evidence that
survey frames are not always adequate to reach a representative sample of the
population. As Abraham noted, coverage ratios for personal visit surveys tend
to be lower for black respondents than for nonblack ones; they are lower for
men than women; and they vary systematically by age. Despite coverage ratios
that generally trended downward from 2000 to 2008, coverage ratios for the
American Community Survey (ACS) have, by contrast, been higher and more
stable than those of other Census Bureau surveys. To help combat the coverage
problem, the Census Bureau, in its 2010 survey redesign process, decided to
use the continually updated Master Address File (MAF)—the frame the ACS
uses—as the frame for its other current surveys. The use of the MAF will begin
with the 2014 surveys.
Another problem creating challenges in the survey environment is the
increasing difficulty of contact with survey respondents. Gated communities
restrict access to respondents for in-person interviews and nonresponse follow-
up. The use of voicemail and caller ID helps respondents avoid contact with
an interviewer in telephone surveys: they can let calls go to voicemail or not
answer calls from numbers they do not recognize on their caller ID display. The
number of cell-phone-only households has risen sharply in the past 10 years
and continues on an upward trend, thus making an initial contact through a
telephone frame more difficult in the case of these households.
Obtaining respondent cooperation has become increasingly difficult.
Abraham explained that increasing demands on respondents’ time, such as
long commute times and increasing numbers of telephone solicitations, make
respondents less likely to cooperate with an interview request. Furthermore,
survey requests, such as from the federal government, compete with multiple
other surveys and sales solicitations for the already limited time and interest of
potential respondents. Finally, pervasive concerns about privacy and confiden -
tiality among many in U.S. society hinder survey participation. It is not only
the federal government and its data collection contractors that suffer from an
increasingly unfriendly and costly climate for surveys; other survey research
organizations are also encountering similar problems.
In addition to an increasing unwillingness to participate in surveys, there
is also evidence of rising item nonresponse within surveys. As an example,
Abraham cited a study by Bollinger and Hirsch (2006) showing that item non -
response has increased on the Current Population Survey’s usual weekly earn -
ings question. Increased item nonresponse is further evidenced by increasing
imputation rates on questions of wages and salaries. By 2000-2004, imputation
rates for weekly earnings were up to about 30 percent for survey respondents.
OCR for page 7
7
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
Next, Abraham briefly discussed the increasing demands from increasingly
more sophisticated data users. Data users tend to demand more timely and
comprehensive data. Many have pushed for more detailed data—that is, data
on small geographic areas and population subgroups. There has also been a call
in the data user community for better integration of estimates (e.g., income,
disability, poverty) from different sources.
Agencies have used multiple strategies to increase or maintain current
survey response rates. Some surveys use advance notification mail materials
or offer multiple modes for response. Other means used are increasing the
number of contact attempts with respondents, improving interviewer training,
and, in the case of the ACS, making the survey mandatory. Some surveys offer
incentives for participation. Abraham noted, however, that the evidence of
the effectiveness of any of these methods is limited, and their use comes with
increased survey costs.
In addition to these strategies, Abraham laid out possible actions that
agencies could take to meet the challenges facing federal household surveys.
Although the last two years have seen an increase in funding for some statistical
agencies, it is unlikely that increases will continue, particularly in the current
political climate with calls for reduced government spending—making it even
more important to look for ways of increasing efficiencies.
Frame improvement is one area in which agencies are attempting to iden -
tify opportunities for increased efficiency. As mentioned earlier, the Census
Bureau will begin using the MAF for many of its personal visit surveys. In
addition, the ACS will be used to provide stratifications for sample designs by
providing more current information on the characteristics of geographic areas.
Abraham asked if, in addition to this change, the ACS should be used directly
as a sample frame itself.
Other frame improvement ideas include incorporating cell phone numbers
into random digit dialing (RDD) samples. The use of the Internet for survey
administration would be most cost-effective; however, there is not yet any
agreed-on methodology for creating a frame for online surveys. While online
surveys remain an attractive prospect for survey administrations, Abraham
stated that more work is needed on how the web option can be most effectively
presented and on ensuring web-reporting data quality.
Administrative records are another avenue agencies are pursuing for use
as sampling frames, as survey benchmarks, as sources of auxiliary data for
model-based estimates, and for direct analysis. This is a promising area for
future research, Abraham said, but she added a word of caution about treating
administrative records as the “gold standard” of data, because little is known
of their error properties.
Better methodologies could be explored for use to reduce nonresponse
and imputation rates. For example, paradata (i.e., data automatically generated
by electronic data collection tools about the survey process) and better survey
OCR for page 8
8 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
frames could aid in improving nonresponse adjustment. Of particular interest is
the potential role of the ACS, or some other large data set, as a sampling frame.
This could provide better information on both respondents and nonrespon -
dents—information that could be used for better adjustments.
Model-based estimates are another methodology to make greater use of.
These have become increasingly accepted as a viable alternative to direct esti -
mates, particularly as direct estimates for small areas become prohibitively
expensive. The ACS is important here, too, in that it may be a valuable source
of auxiliary information for use in small-domain models.
Outside the technical aspects of federal household surveys, it is worth con-
sidering the organizational environment in which these surveys are conducted.
Improved interagency cooperation and coordination are essential. For example,
the Census Bureau could facilitate this by more transparent cost accounting
for client agencies, giving agencies greater input on infrastructure decisions
that affect their surveys, as well as giving them broader access to frames and
survey data that are important to accomplishing agency missions. Title 13 of the
U.S. Code (the law that guarantees the confidentiality of census information)
is a factor that must always be considered with respect to who gets access to
what data. Yet it would be extremely valuable to client agencies to have access
to the sampling frames used for their surveys and to have more access to the
information that is collected, particularly if an agency wished to go back to a
set of respondents.
Clearly, federal statistical agencies face an increasingly difficult environ -
ment for collecting data as well as growing demands with respect to the data
that are collected. A substantial amount of research is being done to meet
these challenges, but strong interagency collaboration is going to be critical to
efficiently implement the new ideas coming out of this research.
SURVEY HARMONIZATION IN THE UNITED KINGDOM
Cynthia Clark (National Agricultural Statistics Service) presented an over-
view of the U.K.’s approach to household survey harmonization in government
surveys. Paul Smith from the U.K. Office of National Statistics (ONS), the
author of the presentation, and one of the prime contributors to the work on
the U.K. Integrated Household Survey (IHS), was not able to attend the work -
shop. Clark explained that the focus of the presentation is on the original design
of the IHS but includes a discussion of the challenges the United Kingdom has
faced related to the design over the years.
Responding to many of the same pressures that confront household surveys
in the United States and as part of the U.K.’s survey modernization program,
the ONS developed an Integrated Household Survey design. The basic concept
was to develop a framework in which multiple household surveys could be
integrated into a common design. In the United Kingdom, household surveys
OCR for page 9
9
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
have developed independently, much like in the United States. Each had dif -
ferent objectives and different methodologies for obtaining the ideal survey
sample for a given topic area. For example, the Labour Force Survey (LFS) is
not a clustered design, whereas many of the ONS’s other household surveys are
clustered. The integrated design increases the sample size for core variables by
asking them on all the component surveys.
The design of the IHS relies on the use of modules formed from four existing
continuous household surveys: the LFS (including some regional supplementary
surveys), which serves as the IHS survey core and provides the majority of sample
cases (200,000 households); the General Lifestyle Survey (formerly the General
Household Survey); the Living Cost and Food Survey (formerly the Expenditure
and Food Survey); and the Opinions Survey (formerly the Omnibus Survey).
After the original modular design incorporated these four surveys, others, such
as the English Household Survey, were added. The idea behind the modules was
to standardize concepts and questions across the surveys. In its current form, the
survey sample includes 265,000 households and uses a staged approach.
Figure 2-1 shows the modular structure of the surveys. The vertical axis
on the graph represents the sample cases, and the horizontal axis the different
modules and interview length. All interviews include the core survey, followed
by a rotating core. The remaining modules represent different surveys pre -
sented to different respondents. Parts of the sample are visited quarterly over
five quarters, parts are visited annually over four years, and parts are visited
only once.
Such an undertaking, Clark noted, relies on several critical assumptions
about changes. First, the flexibility of the field staff must be increased, and
interviewers have to be trained to do all interview types. Surveys with an
original clustered design are ideally unclustered to be joined with the core LFS,
which has benefits in reduced variance of estimates.1 Content and procedures
require standardization among the surveys. Finally, increases in sample size for
core variables help to improve small-area estimation.
The expected benefits include reduced sampling variance due to increased
sample size, cost savings associated with the unclustering of the sample designs,
and two-phase calibration, which will enable the use of the estimates from the
core in calibration for components. The increased sample size of the core is
expected to produce a variance reduction of up to 20 percent for the LFS (if
fully unclustered). An unclustered design for the non-LFS surveys is expected
to reduce variance of the module variables by 2-15 percent, although this has
not yet been implemented.
One of the many challenges encountered was the implementation of the
IHS in the field. Originally, an entirely new case management system was
1The unclustered design would sample addresses directly from the Postcode Address File (PAF)
rather than selecting them from a subset of postal code sectors (Office for National Statistics, 2010).
OCR for page 10
10 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
FIGURE 2-1 Illustrative diagram of a modular continuous population survey.
SOURCE: Workshop presentation by Cynthia Clark based on Office for National Statis-
tics public sector information licensed under the U.K. Open Government Licence v1.0.
planned as part of the field office modernization for the IHS, but the office
modernization project turned out to be too ambitious. Instead, field operations
had to fall back on existing survey systems. Given that data users do not like
to see variables dropped, another problem was that the survey core ended up
being too long to be practically administered in the field. Problems related to
inconsistencies in the survey outputs also persist. The two-phase calibration
has only been partly implemented so far. The calibration works, building in
automatic consistency, which increases the quality and usability of outputs,
but it has shown only marginal variance gains. Estimates from the IHS are
currently released as “experimental,” which allows data user input and feed -
back to quality-check the procedures and outputs; they are not yet classified
as “national.”
Although the implementation of the original design has proven to be chal -
lenging, many of the difficulties were due to the necessary systems not being in
place. Stepping back has made survey harmonization both more important and
more challenging. Despite the difficulties, there has been considerable progress
in the design and implementation of the IHS.
OCR for page 11
11
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
DISCUSSION
Hal Stern invited the workshop attendees to ask questions of the first two
presenters. Phillip Kott (Research Triangle Institute) directed the first ques -
tion to Clark: what did the author from the ONS, Paul Smith, mean by the
unclustering of the current LFS in the United Kingdom, and how would this
save money? Clark explained that the LFS was already unclustered, and, since
it was the largest of the surveys, it made sense to move the smaller surveys to
that design. Because Clark was not the author of the presentation, she referred
to Paul Smith’s paper for additional information about the plans related to
unclustering (Smith, 2009).
Eric Bergman (Bureau of Labor Statistics) noted that there are certain
economies of scale to combining these surveys and asked whether there were
any initiatives to make the IHS mandatory. Clark responded that there were no
initiatives along those lines.
Lawrence Brown (University of Pennsylvania) asked how the integrated
survey design affected the longitudinal character of the LFS and how this
would be reflected in the other integrated surveys. Clark said that she did not
have enough information about the design of the other surveys or if they had
longitudinal components in them, but the LFS in its current form is conducted
in 5 segments over the course of 15 months. Stern wanted to better understand
how modules moving into and out of the integrated survey would look over
time and if there are forecasts regarding ultimate costs for the IHS on a large
scale. Clark said she did not have an answer to those questions.
Abraham asked about the total time required to administer the survey.
Given the length of many of the surveys in the United States, it would be dif -
ficult to see how this model could be applicable here, she said. Clark noted that
the LFS core of the IHS is approximately 20 minutes, and some of the other
modules rotate in and out.
Robert Groves (Census Bureau) made the point that there is nothing
inherent in the design of the IHS to say that questionnaire length could not be
constant across interviews, through appropriate matrix sampling of the mod -
ules. Furthermore, although the ONS is not doing this, administrative records
could be used to guide inclusion probabilities for the matrix sampling. In other
words, there would be an administrative data-driven inclusion probability for
rotating modules.
Andrew White (National Center for Education Statistics) asked whether
the push for integration was budgetary in nature. He also asked whether the
United Kingdom has been experiencing challenges related to household survey
data collections similar to those in the United States and whether the ONS
expects the harmonization to address these problems. Clark responded that
funding became available for infrastructure development, which represented
an incentive to embark on this project. The primary reasons for doing this were
OCR for page 12
12 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
not necessarily in response to the types of challenges described by Abraham in
connection to the U.S. household surveys, she said.
Katherine Wallman (Office of Management and Budget) commented that it
appears that the IHS was not designed with the goal of reducing response bur-
den. When the U.S. Government Accountability Office (GAO) has prepared
reports on the federal household survey network in the past, its perceptions
were that the surveys are duplicative and a heavy burden on respondents. The
GAO wanted to know why surveys are not combined together in a framework
similar to the IHS, but it appears that the IHS has grown out of different
considerations. It is also interesting to hear that some of the supplements are
included only periodically.
Graham Kalton (Westat) asked how difficult it was to bring together the
existing surveys and whether there was any infighting, given response burden
constraints and the probability that the sponsors of each of the existing sur-
veys had different interests and agendas. Clark said that in her experience this
was not a major problem. There was a significant push for harmonization and
modernization as part of the integration process, which may have facilitated
their willingness to compromise. However, she added, the integration process
has not completely succeeded yet, and the LFS still publishes its own estimates,
rather than the IHS estimates.
Alan Zaslavsky (Harvard Medical School), asking what an acceptable
“national statistic” entails, said that there are several potential problems related
to generating such a statistic. One issue might be the technical and operational
quality of the systems used to generate the statistic and whether they are
working correctly and are doing, procedurally, what they are supposed to do.
Another issue might be the acceptability of the estimation methods, as these
become more complicated than simply asking 1,000 people a question and
tabulating the numbers. He asked about the importance of these considerations
as the new methodology is implemented in the United Kingdom and whether
acceptance has been built for these new methods of estimation.
Clark responded that, in her opinion, an important part of the transition
to a national statistic is ensuring that the data remain relevant when compared
with past data and specifically ensuring that there are some mechanisms for
benchmarking and for helping users to understand the new data series. The
ONS uses quality measures similar to those used in the United States—time -
liness, accessibility, comparability, accuracy, relevance, and consistency—in
determining what will become a national statistic. If, in fact, new estimates are
adequately bridged to previous data, then, generally, after several years, a statis-
tic will move from an experimental one to a national one. Small-area estimation
procedures were also used for the first time in official statistics, after a period
of being considered experimental.
Barbara O’Hare (Census Bureau) asked how the federal statistical com -
munity can move toward greater acceptance of model-based estimates, similar
OCR for page 13
13
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
to what was done with small-area estimates in the United Kingdom. Clark sug -
gested that the U.K. model of labeling model-based estimates as experimental
until they have gained acceptance (and can become national statistics) could
be a model for the United States as well. The Small Area Income and Poverty
Estimates (SAIPE) Program at the Census Bureau is an example of publish-
ing model-based estimates in the United States, but when these data were first
released, not all users were comfortable with using them. The estimates were
released because they were better than anything else available, and they were
labeled to advise data users to exercise caution when using them. However, this
does not always help in gaining acceptance.
Constance Citro (Committee on National Statistics) noted that the SAIPE
estimates are available and are being used, although it is not wise to spring new
data on users overnight. It is imperative that the statistical community have a
dialogue with data users and describe the positives of model-based estimates,
such as stability over time. Once they understand what they are dealing with,
they will want the data.
Returning to the topic of challenges related to nonresponse, Jelke Bethlehem
(Statistics Netherlands) commented that, on the basis of his 30 years of experi -
ence working on the issue of nonresponse, he now thinks that the focus should
be on the composition of the responses, rather than on trying to improve the
response rates. If an organization spends enough time and money, it is possible
to increase the response rate, but research shows that this sometimes makes
the responses less representative of the sample. Instead, the focus should be on
measures that help balance the response.
Abraham agreed that increasing response rates at all costs should not be
the objective, but she expressed concern about measures taken to balance a
sample. In some cases, balancing the sample along demographic variables works
well, but there may be other variables of interest for which it does not work.
She noted that the approach of balancing the sample sounds similar to a quota
sample, and experience shows that quota samples do not perform well, at least
in the case of establishment surveys. Clark added that one of the objectives of
an adaptive design of this type is to enable researchers to evaluate the composi -
tion of the respondents, and that it helps to have paradata to be able to monitor
the sample in real time.
Citro made the point that great design ideas alone will not solve the current
problems of the federal household surveys. The success of integration depends
at least as much on systems, procedures, and cost accounting as it does on
design ideas. She referred to Clark’s discussion of the problems with the case
management system, which were a problem with the 2010 census as well. The
question—and challenge—for the statistical agencies is to work together to
do better than in the past in improving the basic components of the survey
“manufacturing process.”
OCR for page 14
14 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
STATISTICS WITHOUT SURVEYS?
DATA COLLECTION IN THE NETHERLANDS
Continuing the focus on foreign survey systems, Jelke Bethlehem (Statistics
Netherlands) presented an overview of the way the Dutch statistical system col-
lects national data and discussed the population register that serves as a back -
bone to an integrated information system. He began by walking the audience
through a brief history of the census and survey systems in the Netherlands.
The Netherlands has a mandatory national register, which has been digital
since 1994. It no longer fields a census in the traditional sense, instead conduct -
ing a virtual census, which involves information gathered from the population
register and through surveys. Demographic data are obtained from the register,
and socioeconomic data are gathered via the LFS.
Statistics Netherlands successfully uses the population register for three
main applications: (1) as a simple and quick data source for monthly population
statistics (only counts, not estimates), (2) as a sampling frame for surveys (for
persons only, households must be constructed), and (3) as a source of auxiliary
variables for weighting adjustments to correct for nonresponse.
Responding to increasing calls for more comprehensive, higher quality
data, Statistics Netherlands created the Social Statistical Database (SSD), an
amalgam of the population register, the LFS, the Survey on Unemployment
and Earnings, and other administrative sources. In the case of the Netherlands’
2001 census, the SSD was used with much success to meet the European
Union’s demand for greater census detail. Using the SSD, the work of putting
together a census was completed early, despite getting a late start, and at a cost
of €3 million, versus the €300 million a traditional census would have cost.
SSD data can also be linked to both survey respondents and nonrespondents.
Despite the reliance on the SSD, Bethlehem said, there is still pressure to
reduce response burden. As a result of this pressure and budget constraints,
the focus of data gathering has shifted to more secondary data collection,
mostly from registers. In this context, Bethlehem mentioned the Netherlands
Statistics Law of 2003, which stipulates that surveys should occur only when
the data are not available elsewhere. It also gives Statistics Netherlands access
to all government registers.
Naturally, the population register is not error-free, and some of the data
require substantial editing. One of the main reasons for the errors relates to
students who tend to move and not register. The fact that Statistics Netherlands
does not control the data collection is also a challenge because of a lack of
understanding of quality control and definitional problems, he noted.
The government can mandate changes in the registry data at any time, a
circumstance that can also lead to problems. The data for the construction
sector are an example of this; the sector reports its earnings via tax administra -
tion. During a recent economic crisis, companies were allowed to change their
OCR for page 15
15
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
declarations from monthly to quarterly. This introduced a lack of comparability
and problems with the reliability in economic data in the construction sector.
To keep pace with increasing data demands and shrinking budgets and to
combat current data collection problems, new ways to collect data are under
study, Bethlehem reported. One strategy is to collect data directly from the
administrative and financial systems of companies. Another is to use radio
frequency identification tags (RFID) and global positioning systems (GPS) to
collect transportation statistics. The use of online robots that collect data from
specific websites allows for the leveraging of information already available on
the Internet. One possible use of such a robot is for the collection of price
data to produce a consumer price index. Of course, he said, there are many
questions surrounding these data collection methodologies. Do they work? Are
they legal?
Bethlehem concluded by saying that, despite opportunities for using regis -
ters and technological advances for data collection, there will still be a need for
surveys in the future. It is likely that the surveys of the future will be increas -
ingly Internet-based or mixed-mode, although these present new challenges,
such as mode and selection effects, that are difficult to separate. There are other
methodologies yet to be considered, and Statistics Netherlands is keeping an
open mind about the possibilities.
CANADA’S HOUSEHOLD SURVEY STRATEGY
Jean-Louis Tambay (Statistics Canada) presented another perspective from
outside the United States, by giving an overview of the Canadian household sur-
vey system. Table 2-1 lists major Canadian surveys with monthly data collection.
Currently, Statistics Canada has three major sampling vehicles for household
surveys: (1) the LFS area frame design, (2) RDD, and (3) a census of popula -
tion, conducted every five years. Many household surveys draw their samples
from LFS sample clusters, are administered as supplements to the LFS ques -
tionnaire, or, to cover certain population subgroups, survey recently rotated-out
LFS sample units. Like other nations, Canada faces an increasing demand for
survey data—a demand that exceeds the current capacity of the LFS to provide
samples. New solutions are being proposed and tested to address the limits of
the current survey platform, which involve the flexibility and timeliness of sur-
veys (especially developing computer applications for surveys), costs, response
burden (particularly for LFS respondents), falling response rates, coverage
problems with RDD and telephone surveys, and the challenges of surveying
difficult-to-reach populations.
In response to the demand for data, Tambay said, Statistics Canada has
developed several strategies grouped under the term “New Household Survey
Strategy,” including survey integration, spreading interviewer and response
burden, development of a master sample, creation of a population frame, and
OCR for page 16
16 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
TABLE 2-1 Major Canadian Surveys with Monthly Data Collection
Survey Size Details
Labour Force Survey 60,000 households 6-month rotation (10,000
(120,000/year) new cases/month); telephone-
first contact for 36% of new
cases; use Address Register
to replace/supplement listing
activities
Canadian Community 65,000/year 50% CAPI (LFS area frame);
Health Survey 50% CATI (telephone lists);
pool 2 years’ sample for small
health regions
Survey of Household 20,000/year LFS area frame
Spending
General Social Survey 25,000/year Random digit dialing
Travel Survey of 110,000/year LFS “live” supplement
Residents of Canada
Canadian Tobacco Use 50,000 households/year Random digit dialing
Monitoring Survey 20,000 persons/year
NOTE: CAPI = computer-assisted personal interviewing, CATI = computer-assisted telephone
interviewing, LFS = Labour Force Survey.
SOURCE: Workshop presentation by Jean-Louis Tambay.
integration of listing activities. The process of survey integration includes using
a common core of questions for all surveys, harmonizing content modules,
creation of a master sample, and integrating survey and census listing activi -
ties. Spreading interviewer and response burden was achieved, in one case, by
spreading the collection period for the Survey of Household Spending over a
12-month period, rather than the 3-month collection period that was used in
the past. The Canadian Community Health Survey (CCHS) sample of 130,000
respondents was divided in half, and data collection was spread over two years,
instead of using the whole sample every other year. Finally, Statistics Canada
is considering ways to increase response options, such as offering electronic
data reporting, which is currently used for business surveys and was also tested
during the 2006 census.
Of the four options considered for the design of a master sample, it was
decided to create the sample by pooling first-phase surveys but to limit the
surveys used to just the LFS and the CCHS. The sample was created, and
a pilot survey was conducted in 2008 using an existing survey vehicle, the
General Social Survey (GSS). Tambay said that this was complex to imple -
ment because it was difficult to develop the proper weights and variances.
OCR for page 17
17
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
Furthermore, there had to be a way to deal with samples that were not really
independent. The results were disappointing: response rates were low and
design effects were high. The master sample option was thus abandoned, and
the idea of using the census as a frame was reopened. A population frame (of
persons) created from census follow-up was considered in lieu of the master
sample design, although this type of frame also suffered from problems, par-
ticularly privacy concerns.
Integration of listing activities involves the coordination of census and the
LFS cluster listing activities via a common listing application. To aid in cluster
listing operations, Statistics Canada provides its interviewers with dwelling lists
from the Address Register (AR), which is similar to the U.S Census Bureau’s
Master Address File. Used since the late 1980s, the AR is derived from tele -
phone billing files from many major telephone companies and Infodirect (simi -
lar to a white pages compilation of all Canada), plus other smaller sources, such
as tax rebate records for new dwellings.
The AR was used to define mailout areas for the 2006 and 2011 censuses,
which account for 70-80 percent of the country. In 2004, it was also used to
replace or supplement the LFS listing in many clusters. For the 2011 census, a
continuous listing was introduced to update the AR (for the 2006 census, the
AR was updated through a full-scale block-canvassing operation that took place
the previous fall). Leading up to the census, interviewers would verify only
clusters that AR methodologists believed were in substantial need of updat -
ing, with the assumption that about a third of the clusters would be visited for
continuous listing. This is what gave rise to the idea that if interviewers were
in the field to do listing for the census, the activity could be combined with
listing for the LFS, Tambay noted. The LFS usually conducts its own listing
activities, although for about 40 percent of the clusters, the AR is considered of
good enough quality to dispense with the initial listing. In another 20 percent
of the LFS clusters, AR dwelling lists are updated by interviewers, and in the
remaining 40 percent AR coverage is such that it is deemed preferable to have
interviewers develop new dwelling lists.
Tambay explained that the process for integrating survey and census listing
activities had three components: (1) coordination of census and the LFS listing
activities, (2) development of a common listing application, and (3) increased
use of the AR to replace or supplement the LFS listing. The coordination com -
ponent consisted of positive and negative coordination. Positive coordination
meant that if a cluster for the LFS has to be listed in a certain month and the
AR has to list it sometime before the next census, then Statistics Canada tries to
coordinate the process so that the cluster is listed for the AR before it is needed
for the LFS. Negative coordination means the listing for the AR is skipped for
clusters in which LFS is actively interviewing.
The latest innovation at Statistics Canada is a corporate business architec -
ture, Tambay said. The goals are to be more efficient, robust, and better able
OCR for page 18
18 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
to respond to new developments. Two of the main principles are (1) decision
making optimized across the organization and (2) centralization of such pro -
cesses as staff services or information technology services and infrastructure.
Several proposals for social surveys have come out of the new program,
including creating a household survey frame function and developing a social
survey processing environment that is common to multiple surveys as well as
increasing the use of electronic data reporting. The LFS is ideal for testing
electronic data reporting because survey respondents have the option of pro -
viding an email address in their first month in the sample or responding for
the following five months of the survey via an Internet address provided by
Statistics Canada.
To address the first proposal, the household survey frame project was cre -
ated. One activity for this project is to improve AR quality and content. This
means it is necessary to increase the availability of phone numbers, maximize
AR coverage, and increase AR content. The plan is to achieve this through sev -
eral steps. First is to increase the availability of phone numbers, which mostly
come from billing files and Infodirect. Phone numbers are then supplemented
with information from the census or tax data. However, the 2006 census did
not provide much more information than Infodirect already had. Telephone
numbers from tax files are also problematic because the number could be for
an accountant who prepared the return or a work number. The child tax benefit
file has proved to be a more useful source of telephone numbers, and it tends
to cover households with young children, Tambay observed.
Other indirect methods of obtaining more complete information are also
under consideration, such as matching tax records to Infodirect phone num -
bers to add apartment numbers that are missing on Infodirect. Exploring a
cell phone billing file was also attempted. An application to sample from this
frame has yet to be developed. A consequence of trying to add additional phone
numbers to the frame is that regional offices are communicating that their tele -
phone centers are already operating at capacity with the phone numbers that
are currently in certain frames.
Statistics Canada is also attempting to expand its address resources using
such tools as municipal lists and tax forms. Frame coverage in the AR currently
is 96-97 percent, with 85 percent of these addresses being mailable. In addi -
tion, the Canada Post Corporation Point-of-Call file, which is comparable to
the U.S. Postal Service’s Delivery Sequence File, is also a very reliable source,
especially in urban areas.
Another goal of this activity is to improve AR content by creating a person
frame. The census short form, which has household composition information,
and the tax family file, which is a file that is constructed from tax records,
can be used to construct this frame. Because people tend to declare their
children, coverage is about 96 percent. That will be used to update the census
information.
OCR for page 19
19
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
The second activity of the Household Survey Frame Project is to develop a
common frame for household surveys. This would entail establishing processes
for sample management (to control respondent burden), completing integration
of the AR with the LFS area frame, and developing a methodology for the use
of phone numbers in the design of computer-assisted telephone interviewing
(CATI) surveys.
There are several keys to a more complete integration of the AR with the
LFS, Tambay noted. The first is two-way communication on new dwellings. If
any growth is identified through the LFS or the AR, then one should be com -
municated to the other to get the best possible integrated address. The second
key is an ongoing attempt to integrate into the AR noncity-style addresses, for
example, postal installation addresses consisting of a type of delivery, which
may be general delivery; lock box number; or municipality name, province,
and postal code. Finally, every attempt is being made to identify AR needs for
the 2014 LFS redesign.
Although still in the planning stages, researchers are currently attempting
to develop a methodology for the use of phone numbers from the improved
frame in the design of CATI surveys. The goal is to pilot this methodology on
the General Social Survey in fall 2011.
For the future, Tambay said, the next thing to consider may be sample
coordination (rather than coordination only for frames). Tied to the LFS rede -
sign is the redevelopment of the generalized sampling system. Statistics Canada
would also like to develop a new system for selecting dwellings. For the por-
tion of the LFS that can utilize the AR, options for keeping this frame current
include updating it by administrative sources and forgoing listing, taking simple
random samples of subclusters, and sample coordination with other surveys to
avoid visiting the same respondent too often.
DISCUSSION
Chester Bowie (National Opinion Research Center), session discussant,
observed that one of the themes of the morning’s session that sets the context
for the rest of the workshop is that surveys have become more complex and
difficult over the past 10-15 years. A number of factors drive this complexity:
quality and cost concerns related to sampling frames, increasing nonresponse
rates; privacy and confidentiality concerns; and rising survey costs, with concur-
rently shrinking budgets. The statistical community is also not yet sure how to
best use administrative data or model-based estimates. Each of the countries
represented at the workshop is addressing these issues differently.
The United Kingdom has standardized and integrated its major household
surveys. This is an intriguing idea, Bowie said, but such a system would be
much more difficult to implement in the United States, where the statisti-
cal system is more decentralized. Several past attempts to standardize basic
OCR for page 20
20 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
demographic questions across surveys at the Census Bureau were unsuccessful
because each survey sponsor had its reasons for wanting to ask a specific ques -
tion in a particular way.
The Netherlands Social Statistical Database is interesting because it is a
move away from surveys toward population registers, Bowie said. This lowers
survey costs, but there are issues inherent in gathering data this way. Canada
has addressed some of its challenges through the use of master sample frames
and samples, integrated listing activities, and household survey sample coordi -
nation. Some of these strategies are unique.
Some have argued that the current approach to conducting household
surveys in the United States is unsustainable. Bowie reiterated that this problem
is the focus of the workshop and that serious thought should be given to what
can be done in the future to address it.
Hermann Habermann (Committee on National Statistics) sought clarifica -
tion on the use of population registers in the Netherlands. If it was a distrust of
government that made people wary of censuses, how was a register received? A
register can be perceived as even more pernicious than a census. Bethlehem said
that there has always been a good population register in the Netherlands. This
became an issue during World War II because religion was recorded on the
register and, when the Germans invaded, they were able to easily identify Jews
in the country using the register. Today, there is a variety of registers, and they
seem to not bother people anymore. Many, if not most, people in the United
States may be in registers without even knowing it.
A follow-up to Habermann’s question concerned the political discussion
on using registers instead of surveys in the Netherlands. Were privacy advo -
cates concerned that the combination of registers would be a threat to privacy?
Bethlehem responded that the only political discussion was about reducing
the administrative burden of government. No privacy issues were raised when
the bill was proposed in Parliament, and the public really does not seem to be
concerned about it.
Wallman asked if registering was mandatory in the Netherlands, as it is in
Germany. She wondered whether there would be an adverse reaction to such
a requirement in the United Kingdom or the United States. Bethlehem again
noted that most people in the Netherlands probably do not even realize that
they are in the population register. The only time citizens encounter the register
is when they have to renew a passport or when they move and they are required
to fill out a form on the Internet. In situations like that, it can become a problem
if they are not in the register. However, the fact that the register is mandatory
has never surfaced as an issue.
Tambay recalled a case in which a journalist discovered that the department
that administers unemployment benefits in Canada has been maintaining a data
file on the labor force. The Canadian government publishes what files are used
by which government departments every year, so the existence of this file was
OCR for page 21
21
THE FEDERAL HOUSEHOLD SURVEY SYSTEM AT A CROSSROADS
always public information. Yet, when the journalist brought attention to this,
a scandal followed that affected subsequent data collection efforts, because
fewer people were willing to share information with this particular department
after the incident. The department was also ordered to destroy the file, because
although the existence of the file was always public, information about how the
data were being used was found to be not transparent enough.
Robert Kominski (Census Bureau) suggested that a synthetic register, or
one compiled from several data sources, may be a viable concept in the United
States. There are already many data systems here, and these could be used to
develop an effective register. An example of an existing register in the private
sector is the charge card registration system, which includes point-of-purchase
data and other information. The banks are authorized by the federal govern -
ment to collect these data, and the federal government could say that these data
are within its purview. Kominski added that perhaps this is a radical idea, but
the purpose of the workshop is to think broadly.
He went on to say that, in the current political climate, U.S. residents
might be willing to give up their privacy and register, if they thought that such
a system would prevent public services from being delivered to those who “do
not deserve them.” Some people might do this to obtain greater security or,
in their eyes, fair administration of state and federal goods and services. Some
might be offended by these ideas, he said, but there is a very large segment of
the population that would not be.
A workshop participant noted that even if only 5 percent of the popula -
tion refused to get an identity card or register, that is still 5 percent of the
population that would be missing, which would ordinarily be considered
unacceptable.
Wallman did not think the issues surrounding registers were necessarily
related to whether or not the registration was mandatory, but rather, in talking
to colleagues in other countries, whether or not the register was tied to certain
benefits. For example, eligibility for child care in the Netherlands is entirely tied
to the registration of that child. Such a setup would have a huge impact here.
There may be pros in addition to the cons typically associated with registers,
she said.
Lawrence Brown cited the example of Israel, which has a census as well as
a registration system. Although this system is far from perfect, particularly for
households, the government is building a secondary system of dual-system esti -
mation to correct the registry lists for census purposes. A question that remains,
however, is how a system like this can be built into a household data system
with the same effectiveness. Another question pertains to inaccuracies in the
registration system. Although the register in the Netherlands enables a count of
the population, there do not seem to be good address records. He asked: Would
it be better to have a dual-system follow-up to correct these inaccuracies?
Bethlehem said there were about 2,000 persons in the Netherlands not in
OCR for page 22
22 THE FUTURE OF FEDERAL HOUSEHOLD SURVEYS
the national register, and they are most likely illegal immigrants. About 15 per-
cent of the register records contain errors, but these errors come from incorrect
addresses. If someone is listed at an incorrect address in the register, this can
become a problem for them should they wish to, for example, get a passport.
Because people depend on the register to receive services, it tends to be fairly
accurate. Statistics Netherlands defines survey populations to be the population
in the register, thus that sampling frame completely fits the population. There
is also a database for information to do weighting adjustments. The question
of whether including illegal immigrants in the count and surveys is a problem
is a decision each country has to make.