The 2010 census has an innovative design, resulting in a census that will differ from its predecessor to a very substantial degree. Though plans for the 2010 census remain tentative, it is useful for the panel’s analysis to be able to compare the timetables for main activities in the 2000 and in the 2010 censuses. Box 3-1 contains a cross-walk of the timetables for the 2000 and 2010 censuses.
Four significant differences in design will considerably affect how the 2010 census coverage measurement (CCM) program needs to differ from the 2000 coverage measurement program: a short-form only census, an improved system for the Master Address File (MAF/TIGER) (the topologically integrated geographic encoding and reference database), coverage follow-up interviews, and removal of duplicate enumerations during the census.1
A Short-Form Only Census Since 2005 the Census Bureau has been fielding the American Community Survey (ACS), a continuous version of the decennial census long form. Therefore, under current plans there will be no long form in the 2010 census. This change will facilitate several
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 55
3
Plans for the 2010 Census
The 2010 census has an innovative design, resulting in a census that
will differ from its predecessor to a very substantial degree. Though plans
for the 2010 census remain tentative, it is useful for the panel’s analysis
to be able to compare the timetables for main activities in the 2000 and in
the 2010 censuses. Box 31 contains a crosswalk of the timetables for the
2000 and 2010 censuses.
MAJOR DESIgN CHANgES
Four significant differences in design will considerably affect how
the 2010 census coverage measurement (CCM) program needs to differ
from the 2000 coverage measurement program: a shortform only cen
sus, an improved system for the Master Address File (MAF/TIGER) (the
topologically integrated geographic encoding and reference database),
coverage followup interviews, and removal of duplicate enumerations
during the census.1
A Short-Form Only Census Since 2005 the Census Bureau has been
fielding the American Community Survey (ACS), a continuous version
of the decennial census long form. Therefore, under current plans there
will be no long form in the 2010 census. This change will facilitate several
1 The discussion in this chapter is based on the Census Bureau’s plans for the 2010 census
as of spring 2008.
OCR for page 55
6 COVERAGE MEASUREMENT IN THE 2010 CENSUS
BOX 3-1
Cross-Walk of Schedules
2000 Census 2010 Census
LUCAa LUCA 98 05/98–09/99 Ship materials
LUCA 99 01/99−10/99 11/06/07−3/18/08
Updates:
9/25/07–10/08/08
[Note: Some materials
sent earlier than
11/06/07]
MAF Block Canvass 01/99–05/99 04/06/09–07/10/09
Questionnaire Mailout 03/13/00–03/15/00 03/15/10–03/17/10
NRFUb Begins/Ends 04/00–07/00 05/01/10–07/10/10
CEFUc CFUd 05/00–07/00 04/26/10–08/13/10
CIFUe 07/00–08/00
Coverage Measurement 05/00–08/00 08/14/10–10/02/10
Personal Interviews
aLUCA: Local Update of Census Addresses
bNRFU: Nonresponse Follow-Up
cCEFU: Coverage Edit Follow-Up
dCFU: Coverage Follow-Up
eCIFU: Coverage Improvement Follow-Up
SOURCES: Census 2000 Operational Plan, December 2000, U.S. Department of
Commerce, Economics and Statistics Administration, U.S. Census Bureau; 2010
Census Key Operational Milestone Schedule.
aspects of data collection in the census, including data capture, the work
of followup enumerators, the management of foreign language forms
and foreign language assistance, and data editing and imputation for
nonresponse.
The ACS has been mentioned as a possible survey vehicle for cover
age measurement. We agree that there may be some potential for use of
the ACS to help assess the quality of dualsystems estimation (DSE), or
to help more broadly in coverage evaluation. However, some problems
would need to be overcome in applying the ACS in this way. First, the
address files for the ACS and the Census are very closely related, so at
present the ACS could not be used to estimate whole household omis
sions. In addition, the ACS questionnaire is not focused on coverage
measurement, as is that for the CCM. Finally, the ACS has a different defi
OCR for page 55
PLANS FOR THE 2010 CENSUS
nition of residence than the census, which would cause some additional,
albeit minor, complications.
Improved MAF/TIgER System In outline, the MAF begins with
the final address list developed in concert with the taking of the previ
ous census. This list is updated on a fairly continuous basis by additions
and deletions to the U.S. Postal Service’s Delivery Sequence File. For the
2000 census, local areas were provided the opportunity in 1998 and 1999
to make additions and deletions based on local information, which was
referred to as the Local Update of Census Addresses (LUCA) Program. A
block canvass was carried out to determine the accuracy of the address
list a year prior to the census. In addition to these procedures, there were
more than a dozen other ways in which an address can be added to
the MAF. There were numerous questions about the completeness and
the accuracy of the MAF listings for the 2000 census (see, e.g., National
Research Council, 2004b:Finding 4.4), and efforts are now under way to
improve both MAF and TIGER for 2010.
The MAF and TIGER databases have been redesigned into a single
MAF/TIGER database: MAF provides a list of household addresses,
and TIGER is used to associate each address on the MAF with a
physical location. The MAF/TIGER Enhancement Program includes:
(1) the realignment of every street and boundary in the TIGER database;
(2) development of a new MAF/TIGER processing environment and
the integration of the two previously separate resources into a common
technical platform; (3) collection of global positioning system coordi
nates for structures on MAF; (4) expansion of geographic partnership
programs with state, local, and tribal governments, other federal agen
cies, the U.S. Postal Service, and the private sector; (5) implementation
of a program to use ACS enumerators to generate address updates,
primarily in rural areas; and (6) the use of periodic evaluation activities
to provide quality metrics to guide corrective actions (for details, see
Hawley, 2004). One motivation for this initiative was the recognition by
the Census Bureau that many census errors and inefficiencies in 2000
resulted from errors in the MAF and in the information on the physical
location of addresses.
Coverage Follow-up Interviews The Census Bureau is greatly
expanding the percentage of housing units for which there will be a
coverage followup interview in 2010 in comparison with the housing
units in 2000 for which there was a coverage edit followup. The 2000
coverage edit followup was used to determine the correct count and
characteristics for two situations: households with more than six residents
(since the census form had space for information for only six persons)
OCR for page 55
COVERAGE MEASUREMENT IN THE 2010 CENSUS
and households with count discrepancies (e.g., differences between the
number of separate people listed on the questionnaire and the indicated
total number of residents). The planned expansion in 2010 is motivated
by the recognition that confusion about residence rules was a substantial
source of census coverage error.
The expanded coverage followup interviews planned in 2010 will
include four situations in addition to the two covered in 2000: (a) house
holds with a possible duplicate enumeration identified by a computer
match of the census returns to themselves; (b) households that respond
positively to a coverage probe on the census questionnaire concerned
with census omissions; (c) households that respond positively to a cover
age probe on the census questionnaire concerned with census erroneous
enumeration and duplication; and (d) households that have different
counts than that of a censusdeveloped population register based on
merged administrative records, known as StARS.
Of these four situations, (a) is intended to identify both households
containing duplicated individuals and fully duplicated households, (b) is
intended to identify potential omissions in the census, (c) is intended pri
marily to identify duplicated individuals, and (d) is intended to identify
all types of coverage error; see Box 32.
BOX 3-2
Situations Potentially Generating a
Coverage Follow-Up Interview
Coverage follow-up interviews could result from any of the following six
situations:
1. Count discrepancies in which the indicated total number of residents does
not equal the number of individuals for whom information is provided on the
census questionnaire.
2. Large households, where the number of residents is larger than six, which is
the maximum number of individuals with space for characteristics informa-
tion on the census questionnaire.
3. Positive result from the national duplicate search, i.e., where individuals in
a household unit match the data for individuals in another household.
4. Positive response to the coverage probes for census omissions, namely:
“Were there any additional people staying here Census Day that you did
not include in Question 1?”
5. Positive response to the coverage probe for census overcounts, namely:
“Does person P sometimes live or stay somewhere else?”
6. A count discrepancy between the census count for a housing unit and the
count from a roster produced from merging administrative records.
OCR for page 55
PLANS FOR THE 2010 CENSUS
Due to resource and time constraints, the Census Bureau will only
followup those households that provide a telephone number. In addition,
the Census Bureau may only be able to administer the coverage follow
up interview to a “most promising” subset of the qualifying households
in 2010. In other words, the Census Bureau may have to set priorities by
selecting a subset of the qualifying households that are more likely to
provide information that would result in coverage improvements.
Removal of Duplicate Enumerations During the Census As noted
above, the coverage followup interviews will be used to collect more
information on suspected duplicate enumerations that are identified
through use of a national computer search, with the objective of determin
ing whether they are in fact duplicates and, if so, which of the addresses
(if either) is the correct residence. If the correct residence is identified,
the enumerations at the incorrect residence would be removed from the
census.
This new census design has some benefits for the coverage measure
ment program in 2010. Focusing on the collection of shortform data will
likely improve the quality of the information collected, thereby reducing
the frequency of errors made in matching of the postenumeration sur
vey (PES) to the census. Also, implementing a national search for, and
field verification of, duplicate enumerations should reduce the number
of duplicates in the census, which may in turn facilitate the estimation of
components of coverage errors in the census and may also simplify the
application of the net coverage error models used in DSE in 2010.
TREATMENT OF DuPLICATES
Census Duplications
There are many different causes of duplication in a census. As noted
above, the census process may enumerate people that move either shortly
before, on, or shortly after Census Day at both their previous and their
current residences; the census may enumerate families with second homes
at both residences; the census may enumerate college students both at
their college residence and at their parents’ homes; and the census may
enumerate “snow birds” at both their primary residences and at their
winter homes. These are all examples of confusion over where someone’s
correct census residence is.
Another cause of duplication in the census is representation of an
address in more than one way on the MAF or having two forms returned
for the same unit, which can happen in multiple ways. To address duplica
tion in the census (in addition to attempts to measure its frequency in the
OCR for page 55
60 COVERAGE MEASUREMENT IN THE 2010 CENSUS
coverage measurement program) efforts have been made to adjust various
census processes to reduce the frequency of duplication. An example is
the primary selection algorithm, used in both the 1990 and 2000 censuses
and planned for use in 2010, which removes duplicate responses in the
census from the same housing unit by identifying the unique people
who were enumerated across all responses keyed to that housing unit.
Also, the census questionnaire has been adjusted in attempts to reduce
misunderstandings of census residence rules—in particular, through the
addition of the two coverage probes—and various efforts have been made
to reduce duplication in the MAF. Yet preventing census duplication
before it occurs is still a nontrivial task, and it was a serious problem in
2000 (for details, see National Research Council, 2004b).
As noted above, the Census Bureau in 2010 will attempt to identify
and delete duplicate persons and housing units during the census. Spe
cifically, after the primary enumeration process and nonresponse follow
up are complete, a nationwide computer search of census enumerations
for matching individuals will be carried out, using name, date of birth,
gender, and phone number (when available). On the basis of the results of
that search, the Census Bureau will identify likely duplicates in the 2010
census. Depending on the geographic proximity of the two residences in
question and the duplicate status of the other residents of a household, this
process may also be used to identify suspected duplicate housing units.
Once this list of potential individual and whole household duplicates
is generated, the plan is to collect more information by telephone through
coverage followup interviews at both residences. The interviews will
attempt to ascertain which (if either) of the two enumerations is correct
and which is a duplicate. (See Box 31 for additional details on circum
stances that can generate a coverage followup interview.)
Because Title 13 (of the U.S. Code) privacy protections prohibit using
information from one housing unit in querying another, extensive probes
will be used for handling a wide variety of complex living situations that
may be associated with the potential duplication in question (e.g., part
time residents, students away at college, movers, children in joint custody,
and elderly people in nursing homes). In particular, the interviewers can
not be told which people in a house are likely duplicates to help guide
the interviews. To reduce costs, as noted above, the coverage followup
interviews will only be carried out by telephone, and so will not be car
ried out for households that did not provide their telephone number on
their census questionnaires. This approach will prevent a modest, but not
insubstantial, percentage of the existing duplicates from being identified
and therefore being removed from the census.
Though the specific algorithm and the accompanying threshold for
designating matching individuals have not been chosen, the Census
OCR for page 55
61
PLANS FOR THE 2010 CENSUS
Bureau intends to set a strict threshold before the records for two individ
uals will be identified as a possible match and therefore trigger a coverage
followup interview. In addition to the strict threshold for designating
potential matches, one of the enumerations of a potential duplicate pair
will not be deleted from the census unless the evidence collected from the
coverage followup interview is clear that the individuals are duplicates
and which of the residences is correct given the census residence rules.
The panel is unclear precisely how the information collected in the
coverage followup interview will be used to discriminate between a
duplicate enumeration and a nonduplicate enumeration and to deter
mine the correct census residence. Moreover, the process will provide an
asymmetric treatment of coverage errors in that the error that results in
the removal of a valid enumeration will be judged as being more serious
than the error that results in the retention of a census duplicate. The panel
acknowledges that this asymmetry can be partly supported given the
nature of decennial census counts: that is, because the political environ
ment in which the census operates reacts differently to these two types of
error. However, both errors need to be measured and the tradeoff evalu
ated to determine if it is reasonable or needs to be reconsidered.
Joint Custody Scenario. As an example of what might happen in 2010,
consider the following situation involving a child in joint custody, when
both parents consider the child’s primary residence to be her or his home.
In this situation, the coverage followup interviews might well collect
information that supports the same two residences as the census reported.
The Census Bureau will strongly suspect that it is a duplicate pair but will
be unable to delete either enumeration given the lack of a way to identify
the correct residence.
Given the political sensitivity of the deletion of a census enumeration,
a coverage followup interview is required for deletion of one of a dupli
cate pair, even if the duplicate status is essentially unambiguous given the
above matching characteristics, and even if the residence rules are clear
as to which residence is correct.
In the case of potential whole housing unit duplicates, field inspec
tion will be used to determine if two housing units are duplicates. Poten
tial duplicate housing units (or households) may result from: (1) duplicate
addresses for the same housing unit, (2) delivery mixups in apartments,
(3) movers, and (4) person duplication of all members of a housing unit.
When field followup is used to verify duplicate housing status, there
will be no associated telephone coverage followup interviews for the
individual residents. If duplicate addresses are discovered for the same
physical unit, one of the two enumerations will be deleted. In the case of
a delivery mixup, the duplicate is retained as a field imputation.
OCR for page 55
62 COVERAGE MEASUREMENT IN THE 2010 CENSUS
The Census Bureau’s proposed use of the coverage followup inter
views and the field validation of potentially duplicated whole housing
units raises at least two major questions:
1. Given that the coverage followup interviews will be by telephone
only, what are the anticipated effects on duplicate resolution and
other enumeration problems that are addressed by the use of the
interview?
2. What are the rules for deleting whole housing units that are identi
fied as potential duplicates?
The coverage followup interviews and the national search for dupli
cates could provide substantial benefits over previous censuses in identi
fying and removing many census duplicates during the field enumeration
and in reducing the occurrence of other census coverage errors. However,
there are many potential complications in implementation that might
limit the benefits from the introduction of these processes in 2010. Since
these activities are inherently national in scope, they could not be com
prehensively tested in the relatively limited environment of a census test
or the 2008 census dress rehearsal. In particular, such environments are
unlikely to provide very good estimates of the extent to which these new
activities will stress the census infrastructure (e.g., because of the number
of coverage followup interviews that will be required). However, they
could provide information as to how to set the threshold for determining
when potential duplicates have characteristics that are close enough to
warrant a coverage followup interview.
The panel sees three concerns for the planned coverage followup
interviews. First, will there be sufficient resources to support the interviews
for all the situations that have been identified as potentially requiring such
followup. Second, although the questions on the coverage followup inter
views are more detailed than those on the census, will the similarity of the
questions result in the relatively infrequent collection of information that
would support changes in census enumeration status. Note that, in some
sense, a respondent in the followup interview needs to admit that the
previously provided information on the census form was incorrect. Third,
given that followup interviews will only be done for households that
provide telephone numbers, what will be the effects of not following up
households that did not provide telephone numbers. In addition, as noted
above, the panel is concerned that because the threshold for matching will
be set relatively high, some duplicates will not appear to have character
istics that match and therefore will not trigger a coverage followup inter
view. This is particularly noteworthy because there may be demographic
groups or geographic areas with a concentration of duplicates.
OCR for page 55
6
PLANS FOR THE 2010 CENSUS
Careful evaluation of both the coverage followup interviews and the
national search for duplicates is extremely important so that the function
ing of these processes in 2010 is fully understood and to carefully guide
any needed improvements for both of these processes prior to their use
in 2020 (if, as we would anticipate, they are included in the 2020 census
design). To support a careful evaluation, there is a need, at least for a
sample of cases, to retain information as to precisely what happened to
the cases that were selected for coverage followup interviews and as a
result of the interviews. Therefore, the Census Bureau should save the
responses, at least for a sample of enumerations, to the coverage follow
up interview questions and the final decisions made regarding the assess
ment of enumeration status.
In addition to retaining information on the functioning of the follow
up process, it will also be important to know the extent to which the
followup process moved census counts closer to the truth. The CCM
provides a unique resource to assist in determining the situations for
which the coverage followup process worked well and those for which
it worked poorly. So, for enumerations in the Esample (census enumera
tions that are in the Psample, the postenumeration survey clusters), it
would be very useful to retain a comprehensive log of their status prior
to and after the coverage followup interviews. With this information, the
CCM can provide a formal way of measuring the probabilities of proper
and improper duplicate removal and proper and improper duplicate
retention, and it can therefore provide an assessment of the decision pro
cess that was used to determine the cases that were deleted as duplicates
and the cases that were retained.
In addition to using the CCM for this purpose, it may also be valu
able for the Census Bureau to return to the field to examine a subsample
of cases selected for coverage followup interviews to see whether the
interviews actually provided new information with any appreciable
frequency and whether that new information led to correct decisions.
Such a study should be designed to include a large fraction of census
duplicates.
Regarding the national search for duplicates, it would be useful to
learn more about those cases that were near but still below the threshold
and therefore were not selected for coverage followup to determine
whether other thresholds would have provided better results. One could
sample from the cases near but below the threshold and follow them
up to assess whether any were duplicates and whether a field interview
likely would have determined that. Such data collection would inform
a costbenefit analysis of the tradeoff of identifying more true duplicate
enumerations against the cost of the additional field work (and the errone
ous identification of more false duplicates).
OCR for page 55
6 COVERAGE MEASUREMENT IN THE 2010 CENSUS
This discussion has been focused on the use of the coverage follow
up interviews to determine duplicate status. However, the interviews
will also be generated by a coverage probe for census omissions (i.e.,
“Were there any additional people staying here Census Day that you did
not include in Question 1?”). This probe is likely to have the same prob
lem as that encountered in the search for duplicates, namely, that it will
often result in the same information as the census. Therefore, it also will
be important to evaluate the resolution of households whose selection
for followup interviews are generated by the coverage probe for census
omissions. These evaluations will help measure the degree to which the
panel’s concerns (noted above) are a problem. Looking ahead, to further
improve on the use of this new process, the panel also believes it is impor
tant for the Census Bureau to undertake research during the postcensal
period in the following areas:
• the potential for StARS to help target the cases that are included in
the set of coverage followup interviews;
• the potential for StARS to help resolve potential duplicate
enumerations;
• the potential use of StARS to augment the CCM personal inter
views for resolving duplicate status;
• how to optimally set the “bar” for inclusion in the coverage follow
interviews;
• how best to discriminate between person and whole household
duplication; and
• in general, how to evaluate census unduplication procedures.
With respect to the first three issues above, see further discussion below.
Given the late date, it may be difficult to comprehensively evaluate these
three suggestions, but it should be feasible to make some progress on
each.
CCM Duplications
Duplications will occur not only in the census but also in the CCM sur
vey data collection. There are some differences in the treatment of a pos
sible duplicate enumeration in the census and in the CCM. Some of these
differences in approach are due to the dramatically different sizes of the
two activities: coverage followup interviews could be used for between
10 and 30 million households; in contrast, the CCM will cover about
300,000 households. We note several consequences of these differences.
First, since one of the components of census coverage error is omis
sions, and since, as pointed out below, the estimation of net coverage error
OCR for page 55
6
PLANS FOR THE 2010 CENSUS
is needed to estimate the number of census omissions, the Census Bureau
should and will use an unbiased approach to its assessment of duplicate
status in the CCM, in the sense of avoiding any differential bias in assess
ing the number of census omissions in comparison with the number of
census overcounts. Second, because the CCM does not have to make a
final determination of enumeration status, as the census does, the CCM
can assign probabilities of being a duplicate to cases with unresolved
duplicate status or similarly to cases with unresolved correct enumeration
status in the Esample or unresolved residency status in the Psample.
The inperson followup interviews for initially nonmatching CCM
cases should often prove useful in reducing the number of duplicates
in the CCM Psample. However, it would also be useful to use infor
mation from the coverage followup interviews to reduce duplicates in
the Psample. However, due to concerns about statistical independence
between the census and the postenumeration survey data collections,
the Census Bureau does not currently plan to use information from the
census coverage followup interviews to help ascertain duplicate status in
the CCM. The panel does not agree with this decision: the panel does not
understand why census information should not be used to assist in such
determinations, since the goal is the proper estimation of the frequency
of Psample matches and Esample correct enumerations.
College Student Scenario. To help make these issues more clear, con
sider the following example involving a 19yearold college student.
Assume that the student is counted in the census at his or her parents’
house and also at his or her university in a different city. Also assume
that the responses to the coverage probe dealing with overcounts on the
census form for the parents’ home does not indicate that the student may
sometimes live at an alternative address (the university). If the student’s
name is relatively common and some of the other characteristics do not
match (possibly due to nonresponse), a coverage followup interview
may not take place, since the degree of agreement may not reach the high
threshold for a coverage followup interview. In this case, the duplicate
enumeration would remain in the census. However, if the student’s name
is relatively uncommon, the degree of agreement may result in a follow
up interview. In that case, the parents, when interviewed, could still assert
that the student lives with them, in which case the student’s duplicate
enumeration would still remain in the census. However, the CCM thresh
old for identifying potential duplicates is almost certainly going to be
lower than that for the coverage followup interview, which may therefore
trigger an inperson followup interview, which might resolve the case.
Assuming that the parents’ home is in the CCM survey, the parents are
also likely to incorrectly respond to both the CCM interviewers and the
OCR for page 55
0 COVERAGE MEASUREMENT IN THE 2010 CENSUS
A second approach would be to have the coverage followup inter
views occur either before or after the CCM interviews, but apply the CCM
coverage measurement program to the census before coverage followup
interviews. This approach is referred to as evaluating a truncated census,
since the definition of the census for purposes of coverage evaluation is the
census that existed prior to the followup interviews. Any enumerations
added by carrying out coverage followup interviews after the CCM inter
views were completed could be treated as “late additions” were treated
in 2000: that is, removed from the census for purposes of coverage mea
surement. A problem with this approach is that if the coverage followup
interview adds an appreciable number of people or corrects the enumera
tions of an appreciable number of people, one is evaluating a truncated
census that is substantially different from the actual census. Also, if these
additions or corrections are considerably different in coverage error char
acteristics in comparison with the remainder of the population, it would
add a bias to the dualsystems estimates. As defined, one could include
the coverage followup interviews that occurred prior to the CCM inter
views in the truncated census, in which case the net coverage error models
could condition on whether a followup interview was carried out prior
to the CCM interviews: this would remove any bias if the Psample inclu
sion probabilities depended on the occurrence of the coverage followup
interviews (but not on its outcome; for details, see Bell, 2005). Information
on what the interviews added from outside the CCM blocks also could
be used in these models. There are some operational complexities to this
idea, including the need to duplicate the formation of relatively large
processing files. Finally, as mentioned above, one is not evaluating the
complete census: consequently, to assess components of census coverage
error resulting from the application of the later changes from the coverage
followup interviews, one would need to carry out a separate evaluation
study outside the CCM blocks, which is a serious disadvantage.
A third approach is not to use coverage followup interviews in the
CCM blocks. This approach avoids any contamination, but then the CCM
evaluates an incomplete census, with essentially the same problems listed
in the second approach, although it is worse because no results from cov
erage followup interviews could be used.
A fourth approach is to let the coverage followup and CCM interviews
occur in whatever order they do and treat contamination in net coverage
models as a constant effect times an indicator variable for which of the two
interviews comes first. The difficulty with this approach is that the effect of
whichever interview comes second is not clear, so it is not clear that con
tamination can be effectively modeled through use of a constant effect. For
example, contamination might be subject to various interaction effects.
OCR for page 55
1
PLANS FOR THE 2010 CENSUS
A fifth approach is to delay the CCM interviews until the coverage
followup interviews are complete. Such a delay solves the contamina
tion problem, but it introduces other problems. For example, coverage
evaluation interviews that occurred in August 1980 were less useful than
those in April due to the large number of movers that occurred during the
fourmonth period. Thus, this approach could have a substantial, negative
impact on the quality of the CCM data that are collected in 2010, depend
ing on the length of time between the census and the CCM.
After considering these approaches, the Census Bureau decided on
the last one—to delay the CCM interviews until after all coverage follow
up interviews are completed. There were several arguments given in
support of this decision:
• The Census Bureau would not have to plan on having a sub
standard census in any area, which would certainly be true of the
third approach.
• Combining the interviews, the first approach, might harm both
interviews.
• The fourth approach—letting the two interviews occur whenever
they fell—is speculative and would be difficult to assess prior to
the 2010 census.
The second and third approaches (excluding some of the interviews)
would require some assumptions about the nature of the late coverage
followup interviews and would also require a large, parallel census data
base. (For details on the Census Bureau’s views on contamination, see
kostanich and Whitford, 2005.)
It may be the case that further work would have demonstrated the
advantages of either a truncated census (the second approach) or of com
bining the two interview (the first approach). Also, the panel finds some
of the Bureau’s arguments in relation to the second and third approach—
in particular the difficulty of duplicating census processing files given
the availability of inexpensive computer memory—not fully convincing.
However, arguments for or against various alternatives are now moot,
given the Census Bureau’s decision.
The CCM interviews may not begin until late August or September
2010, which means there will be a relatively larger number of movers
between Census Day and the CCM interviews in comparison with the
number of movers in 2000. Data from movers in this context are known
to be of poor quality, partly because a large fraction of the data collected
is from proxy respondents. In addition, there also may be recall problems,
since people are being queried about where they lived several months
OCR for page 55
2 COVERAGE MEASUREMENT IN THE 2010 CENSUS
ago. This reduction in data quality will probably result in estimating
fewer matches than there actually are.
An early August start for the CCM in 2010 might be possible by expe
diting certain operations. To determine whether this is feasible, it will be
important to collect good data during the dress rehearsal, if relevant, on
the possibilities of expediting the initiation of the CCM interviews and to
develop a good understanding of how various delays affect the number
of movers.
Given its concern over a late start to CCM interviewing, the panel
would like to raise the possibility of initiating the 2010 CCM data collec
tion prior to the completion of the coverage followup interviews, without
any accounting for the overlap of the two data collection efforts in the
estimation of net coverage error. Decisions on whether to allow these
data collections to overlap, and if so, how much, are difficult to assess
since they involve the comparison of two biases whose magnitudes are
difficult to gauge. One bias stems from the data collected from movers,
and the second bias results from the potential contamination—that the
census data collected in the CCM block clusters will be different from
the remaining census data. Both biases are potentially sizable and, if so,
could substantially reduce the utility of the estimates from the CCM.
The magnitude of these biases involves a direct tradeoff: as one moves
the date for the initial capture of CCM data from midJune until early
September, the contamination bias decreases to zero as the mover bias
increases substantially.
The available research does not clarify what the size of these two
biases is as a function of various factors, especially the date that CCM
data collection begins. The uncertainty about the magnitudes of these two
biases precludes the panel from recommending how the Census Bureau
should proceed. However, the panel’s relatively subjective assessment of
the situation is that the mover bias at its maximum (from no overlap) is
likely to be substantially greater than the contamination bias at its maxi
mum (starting CCM data collection in, say, late June). Therefore, the panel
suggests that the Census Bureau reconsider starting the CCM data col
lection no later than midJuly, thereby allowing for some modest overlap
between the coverage followup and the CCM data collections.
Whether or not the Census Bureau reconsiders the start date for the
CCM, it should endeavor to begin CCM interviewing as soon as possible
after the completion of the great majority of the census data collection,
which one hopes would be before late July. Consistent with this, to the
extent that it is feasible, the management of the coverage followup and
the CCM data collections should be organized to limit the potential for
contamination by selectively starting the CCM data collection in those
areas in which the coverage followup interviewing has been completed,
OCR for page 55
PLANS FOR THE 2010 CENSUS
monitoring this on as small a geographic basis as possible. Furthermore,
there are potential advantages to the use of census designs in which there
is modest overlap between the coverage followup and the CCM, and that
the Census Bureau should consider use of such designs in 2010.
Recommendation : The Census Bureau should organize census
and coverage follow-up data collection so that data collection for
the census coverage measurement (CCM) program is initiated as
soon as possible after the completion of the census. In particu-
lar, the postenumeration survey in a particular area should start
as soon as possible after the completion of the great majority of
the census data collection—hopefully before late July. The Census
Bureau should also consider census designs for 2010 in which there
is some modest overlap between coverage follow-up and CCM data
collections.
ADMINISTRATIvE RECORDS
The Census Bureau has explored the potential for using administra
tive records (data collected as a byproduct of administering govern
mental programs) in the decennial censuses since the 1970s. Possible
uses include: (1) supporting a purely administrative records census;
(2) improving census nonresponse followup, either by using enumerator
followup only when administrative records do not contain the required
information or by completing information for households that do not
respond to initial attempts by field enumerators; (3) improving the Master
Address File with addresses found in administrative records; 3 (4) assist
ing in coverage measurement, for example, through use of triplesystems
estimation;4 and (5) assisting in coverage improvement, for example, by
identifying census blocks that may not have been well enumerated or
households for which the census count is likely to be in error.
One important advantage that administrative records have is that
they provide a source of information for hardtoenumerate groups that is
operationally independent of the census processes. Underlying the use of
a postenumeration survey is the assumption that reinterviewing people,
albeit with a much more intensive interview with more highly trained
3 The Census Bureau has already used administrative records for this purpose. The MAF is
already updated using the delivery sequence file from the U.S. Postal Service, which is a type
of administrative record, and the MAF is also updated using files from local jurisdictions,
which are often based on local administrative sources.
4 Triplesystems estimation is a generalization of dualsystems estimation: In this case the
third system would be a merged list of individuals from administrative records (for details,
see Zaslavsky and Wolfgang, 1990).
OCR for page 55
COVERAGE MEASUREMENT IN THE 2010 CENSUS
interviewers, will either generate a response when there was previously
no response or will provide different information than the respondents
provided earlier and thereby correct a incorrect response. One can argue
that for a substantial fraction of the cases with coverage error, due to the
similarity of the two requests for information, neither assumption may
obtain, especially for people that are actively seeking not to be counted
in the census. For many such cases, administrative records may provide
the only current real chance at enumeration.
Until recently, the available administrative records have suffered
from several limitations, including: insufficient coverage of the popula
tion represented on administrative records; lack of current information
(particularly for addresses); lack of information on race and ethnicity;
difficulty in unduplicating administrative lists with very few errors;
computational burden; and concerns about public perceptions. 5 Con
sequently, none of the potential applications of administrative records
have been implemented during a census. Until 2000, there was no com
prehensive field test of the benefits of the use of administrative records
for any census application, although there were assessments of the
coverage of merged administrative lists in assessing the feasibility of
an administrative records census. However, administrative records did
support at least two major coverage improvement programs—the non
household sources check in 1980 and the parolees and probationers
check in 1990.
Now, however, several of the limitations just noted have been addressed.
The quality and availability of national administrative records are improv
ing, computing power has increased dramatically, and the research group
on administrative records at the Census Bureau has achieved some impres
sive results. The primary program and database, referred to as StARS, now
has an extract of a validated, merged, unduplicated residential address
list with 150 million entries, 80 percent of which are geocoded to census
blocks, and another extract of a validated, merged, unduplicated list of
residents with demographic characteristics. These lists are approaching
the completeness of coverage that might be achieved by a decennial census
(Obenski and Farber, 2005).
Seven national files are merged to create StARS, with the Social Secu
rity Number Transaction File providing demographic data. As a result
of this progress, an administrative records comparison will be one of six
circumstances generating a coverage followup interview in 2010, which
may be the first direct application of administrative records to assist in
census enumeration.
5An approach to the problem of current address can be found in Stuart and Zaslavsky
(2002).
OCR for page 55
PLANS FOR THE 2010 CENSUS
However, the progress to date is not a compelling argument for wide
spread use. The quality of administrative records, and StARS, in support
of census field enumeration is still untested, and many of the deficiencies
regarding undercoverage, race and ethnicity information, and current
address are still worrisome. AREX 2000 provided the only major test to
date of the use of administrative records (primarily for use as an alterna
tive method for taking a census). While the population coverage (for the
more thorough of the schemes tested) was between 96 and 102 percent
relative to the 2000 census counts for the five test site counties, AREX
2000 and the census counted the same number of people at the housing
unit level only 51.1 percent of the time and counted within one person of
the census count for only 79.4 percent of the households. So although the
potential of administrative records is obvious, these ideas need further
development and evaluation.
In order to make sure that an important opportunity is not being
missed, but also to verify that administrative records can provide real
benefits, the Census Bureau would need to support a wideranging and
systematic research program on decennial census applications of admin
istrative records that is amply funded and staffed. Such a program would
have the specific goal of deciding which of the potential uses of admin
istrative records are and are not feasible for use in 2020. Such decisions
would have to be made by 2015 so that there would be sufficient time
before the census for final testing and to best integrate these various
activities into the 2020 census design. Administrative records can still be
used in a limited way in the 2010 census, in addition to the role they are
playing in generating the coverage followup interviews. In particular,
administrative records might be considered either for coverage improve
ment or for coverage measurement in 2010. The panel believes that it no
longer makes sense to view the use of administrative records as an inter
esting possibility for some unspecified census in the future. We believe it
is crucial to comprehensively assess their potential now for use in the 2020
census. We propose six potentially feasible uses of administrative records
in a census: to improve the MAF or other address lists, in latestage non
response followup, for item imputation, to improve targeting of coverage
followup interviews, for assistance on the status of nonmatches, and to
evaluate a census coverage measurement program.
Improvement or Evaluation of the Quality of the MAF or the
Address List of the Postenumeration Blocks The quality of the MAF is
key to a successful mailout of the census questionnaires and nonresponse
followup, and the quality of the independent list that is created in the
CCM blocks in 2010 will be key to a successful coverage measurement
program. StARS provides a list of addresses that could be used in at least
OCR for page 55
6 COVERAGE MEASUREMENT IN THE 2010 CENSUS
two ways. First, the total number of StARS addresses for small areas could
be checked against the corresponding MAF or PES totals to identify areas
with large discrepancies that could be relisted. Second, more directly,
address lists could be matched to identify specific addresses that are
missed in either the MAF or the PES address listings, with discrepancies
followed up in the field for resolution. Note that although administrative
records could be used to improve the address list for either the census or
the PES, to maintain independence they should not be used for both.
Assistance in Late-Stage Nonresponse Follow-up The Census Bureau
makes several attempts using field enumerators to collect information
from mail nonrespondents to the census. When these attempts fail to col
lect information, attempts are made to locate a proxy respondent and,
when that fails, hotdeck imputation (filling in for the nonresponse with
the data for a randomly selected, geographically proximate household) is
used to supply whatever information is needed, including the residence’s
vacancy status and the household’s number of residents. If the quality of
StARS information is found to be at least as good as that from hotdeck
imputation or even proxy interviews, it might be effective to attempt to
match nonrespondents to StARS before either pursuing a proxy interview
or using hotdeck imputation. Especially with a shortformonly census,
StARS might be sufficiently complete and accurate for this purpose. Fur
ther, one might profitably make fewer attempts at collecting nonresponse
data by making use of StARS information, for example, after only one or
two attempts at nonresponse followup, thereby substantially expediting
and reducing the costs of nonresponse followup.
Item Imputation The Census Bureau often uses item imputation
to fill in modest amounts of item nonresponse. Item nonresponse could
affect the ability to match a Psample individual to the Esample, and
missing demographic and other information may result in an individual
being placed in the wrong poststratum or the use of the wrong covariate
information in a logistic regression. Item imputation based on information
from StARS may be preferable to hotdeck imputation. The use of StARS
to provide item imputation was tested as part of the 2006 census test, but
the results were not available in time for this report.
Targeting the Coverage Improvement Follow-up Interviews The
coverage improvement interview in 2010, as currently planned, will
follow up households with any of the following six conditions: (1) char
acteristics for the additional people in large households who did not fit on
the census questionnaire, (2) count discrepancies between the indicated
number of residents and the number of persons for whom information is
OCR for page 55
PLANS FOR THE 2010 CENSUS
provided, (3) potential duplicates identified by a national match of census
enumerations to themselves, (4) persons who, given their responses to
coverage probes, may have been enumerated at other residences in addi
tion to the one in question (potential duplicates), (5) persons who, given
their responses to coverage probes, sometimes stayed at the housing
unit in question and who may have been omitted from the census, and
(6) people in households with different counts than in a list generated
from administrative records. The workload for this operation might well
exceed the Census Bureau’s capacity to carry out the necessary field work
given limited time and resources. It might be possible to use administra
tive records to help identify situations in which field resolution is not
needed, for example, by indicating which of a set of duplicates is at the
proper residence. (Uses of StARS in similar ways were tested in the 2006
census test, but the results were not available in time for this report.)
Determination of the Status of Nonmatches It is possible that
administrative records can be used to determine the status of a nonmatch
prior to followup of nonmatches in the postenumeration survey. It is
very possible that nonmatches of the Psample to the census may be
resolved, for example, by indicating that there was a geocoding error or
a misspelled name, thereby saving the expense and time of additional
CCM field work.
Evaluation of the Census Coverage Measurement Program The
quality of many of the steps leading to production of dualsystems esti
mates might be checked using administrative records. For example,
administrative records information might be used to assess the quality
of the address list in the Psample blocks, to assess the quality of the
matching operation, or to assess the quality of the smallarea estimation of
population counts. We note, however, that any operation that makes use
of administrative records cannot also use the same administrative records
for purposes of evaluation.
The administrative records group at the Census Bureau has already
had a number of successful applications of StARS. First, an administrative
records census was conducted in five counties during the 2000 census,
and its quality was judged to be comparable to that of the census in those
counties. (This assessment is somewhat surprising given that, as pointed
out above, the agreement between StARS and the census counts was only
slightly above 50 percent.) Second, StARS was used to explain 85 percent
of the discrepancies between the Maryland Food Stamp Registry list of
recipients and estimates from the Census Supplementary Survey in 2001
(the pilot American Community Survey).
OCR for page 55
COVERAGE MEASUREMENT IN THE 2010 CENSUS
Since the panel’s suggested uses of administration records depend
crucially on the quality of the merged and unduplicated lists of addresses
and people in StARS, prior to the implementation of StARS for any of
the above purposes in 2010 (except arguably for coverage measurement),
it would be necessary to evaluate the use of administrative records in
comparison to the current method used in the census. Alternatively, the
use could be an additional process added to the census, in which case it
would be necessary to assess the likely effects on the quality of the census
enumerations along with its likely costs. If there are no opportunities for
a careful test of feasibility and effectiveness of applications of administra
tive records in 2009, additional uses of administrative records in 2010 will
not be feasible.
Thus, it is likely that additional uses of administrative records, besides
their current role in coverage followup interviews, will have to wait
until 2020. However, the 2010 census provides an important opportu
nity for testing the above ideas. Therefore, the panel suggests that the
more promising of the above applications be developed sufficiently to
support a rigorous test in 2010, with additional refinement during the
intercensal period, with the goal of implementation in 2020 should the
subsequent evaluation support their use. (This idea is consistent with a
recommendation of the Panel on the Design of the 2010 Census Program
of Evaluations and Experiments; see National Research Council, 2007.) If
tests during 2010 are not feasible, the panel believes the highest priority
should be given to testing during 2012–2015, as a first step toward the
possible substantial use of administrative records in the 2020 census. In
particular, given the promise of administrative records in relation to the
census’ greatest challenge, reducing omissions, we strongly advocate the
testing of the use of administrative records for coverage improvement and
as part of the coverage measurement program or to assess the effective
ness of the coverage measurement program in measuring the number of
census omissions in 2020.
If data from StARS are used successfully in the coverage followup
interviews in 2010 or if early tests of administrative records in the next
decade strongly indicate their applicability and value for various census
applications, the Census Bureau could consider even more ambitious uses
of administrative data in the 2020 census. Specifically, for many housing
units, the Census Bureau might use administrative data not just to replace
latestage followup, but as a replacement for the entire nonresponse
followup interview. This use would seem to be especially valuable in sit
uations in which the enumerators had determined that the nonresponding
household was occupied. Under this approach, the Census Bureau would
use data from administrative records to determine the occupancy status of
some nonresponding housing units and the number and characteristics of
OCR for page 55
PLANS FOR THE 2010 CENSUS
its residents. To do so, the Census Bureau would have to develop criteria
of adequacy of the information in the administrative records to establish
the existence and count of the household for this purpose. For example,
agreement of several records of acceptable currency and quality might be
considered sufficient to use the information as a substitute for a census
enumeration, which would reduce the burden of field followup.
This use of administrative records would represent a substantial
change in what constitutes a census enumeration of at least the same
conceptual magnitude as the change from inperson to mail enumerations
as the primary census methodology. However, given that the complete
ness of administrative records systems and the capabilities for matching
and processing administrative records has been growing, and given that
public cooperation with survey field operations appears to be declining
(though the mail response rate in the 2000 census was slightly better
than that in 1990, reversing a trend over the past few censuses), it seems
increasingly likely that administrative records will soon provide enumer
ations of quality at least as good as field followup for some housing units.
Furthermore, unlike purely statistical adjustment methods, every census
enumeration would correspond to a specific person for whom there is
direct evidence of his or her residence and characteristics. The longrun
potential for such broader contributions from administrative records is a
reason to give high priority to their testing in the 2010 census.
Three possible objections might be raised in opposition to this
approach. First, this use of administrative records may be ruled to be
inconsistent with interpretations of what an enumeration means in the
Constitution. Second, public perception that the government will other
wise obtain the information might reduce response to the census mailout
questionnaire. Third, like any use of administrative records for other than
their intended purpose, this may raise public concerns about a loss of
confidentiality. These three issues are not compelling arguments against
moving forward, but they would need to be addressed before the Census
Bureau could implement their use in 2020.
In summary, if the Census Bureau is to position itself to be able to
make an informed decision about the value of administrative records to
fulfill a variety of possible functions in the 2020 census, it needs to make
use of the various testing opportunities in both the 2010 census and in
the early part of the 2010–2020 intercensal period to assess which of the
applications listed here are feasible and effective. Otherwise, important
benefits may be missed since one cannot implement the ideas absent a
careful evaluation. Even with a successful test, there will be a number of
implementation complexities that will have to be dealt with, and waiting
to test such ideas in 2016 or later will likely not leave enough time.
OCR for page 55
0 COVERAGE MEASUREMENT IN THE 2010 CENSUS
Recommendation 5: The Census Bureau should use the various
testing opportunities in both the 2010 census and in the early part
of the 2010–2020 intercensal period to assess how administrative
records can be used in the 2020 census.