| ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 96
4
Sampling and Statistical Estimation
This chapter discusses potential uses of sampling and statistical estimation to
address the two main challenges of the 2000 census: reducing differential cover-
age and controlling operational costs. Why should the Census Bureau consider
the use of sampling and estimation? Sampling and subsequent estimation offer
two advantages over enumerating or surveying an entire population. The first,
more obvious one, is cost savings. Trying to obtain data from everyone in a large
population is usually prohibitively expensive. Drawing a sample can dramati-
cally slash resources requirements and often yields adequately precise estimates
for the population and major subgroups. Only when estimates are required for
fine levels of detail, as in the U.S. census, does it make sense to even consider
trying to obtain data from everyone in a large population. The second advantage
of sampling is that it enables enhancements in data quality that would be too
expensive or intrusive to apply to the entire population. A well-conducted sample
survey will usually provide more accurate information than a program that at-
tempts to collect data from an entire population but suffers from high nonresponse
or biased responses. Indeed, the Census Bureau has traditionally used a sample
survey to evaluate census coverage.
This chapter focuses on two major innovations that the Census Bureau is
considering for producing population counts in the 2000 census. The first inno-
vation is sampling for nonresponse follow-up. Instead of trying to enumerate all
housing units for which there is no response during mailout-mailback operations,
the Census Bureau would follow up only a sample of such housing units (most
likely between 10 and 33 percent). Data from housing units sampled for non-
response follow-up would allow estimation of counts and characteristics of mail-
back nonrespondents who are not sampled.
96
OCR for page 97
SAMPLING AND STATISTICAL ESTIMATION
97
The second proposal, called integrated coverage measurement (ICM), is de-
signed to measure and correct the differential undercount. In July 1990, the
Census Bureau conducted the Post-Enumeration Survey (PES) in a sample of
165,000 housing units to allow measurement of the coverage achieved by the
main census operation. Although the survey identified a net undercount of about
1.6 percent and substantial differential undercount by geography and demographic
characteristics, the official 1990 census counts did not use the information ob-
tained as part of the PES. During the 1995 census test, the Census Bureau plans
to evaluate a new integrated coverage measurement method, CensusPlus, de-
signed to run concurrently with the main census operations and thereby to facili-
tate production of official counts by the legal deadlines.
The Census Bureau decided not to use sampling during the initial mailout-
mailback phase of the census because of concerns about the legality of that
strategy and the adverse impact that it would have on the accuracy of counts
(Isaki et al., 1993~. We concur with that decision. Both nonresponse follow-up
sampling and integrated coverage measurement use sampling to try to obtain
more accurate responses than could be achieved in a census. Combined with
statistical estimation, these techniques should improve the absolute counts and
reduce differentials in census coverage across states, other large political divi-
sions, and major demographic categories. At the same time, initial attempts to
enumerate everyone should produce acceptable accuracy for smaller areas like
minor civil divisions and census tracts. The likely combination of nonresponse
follow-up sampling and integrated coverage measurement clarifies the need for
statistical estimation in the 2000 census. Consequently, as Chapter 1 explains,
the Census Bureau is planning for a "one-number census" that combines the use
of enumeration, assignment, and estimation for production of the census counts.
The next three sections of this chapter discuss sampling for nonresponse
follow-up, integrated coverage measurement, and statistical estimation, respec-
tively. Although we discuss them separately, a recurring theme of the chapter is
that decisions about each of these topics should be considered in light of the other
two. Estimation methods clearly cannot be determined without knowledge about
sampling methods. And, for example, ultimate evaluation of a design for inte-
grated coverage measurement must refer to specifications of the plans for non-
response follow-up sampling and estimation procedures.
NONRESPONSE FOLLOW-UP
Background
The 1990 census was substantially more expensive than the 1980 census,
even after accounting for inflation and population growth. The largest single part
of the expense was follow-up of housing units that had not responded during the
mailout-mailback portion of the census. Estimates of the total cost of nonresponse
OCR for page 98
98
COUNTING PEOPLE IN THE INFORMATION AGE
follow-up operations in the 1990 census range from $490 to $560 million, roughly
20 percent of the $2.6 billion 10-year cycle cost of the census (Bureau of the
Census, 1992b; U.S. General Accounting Office, 1992~. Each 1 percent of
nonresponse to the mailed questionnaire is estimated to have added approxi-
mately $17 million to the cost of the census.
Perhaps just as important, nonresponse follow-up (NRFU) took much longer
than anticipated in some sites (in particular, New York City), pushing back the
schedule for completion of the census. In turn, NRFU operations pushed back the
beginning of coverage measurement by the Post-Enumeration Survey. The long
delay between Census Day and the beginning of coverage measurement compro-
mised the ability of the PES to operate accurately and was one of several factors
making it impossible for the Census Bureau to incorporate the PES results into
official counts released by the legal deadlines.
Even without delays in schedule, the latter stages of census operations typi-
cally suffer degradation of data quality. Ericksen et al. (1991) report that, for the
1990 census, the rate of erroneous enumeration on mailout-mailback was 3.1
percent. On nonresponse follow-up, the rate was 11.3 percent; on field follow-
up, the rate was 19.4 percent.
Much of the problem in 1990 resulted from mailback response rates that
were lower than expected. Item nonresponse also contributed to the follow-up
work because additional contacts were required to complete missing items. Ques-
tionnaire simplification, reminder postcards, replacement questionnaires, and
other innovations are expected to improve mailback rates, and the use of tele-
phone interviews may speed NRFU operations. Even so, a 100-percent NRFU
operation would certainly be very expensive. Thus, the Census Bureau has
focused substantial efforts on ways to reduce the scope of nonresponse follow-up
without undue sacrifices in the accuracy of the count or the content. The Census
Bureau has studied three major innovations for nonresponse follow-up in the
2000 census: truncating NRFU early, following up only a sample of mailback
nonrespondents, and using administrative records to replace or supplement tradi-
tional NRFU. In addition, it has considered combinations of these strategies
e.g., a two-stage NRFU consisting of a truncated operation aimed at all mailback
nonrespondents, followed by continued nonresponse follow-up for only a sample
of households.
The Census Bureau's cost models estimated very large cost savings with
either a truncated NRFU or with sampling for NRFU. Estimated cost savings
from truncation compared with the 1990 10-year cycle costs (in 1992 dol-
lars) ranged from about $127 to $160 million (depending on assumptions) for
truncation on June 30 up to $740 to $894 million for truncation on April 21 (no
follow-up) (Keller and Van Horn, 19931. For NRFU sampling rates of 50 percent
down to 10 percent, the models estimated cost savings compared with the 1990
10-year cycle costs ranging from approximately $300 to $750 million, even after
increasing the sample size for ICM measurement (Bureau of the Census, 1993d).
OCR for page 99
SAMPLING AND STATISTICAL ESTIMATION
99
However, those estimates do include some savings that could probably be
achieved even with 100 percent NRFU. We have not seen any estimates for cost
savings associated with the use of administrative records, presumably because no
detailed plans have been proposed for their use in NRFU.
Either of these innovations would also offer timing benefits compared with
the 1990 scenario. Either truncation or sampling for NRFU would accelerate the
completion of ICM. Because one of the potential problems with the planned ICM
method is difficulty with retrospective identification of Census Day residency,
moving up the last cases could be an important benefit. Earlier completion of
ICM would also make it easier for the Census Bureau to produce final counts in
time to meet legal deadlines. However, we note that these potential benefits
would be more important for a 1990-style PES than for the currently planned
ICM survey, which would run concurrently with the main census operations.
In contrast to these cost and operational advantages, both truncation and
sampling have negative implications for the precision of counts and other results,
especially for small areas. Counts and attributes of persons in nonsampled,
nonresponding housing units would need to be estimated, producing sampling
variability roughly proportional to the number of cases being estimated (although
the exact relationship would depend on the sample design and estimation method).
An results are aggregated to larger geographic areas, the errors diminish in size
relative to the population of the area.
The Census Bureau ran simulations with 1990 census data to evaluate the
adverse impact on the accuracy of various counts from exclusively using either
early truncation of NRFU or sampling for NRFU. Unfortunately, the simulation
studies did not produce estimates that allow for direct comparison of the two
methods. Even so, the Census Bureau concluded that sampling for NRFU seems
the more promising option at this point. Studies of NRFU truncation indicated
that, to achieve savings of $300 million (in 1992 dollars), truncation would have
had to occur so early in the 1990 census that the residual nonresponse rate would
have been 11 percent of all housing units. More troubling, the nonresponse cases
would have been spread very nonuniformly across district offices and demo-
graphic groups. As a result, truncation would have greatly increased the differen-
tial undercount in the census enumeration, placing further burden on integrated
coverage measurement.
Plans for the 1995 Census Test
On the basis of these conclusions, the Census Bureau decided to focus on
evaluating sampling for NRFU in the 1995 census test. Households that do not
respond to the mail questionnaire by 6 weeks after the initial mailout (14 days
after mailing of a replacement questionnaire) will be considered mailback non-
respondents, and one-third of these households will be sampled for NRFU. Cur-
rent plans call for the collection of only short-form data during NRFU. No
OCR for page 100
100
COUNTING PEOPLE IN THE INFORMATION AGE
attempt will be made to obtain information from the other two-thirds of mailback
nonresponding households. An attempt will be made to identify vacant housing
units before selection of the nonresponse sample. Interviewers will visit units for
which a postmaster returned the prenotice to the first mailing. Confimned vacan-
cies will not be included in the NRFU sample. A major purpose of testing
sampling for NRFU in the 1995 census test is to learn more about the relative
merits of sampling individual housing units (a unit sample) versus whole blocks
(a block sample); in the test, the NRFU sample will be split evenly between the
two types of samples. (Census Bureau documents refer to the former as a case
sample design, but we prefer to describe it as a unit sample design.) In a random
sample of one-half of the blocks not involved in ICM, the Census Bureau will
sample 33 percent (one-third) of nonresponding housing units. In the other non-
ICM blocks, block sampling will be used. That is, all mailback nonrespondents
will be followed up in one-third of the block-sample blocks, and no NRFU
activities will be conducted in the remainder of the block-sample blocks. Com-
plete nonresponse follow-up will be conducted in all ICM blocks.
Decisions for the 2000 Census
The Census Bureau faces several important decisions in connection with
sampling for NRFU in the 2000 census.
· Should sampling for nonresponse follow-up be used at all?
· Is a unit or a block sample preferable?
· What proportion of units or blocks should be sampled?
· Should the sampling probability be uniform across blocks (for a unit
sample) or across areas (for a block sample)?
How should the Census Bureau treat mail returns received after the begin-
ning of NRFU?
Should any nonresponse follow-up operations be conducted for all house-
holds before (or concurrent) with the sampling for nonresponse follow-
up?
We discuss these questions in the sections that follow.
Should Sampling for Nonresponse Follow-up be Used?
Whether to use sampling for NRFU in the 2000 census is mainly a policy
decision about whether the expected cost savings from the use of sampling out-
weigh the likely decreases in the accuracy of counts and other data, particularly
for small areas. The 1995 census test will provide valuable data to inform that
decision: more current inputs to the NRFU components of the Census Bureau's
cost model and data on the relationship between NRFU and ICM. In particular,
it will be important to identify all fixed components of the cost of NRFU sam
OCR for page 101
SAMPLING AND STATISTICAL ESTIMATION
101
pling in order to obtain accurate estimates of the cost savings during the 2000
census. However, the most complete information about the effects of sampling
for NRFU on the accuracy of the census would be gained from additional simu-
lations with 1990 data, especially to the extent that these effects vary across
geographic areas.
Ultimately, resolving whether to sample for nonresponse follow-up is likely
to involve answering the question: How accurate does the 2000 census need to be
for small areas? Although that question is more central to the charge of the Panel
on Census Requirements in the Year 2000 and Beyond, we offer a pair of com-
ments. First, counts and other tabulations are needed at the block level primarily
to allow flexibility for redistricting and for aggregating results to various political
jurisdictions and other territories. Thus, the success of the 2000 census should be
measured by the accuracy of these aggregate statistics rather than by the accuracy
of block-level data. Even so, we note that there will be no single answer to the
question of accuracy because sampling for NRFU would affect various levels of
aggregation in different ways.
Second, sampling variability is not the only source of error in census results.
Incomplete counts and erroneous enumerations occur during both the mailback
stage and the NRFU operation (even with 100 percent follow-up). Although
sampling for NRFIJ would certainly contribute most to the error in block- and
tract-level data, sampling error may be small compared with systematic error at
larger levels of aggregation. Systematic errors have contributed most to the
differential undercount in past censuses. If sampling for NRFU frees resources
for taking steps to reduce other sources of error in the final results, it may produce
a more accurate census by some measures.
Another concern associated with the use of NRFU sampling is that publicity
about it may reduce the mailback response rate. If NRFU sampling is used in the
2000 census, that fact would certainly become public knowledge, which might
dilute any positive effect that the mandatory nature of the census has on the
mailback response rate. It is also conceivable that Census Bureau staff might be
less committed to their enumeration efforts in the belief that sampling will take
care of nonresponse. Unfortunately, there is no way to learn from census tests
whether concerns about such reactions are warranted.
Whether sampling for nonresponse follow-up is used in the 2000 census will
also depend on obtaining adequate answers to the other questions posed above.
Is a Unit or Block Sample Preferable?
The choice between a unit sample and a block sample for NRFU involves
mainly a trade-off between the greater statistical efficiency of a unit sample and
the operational and cost advantages of a block sample. An additional consider-
ation is that a block sample would be easier to combine with the planned version
OCR for page 102
102
COUNTING PEOPLE IN THE INFORMATION AGE
of ICM. The 1995 census test will provide much of the information needed to
compare the relative advantages of the two options.
Sampling for NRFU necessitates estimating the attributes of nonsampled
(and nonresponding) housing units in a block from the information obtained
(during mailback or NRFU sampling) about responding units in that block and in
blocks judged similar in terms of geography and demographic characteristics.
There is reason to expect that a unit sample would generally produce more
accurate estimates than a block sample of the same size, because there is probably
some within-block correlation in household size and other attributes of mailback
nonresponse housing units, even within carefully selected strata.
Suppose, for illustrative purposes only, that information from sampled hous-
ing units in a 100-block area (roughly 1,000 nonresponding housing units) is used
in estimating the characteristics of nonsampled mailback nonrespondents in the
same blocks. To the extent that there is within-block correlation in the 100
blocks, data on a sample of nonrespondents spread evenly among the 100 blocks
would be more valuable by a ratio known as the design effect than data from
the same number of housing units concentrated in a smaller number of blocks. A
unit sample would also provide the opportunity to use information from sampled
mailback nonrespondents in the same block to improve the estimates for non-
sampled housing units in that block.
Certainly, heterogeneity among blocks can be expected for such characteris-
tics as race and ethnicity. However, the critical quantities to estimate may be
differences in mailback response rates among groups cross-classified by race,
ethnicity, and age; such differences may be relatively homogeneous among
blocks. Initial Census Bureau simulations with 1990 census data have found
advantages to both unit and block sampling under various circumstances (Fuller
et al., 1994), but further investigation is needed to separate the possible effects of
the estimation procedures from those of the design. Also, these simulations have
been limited to a few district offices. More comprehensive simulations with
more fully developed estimators are needed to precisely determine the size of the
unit sample advantage.
Another potential advantage of unit sampling is that it would spread impreci-
sion due to sampling and estimation among all blocks, thereby reducing the
maximum amount of block-level error. However, because block sampling would
eliminate the need for estimation in sampled blocks, the two methods would not
differ in the total number of housing units where estimation is needed. Conse-
quently, the relative accuracy of aggregate estimates based on unit sampling
would not necessarily increase beyond the amount attributable to within-block
correlation.
In contrast, block sampling might offer certain operational advantages. Enu-
merators would need to spend less time traveling between blocks. They might
also be able to use their time in each block more effectively. For example, while
visiting a complete sample of ma~lback nonrespondents in a block, enumerators
OCR for page 103
SAMPLING AND STATISTICAL ESTIMATION
103
might frequently observe occupants entering or leaving other units on the NRFU
list. With a unit sample instead, enumerators might tend to finish and proceed to
the next block too quickly for such contacts to occur. On the basis of very
preliminary assumptions, the Census Bureau has estimated that, compared with a
unit sample of the same size, a block sample would save from $14 million (for a
10 percent sample) to $42 million (for a 50 percent sample) more than the corre-
sponding amounts saved by the unit sample. Therefore, it is not obvious in
advance whether the unit sample or the block sample is more efficient in terms of
accuracy for equal costs. Operational data from the 1995 census test should
allow the Census Bureau to estimate the relative cost advantage more accurately.
Block sampling would fit better with any likely method of ICM, because 100
percent NRFU would be required in the ICM blocks (and, perhaps, in surround-
ing blocks). Complete NRFU is needed so that the block total from the ICM
operation can be validly compared with the total from preceding census opera-
tions. In effect, ICM blocks would also be NRFU block-sample blocks. Thus,
even if unit sampling is the primary strategy for NRFU, it may need to be mixed
with some block sampling for ICM purposes.
A related consideration is whether the choice of sampling design affects
coverage in NRFU housing units. For example, with the more concentrated
effort involved in following up a block sample, enumerators might be more likely
to discover housing units that had been omitted from the frame (e.g., garage
apartments). And if they do, it will be easier to use the results, because such
housing units will automatically be part of a block sample. Enumerators may
also be able to collect better proxy information for difficult-to-complete cases
under block sampling.
The Census Bureau plans to perform statistical tests for whether the average
household size differs systematically between unit and block sampling in the
1995 census test (Bureau of the Census, 1994c). However, the size and design of
the planned test are such that it could easily miss a coverage difference of 0.05
person per housing unit (about 2 percent of people in sampled units) between the
block-sampling and unit-sampling design; a difference of this magnitude would
be important to the decision on which sampling plan to use. If coverage differs
under block sampling and unit sampling, then the viability of unit sampling for
NRFU operations would be compromised, because ICM would measure cover-
age in block-sample NRFU and there would not be an adequate corresponding
measure for unit-sample NRFU. Consequently, the Census Bureau should inves-
tigate other ways to compare the validity of the two methods, such as comparing
the numbers of added housing units.
What Proportion of Units or Blocks Should be Sampled?
The Census Bureau appears to be considering sampling proportions in the
range of 10 to 33 percent. Like the question of whether to sample for nonresponse
OCR for page 104
104
COUNTING PEOPLE IN THE INFORMATION AGE
follow-up at all, the choice of sampling proportion rests mainly on a trade-off
between cost savings and accuracy of small-area data. Updated estimates of the
cost savings available from various sampling proportions will be one critical
input to the decision. The other critical input will be detailed simulation studies
of the effect of the sampling proportion on the accuracy of various estimates.
This decision should be made jointly with the decision about how large a sample
to use for ICM (assuming that element is included). Thus, simulations need to
account for the trade-offs between these two procedures.
Should Sampling Proportions be Uniform?
The Census Bureau will need to decide whether to sample all units or blocks
with equal probability. Factors that might influence the sampling probability
include the mailback response rate in the block, the size of the estimation post-
stratum of the block, and the cost of sending an enumerator to the block (or
housing unit).
How Should Late Mail Returns be Treated?
Inevitably, mail returns will continue to trickle in after selection of the NRFU
sample. Because these returns will not come from a random sample of all hous-
ing units that failed to respond prior to sampling, use of data from these returns
could bias estimates. However, ignoring the results might add unnecessary vari-
ance and be a public relations problem. Research is needed about the best use of
such data from either sampled or nonsampled housing units. (In a later section,
we discuss a similar issue and possible solution in the context of ICM.) The 1995
census test will provide information about the likely frequency of late mail re-
turns for various cutoff dates. That information may suggest moving the date for
beginning NRFU.
Operations to Supplement Sampling for Nonresponse Follow-up
In the panel's interim report, we recommended that the Census Bureau con-
sider a two-stage strategy that combines a truncated NRFU followed by sampling
first-stage nonrespondents. Although the Census Bureau chose not to directly try
out a two-stage strategy in the 1995 census test, data collected as part of the test
could provide valuable information about this option. If the response rate can be
increased substantially during a brief effort directed at 100 percent of mailback
nonrespondents, this strategy might reduce the magnitude of estimation required
while retaining large cost savings. Although the 1990 census results are discour-
aging on this score, the computer-assisted telephone interview (CATI) system
OCR for page 105
SAMPLING AND STATISTICAL ESTIMATION
105
offers some hope for yielding a large number of responses in a cost-effective
manner.
We also recommended that the Census Bureau investigate the value of ad-
ministrative records as background information to make possible more accurate
estimation of people in blocks not sampled for nonresponse follow-up. The idea
(which could be applied equally well to a unit sample) is to use administrative
records information to help estimate the count and characteristics of people in
housing units about which there is no other direct information. The Census
Bureau would neither accept the administrative records data at face value (too
unreliable) nor require direct verification (too expensive). Instead, the same
administrative records data would be compiled for housing units in the NRFU
sample. The combination of administrative records that best predicted counts
and characteristics in the NRFU sample would be used to estimate those same
quantities in nonsampled households. Evaluating estimators on the ability to
predict in sampled housing units would serve as an aggregate verification process
for any administrative record. If some combination of administrative records is
fairly accurate, then using such records in the estimation could substantially
improve the accuracy of small-area estimates at relatively little increase in costs.
Because administrative records data will be collected and processed indepen-
dently from NRFU operations, the Census Bureau can evaluate the ability of
these records to improve estimates for nonsampled housing units.
Recommendation 4.1: Sampling for nonresponse follow-up could pro-
duce major cost savings in 2000. The Census Bureau should test
nonresponse follow-up sampling in 1995 and collect data that allows
evaluation of (1) follow-up of all nonrespondents during a truncated
period of time, combined with the use of sampling during a subsequent
period of follow-up of the remaining nonrespondents, and (2) the use of
administrative records to improve estimates for nonsampled housing
units.
INTEGRATED COVERAGE MEASUREMENT
In addition to the use of sampling and estimation for nonresponse follow-up
as described above, current census design plans call for a separate data collection
effort in a smaller sample of blocks to measure the coverage of all census opera-
tions that precede it. The preceding census operations include address list devel-
opment, mailout-mailback of census questionnaires, special enumeration
methods, and nonresponse follow-up. In a one-number census, the coverage
measurement survey and the estimation and modeling associated with it are con-
ceived as an integral component of census-taking, not as a separate postcensal
evaluation activity. Hence, this phase of census-taking is called integrated cover-
age measurement. In this section we focus on the ICM data collection methodol
OCR for page 106
106
COUNTING PEOPLE IN THE INFORMATION AGE
ogy; we discuss coverage measurement methods used in the past, the data collec-
tion procedures planned for the 1995 census test, our concerns about those meth-
ods, and suggestions for evaluation. We turn our attention to the associated
estimation in a later section.
Previous Coverage Evaluation Programs
The Census Bureau has evaluated coverage of census enumerations since
1950 (Coale, 1955; Himes and Clogg, 1992~. Two methods of coverage evalua-
tion have been used: demographic analysis (DA) and dual-system estimation
(DSE).
Demographic analysis combines data from previous censuses, vital statistics
on births and deaths, and other administrative records, such as Medicare data, to
obtain national population estimates by age, race or ethnicity, and sex. DA relies
on what is called the demographic accounting equation:
population = previous population + births - deaths + inmigrants - outmigrants.
DA has been useful in determining broad patterns of census coverage over
time. Because of the lack of detailed information on internal migration and other
state-level components of this accounting method, however, DA is regarded as
reliable only for national-level estimates of population by demographic group
and cannot provide estimates for subnational areas such as states. Uses and
extensions of DA are discussed further in a subsequent section.
Dual-system estimation as used in recent censuses is based on data collected
for a stratified sample of households in a coverage measurement survey. (DSE
more broadly construed has taken many forms in problems of human and animal
population estimation see e.g., Marks et al., 1974; Seber, 1982; Chandrasekhar
and Deming, 1949~. In short, people "caught" in the survey are matched against
the census enumeration in order to estimate the fraction of the population that
was included in the census. Similarly, a sample of people enumerated in the
census is followed up to determine whether these people should in fact have been
included or whether they were erroneously enumerated. The DSE method allows
estimation of census coverage-undercount or overcount by combinations of
demographic group, geographic area, and other variables available on the census
form (such as owner/renter status); the degree of stratification is limited only by
the sample size.
The coverage measurement survey may be conducted as a postenumeration
survey, following the census and temporally and operationally separated from it,
as in the 1990 census, but this is only one of several possible alternatives. In
1980, two panels of the Current Population Survey were used as the coverage
measurement survey. Pre-enumeration surveys have also been proposed.
The 1980 Post-Enumeration Program was designed purely as an evaluation
of the 1980 census enumeration. The possibility of using the 1980 coverage
OCR for page 125
SAMPLING AND STATISTICAL ESTIMATION
125
each cell by the corresponding adjustment factor, and then summing the adjusted
counts.
The validity of these synthetic estimates was criticized on the grounds that
undercount is not in fact uniform across each adjustment cell. The question of
how this lack of uniformity affects the accuracy of synthetic estimates of popula-
tion (in absolute terms, and in comparison to unadjusted enumerations) has been
a subject of lively debate (Freedman and Navidi, 1986; Schirm and Preston,
1987, 1992; Freedman and Wachter, 1994; Wolter and Causey, 1991; Fay and
Thompson, 1993~. This question has been approached through theoretical inves-
tigation and through simulations based on coverage measurement data and cen-
sus data for other variables than the undercount. Further research in this area
before the 2000 census would make a useful contribution to design and validation
of the estimation methodology.
Other methods have been proposed for carrying estimates down to small
areas. One alternative, for example, is a simple regression methodology, with
proportions from different adjustment cells as covariates (Causey, 1994) or with
other variables as covariates (Ericksen and Kadane, 1985~. Like the synthetic
approach, these methods should be subjected to testing through simulations be-
fore a final decision is made.
Direct and Indirect Estimates
Direct estimates are defined as estimates based entirely on data from the
domain for which the estimates are calculated. Indirect estimates make use of
data from outside the domain. Simple survey estimates are direct estimates,
whereas model-based estimates may be indirect. In particular, synthetic esti-
mates are indirect because they apply factors calculated over an estimation cell to
geographic areas that are smaller than the geographic extent of that estimation
cell. Empirical Bayes smoothing models are also indirect because the model
component of the estimates is estimated across a large number of cells.
In the 1990 undercount estimation program, almost all cells cut across state
lines. Consequently, estimates of population for states and subdivisions of states
were synthetic and therefore indirect. This fact was grounds for some contro-
versy over whether it was accurate and fair to estimate one state's population
using data from another state, if conditions might in fact have differed among
states. Evaluation studies directed at this question were inconclusive (Kim et al.,
19911.
The Census Bureau is now considering requiring that direct estimates be
obtained for all states. This would have major implications for design of the ICM
sample, because even the states with the smallest populations would need sub-
stantial ICM sample sizes to obtain direct estimates of acceptable accuracy. The
calculations of sample sizes could vary greatly, however, depending on the crite-
rion of accuracy that is adopted. At one extreme, a criterion of equal coefficient
OCR for page 126
26
COUNTING PEOPLE IN THE INFORMATION AGE
of variation of direct population estimates in every state (equal standard error of
estimated ICM adjustment factors) would imply roughly equal sample sizes in
every state, despite the 100-fold ratio of populations between the most and least
populous states. Such a design might be drastically inefficient for estimation of
adjustment factors for domains other than states. At the other extreme, a criterion
of equal variance of direct population estimates for every state would imply
larger sampling rates (and therefore disproportionately larger sample sizes) in
larger states. Considerations based on minimization of expected loss (Spencer,
1980; Zaslavsky, 1993) would lead to other sample allocations with higher sam-
pling rates but smaller absolute sample sizes in small states compared with large
states.
A decision to require direct state estimates must be considered in light of its
implications for accuracy of estimated population shares for other domains, such
as urban versus suburban areas within states and white, black, and Hispanic
populations in a region. We recommend that alternative ICM designs should be
prepared with and without direct state estimates and that the added costs or loss of
accuracy for other domains of interest should be made clear at a policy-making
level before it is decided whether this feature is essential. Compromises may also
deserve consideration, such as requiring direct estimation only for states and
. . . . .
cities ~ arger than some minimum size.
Use of direct estimation for some or all states would not preclude calculation
of indirect estimates for subdomains within states, down to a level of detail
similar to the posts/ratification in the 1990 PES. For example, the total popula-
tion of Idaho might be obtained from a direct estimate, but the estimate for urban
Idaho (compared with suburban Idaho) might be based in part on adjustment
factors calculated from data for Idaho, Wyoming, and Montana, and the estimate
for Native American reservations in Idaho might use nationally estimated adjust-
ment factors for Native American reservations. This could be accomplished, for
example, by calculating synthetic estimates for all domains and then ratio adjust-
ing them or rallying them to match direct state estimates. Selection of estimators
must be guided by an awareness that, although for some purposes state estimates
are the most important product of the census, for other purposes the distribution
of population within states, for example by race and urbanicity, is paramount.
Recommendation 4.5: The Census Bureau should prepare alternative
sample designs for integrated coverage measurement with varying levels
of support for direct state estimation. The provision of direct state
estimates should be evaluated in terms of the relative costs and the
consequent loss of accuracy in population estimates for other geographic
areas or subpopulations of interest.
OCR for page 127
SAMPLING AND STATISTICAL ESTIMATION
What Form Will Final Population Counts Take?
127
The Census Bureau has placed a high priority on making all census products
consistent internally and with each other. Consistency requires that, when sev-
eral published cross-tabulations share a common stub or margin, or when such a
margin can be calculated by summing numbers from several tables, the margins
of various tables should agree with each other. For example, the total of the
number of white males over age 18 in a state should be the same regardless of
whether it is read off the state totals or calculated from tables by county, by tract,
or by block. Similarly, published means or proportions should agree with quan-
tities that might be calculated from tables.
A wide variety of census products are produced, and it is impossible to
foresee all tabulations that will be generated as part of regular data series or
special tabulations. Therefore, the surest way of guaranteeing this consistency is
to produce a microdata file that looks like a simple enumeration of people and
households but includes records for people estimated through TOFU sampling
and ICM as well as those directly enumerated. Then tabulations and public-use
microdata samples may be produced from this microdata file. There are two
difficulties inherent to this approach. First, simple estimation methods such as
those described above for ICM estimation produce adjustment factors that can be
used to calculate expected numbers of people. They do not, however, describe
the full detail required to create a roster of households that include the estimated
number of people in each estimation cell. Therefore, procedures must be devel-
oped that predict household structure for the "estimated" part of the population.
These procedures could take various forms, ranging from arbitrary grouping of
people to full probability modeling.
Second, the rounding inherent in imputation of complete households adds
noise to the estimates. For example, if calculations based on the adjustment
factors indicate that 0.15 person should be estimated in a block (beyond the
enumerated roster) in each of 9 cells, then the total number of people added will
have to be rounded to either 1 or 2, and the number in each cell to either 0 or 1.
The requirements of creating realistic households may require even more round-
ing, since it may not be realistic to assume that the estimated people were neces-
sarily in households of only one or two people, even if that is the average number
estimated per block. A further complication is that there would be a stochastic
element in this calculation. Both the rounding and stochastic components of error
would be most noticeable at the most detailed levels of geography. At more
aggregated levels, rounding could be controlled to area totals, and the stochastic
component of error would tend to average out. This problem is another research
area that will have to be considered in the years before 2000.
Weighting presents an alternative to imputation that avoids the above diffi-
culties. But weighting possesses the disadvantage that it produces noninteger
counts. Rounding must therefore be performed after the weighted tables are
OCR for page 128
28
COUNTING PEOPLE IN THE INFORMATION AGE
created, thus complicating the task of maintaining consistency in all census
products.
A minor, but possibly sensitive, issue is the treatment in estimation of counts
associated with census forms that are collected late, after the implicit reference
point of ICM. In principle, these should not be counted because the factors
derived from ICM do not include these forms in their base. Simply ignoring
these forms might be a public relations disaster, however. One appealing solu-
tion would be to substitute these forms for households that would have been
imputed for ICM without changing the total number estimated for any geographi-
cal area. Similar issues may arise with respect to late mailback returns that come
in after NRFU sampling and data collection have taken place.
Acceptable Accuracy for Estimates
The design of NRFU and ICM samples is motivated by considerations of
desired accuracy at various levels of geographic aggregation. Under all designs
within the current range of consideration, the NRFU sample will be much larger
than the ICM sample (10-33 percent versus less than 1 percent), but the portion of
the population that will be estimated and imputed on the basis of NRFU sampling
(on the order of the mail nonresponse rate, about 30 percent in 1990) is also much
larger than the portion estimated through ICM (of the order of 1-2 percent of pre-
ICM totals, on the average). Therefore, issues about the accuracy of estimates
under NRFU sampling are concerned primarily with small levels of aggregation,
such as blocks, tracts, and minor political divisions, and issues about ICM accu-
racy are concerned primarily with larger levels of aggregation, such as states,
demographic groups in broad geographic regions, and cities.
Early research (Fuller et al., 1994) suggests that coefficients of variation for
block population estimates under NRFU sampling will be high. Research may
lead to improved estimators that reduce variance for small areas, but it is unlikely
that any estimator can make major gains in precision at this level. Precise block-
level counts are rarely if ever needed, however, except as a means for building up
estimates for larger levels of aggregation. Therefore, evaluation of NRFU sam-
pling should focus on accuracy at the level of minor civil divisions, state legisla-
tive districts, and similar-sized units.
Conversely, evaluation of ICM should focus on accuracy at broader levels of
aggregation. There are important interactions between ICM sample design, esti-
mation methods, and the units for which ICM accuracy is measured. The ICM
sample must be designed to give acceptable accuracy for important units such as
those listed above. It would be desirable to attain a level of accuracy such that
error for these units is smaller than the differential coverage of the pre-ICM phase
of the census, but this may not be attainable within acceptable limits for the scale
of ICM.
OCR for page 129
SAMPLING AND STATISTICAL ESTIMATION
129
The Role of Demographic Analysis
Demographic analysis refers to the estimation of population using the basic
demographic identities relating population to birth, deaths, immigration, and
emigration (e.g., Robinson et al., 1993, 1994~. In practice, demographic analysis
can be used to obtain population estimates at high levels of aggregation. Tradi-
tional methods for demographic analysis yield estimates of the national popula-
tion cross-classified by age, sex, and race.
In the 1990 census, demographic estimates were used as a check on the
aggregate accuracy of dual-system estimates of undercount by age x sex x race
group (Robinson et al., 1993~. They are particularly suited to this purpose be-
cause most of the components of demographic estimates are quite accurately
determined, although estimates of undocumented immigration remain controver-
sial (see, e.g., Bean et al., 1990~. In addition, demographic estimates of sex ratios
have been used to check the internal consistency of dual-system estimates from
the 1990 PES (Bell, 1993~. For this purpose, the DA estimates of sex ratios were
regarded as more reliable than DA estimates of population totals, and the results
suggested a substantial undercount of black males, even after dual-system esti-
mation. Demographic estimates were also used to evaluate the face validity of
subnational dual-system estimates in the decision on adjustment of the 1992
postcensal estimates base.
More recent and ongoing efforts have attempted to develop improved demo-
graphic estimates at subnational geographic areas for the youngest and oldest
segments of the population. Estimates for the population age 65 and over have
been produced using state-level Medicare records, with allowance for state-by-
state variability in Medicare enrollment rates and in percentages of eligible mem-
bers of the population age 65 and over (Robinson et al., 19931. State enrollment
and eligibility rates are affected by such variables as citizenship status and the
proportion of retirees who have held federal government jobs. The Census Bu-
reau has also worked to develop subnational estimates for the population under
age 10, using vital statistics and estimates of interstate migration (Robinson et al.,
1994~. These estimates require assumptions about the state-to-state variability in
migration patterns, the completeness of birth records, and the use of valid resi-
dence definitions during hospital birth registration.
These research programs are breaking important new ground, but it would be
premature to judge the credibility of these estimates. As noted above, the esti-
mates are based on a number of assumptions that require further evaluation.
Also, a major limitation of demographic analysis is that the uncertainties in
estimates provided by the method are largely unknown (see Clogg and Himes,
1993; Robinson et al., 1993~.
Demographic analysis possesses a number of potential strengths as a method
for coverage evaluation in the 2000 census: operational feasibility, timeliness
(estimates could be available early in the census year), low cost, independence of
OCR for page 130
130
COUNTING PEOPLE IN THE INFORMATION AGE
ICM, and comparability to historical series (Robinson et al., 1993, 1994; Clogg
and Himes, 1993~. But, in addition to the problem of measuring uncertainty,
there remain significant difficulties associated with the use of demographic analy-
sis that currently limit its role in the decennial census: the lack of reliable data on
international migration, particularly for emigration and undocumented immigra-
tion, questionable measures of interstate migration, which limit the accuracy of
subnational estimates; and problems with racial classification in birth and death
records and their congruency with self-identification in the census, which affects
estimates for all groups (especially Hispanics, Asians and Pacific Islanders, and
Native Americans).
Because of the limitations of foreseeable progress on these methodological
problems, we expect that demographic analysis will be useful primarily as an
evaluation tool for ICM in the 2000 census. We assume that, as in recent cen-
suses, national estimates of the population cross-classified by age, sex, and race
will be produced for this purpose. It may also be possible and worthwhile to
incorporate some demographic information e.g., estimates of sex ratios into
the estimation procedure for integrated coverage measurement. However, based
on the current state of research, we doubt that demographic analysis could be
used in the 2000 census to adjust (or benchmark) final population estimates as
part of integrated coverage measurement.
Nonetheless, the panel believes further research on demographic methods is
a cost-effective investment that could pay long-term dividends beyond the contri-
butions to census coverage and evaluation. An exciting new development is the
convergence of demographic analysis, the postcensal estimates program, and the
Program for Integrated Estimates in connection with the proposed continuous
measurement system (see Chapter 6~. The common ground of these programs is
that each of them uses a variety of data sources to improve estimation of popula-
tion counts and characteristics without relying on the census itself. Demographic
analysis traditionally has depended primarily on demographic data, as described
above, in order to obtain very aggregated estimates; administrative records such
as Medicare registrations have also begun to be used in this program. The
postcensal estimates program (Long, 1993) combines census-year population
counts with a variety of indicators of change of the population at local levels,
such as school enrollments and changes in housing stock, to obtain estimates of
the population at a fairly detailed level. The Program for Integrated Estimates
(Alexander, 1994) would make use of a wide variety of sources, including data
from a new survey and a variety of administrative records with coverage for
people, households, or housing units, to produce detailed estimates of counts and
characteristics down to the block and tract levels (see Chapter 5 for a discussion
of the potential of health care records in this regard).
The three programs described above can be seen in a progression according
to time of development (from several decades ago to the present and future), cost
(from least to most expensive), operational difficulty (from easiest to most chal
OCR for page 131
SAMPLING AND STATISTICAL ESTIMATION
131
longing), use of localized sources (from purely aggregate analysis to integration
of microdata sources), and degree of small-area precision (from least to most
detailed). Each step in this progression has been justified by contemporary
needs, but each step also brings us closer to having the technical capabilities and
experience required to be able to obtain adequate information from administra-
tive records without having to mount a full-scale census.
Recommendation 4.6: The panel endorses the continued use of demo-
graphic analysis as an evaluation tool in the decennial census. However,
the present state of development does not support a prominent role for
demographic methods in the production of official population totals as
part of integrated coverage measurement in the 2000 census. The Cen-
sus Bureau should continue research to develop subnational demo-
graphic estimates, with particular attention to potential links between
demographic analysis and further development of the continuous mea-
surement prototype and the administrative records census option.
Other Uses of Estimation
As discussed in Chapter 3, current proposals (Kalton et al., 1994) for enu-
meration of the homeless population call for service-based enumeration. Under
these plans, persons making use of services, such as shelters and soup kitchens,
would be enumerated on several different occasions. The lists from these enu-
merations will then be matched so that the degree of overlap in the service
population from day to day can be determined. These data will be the basis for
estimation of the total homeless population that males use of these services.
These estimation methods are related to dual-system estimation, but there are
special complications because one site might be enumerated on several different
occasions and because the same person could appear for services at more than
one site.
Statistical modeling and estimation may also play a role in the use of quality
assurance (QA) data to monitor and evaluate the coverage measurement survey
(Biemer, 1994~. Evaluation of this survey is difficult, expensive, and controver-
sial, because it requires replication and reconsideration of judgmental decisions
made during the original coverage measurement operation. For example, the
most skilled and experienced matching staff may reanalyze data and obtain addi-
tional data from the field long after the census to check the accuracy of matching
determinations made during the CensusPlus operation. Because these studies are
difficult and depend on skills that are in short supply, they will almost certainly
be small compared with the coverage measurement sample itself. If research
over the next few years can identify QA measures of the census and coverage
measurement process that are correlated with evaluation outcomes, the QA data
OCR for page 132
32
COUNTING PEOPLE IN THE INFORMATION AGE
will provide useful auxiliary variables for evaluation of the distribution and con-
sequences of errors in the coverage measurement operation.
Prespecification and Documentation of Procedures
In the year 2000 census, increased reliance on estimation makes it essential
that the Census Bureau's choice of estimation methodologies should inspire gen-
eral confidence. It is unrealistic to hope that there will be total unanimity in
support of any full set of methodologies. Census methods have always received
criticism insofar as they have had a discernible effect on an identifiable geo-
graphic area or a particular group. For example, the local review process during
the 1990 census led to inclusion of units that were determined to have been
omitted from address lists and to recanvassing of some areas. Many procedural
decisions, however, are invisible to those outside the Census Bureau and have
consequences that are obscure. Decisions about estimation methods have been,
and in all likelihood will continue to be, especially controversial (1) because they
are open and explicit and (2) because their effects on totals for identifiable areas
can be determined, at least after the fact. We address each of these consider-
ations, taking the second first.
Past experience (for example, with adjustment of the 1990 census and with
adjustment of the 1992 base for postcensal estimates) has shown that the majority
of the public comments received on an estimation methodology are motivated by
concerns for its effects on particular political jurisdictions (Bryant, 1993~. An-
other common concern was that confusion would be created by having two sets of
results, those before and after adjustment. By prespecifying estimation methods
as much as possible, the Census Bureau makes it clear that decisions have been
made on good statistical principles and judgment, rather than being motivated by
any consideration of how they will affect particular areas. Adopting such a
policy avoids the 1990 experience of placing estimation choices before decision
makers who could perceive the political consequences of different procedures. In
this way, the Census Bureau obtains some protection against criticisms directed
at the particular effects of methods.
A proper balance must be struck that gains these benefits of prespecification
without committing the Census Bureau to a rigid set of procedures that permit no
leeway for handling unforeseen circumstances in the conduct of the census or
adapting estimators to unanticipated features of the data. Therefore, an appropri-
ate level of prespecification includes a positive statement of areas in which judg-
ment may be exercised as well as areas in which decisions are made before the
census. For example, the general form of the estimators to be used and the flow
of information in processing should be prespecified, but it may be recognized in
advance that judgment will be exercised in deletion or downweighting of outliers,
variable selection in regression models, or splitting of poststrata.
The openness and explicitness of estimation methodologies make it possible
OCR for page 133
SAMPLING AND STATISTICAL ESTIMATION
133
to begin to build consensus in support of them, both technically and politically,
before the census even begins. To make this possible, the Census Bureau should
release a publication describing the main steps in data collection and processing
and the estimation methods that will be used in the census, including a descrip-
tion of estimators, of evaluations that will be applied to these estimators, and of
points at which the use of professional judgment is foreseen.
The process of consensus-building continues after the census through the
release of suitable technical documentation. This documentation, as well as
describing in general terms the estimators that were used, should present in
aggregate terms the calculations that produced population totals reported for
major geographic areas (states and large cities), as well as for major demographic
groupings. It must be emphasized that this documentation would be released
after the publication of census population figures used for apportionment and
redistricting, and that the intermediate totals in these calculations should not be
interpreted as competing estimates of population. The procedures should be
regarded as an integrated whole, not a menu of options from which various
parties can pick and choose to find the treatment most favorable to their local
area. The postcensal documentation should also contain a summary of evaluation
results. Summary measures of accuracy for various levels of aggregation, such as
those calculated through the total error model in the evaluation of the 1990 PES,
may be a suitable format for summarizing these evaluation results.
Recommendation 4.7: Before the census, the Census Bureau should
produce detailed documentation of statistical methodology to be used
for estimation and modeling. After the census, the Census Bureau should
document how the methodology was applied empirically and should
provide evaluation of the methodology.
Reporting of Uncertainty
Official statistics have progressed over the century from a narrow focus on
simple tabulations of population characteristics to provision of a range of census
products, including complex tabulations and sample microdata files. Analytical
uses of these data require availability of both point estimates and measures of
uncertainty. When complex statistical methods, such as complex sampling
schemes, indirect estimation, and imputation are used in creating census prod-
ucts, users will not be able to derive valid measures of uncertainty by elementary
methods, and they may not have adequate information in the published or avail-
able products to derive these measures. It therefore becomes the responsibility of
the data producers to facilitate estimation of uncertainty.
Total error models have been used by the Census Bureau to measure uncer-
tainty in the outcomes of the census and the contributions of the various sources
of error to this uncertainty (Hansen et al., 1961~. More recently, a total error
OCR for page 134
34
COUNTING PEOPLE IN THE INFORMATION AGE
model was developed for estimation of uncertainty in adjusted estimates based on
the 1990 census and PES (Mulry and Spencer, 1993~. Such models take into
account both sampling errors in the estimates and potential biases stemming from
the regular census and from coverage estimation. Bias can arise, for example,
from use of several response modes or from differences among response times.
Similar models may be a useful tool for evaluating uncertainty in integrated
estimates from a complex census in the year 2000.
After uncertainties have been estimated, they should be described in a man-
ner that allows users to incorporate them into their data analyses. A variety of
methods for representing uncertainty are familiar from the world of survey sam-
pling. Summary measures of uncertainty (such as average coefficients of varia-
tion or variance functions) may be published as a supplement to published tabu-
lations' or standard errors may be published for quantities of particular interest.
A number of imputation methodologies are available (Rubin, 1987; Clogg et al.,
1991) that enable users of public use microdata samples to estimate the effects of
sampling and nonsampling variability on their analyses.
Recommendation 4.8: The Census Bureau should develop methods for
measuring and modeling all sources of error in the census and for show-
ing uncertainty in published tabulations or otherwise enabling users to
estimate uncertainty.
Research Program on Estimation
Necessary research on statistical estimation divides roughly into three phases.
In the first phase, which is now under way and continues until the major design
decisions have been made for the 1995 census test, estimation research focuses
on broadening the range of possibilities for the use of sampling and other statis-
tically based techniques. In this phase, preliminary assessments can be obtained
of the expected precision for venous designs.
In the second phase, roughly coinciding with the planning, execution, and
processing of the 1995 census test, the emphasis shifts to developing methods
needed for the selected designs and methodological features. Although it is not
necessary during this phase to decide on all the estimators that will be used, it is
critical that enough progress be made on NRFU sampling and ICM estimators
to avoid making decisions about design based on estimators that will later be
replaced.
In the final phase, beginning with assessment of the 1995 census test and
continuing through the decade, the selected estimation methods will have to be
consolidated, optimized, validated, and made both theoretically and operationally
robust. This last process will ensure that they can stand up to critical scrutiny and
to problems that may arise in the course of the 2000 census. In this phase, work
will also continue on selecting estimation procedures required for the production
OCR for page 135
SAMPLING AND STATISTICAL ESTIMATION
135
of all census products, including measures of uncertainty, and on more complex
procedures that will be used in evaluation of the census estimates.
Recommendation 4.9: The Census Bureau should vigorously pursue
research on statistical estimation now and throughout the decade. Top-
ics should include nonresponse follow-up sampling, coverage estima-
tion, incorporation of varied information sources (including administra-
tive records), and indirect estimation for small areas.
Representative terms from entire chapter:
housing units