Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Coverage Measurement in the 2010 Census Executive Summary The U.S. Census Bureau is justifiably proud of its more than 50-year history of evaluating the degree of net coverage error (undercoverage minus overcoverage) of the population in the decennial census. In addition to the information provided by the coverage measurement programs, this effort has resulted in the development of two internationally used census coverage measurement methods, dual-systems estimation and demographic analysis. Dual-systems estimation uses a coverage measurement post-enumeration survey as an independent enumeration of a population. Based on the number of matches between the census and the post-enumeration survey enumerations, an estimate of the number missed by both enumerations is generated, and therefore an estimate of the size of the population. Demographic analysis uses an accounting relation to estimate a population group’s size, adding the result of births and immigration, and subtracting those as the result of deaths and emigration. The census coverage measurement programs have historically addressed three primary objectives with varying degrees of emphasis: (1) to inform users about the quality of the census counts for various applications; (2) to help identify sources of error to improve census taking; and (3) to provide alternative (“adjusted”) counts based on information from the coverage measurement program. In planning the 1990 and 2000 censuses, the main objective was to produce alternative counts based on the measurement of net coverage error, although the alternative counts were never used for either reapportionment or redistricting. Subsequently,
OCR for page 2
Coverage Measurement in the 2010 Census a 1999 Supreme Court decision precluded the use of alternative counts, when based on sampling, for use in reapportionment. In addition, it is difficult to provide alternative counts in time for reapportioning congressional districts. Consequently, for the 2010 census coverage measurement program, the Census Bureau has stated its intent to deemphasize the goal of providing alternative counts and is instead planning on focusing its coverage measurement program on the second goal of improving census processes. The panel strongly supports the Census Bureau’s change in goal. However, the panel finds that the current plans for data collection, data analysis, and data products are still too oriented toward measurement of net coverage error to fully exploit this new focus. Although the Census Bureau has taken several important steps to revise data collection and analysis procedures and data products, the panel recommends further steps to enhance the value of coverage measurement for the improvement of future census processes. Recommendation 1: The Census Bureau should more completely shift its focus in coverage measurement from that of collecting data and developing statistical models with the goal of estimating net coverage error to that of collecting data and developing statistical models that support the improvement of census processes. To help achieve this new goal, instead of only measuring net census error, the Census Bureau also plans to measure the four components of census coverage error: (1) census omissions, (2) census duplications, (3) erroneous census enumerations, and (4) census enumerations in the wrong location. The panel supports these plans, since different types of coverage errors are caused by different interactions between census processes and housing units and their occupants. The estimation of these four components of coverage error can be supported by the general structure of the data collection and matching that is carried out in support of dual-systems estimation, though modified and expanded to support this different purpose. The panel finds, however, that the Bureau’s plans could be more fully developed for this purpose. Recommendation 9: The Census Bureau should further develop and refine its framework for defining the four basic types of census coverage error and measuring their frequency of occurrence. The Census Bureau should also develop plans for operationalizing the measurement of these components using data from the census and the census coverage measurement program.
OCR for page 3
Coverage Measurement in the 2010 Census Certain types of housing units are more likely to be missed than others, yet the Bureau’s current design for the coverage measurement postenumeration survey does not adequately take this into account. Recommendation 6: The Census Bureau should compare its sample design for the 2010 census coverage measurement postenumeration survey with alternate designs that give greater sampling probability to housing units that are anticipated to be hard to enumerate. If an alternate design proves preferable for the joint goals of estimating component coverage error and net coverage error estimation, such a design should be used in place of the current sample design. Thorough analysis of data from the coverage measurement survey offers a unique opportunity to learn how census errors occur and how census processes might be changed to reduce them in the future. Working with outside researchers to the extent possible, the Census Bureau should study and give consideration to a richer menu of analytic methods using data collected from the coverage measurement postenumeration survey. To date, the Census Bureau has not given sufficient attention to developing statistical models that link the frequency of the four components of coverage error to census processes, person and housing characteristics, and other predictors. These models, which can be thought of as forms of discriminant analysis, could use a wide variety of approaches, including logistic regression and various data mining methods, such as classification trees, support vector machines, and neural nets. It may be that modeling the frequency of erroneous enumerations may benefit from an entirely different approach than modeling the frequency of census duplicate enumerations, or census omissions. Consideration should also be given to the potential for using predictor variables that are specific to each type of error. Also, the use of separate models for distinct population subgroups should be considered. Recommendation 12: The Census Bureau should develop regression models that elucidate the various types of census coverage error, using specified dependent and predictor variables. To the extent that the database supporting these models can be made available to external researchers, it is extremely important that the Census Bureau pursue all viable avenues to involve outside researchers in the development of such models. Recommendation 10: In developing the logistic regression models or other types of discriminant-analysis models of match status,
OCR for page 4
Coverage Measurement in the 2010 Census correct enumeration status, and components of census coverage error, the Census Bureau should consider: Use of several approaches before focusing on a specific model; besides logistic regression, alternatives should include use of other link functions, discriminant analysis, and various data mining approaches, such as classification trees, support vector machines, and neural nets. Thorough examination of the subset of predictors that is best suited to each individual statistical model; the predictors for these various statistical models need not be identical; however, there may be a benefit to constraining the (logistic regression) models of match rate and correct enumeration rate to have identical variables in the estimation of net coverage error, and research should be carried out to assess whether this benefit outweighs the benefit of selecting variables that are optimal for each of these two logistic regression models. To effectively blend information from auxiliary sources at various levels of geographic and demographic aggregation, random effects modeling and Bayes’ methods also should be examined. This effort will require that considerable resources be allocated for the development and use of these models, comprising essentially a new Census Bureau research program. Recommendation 2: The Census Bureau should allocate sufficient resources, including funding and staff, to assemble and support an ongoing intercensal research program on decennial census improvement. Such a group should focus on using the data from the census and the census coverage measurement programs to identify deficient census processes and to propose better alternatives. The work of this group should be used to help design the census tests early in the next decade. To do this, it is important that sufficient data from 2010 be retained both for the measurement of the components of census coverage error and to provide the predictors that might be useful in these models. Recommendation 13: For a sample of households, the Census Bureau should retain data that provide a comprehensive picture of the census processes used to enumerate it, and the individuals residing in it, to facilitate subsequent evaluation. To allow linking assessment of census coverage error with a history of the census
OCR for page 5
Coverage Measurement in the 2010 Census processes, this sample should substantially overlap with the census coverage measurement sample. The creation and exploitation of an analytic database, in order to improve census processes, should be the primary goal of the coverage measurement program in 2010. However, the Census Bureau is focusing instead on producing summary tabulations related to the frequency of the components of coverage error by major census process. Such tabulations will have little value: the complexity of coverage error requires a more sophisticated use of the data through development of statistical models for each component of coverage error. Recommendation 11: The primary output of the Census Bureau’s coverage measurement program in 2010 should be an analytic database that is used to support the development of statistical models to inform census process improvement. The production of summary tabulations should be of lesser priority. In addition to the topics discussed here, the panel also offers recommendations in five areas: (1) the need to retain comprehensive information on the functioning of the coverage follow-up interview, (2) the timing of the coverage follow-up interview in relationship to the timing of the census coverage measurement data collection, (3) the testing of administrative records for various census purposes, (4) the development of improved techniques for treating missing data in coverage measurement models, and (5) research to guide improvement of demographic analysis.
OCR for page 6
Coverage Measurement in the 2010 Census This page intentionally left blank.