Certain types of housing units are more likely to be missed than others, yet the Bureau’s current design for the coverage measurement postenumeration survey does not adequately take this into account.

Recommendation 6: The Census Bureau should compare its sample design for the 2010 census coverage measurement postenumeration survey with alternate designs that give greater sampling probability to housing units that are anticipated to be hard to enumerate. If an alternate design proves preferable for the joint goals of estimating component coverage error and net coverage error estimation, such a design should be used in place of the current sample design.

Thorough analysis of data from the coverage measurement survey offers a unique opportunity to learn how census errors occur and how census processes might be changed to reduce them in the future. Working with outside researchers to the extent possible, the Census Bureau should study and give consideration to a richer menu of analytic methods using data collected from the coverage measurement postenumeration survey.

To date, the Census Bureau has not given sufficient attention to developing statistical models that link the frequency of the four components of coverage error to census processes, person and housing characteristics, and other predictors. These models, which can be thought of as forms of discriminant analysis, could use a wide variety of approaches, including logistic regression and various data mining methods, such as classification trees, support vector machines, and neural nets. It may be that modeling the frequency of erroneous enumerations may benefit from an entirely different approach than modeling the frequency of census duplicate enumerations, or census omissions. Consideration should also be given to the potential for using predictor variables that are specific to each type of error. Also, the use of separate models for distinct population subgroups should be considered.

Recommendation 12: The Census Bureau should develop regression models that elucidate the various types of census coverage error, using specified dependent and predictor variables. To the extent that the database supporting these models can be made available to external researchers, it is extremely important that the Census Bureau pursue all viable avenues to involve outside researchers in the development of such models.

Recommendation 10: In developing the logistic regression models or other types of discriminant-analysis models of match status,

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement