6
Accuracy and Coverage Evaluation: Overview

The Accuracy and Coverage Evaluation (A.C.E.) Program was designed for two purposes: to be the primary source of information about the completeness of coverage in the 2000 census for population groups,1 and to provide the basis for recommending adjustment of the census counts for estimated net undercount if the Census Bureau determined that adjusted estimates were more accurate than unadjusted counts for important uses of the data. The design and implementation of the A.C.E. were based on over 20 years of experience at the Census Bureau with dual-systems estimation of the population, using a sample of census records and a sample of records from a separate postcensus survey.

This chapter first summarizes key results from the A.C.E. for 2000, comparing them with results from the 1990 Post-Enumeration Survey (PES) and drawing implications for population coverage in the two censuses. It then gives an overview of dual-systems estimation, which uses results from the A.C.E. and the census to estimate the population. The final section describes the design and operational procedures for the A.C.E. (Appendix C provides a more detailed description) and summarizes important differences from the 1990 PES. The next chapter presents the panel’s assessment of what is known about the quality of the A.C.E. operations.

COVERAGE PATTERNS, 2000 AND 1990

Table 6-1 shows net undercount rates and the associated 90 percent confidence intervals from the 2000 A.C.E. and the 1990 PES for race/ethnicity domains, age and sex, and housing tenure.2 (Separate population estimates

1  

Demographic analysis (see Chapter 5) is another source for evaluating census population coverage; however, demographic analysis is limited to population estimates for age, sex, and black-nonblack population groups for the nation as a whole.

2  

The 90 percent confidence interval is the estimate (e.g., net undercount for Hispanics) plus or minus 1.645 times the standard error of the estimate. Standard errors were estimated by the Census Bureau (see Davis, 2001; see also Appendix C).



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 87
The 2000 Census: Interim Assessment 6 Accuracy and Coverage Evaluation: Overview The Accuracy and Coverage Evaluation (A.C.E.) Program was designed for two purposes: to be the primary source of information about the completeness of coverage in the 2000 census for population groups,1 and to provide the basis for recommending adjustment of the census counts for estimated net undercount if the Census Bureau determined that adjusted estimates were more accurate than unadjusted counts for important uses of the data. The design and implementation of the A.C.E. were based on over 20 years of experience at the Census Bureau with dual-systems estimation of the population, using a sample of census records and a sample of records from a separate postcensus survey. This chapter first summarizes key results from the A.C.E. for 2000, comparing them with results from the 1990 Post-Enumeration Survey (PES) and drawing implications for population coverage in the two censuses. It then gives an overview of dual-systems estimation, which uses results from the A.C.E. and the census to estimate the population. The final section describes the design and operational procedures for the A.C.E. (Appendix C provides a more detailed description) and summarizes important differences from the 1990 PES. The next chapter presents the panel’s assessment of what is known about the quality of the A.C.E. operations. COVERAGE PATTERNS, 2000 AND 1990 Table 6-1 shows net undercount rates and the associated 90 percent confidence intervals from the 2000 A.C.E. and the 1990 PES for race/ethnicity domains, age and sex, and housing tenure.2 (Separate population estimates 1   Demographic analysis (see Chapter 5) is another source for evaluating census population coverage; however, demographic analysis is limited to population estimates for age, sex, and black-nonblack population groups for the nation as a whole. 2   The 90 percent confidence interval is the estimate (e.g., net undercount for Hispanics) plus or minus 1.645 times the standard error of the estimate. Standard errors were estimated by the Census Bureau (see Davis, 2001; see also Appendix C).

OCR for page 87
The 2000 Census: Interim Assessment TABLE 6-1 Net Undercount for Major Groups, 2000 A.C.E. and 1990 PES (in percent)   2000 A.C.E. 1990 PES   Net Undercount 90% Confidence Interval Net Undercount 90% Confidence Interval Total Population 1.18 (0.97, 1.39) 1.61 (1.28, 1.94) Race/Ethnicity Domain   American Indian and Alaska Native on Reservation 4.74 (2.77, 6.71) 12.22 (3.52, 20.92) American Indian and Alaska Native off Reservation 3.28 (1.09, 5.47) N.A. N.A. Hispanic Origin (any race) 2.85 (2.22, 3.48) 4.99 (3.64, 6.34) Black or African American (not Hispanic) 2.17 (1.59, 2.75) 4.57 (3.67, 5.47) Native Hawaiian, Other Pacific Islander (not Hispanic) 4.60 (0.04, 9.16) N.A. N.A. Asian (not Hispanic; includes Pacific Islander in 1990) 0.96 (–0.09, 2.01) 2.36 (0.07, 4.65) White or Some Other Race (not Hispanic) 0.67 (0.44, 0.90) 0.68 (0.32, 1.04) Age and Sex   Under 18 years 1.54 (1.23, 1.85) 3.18 (2.70, 3.66) 18 to 29 years   Male 3.77 (3.24, 4.30) 3.30 (2.41, 4.19) Female 2.23 (1.75, 2.71) 2.83 (2.06, 3.60) 30 to 49 years   Male 1.86 (1.55, 2.17) 1.89 (1.36, 2.42) Female 0.96 (0.68, 1.24) 0.88 (0.47, 1.29) 50 years and over   Male –0.25 (–0.55, +0.05) –0.59 (–1.15, –0.03) Female –0.79 (–1.07, –0.51) –1.24 (–1.72, –0.76) Housing Tenure   In owner-occupied housing units 0.44 (0.21, 0.67) 0.04 (–0.31, 0.39) In nonowner-occupied housing units 2.75 (2.32, 3.18) 4.51 (3.80, 5.22) NOTES: Net undercount is the difference between the estimate (A.C.E. or PES) and the census, divided by the estimate. Minus sign (–) indicates a net overcount. For 2000, total population is the household population; for 1990, it is the household population plus the noninstitutional group quarters population (see Appendix C). The 90% confidence interval is the estimate plus or minus 1.645 times the standard error of the estimate. See Table 6-2 for definitions of race/ethnicity domains in 2000. N.A.: not available (not estimated). SOURCE: Hogan (2001a:Tables 2a, 2b).

OCR for page 87
The 2000 Census: Interim Assessment were produced for many more groups—called post-strata—than are summarized here; see “A.C.E. Operations,” below.3) The A.C.E. estimates apply to the household population; they exclude people in institutional and noninstitutional group quarters (e.g., dormitories, prisons, nursing homes, group homes), for whom coverage is not estimated and who therefore, by default, are assumed to have zero net undercount.4 People living in remote Alaska and people enumerated in shelters are also not included in the A.C.E. The PES estimates include people in noninstitutional group quarters, but there are no separate estimates for them. Overall, the population was undercounted by 1.2 percent in 2000 as estimated by the A.C.E., less than the net undercount rate of 1.6 percent estimated for 1990 by the PES. The 2000 and 1990 estimates of net undercount show similar patterns, in that net undercount rates are significantly higher in both censuses for most minority groups than they are for the white or some other races (not Hispanic) category. However, the 2000 net undercount rates for Hispanics and non-Hispanic blacks are significantly lower than the rates in 1990: for 2000 they are estimated as 2.9 percent and 2.2 percent, respectively, compared with estimates from 1990 of 5 percent and 4.6 percent, respectively. The net undercount rates for the white or other races category are the same in both censuses, 0.7 percent. By age and sex, net undercount rates in 2000, as in 1990, were higher for men than women. Net undercount rates were also higher in both censuses for younger people than for those aged 50 and over, for whom there is a small estimated net overcount. The most pronounced difference in net undercount rates by age between 1990 and 2000 is for children under age 18, for whom the rate is significantly lower in 2000 (1.5%) than it was in 1990 (3.2%). By housing tenure, people who rent continued to be undercounted at a higher rate than people who own their homes, but the net undercount rate for renters is significantly lower in 2000 compared with 1990. For owners, the net undercount rate is estimated at less than 0.5 percent in both censuses; for renters, the estimated net undercount rate is 2.8 percent in 2000, compared with 4.5 percent in 1990. 3   In census terminology, the racial and ethnic groupings used in defining post-strata are called domains. They are carefully defined in a hierarchical manner (due to the option to report more than one race; see Table 6-2), and may differ from colloquial definitions of a particular racial/ethnic group. 4   This assumption needs evaluation since people in group quarters are estimated to be 2.8 percent of the total population (see Chapter 4).

OCR for page 87
The 2000 Census: Interim Assessment FIGURE 6-1 Post-stratum coverage correction factors (CCF) by domain and tenure, 2000 A.C.E. NOTES: The number in parentheses following a label indicates the number of post-stratum groups belonging to that domain/tenure classification (see Table 6-2 for definitions of individual post-strata). Coverage correction factors are the dual-systems estimate of the population divided by the census count. For each box on the plot, the black dot indicates the median of the observations. The end-points of the rectangular box are the 25th and 75th percentiles of the observations, so the box spans the locations of one-half of the observations. The “whiskers” extend from the end of the box to a length based on the interquartile range of the observations; observations beyond these whiskers are indicated by open circles and may be considered outliers in the distribution. SOURCE: Tabulations by panel staff from U.S. Census Bureau, Pre-Collapsed Post-Stratum Summary File (U.S.), February 16, 2001.

OCR for page 87
The 2000 Census: Interim Assessment FIGURE 6-2 Coverage correction factors by race/ethnicity domain, housing tenure, and age/sex groups. Figure 6-1 shows distributions (boxplots) of estimated census coverage correction factors from the A.C.E. for some of the strongest relationships in the estimation, namely, that for owners and renters for each of seven race/ethnicity domains.5 While there is variation in the coverage correction factors for individual post-strata within each race/ethnicity and tenure group, renters have higher median coverage correction factors than owners in every race/ethnicity domain except for American Indians and Alaska Natives on reservations. Indeed, white renters have a higher median coverage correction factor than most minority owners. Figure 6-2 shows aggregate coverage correction factors for age/sex groups among three race/ethnicity domains (Hispanics, non-Hispanic blacks, and non-Hispanic whites and other races) for owners and renters.6 Higher coverage correction factors for men and young adults relative to children, older people, and women are more pronounced for renters than for owners in each 5   Coverage correction factors represent the population estimated from the A.C.E. using dual-systems estimation divided by the census population. For groups estimated to have a net undercount, the coverage correction factor minus 1.0 will be slightly higher than the net undercount rate measured by taking the difference between the dual-systems estimate and the census count and dividing that difference by the dual-systems estimate (e.g., a coverage correction factor of 1.04 is equivalent to a net undercount rate of 0.038 or 3.8%). 6   Separate coverage correction factors by sex are not available for children under age 18.

OCR for page 87
The 2000 Census: Interim Assessment race/ethnicity category, as are differences between the non-Hispanic white and other races domain and the other two groups. The measured reduction in the net undercount rates and associated coverage correction factors for minorities relative to the rates for the white and other races category has been cited as a major achievement of the 2000 census. Also noted has been the reduction in the net undercount rate of children relative to that for older people, as well as the reduction in the undercount rate of renters relative to that for owners (Executive Steering Committee on A.C.E. Policy, 2001a). An important question is the reason(s) for these reductions. In this chapter, we focus on the operation of the A.C.E. itself, which is necessary to understand the estimated net undercount rates and to determine whether the estimation of those rates is accurate. DUAL-SYSTEMS ESTIMATION The A.C.E., like its predecessors, the 1990 PES and the 1980 Post-Enumeration Program (PEP), was designed to estimate the population of the United States and population groups by dual-systems estimation (DSE). This method is closely related to a widely used statistical methodology known as capture-recapture, which was first developed for estimating wildlife populations. The methodology requires adaptation for the census context, as described in Fienberg (2000) and Hogan (1992, 2000a, 2000b). The basic concept is that a total population estimate—the dual-systems estimate—can be developed on the basis of being able to estimate how many people who were validly included in a second, independent survey (the P-sample) were also found in the first survey (here, the census enumerations in the A.C.E. sample blocks). Not every census enumeration is correct; some are erroneous (e.g., a duplicate), so the process also involves estimating how many of the records in a sample of census enumerations in the A.C.E. blocks—the E-sample—represent correct enumerations.7 In general terms, the P-sample and E-sample are used to estimate two components of the formula for calculating the DSE for each of several hundred population groups, called post-strata. These components are the proportion of the population correctly included in the census, which is estimated by the P-sample match rate, and the proportion of the census records that were correctly included, which is estimated by the E-sample correct enumeration rate: The match rate is the weighted estimate, M, of P-sample persons who match with E-sample or other census persons, divided by the weighted estimate, P, of all valid P-sample persons (including matches and nonmatches). 7   The E-sample does not include every census enumeration in the A.C.E. blocks, for such reasons as subsampling of large blocks (see “A.C.E. Operations,” below).

OCR for page 87
The 2000 Census: Interim Assessment The correct enumeration rate is the weighted estimate, CE, of E-sample persons who were correctly enumerated in the census (including matches and correct nonmatches), divided by the weighted estimate, E, of all E-sample persons (including correct and erroneous enumerations). These components are applied in a formula for each post-stratum (ps): (1) where: DSE is the dual-systems estimate of the post-stratum total population, ps; C–II is the census count, C, minus people requiring imputation and late additions to the census count, II, who are excluded from the E-sample because they cannot be matched to the P-sample;8 CE/E is the weighted correct enumeration rate from the E-sample; and P/M is the inverse of the weighted match rate from the P-sample. For any post-stratum, the net undercount rate (UR) is computed as: (2) and the coverage correction factor (CCF) is computed as (3) where C is the census count, including people requiring imputation and late additions to the count (IIs). The basic assumption underlying the calculation of the DSE can be stated as follows: Given independence of the P-sample survey from the census, the estimated proportion of P-sample people in a post-stratum who match to the census (M/P) is a good estimate of the estimated proportion of all people in 8   II is a Census Bureau term that originally stood for “insufficient information for matching.” Its meaning has evolved, and it now covers late additions to the census and people whose census records were incomplete and required imputation. In 2000, there were no late enumerations as such; however, there were 2.4 million people whose records were temporarily removed from the census file and reinstated too late to be included in the A.C.E. processing.

OCR for page 87
The 2000 Census: Interim Assessment the post-stratum who were correctly enumerated in the census (CE/DSE). Solving for DSE in the following equation, (4) gives equation (1) above. Five points are worth noting about dual-systems estimation in the census context. First, if there were no IIs, that is, no census enumerations that either lacked sufficient information or were added too late to be included in the A.C.E. matching, then the coverage correction factor, CCF, would be equivalent to the correction ratio, CR. The correction ratio is the correct enumeration rate, CE/E, divided by the match rate, M/P. (The equivalence is evident by setting II equal to zero in equation (4) and solving for (DSE/C).) Hogan (2001b) demonstrates why, in principle, the level of IIs does not bias the DSE (see also Chapter 8). However, the larger the number of IIs, the more the correction ratio will exceed the coverage correction factor. Consequently, if a census had a considerable number of IIs, examination of the correction ratio from the A.C.E. process would lead one to expect higher net undercount rates than actually result when the DSE is compared with the census count inclusive of IIs. In Chapter 8, we examine the role of IIs—people requiring imputation and late additions to the census count—who were several times more numerous in 2000 (8.2 million) than in 1990 (about 2.2 million). Second, there is no assumption that the P-sample must be more complete than the E-sample for DSE to work; it is expected that the P-sample will miss some people who were correctly enumerated in the census, and vice versa. What is important is that the information obtained in the P-sample that is needed to determine a match or valid nonmatch be of high quality and obtained independently of the census. Third, a key assumption in the calculation of the DSE in the census context is that the procedures used to define who is in and who is not in the census are balanced. The E-sample is used to determine how many census enumerations are correctly in the census according to specified criteria (e.g., a college student living in a dormitory should be enumerated at the college and not at his or her parental home). For the DSE model to work, the same criteria must be applied to determine how many P-sample people match to correct census enumerations (whether or not they are in the E-sample). Failure to apply the same criteria will create an error of balancing (see Chapter 7). An important dimension of balancing involves geographic correctness. For each person, there is a defined area where he or she should have been enumerated (this is the block cluster in the A.C.E.). In searching for a match for a person in the P-sample, it is important to search all the census enumerations that are in the correct area and only those enumerations in the correct area.

OCR for page 87
The 2000 Census: Interim Assessment Geographic balancing error occurs when the actual search area for P-sample matches is larger or smaller than that used in the E-sample to determine correct enumerations. Fourth, the DSE is sample based, which means that it is important not only to estimate the DSE itself, but also to accurately estimate the variance in the DSE due to sampling error and other sources of variation. Finally, if DSE results are to be used to adjust the census for undercount, the process would involve applying the coverage correction factors to the population counted in each geographic area for which adjusted counts are desired, separately for each post-stratum. This procedure assumes that the probabilities of being included (captured) in the A.C.E. or the census do not vary significantly by geographic area within post-strata. A.C.E. OPERATIONS Overview Sampling and Address Listing The 2000 A.C.E. began with a series of steps to obtain a sample of about 11,000 block clusters and 300,000 household addresses nationwide in which interviews would be conducted for the independent P-sample survey (see Appendix C; remote Alaska was not part of the A.C.E.). The steps included: drawing a large sample of block clusters and sending field staff to develop a complete address list for them, independent of the census Master Address File (MAF); reducing the sample for medium and large block clusters (those with 3 to 79 housing units and 80 or more housing units, respectively) in a manner that oversampled minority areas; reducing the sample for small block clusters; matching the addresses on the P-sample address list against the MAF addresses in the sampled block clusters to provide information for the last stage of sampling and facilitate other operations; and subsampling addresses within large block clusters to reduce the interviewing workload. P-Sample Interviewing Beginning in late April 2000, interviewers used laptop computers to obtain information for all addresses in the P-sample block clusters. The first wave of interviewing was conducted by telephone for households that provided a

OCR for page 87
The 2000 Census: Interim Assessment telephone number on their census questionnaire and for which there was a clear city-style address. Fully 29 percent of the P-sample household interviews were obtained by telephone. The second wave of interviewing, which began in mid-June and continued through August, was in person. Interviewers were instructed to strive for a household respondent, but proxy interviews from neighbors or landlords were accepted if attempts to contact the household directly proved futile. The interviewers asked about three types of household residents: nonmovers—those who lived in the house on Census Day and still lived there; outmovers—those who lived in the house on Census Day but had subsequently moved away; and inmovers—those who were current residents but had not lived in the house on Census Day. People who were determined to be group quarters residents were removed from the P-sample. Initial Matching and Targeted Extended Search Once the P-sample survey was complete and the E-sample of census enumerations was drawn for the A.C.E. sample of block clusters, the first round of matching was conducted. The E-sample excluded certain census enumerations: group quarters residents, people reinstated in the census too late for A.C.E. processing (“late additions”), and people requiring imputation (people having only one reported short-form characteristic among name, age, sex, race, ethnicity, and household relationship).9 The first stage of matching was done by computer; the matching algorithm assigned a match probability score by examining the available variables (name and demographic characteristics) according to specified rules. Probability score cutoffs identified clear matches, possible matches, and nonmatches within each block cluster. (P-sample and E-sample records lacking enough reported data for A.C.E. matching and follow-up, including a name and at least two characteristics, were flagged for imputation of match or enumeration status.) P-sample records could match to census records that were not in the E-sample, such as census records excluded from the E-sample due to large block subsampling. Clerks then reviewed the possible matches and nonmatches to identify additional matches. Their work was reviewed in turn by a small staff of technicians and a yet smaller staff of analysts. 9   Such census records are termed “whole person imputations” or “non-data-defined.”

OCR for page 87
The 2000 Census: Interim Assessment In selected block clusters, the clerks performed a targeted extended search (TES): they searched the blocks adjacent to the block cluster for census enumerations that matched P-sample households not already matched to an E-sample household in the block cluster. They also searched for E-sample enumerations in the surrounding blocks that had been identified as goecoding errors—that is, their addresses were incorrectly assigned to the block cluster. Field Follow-Up and Final Matching An important part of the A.C.E. was an operation to recheck certain cases to clarify their status. About half of P-sample nonmatched cases and most unresolved cases were followed up in the field to obtain information that would clarify their residence status (whether they resided at the address on Census Day), as well as their match status. In addition, almost all nonmatched and unresolved E-sample cases were followed up in the field to obtain information that would clarify their enumeration status (whether they were a correct, nonmatched enumeration or a duplicate or other type of erroneous enumeration). The information provided by field follow-up was used to determine a final match and enumeration status for as many P-sample and E-sample cases as possible. Weighting and Imputation Prior to estimation, the sampling weights for P-sample cases were adjusted to represent households that, despite best efforts, could not be interviewed. Also, a series of imputations were performed, including: imputation of values for specific missing characteristics needed for post-stratification (age, sex, race, ethnicity, and housing tenure); imputation of enumeration status for unresolved E-sample cases; imputation of residence status for unresolved P-sample cases; and imputation of match status for unresolved P-sample cases who were reported or imputed to be Census Day residents at the P-sample address. Post-Strata Estimation The final step in the A.C.E. process was estimation of the DSE and its associated variance for post-strata. Post-strata were prespecified to form 448 individual strata that grouped people by age, sex, race/ethnicity, housing tenure, and, in some cases, geographic region, a mail return rate for their neighborhood calculated for the A.C.E., and size of metropolitan area. Table 6-2 shows the A.C.E. post-strata. If a post-stratum had fewer than 100 nonmovers and outmovers, it was combined with another stratum; this procedure reduced the number of post-strata from 448 to 416 for the final analysis.

OCR for page 87
The 2000 Census: Interim Assessment TABLE 6-2 Post-Strata in the 2000 A.C.E., 64 Major Groups Race/Ethnicity Domain Other Characteristics 1. American Indian or Alaska Native on Reservationa □ 2 groups: owner, renter 2. American Indian or Alaska Native off Reservationb □ 2 groups: owner, renter 3. Hispanicc □ 4 groups for owners: ❖ High and low mail return rate ❖ By type of metropolitan statistical area (MSA) and enumeration area ➤ Large and medium-size MSA mailout/mailback areas ➤ All other □ 4 groups for renters (see Hispanic owners) 4. Non-Hispanic Blackd □ 4 groups for owners (see Hispanic owners) □ 4 groups for renters (see Hispanic owners) 5. Native Hawaiian or Pacific Islandere □ 2 groups: owner, renter 6. Non-Hispanic Asianf □ 2 groups: owner, renter 7. Non-Hispanic White or Some Other Raceg □ 32 groups for owners: ❖ High and low mail return rate ❖ By region (Northeast, Midwest, South, West) ❖ By type of metropolitan statistical area and enumeration area: ➤ Large MSA, mailout/mailback areas ➤ Medium MSA, mailout/mailback areas ➤ Small MSA and non-MSA, mailout/mailback areas ➤ Other types of enumeration area (e.g., update/leave) □ 8 groups for renters: ❖ High and low mail return rate ❖ By type of metropolitan statistical area and enumeration area ➤ (See owner categories) All 64 groups were classified by 7 age/sex categories (below) to form 448 post-strata; in estimation, some age/sex categories were combined (always within one of the 64 groups) to form 416 strata. Under age 18 Men aged 18–29; women aged 18–29 Men aged 30–49; women aged 30–49 Men aged 50 years and older; women aged 50 years and older.

OCR for page 87
The 2000 Census: Interim Assessment     NOTES: Large metropolitan statistical areas (MSAs) are the largest 10 MSAs in the United States; medium MSAs are other MSAs with 500,000 or more population; small MSAs are MSAs with less than 500,000 population. The description of race/ethnicity domains is simplified somewhat; see Haines (2000) for complete set of classification rules (see also Farber, 2001a). a   All people on a reservation with American Indian or Alaska Native as their single or one of multiple races. b   All people in Indian Country not on a reservation with American Indian or Alaska Native as their single or one of multiple races; all non-Hispanic people not in Indian Country with American Indian or Alaska Native as their single race. c   All Hispanic people in Indian Country not already classified in Domain 2; all Hispanic people not in Indian Country except those living in Hawaii with Native Hawaiian or Pacific Islander as their single or one of multiple races. d   All non-Hispanic people with Black as their only race; all non-Hispanic people with Black and American Indian or Native Alaska race not in Indian Country; all non-Hispanic people with Black and another single race group, except those living in Hawaii with Black and Native Hawaiian or Pacific Islander race. e   All non-Hispanic people with Native Hawaiian or Pacific Islander as their only race; all non-Hispanic people with Native Hawaiian or Pacific Islander and American Indian or Alaska Native race not in Indian Country; all non-Hispanic people with Native Hawaiian or Pacific Islander and Asian race; all people in Hawaii with Native Hawaiian or Pacific Islander as their single or one of multiple races. f   All non-Hispanic people with Asian as their only race; all non-Hispanic people with Asian and American Indian or Alaska Native race not in Indian Country. g   All non-Hispanic people with White or some other race as their only race; all non-Hispanic people with White or some other race in combination with American Indian or Alaska Native not in Indian Country; or in combination with Asian; or in combination with Native or Pacific Islander not in Hawaii; all non-Hispanic people with three or more races (excluding American Indian or Alaska Native) in Indian Country or outside of Indian Country (excluding Native Hawaiian or Pacific Islander in Hawaii).

OCR for page 87
The 2000 Census: Interim Assessment To form the DSE, weighted estimates were developed of E-sample total cases and correct enumerations; P-sample nonmover cases, inmover cases, and outmover cases; and P-sample matched nonmover cases and outmover cases. In a procedure called PES-C that was used for most post-strata (see “Major Differences from 1990 PES,” below), the match rates calculated for outmovers were applied to the estimated number of inmovers as part of developing an overall match rate for each post-stratum. Also tabulated for each post-stratum was the census count and the count of IIs (people requiring imputation and late additions). These rates and counts permitted the calculation of the DSE, the net undercount rate, and the coverage correction factor for individual post-strata. Major Differences from 1990 PES The 2000 A.C.E. procedures and concepts differed in a number of respects from those incorporated in the 1990 PES (see Hogan, 2000b). This section briefly summarizes the major differences. Universe The A.C.E. universe excluded people living in institutions, college dormitories, and other group quarters; the PES universe included most noninstitutional group quarters. The Census Bureau decided to limit the A.C.E. to the household population because of its experience in the 1990 PES.10 Sample Size and Design The 2000 A.C.E. was twice the sample size of the 1990 PES: the 2000 P-sample comprised about 300,000 housing units, compared with 165,000 housing units in the 1990 P-sample. Because of its larger overall sample size, the A.C.E. could produce reliable direct estimates for minorities and other groups with less oversampling than was used in the PES to develop post-strata estimates by means of a smoothing model. Consequently, the A.C.E. weights varied less than the PES weights, which contributed to reducing the variance of the A.C.E. estimates. 10   Rates of unresolved match status were much higher for group quarters residents than for household members in the PES because of much higher rates of short-term mobility for people in group quarters (e.g., college students moving between dormitories and their parental homes, shelter residents moving from one shelter to another, migrant worker dormitory residents moving from one farm to another).

OCR for page 87
The 2000 Census: Interim Assessment Initial Homing Unit Match The A.C.E. included a new operation to match P-sample and January 2000 MAF housing units prior to interviewing. The purpose of the match was to facilitate such operations as large block subsampling, telephone interviewing, and matching. Although the P-sample and census address lists were linked, independence was maintained because no changes were carried over from one list to the other as a consequence of the match. P-Sample Interviewing Technology The A.C.E. used computer-assisted telephone and personal interviewing (CATI/CAPI) to facilitate the accuracy of the information collected and the speed of data capture and processing. The PES used paper-and-pencil techniques throughout. Matching Technology The A.C.E. clerical matching operation was conducted by clerks examining computerized P-sample responses and questionnaire images for census cases in the sampled block clusters. The technology was designed to be user friendly. Because of complete computerization of the operation, all matching could be done at one location, instead of seven as in 1990. Treatment of Movers A major change from 1990 was the treatment of movers. The goal of the 1990 PES was to visit each P-sample address and find out where the current residents usually lived as of Census Day, April 1. This procedure is called PES-B, which requires collecting Census Day address information for inmovers (people resident at the P-sample address on interview day but not on Census Day) and searching nationwide to determine if they were enumerated or missed at their reported Census Day residences.11 The original design for Integrated Coverage Measurement for 2000 ruled out PES-B because of the plan to use sampling for nonresponse follow-up, which meant that movers might not match because their Census Day addresses did not fall into the nonresponse follow-up sample. This decision was carried over to A.C.E. The 2000 A.C.E. had two goals: to find out who lived at each P-sample address on Census Day and determine whether they were enumerated or missed in the census at that address and to find out who lived at each P-sample address as of the A.C.E. interview day. This procedure is called PES-C; it results in 11   See Marks (1978), who also described a PES-A procedure in which the goal is to visit each P-sample address to find out who lived there on Census Day.

OCR for page 87
The 2000 Census: Interim Assessment obtaining information not only for nonmovers and inmovers, but also for outmovers (Census Day residents not resident on interview day). The PES-C procedure involved estimating the P-sample match rate for movers by matching outmovers. At the same time, for most post-strata, the A.C.E. estimated the number of matched movers by applying the outmover match rate to inmovers. The underlying assumption is that inmovers would be more completely reported than outmovers. The advantage of PES-C is that the searching operation for the Census Day residence of inmovers is not required. The potential drawback is that the quality of the information collected to use in the matching for outmovers may be impaired because their information is always supplied by other people. Targeted Extended Search Procedures Another important change from the 1990 PES concerned the TES procedure for searching surrounding blocks if a search in the sampled block cluster did not turn up a match for a P-sample household and to find out if misgeocoded E-sample cases were located nearby. In 1990, one ring, or sometimes two rings, of blocks surrounding each sample block cluster were searched for additional P-sample matches and E-sample correct enumerations. The purpose was to reduce the variance and bias of the DSE estimates. For efficiency reasons, it was decided for the 2000 A.C.E. to target the extended search and to conduct it on a sample basis. Definition of Post-Strata The 448 post-strata in 2000 (reduced to 416 for estimating DSEs) were similar to the 357 post-strata that were implemented in the reestimation of the 1990 PES.12 The 2000 post-strata included two additional race/ethnicity domains, one for American Indians and Alaska Natives not living on reservations and another for Native Hawaiian and other Pacific Islanders (who had been combined with Asians in 1990). The 2000 post-strata also categorized non-Hispanic whites and other races, non-Hispanic blacks, and Hispanics by mail return rate (two categories—high and low—calculated separately for each group by tenure). Region was dropped as a stratifier except for people in the non-Hispanic white and other race category who owned their homes. 12   See Thompson (1992); U.S. Census Bureau (1992a). The original 1990 estimation used 1,392 strata together with a composite estimation procedure to smooth the resulting DSEs (see Hogan, 1992).