4
Census Operations: Assessment

In this chapter we present our initial overall assessment of the 2000 census. We consider broadly how well the census design and operations were carried out and to what extent six major innovations for 2000 were successful:1

  1. use of multiple sources to develop the Master Address File (MAF) in general, and, specifically, the Local Update of Census Addresses (LUCA) Program;

  2. improvements in the questionnaire and mailing strategy to encourage household response;

  3. use of paid advertising and more extensive outreach to encourage response;

  4. advance hiring and higher wages for enumerators to ensure timely follow-up of nonresponding households;

  5. use of contractors and improved technology for data capture and other operations; and

  6. greater reliance on computers for processing incomplete responses.

We also consider briefly the completeness of coverage of the population achieved in the 2000 census, in total and for important population groups, and two outcomes of census operations as they relate to coverage: mail return rates and imputations of whole persons. Details of population coverage are discussed in subsequent chapters. We are not able at this time to assess the quality of the census data for characteristics of the population. There is potentially a serious problem for the quality of the long-form information because of substantially lower mail return rates for long forms than short forms.

1  

We do not assess a seventh major innovation: the expanded use of the Internet for release of data products to users. There has not been enough time yet for users to assess the usefulness and ease of accessing the 2000 census data through such mechanisms as the Census Bureau’s American FactFinder interface (http://factfinder.census.gov).



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 57
The 2000 Census: Interim Assessment 4 Census Operations: Assessment In this chapter we present our initial overall assessment of the 2000 census. We consider broadly how well the census design and operations were carried out and to what extent six major innovations for 2000 were successful:1 use of multiple sources to develop the Master Address File (MAF) in general, and, specifically, the Local Update of Census Addresses (LUCA) Program; improvements in the questionnaire and mailing strategy to encourage household response; use of paid advertising and more extensive outreach to encourage response; advance hiring and higher wages for enumerators to ensure timely follow-up of nonresponding households; use of contractors and improved technology for data capture and other operations; and greater reliance on computers for processing incomplete responses. We also consider briefly the completeness of coverage of the population achieved in the 2000 census, in total and for important population groups, and two outcomes of census operations as they relate to coverage: mail return rates and imputations of whole persons. Details of population coverage are discussed in subsequent chapters. We are not able at this time to assess the quality of the census data for characteristics of the population. There is potentially a serious problem for the quality of the long-form information because of substantially lower mail return rates for long forms than short forms. 1   We do not assess a seventh major innovation: the expanded use of the Internet for release of data products to users. There has not been enough time yet for users to assess the usefulness and ease of accessing the 2000 census data through such mechanisms as the Census Bureau’s American FactFinder interface (http://factfinder.census.gov).

OCR for page 57
The 2000 Census: Interim Assessment OVERALL DESIGN AND EXECUTION Our general assessment from the evidence available at this time is that the 2000 census was well executed in many respects. All statutory deadlines for data release were met, and most of the individual operations were completed on or ahead of schedule. Strategies for obtaining public cooperation in completing and mailing back their census forms succeeded in keeping mail response rates at the levels attained in 1990. This outcome represents an important achievement given the large decline in mail response rates from the 1980 census to the 1990 census. Few instances occurred in 2000 when operations had to be modified in major ways after Census Day. In contrast, the 1990 census experienced serious and unexpected problems in executing such key operations as nonresponse follow-up, and the Census Bureau had to return to Congress to obtain additional funding to complete all needed operations. One unexpected modification in summer 2000 was the special operation to minimize duplicate housing unit addresses in the Master Address File. This operation deleted 3.6 million people from the census who duplicated another enumeration and reinstated 2.4 million people who were initially believed to be duplicates. It was mounted quickly when the need for it became apparent and completed with little or no apparent adverse effect on other operations.2 Another late change in plans, made early in 2000, was to set aside data capture of long-form information in order to keep short-form processing on schedule. Most innovations for 2000 appeared effective, but some exhibited problems in implementation that deserve attention. In particular, the MAF development process was problematic in several respects as discussed below. Also, in comparison with 1990, the 2000 census had increased numbers of people who required imputation. Some of the increase was likely due to two design features of 2000: the use of a shorter questionnaire with space to record characteristics for six instead of seven household members, and the use of telephone followup, not supplemented by field work, to contact households whose returns appeared to be incomplete (see Chapter 3). However, some of the increase in people requiring imputation is not readily explained; it may have been due to errors in MAF, problems in follow-up operations, or other factors. More evaluation is needed of the sources and quality of census imputations, as well as of people reinstated in the census due to the special MAF unduplication operation. On balance, though, the 2000 census appears to have been well carried out, particularly in view of the problems that hampered planning and preparations. The basic design was not finally determined until winter 1998–1999, 2   Neither the reinstated people described here nor the imputations discussed in the next paragraph are included in the Accuracy and Coverage Evaluation (A.C.E.) Program. The large numbers of such people in 2000 complicate the interpretation of the A.C.E. results when those results are compared with the 1990 Post-Enumeration Survey (see Chapters 7 and 8).

OCR for page 57
The 2000 Census: Interim Assessment little more than a year from Census Day. Census managers faced uncertainties about funding, which impeded staffing and resolution of specific elements of such operations as coverage improvement (see Waite et al., 2001). After full funding was obtained for the final agreed-upon design, the Bureau executed the census and the Accuracy and Coverage Evaluation (A.C.E.) operations in a controlled manner. Control was maintained even though—of necessity—specific procedures for several operations were only finalized very late. Likewise, many data processing systems were implemented almost as soon as they were completed, without benefit of advance testing. The relatively smooth operation of the census was facilitated by generous funding and the dedication and energy of Census Bureau staff. MULTIPLE SOURCES FOR MAF With the bulk of the population enumerated by mailout/mailback and update/leave/mailback techniques, the quality of the 2000 address list was essential to the completeness and accuracy of population coverage. The Census Bureau early in the 1990s made a decision to partner with other organizations and use multiple sources to develop the MAF Contributing to the 2000 MAF were the 1990 address list augmented by updates from the U.S. Postal Service (in mailout/mailback areas), a full block canvass by Census Bureau staff, input from localities that participated in LUCA, and census field operations. The goal of using multiple sources to build as complete a list as possible was a worthy one. Because many of the procedures were new, implementation was not always smooth. The decision to conduct a complete, instead of targeted, block canvass was made late in the decade and required additional funding to implement; an even later decision was to provide localities in city-style-address areas an opportunity to add addresses for units newly constructed in January-March 2000. Original plans for a sequential series of steps in the LUCA Program, involving back-and-forth checking with localities, had to be combined under pressures of time, and many LUCA components experienced delays; see Table 4-1. Questionnaire labeling had to occur before the Bureau had the opportunity to check most of the addresses supplied by LUCA participants. Local review of the address list for special places (group quarters) was delayed, and errors in assigning special places to geographic areas apparently occurred. Except for the stage of appealing to the U.S. Office of Management and Budget, localities were not given additional time for their review. The Bureau recognized early on that the MAF was at risk of including duplicate and other erroneous addresses. The risk of omitting valid addresses was also present, but MAF procedures were expected to reduce the level of omissions from previous censuses. An increased risk of including duplicate addresses in the 2000 MAF resulted not only from the planned use of multiple

OCR for page 57
The 2000 Census: Interim Assessment TABLE 4-1 Original and Actual Timelines for the Local Update of Census Addresses (LUCA) Program Planned Dates Actual Dates Activity LUCA98a   November 1997 February 1998 Census Bureau sent invitation letters to eligible localities April-August 1998 May 1998-March 1999 Bureau sent initial materials (address list, maps) to participants May-December 1998 May 1998-June 1999 LUCA participants conducted initial review of materials from Bureau Not part of original plan January-May 1999 Bureau conducted full (instead of targeted) block canvassing May-October 1999 July-December 1999 Bureau verified participants’ addresses in the field (reconciliation) (original plan was to send results to localities to obtain feedback before sending final determination materials) March-November 1999 October 1999-February 2000 Bureau sent detailed feedback/final determination materials to participants Original deadline, January 14 November 1999-April 3, 2000 LUCA participants filed appeals (addresses were visited in coverage improvement follow-up) April-August 1998 December 1999-April 2000 Special Places LUCA (Bureau did not complete Special Places list until November 1999) Added operation January-March 2000 Participating localities submitted new construction addresses (addresses were visited in coverage improvement follow-up) LUCA99b   July 1998-February 1999 July 1998-February 1999 Census Bureau field staff listed addresses September-October 1998 September-October 1998 Bureau sent invitation letters to eligible localities January-April 1999 January-August 1999 Bureau sent initial materials (block counts, maps) to participants January-April 1999 January-October 1999 LUCA participants conducted review of initial materials from Bureau March-May 1999 May-October 1999 Bureau verified participants’ block counts in the field (reconciliation) March-June 1999 September 1999-February 2000 Bureau sent detailed feedback/final determination materials to participants Original deadline, January 14 October 1999-April 3, 2000 LUCA participants filed appeals for specific addresses January-April 1999 December 1999-April 2000 Special Places LUCA aThis program was conducted in areas with mostly city-style addresses. bThis program was conducted in areas with mostly rural route and post office box addresses. SOURCE: Adapted from LUCA Working Group (2001:Figure 1).

OCR for page 57
The 2000 Census: Interim Assessment sources, but also from the operational problems just reviewed. To minimize duplication, the Bureau used a combination of field checking and internal consistency checks of the MAF file (see Chapter 3). We are unclear on how to assess the overall success of the MAF development process at this time. On the plus side, despite the various implementation problems noted, all elements of the MAF development process were completed, and delays in component operations did not appear to affect other census operations, such as mailout, field follow-up, and data processing. With regard to accuracy, we believe it likely that the MAF contains more duplicate addresses than were detected in the various checking operations. In particular, some of the people reinstated in the census may in fact duplicate other enumerations, even though the evidence was not strong enough to weed them out during the special MAF unduplication operation. Similarly, duplicates and other errors in the MAF may have contributed to the increased number of people in 2000 compared with 1990 who required imputation to complete their census records. Yet the MAF may have omitted some valid addresses as well, and we do not yet know the balance between overcounting and undercounting errors. Further, whether errors in the MAF contributed more or less to population coverage errors than omissions or erroneous inclusions of people in otherwise correctly enumerated households remains to be established from analysis of the A.C.E. and other sources. Finally, there may be significant variability in the accuracy of the MAF across geographic areas due to the LUCA Program (see below) and other factors. All of these aspects of MAF need evaluation. PARTICIPATION IN LUCA Preliminary data show variable patterns of participation in the Local Update of Census Addresses (LUCA) Program. Of 39,051 counties, places, and minor civil divisions that were eligible for either or both LUCA98 (conducted in city-style-address areas) or LUCA99 (conducted in areas with large numbers of rural route and post office box addresses), 25 percent participated fully in one or both programs. By full participation, we mean that they informed the Census Bureau of needed changes to the address list for their area (LUCA Working Group, 2001:Ch.2).3 3   It is not straightforward to determine participation in LUCA from the available data. In addition to full participants, 7 percent of eligible governments received Census Bureau materials and were coded as returning them to the Bureau without comment. Some of these governments may have been satisfied with the MAF for their areas, but, more likely, they did not have time or resources to conduct a full review. Also, participation by a county could mean that it reviewed the MAF for the entire county or only for selected jurisdictions in the county.

OCR for page 57
The 2000 Census: Interim Assessment The substantial variation in LUCA participation is shown in Table 4-2. Factors that relate to participation include: geographic region—jurisdictions in the Pacific and Mountain states participated at a higher rate than jurisdictions in other parts of the country; population size—jurisdictions with larger populations participated at a higher rate than those with smaller populations; type of government—places and counties participated at higher rates than minor civil divisions; and type of program—areas eligible for LUCA98 or both LUCA98 and LUCA99 participated at a higher rate than areas eligible only for LUCA99. In addition, a multivariate regression analysis found that, among counties and places that signed up to participate in LUCA, the 1990 census net undercount rate was a strong predictor that a jurisdiction would participate fully. Case studies also identified instances in which a vigorous coordination effort by a state or regional government facilitated participation by local jurisdictions. The governments that participated in LUCA appeared to cover a higher proportion of the nation’s housing stock than the proportion of participating governments to eligible governments would suggest. From preliminary data, places that participated fully in LUCA98 accounted for 67 percent of the 1990 housing stock in eligible places, even though they included only 48 percent of eligible places.4 Even though coverage was higher for housing than for governments, which would be expected given the greater propensity of larger-size areas to participate, substantial portions of the MAF were not accorded local review. There has not been a full accounting of the contribution of LUCA to MAE As a rough indicator of order of magnitude, fully participating places among those eligible for LUCA98 submitted 3.7 million additional addresses, of which the Census Bureau initially accepted 2.1 million before appeals; those 2.1 million addresses represented 5 percent of the housing stock of participating places. These places also submitted corrections and deletions. What is not known is what LUCA contributed uniquely—that is, the number and proportion of added addresses that were missed by other Census Bureau address updating operations and that resulted in added (nonduplicative) census enumerations. A thorough assessment of the LUCA Program is needed, including not only the effects of LUCA on the completeness of the census count in participating areas, but also the possible effects on the counts in other areas from not having had a LUCA review. 4   See LUCA Working Group (2001:Ch.2). Data are not available to permit constructing estimates for all eligible jurisdictions or for the two programs (LUCA98 and LUCA99) combined.

OCR for page 57
The 2000 Census: Interim Assessment TABLE 4-2 Participation of Local Governments in the 2000 Local Update of Census Addresses (LUCA) Program   LUCA98 Only LUCA99 Only Both LUCA98 and LUCA99     Percent Participated in   Category Number Eligible Percent Participateda Number Eligible Percent Participatedb Number Eligible LUCA98 Only LUCA99 Only LUCA98 and 99 Total Number Eligible Percent Participated in One or Both Total 9,044 41.6 21,760 14.2 8,247 17.7 7.6 12.0 39,051 25.4 Geographic Division   New England 518 41.5 1,047 6.7 66 12.1 3.0 1.5 1,631 18.1 Middle Atlantic 2,034 43.3 2,133 16.2 662 19.2 7.0 12.7 4,829 30.7 East North Central 3,944 35.0 3,187 10.6 3,483 15.4 5.3 5.1 10,614 24.6 West North Central 548 40.3 9,437 13.3 1,414 17.4 10.7 19.1 11,399 18.8 South Atlantic 697 52.7 1,681 24.1 585 18.1 11.5 21.4 2,963 36.1 East South Central 411 35.5 1,046 11.0 427 14.8 7.3 11.7 1,884 21.5 West South Central 241 37.8 1,914 14.1 886 13.5 9.8 14.0 3,041 22.7 Mountain 113 68.1 966 23.7 328 35.7 6.4 22.3 1,407 36.7 Pacific 538 72.1 349 20.1 396 33.6 9.6 21.0 1,283 55.5 Population Size (1998 est.)   1,000 or fewer 1,743 26.0 15,100 12.1 1,436 14.2 10.5 6.3 18,279 14.8 1,001–10,000 4,550 40.9 6,080 18.9 4,044 17.8 6.6 11.7 14,674 30.4 10,001–50,000 2,157 50.2 563 21.1 1,827 17.6 8.1 12.6 4,547 41.8 50,001–100,000 364 64.7 17 23.5 444 19.6 7.4 16.9 825 52.7 100,001–1,000,000 217 58.5 0 — 469 25.2 6.4 23.2 686 56.0 1,000,001 or more 13 69.2 0 — 27 25.9 7.4 44.4 40 75.0 Government Type   County 122 46.7 982 17.1 1,956 10.3 9.3 10.6 3,060 26.7 Minor civil division 3,624 29.8 9,887 7.7 3,082 13.7 4.2 5.2 16,593 15.4 Place 5,298 49.6 10,891 19.8 3,209 25.9 9.9 19.4 19,398 33.8 NOTES: Not all regions have minor civil divisions. The analysis excludes 340 American Indian Reservations, 12 Alaska Native areas, 78 county-level municipios of Puerto Rico, and 3 places for which 1998 population estimates could not be determined. aParticipation in LUCA98 is defined as the local government returning at least one action record (addition, deletion, or correction) to the Census Bureau after reviewing the Master Address File for its area. bParticipation in LUCA99 is defined as the government challenging at least one block count provided by the Census Bureau for its area. SOURCE: Tabulations by panel staff from preliminary U.S. Census Bureau data (LUCA98 and LUCA99 spreadsheets, June 2000), modified by assigning county codes to minor civil division and place records and augmenting the file with 1998 population estimates and variables from the Census Bureau’s 1990 Data for Census 2000 Planning (1990 Planning Database, on CD-ROM) (see LUCA Working Group, 2001:Tables 2–2, 2–3).

OCR for page 57
The 2000 Census: Interim Assessment REDESIGNED QUESTIONNAIRE AND MAILING STRATEGY The Census Bureau redesigned the census questionnaire and mailing strategy for 2000 as part of its effort to encourage the public to fill out questionnaires and return them in the mail (or over the Internet or the telephone; see Chapter 3). The Bureau budgeted for a decline in the mail response rate to 61 percent in 2000 (from the 65 percent rate achieved in 1990), but its stated goal was to keep the response rate at least as high as in 1990. Maintaining the 1990 mail response rate was key to the Bureau’s ability to complete nonresponse follow-up on time and within budget. Estimates produced in conjunction with the 1990 census were that each 1 percentage point decline in the mail response rate would increase census costs by 0.67 percent (National Research Council, 1995:48). In addition, evidence from the 1990 census, confirmed by analysis of 2000 data (see below), indicated that mail returns, on balance, were more complete in coverage and content than returns obtained in the field.5 The changes to the 2000 questionnaire and mailings were based on extensive research carried out in the early 1990s. In one test, mail response to a user-friendly “booklet” form of the type used in 2000 was 3.4 percentage points higher than response to the type of form used in 1990; the difference in response rates for areas that were hard to enumerate in 1990 was even greater, 7.6 percentage points (Dillman et al., 1993). Adoption of optical scanning technology for data capture made it possible to create a more visually appealing questionnaire in 2000. The results of another experiment suggested that the use of more mailings could substantially increase response. Individually, it appeared that sending an advance letter (used for the first time in 2000) increased response by 6 percentage points, sending a reminder postcard (used both in 1990 and 2000) increased response by 8 percentage points, and sending a second questionnaire to nonrespondents (not used) increased response by 10–11 percentage points. Another test demonstrated that stressing the mandatory nature of filling out the questionnaire on the mailing envelope (implemented in 2000) was effective in encouraging response, while emphasizing the benefits of the data or their confidentiality was not particularly effective (National Research Council, 1995:120–121). The 2000 census was successful in achieving the goal of stemming the historic decline in mail response rates. The rate achieved was about 66 percent.6 This accomplishment was of major importance for the success of the census in terms of timely, cost-effective completion of operations. It seems likely that the changes to the questionnaire and mailing package and the use of an advance 5   See Box 3-1 in Chapter 3 for definitions of mail response and return rates and rates for 1970– 2000. 6   The Census Bureau is in the process of evaluating 2000 mail response and return rates; percentages cited in the text should be treated as approximate.

OCR for page 57
The 2000 Census: Interim Assessment letter—despite or perhaps even because of the publicity due to the addressing error in the letter (see Appendix A) —contributed to maintaining the response rate, although how large a role these elements played in this achievement is not yet known. One disappointment of the initiatives to encourage response in 2000 (which also included expanded advertising and outreach—see below) was that they did not stem a steep decline in the response of households that received the long form: the long-form mail response rate was 13 percentage points below the rate for short forms (67% and 54%). Similarly, the long-form mail return rate (based on occupied, not total, addresses) was 14 percentage points below the short-form rate (72% and 58%). This difference was double the difference that the Bureau expected (U.S. General Accounting Office, 2000b:5), and far larger than differences between long-form and short-form return rates seen in previous censuses (see Box 3-1 in Chapter 3).7 The low return rate for long forms could well have serious effects on the quality of the long-form data. The reason is the difficulty of obtaining long-form information in follow-up. While enumerators visit all nonresponding households, evidence from the 1990 census indicates that, very often, they succeed in obtaining responses only to the short-form questions from the households in the long-form sample and not also the additional information on the long form (National Research Council, 1995:App.L). A second disappointment of the 2000 census mailing strategy was that the plan to mail a second questionnaire to nonresponding households had to be discarded (see Appendix A). At the time of the dress rehearsal, vendors said they could not turn around the address list for a targeted second mailing on the schedule required. In addition, experience in the dress rehearsal suggested that mailing a second questionnaire to every address would generate adverse publicity and increase the number of duplicate returns that would need to be weeded out from the census count. PAID ADVERTISING AND PARTNERSHIPS An important element of the Census Bureau’s strategy in 2000 to reverse the historical decline in mail response rates and to encourage nonrespondents to cooperate with follow-up enumerators was to advertise more extensively and expand local outreach efforts well beyond what was done in the 1990 census. An integral part of the advertising strategy was to pay for ads instead of securing them on a pro bono basis. Advertising and outreach efforts began in fall 1999 and continued through May 2000. 7   One of the Bureau’s questionnaire experiments in the early 1990s, using an appealing form and multiple mailings, presaged this outcome: it found an 11 percentage point difference between short-form and long-form response rates (Treat, 1993). It was expected, however, that the publicity in a census environment would narrow this difference.

OCR for page 57
The 2000 Census: Interim Assessment The advertising campaign appeared very visible and appealing, and we believe that it very likely contributed to maintaining the response rate in 2000 at the 1990 level. However, data are not yet available with which to evaluate the extent of its contribution, either overall or for specific population groups to whom ads were targeted. Although it may not be possible to link specific ads or the overall campaign to response in any direct way, evaluation studies should be pursued to explore this question. Similarly, partnerships with local communities for outreach seemed more numerous and vigorous than in 1990. It might be useful to conduct case studies of outreach efforts in specific communities, even if it is likely not possible to evaluate their contribution to mail response or the success of follow-up overall. Also, it could be useful to analyze variations in the extent of partnerships in different communities. While the Census Bureau offered opportunities for outreach partnerships nationwide, some localities were more supportive and put forth more resources than others. (The Census Bureau provided materials and limited staff support but not direct funding.) Variation in the presence and effectiveness of outreach partnerships (just as variation in participation in the LUCA Program) could have helped reduce variability in population coverage to the extent that outreach was more effective in traditionally hard-to-count areas. Alternatively, such variation could have led to greater variability in population coverage across geographic areas than in previous censuses, which is of concern for uses of census data that involve population shares (e.g., allocation of federal funds—see Chapter 2). AGGRESSIVE RECRUITMENT OF ENUMERATORS Just as critical to the success of the census as developing the MAF and encouraging mail response was the follow-up effort to visit nonresponding households and either obtain an enumeration or determine that the address was a vacant unit or should not have been included in the MAF. Nonresponse followup was a major problem in the 1990 census because the mail response rate not only dropped below the rate in 1980, but also dropped several percentage points below the budgeted rate. The Bureau had to seek additional funding, scramble to hire enough enumerators, and stretch out the effort much longer than planned. In contrast, in 2000, fears of a tight labor market that could make it difficult to hire short-term staff led the Bureau to plan aggressive recruitment of field staff from the outset. Generous funding made it possible for the Bureau to implement its plans, which included directing local offices to recruit twice as many enumerators as they expected to need at competitive wages (see Chapter 3). The Bureau’s recruitment strategy seems to have been very successful. Most local offices had little or no problems meeting their staffing goals, and nonresponse follow-up was completed slightly ahead of schedule—a major

OCR for page 57
The 2000 Census: Interim Assessment achievement of the 2000 census. A midstream assessment of nonresponse follow-up concluded that it was going well in most offices (U.S. General Accounting Office, 2000b). It is possible that the success in completing nonresponse follow-up on time, and, similarly, in fielding a more focused coverage improvement follow-up effort than in 1990, contributed to reduction in measured net undercount. In 1990, questionnaires with later check-in dates (the date of entering the Census Bureau’s processing system) were more likely to include erroneous enumerations than were returns checked-in earlier. Specifically, the percentage of erroneous enumerations increased from 2.8 percent for questionnaires checked-in through April 1990 (largely mail returns), to 6.6 percent, 13.8 percent, 18.8 percent, and 28.4 percent, respectively, for those checked in during May, June, July, and August or later (largely, enumerator-obtained returns) (Ericksen et al., 1991:Table 2). Although the correlation between timing of receipt and accurate coverage of household members on a questionnaire may be spurious, there are several plausible reasons to support such a relationship. For example, people who moved between Census Day and follow-up could well be double-counted—at both their Census Day residence and their new residence (e.g., snowbirds in transit from a southern winter residence to a northern summer residence or college students in transit between home and dormitory around spring or summer vacation).8 More generally, the later a household was enumerated, the less accurately the respondent might have described the household membership as of Census Day. Given the delays in nonresponse follow-up in 1990, it appears that as much as 28 percent of the workload was completed in June or later, when erroneous enumeration rates were 14 percent or higher. We do not have information on the relationship of erroneous enumerations to the timing of enumeration in the 2000 census. However, we do know that nonresponse follow-up was completed by the end of June. Some returns were obtained through coverage improvement follow-up in July and August, but these represented a small percentage of the total. (Most coverage improvement work involved quality checks on already received returns rather than new enumerations—see Appendix A.) Hence, although we cannot be sure, it is possible that the speedier completion of nonresponse follow-up in 2000 contributed to reduction in net undercount. It is also possible, however, that the drive to complete nonresponse followup on schedule led to coverage errors that were not corrected in the second wave of coverage improvement follow-up. In support of this possibility, at the end of all follow-up operations, there were more people requiring imputation 8   Duplicate enumerations of snowbirds and other people with multiple residences may have increased in 2000 because of the lack of instructions on the questionnaire for how such people should respond to multiple forms.

OCR for page 57
The 2000 Census: Interim Assessment to complete their census records in 2000 (5.8 million or 2.1% of the household population) than in 1990 (1.9 million or 0.8% of the household population). Some of this increase was not unexpected because follow-up of large households and other households that were thought to be incomplete was handled by telephone, without field work, but some of the increase in people requiring imputation is not yet explained (see Chapter 8). USE OF CONTRACTORS AND IMPROVED TECHNOLOGY A major innovation for the 2000 census was the use of outside contractors and improved technology for key operations. Three outside vendors were contracted for data capture using imaging and optical mark and character recognition, supplemented by clerical keying; the Census Bureau’s National Processing Center at Jeffersonville, Indiana, was the fourth data capture center. Also, outside vendors were used to provide telephone questionnaire assistance and to carry out telephone follow-up for questionnaires that were identified as possibly incomplete in coverage (e.g., households that reported more members than the number for which they answered individual questions—see Chapter 3). Outside contracting for data capture was essential to handle the workload, given that almost all questionnaires were checked in at one of the four processing centers. By contrast, in 1990, most questionnaires went first to local offices for check-in and editing, and no use was made of contractors for data capture or other major operations. In testing data capture operations in early 2000, some problems were identified in the accuracy of the optical mark/character recognition, and changes were made to improve the accuracy rate and reduce the number of questionnaires that had to be keyed or rekeyed from images by clerks (U.S. General Accounting Office, 2000a). Data capture systems were redesigned to separate capture of short-form data from long-form data. This change was made on the basis of operational tests of keying from images, which demonstrated that keying could not occur fast enough to handle short-form and long-form data at the same time and keep to the overall schedule (U.S. General Accounting Office, 2000b). Evaluations of the accuracy and efficiency of contractor data capture and other operations are under way. Little hard evidence is yet available, but it appears that the contractors performed well and that the Census Bureau was able to retain appropriate oversight and management control of contractors’ work. The Bureau reported that overall rates of accuracy of optical character recognition (99%) and keying from image (97%) exceeded performance standards. There were no apparent data processing delays that affected field or other operations, except that long-form questions were set aside to ensure that short-form processing stayed on schedule.

OCR for page 57
The 2000 Census: Interim Assessment INCREASED USE OF COMPUTERS The 2000 census used computers whenever possible to replace tasks that were previously performed in part by clerks or enumerators. Notably, questionnaires went directly to one of the four data processing centers for data capture instead of being processed by clerks in local census offices, as occurred for much of the workload in 1990. Editing and imputation of individual records to supply values for missing responses to specific questions or reconcile inconsistent answers were handled entirely by computer; there was no clerical editing or effort to revisit households to obtain more content information as occurred for some of the workload in 1990. Mail returns that appeared to be missing information for some household members were followed up by telephone, but in contrast to 1990, there was no field follow-up when telephone follow-up was unsuccessful except for follow-up of completely blank returns (see Appendix A). After completion of all follow-up procedures, sophisticated computer routines were used, as in previous censuses, to complete the census records for households and people that had minimal information. These imputation routines used records from neighboring households or people who matched as closely as possible whatever information was available for the household or individual requiring imputation (see Chapter 8). The advantages expected from greater computerization of data processing included savings in cost and time to complete the data records. Also, it was expected that computer systems for editing and imputation would be better controlled and less error-prone than clerical operations. The 2000 census computer systems for data processing appear to have worked well. Although programming of systems was delayed because of the delays in determining the final census design, there appear to have been little adverse effects on the timing of other operations. Computer problems did delay the implementation of the coverage edit and telephone follow-up operation by a month. Data are not yet available for evaluating the quality of computer-based editing and imputation. The rates of missing data for individual short-form items, such as age and sex, are known and were low (1%–4%). Moreover, it was often possible to infer a missing value from other information on the household’s own questionnaire instead of having to use information from neighboring households. Imaging of forms helped in this regard, as names were captured along with responses to questions (see Appendix A). The Bureau’s editing and imputation routines for missing and inconsistent data items have been increasingly refined over several decades of computer processing of censuses and household surveys, although few studies have been performed of the errors introduced by imputation. With the likely exception of race/ethnicity data, examination of published tables from the 1990 census of the distributions of individual items before and after imputation shows little

OCR for page 57
The 2000 Census: Interim Assessment effect of the imputations, particularly for short-form items for which rates of missing data are low (see U.S. Census Bureau, 1992b). Assuming that the Bureau maintained good quality control of editing and imputation specifications and implementation in 2000, the use of computer routines to provide values for specific missing short-form items would have had little adverse effect on data quality. The resulting data products will be more complete and therefore useful for a broader range of purposes. The Bureau has also used computerized imputation routines for several censuses to supply data records for households and people with minimal information. In general, such a procedure is likely preferable to deleting the person or household, given that there is good reason to believe the person or household exists and should be included. In 2000, there were considerably more such people than in 1990, which helps explain some puzzling findings about population coverage in the two censuses (see below). For this reason, the performance of the computerized imputation routines for whole person imputations should be carefully evaluated to determine if the imputations were appropriate. POPULATION COVERAGE The evidence from the A.C.E. indicates that the 2000 census, compared with previous censuses, succeeded in its primary goals to reduce net undercount and to narrow the differences between net undercount rates for historically less-well-counted and better-counted groups (see Chapter 6). Not all planned analyses of the A.C.E. have yet been completed—particularly studies of balancing error and matching error—so we must reserve judgment about the accuracy of particular A.C.E. results. Nonetheless, the overall patterns of net undercount in the A.C.E. for major groups accord with knowledge from previous censuses: while differences in net undercount rates were narrowed, the rates remained somewhat higher for such groups as minorities in comparison with non-Hispanic whites, renters in comparison with owners, men in comparison with women, and younger people in comparison with older people. Estimates of the population from demographic analysis indicate that the census either had a net overcount of the total population or had a net undercount considerably smaller than that measured by the A.C.E. The different demographic estimates result from different assumptions about net undocumented immigration (see Chapter 5). The demographic analysis results corroborate the A.C.E. findings of reduced net undercount for children. They also show a difference in net undercount rates for blacks and others—indeed, a larger difference than that measured in A.C.E. The uncertainties about estimates of immigrants and the categorization of the population by race lead us to conclude that the available demographic estimates should not be a standard for evaluating the census or the A.C.E.

OCR for page 57
The 2000 Census: Interim Assessment Until additional evaluations under way at the Census Bureau are completed, we cannot endorse either the census or the A.C.E. estimates of the total population. Nonetheless, it seems clear that net undercount was reduced in 2000—from about 4.0 million people in 1990 (1.6% of the population) to about 3.3 million people (1.2% of the population), or possibly less.9 It is also clear that counting errors occurred in both directions: the census missed people who should have been counted and duplicated or included other people who should not have been counted. Indeed, a puzzle from the A.C.E. was that rates of erroneous enumerations and missed people were not dissimilar from the rates in the 1990 Post-Enumeration Survey (PES), which should result in similar estimates of net undercount, other things equal. Yet the A.C.E. measured lower net undercount (see Chapter 7). One aspect of population coverage in 2000 for which no evaluation data are available is the completeness of enumeration of people in group quarters. Almost 3 percent of the population counted in 2000 resided in such group settings as college dormitories, nursing homes, prisons, military barracks, migrant worker dormitories, and others. Group quarters residents were excluded from the A.C.E. estimation because of problems in developing dual-systems estimates for them in the 1990 PES due to their high rates of short-term mobility (see Killion, 1997; see also Chapter 6).10 The Bureau planned more intensive enumeration procedures for group quarters in 2000 (e.g., advance visits to special places and additional training for field staff), expecting that enumeration would be more accurate than dual-systems estimation for this population. However, scattered evidence suggests that the enumeration of group quarters residents was not as well controlled as the enumeration of the household population. The development of the address list for special places (e.g., dormitories) was not integrated with the MAF until late in the process, and census data users have reported that dormitory and prison populations were assigned to incorrect geographic locations in some instances, usually to a neighboring area (see Anderson and Fienberg, 2001a). Our observation of field offices suggests varying levels of cooperation from administrators of special places, which could have impeded complete enumeration. We cannot assess coverage for group quarters residents until the Census Bureau completes its evaluations, including an assessment of whether the A.C.E. properly treated group quarters enumerations in developing dual-systems estimates for the household population. 9   The estimates in the text are from the PES and the A.C.E., respectively. Demographic analysis estimated a slightly higher net undercount in 1990 (1.9% of the population), but a lower net undercount in 2000 (0.3% of the population) or, possibly, a small net overcount (Robinson, 2001a:App. Tables 1, 3). 10   The PES estimation included noninstitutionalized group quarters residents but not inmates of institutions.

OCR for page 57
The 2000 Census: Interim Assessment COVERAGE-RELATED FACTORS In endeavoring to understand changes in population coverage patterns between 1990 and 2000 by comparing the A.C.E. and PES estimates, we analyzed two census operational outcomes for which data were available: mail returns and people who could not be included in the A.C.E. (or PES) because they lacked sufficient reported information for matching or because their census records were available too late to be processed. Our preliminary mail return rate analysis did not shed much light on changes in coverage patterns; it is summarized below and detailed in Appendix B. Our analysis of people not included in the A.C.E. was more informative; it is briefly described below and discussed in detail in Chapter 8. Mail Return Rates Our interest in mail return rates stemmed from 1990 research showing that, in the census context, a mail return filled out by a household member tends to be more complete in coverage (and content) than an enumerator-obtained return. Analysis of 2000 A.C.E. data largely confirmed the 1990 findings: mail returns were somewhat less likely to omit one or more household members than were enumerator-obtained returns, and they were also less likely to include an erroneous enumeration. At the neighborhood (census tract) level, rates of household omissions and erroneous enumerations (particularly omissions) declined as the neighborhood mail return rate increased. Because overall mail return rates were similar between 2000 and 1990, the reduction in net undercount for the total population from 1990 could not be due to mail returns. However, the distribution of mail return rates could have changed in ways that would explain the smaller differences in net undercount rates between usually hard-to-count and easier-to-count groups in 2000 compared with 1990. For instance, targeted advertising and outreach might have increased mail return rates for renters while the rates for owners fell off slightly from 1990 levels. Regression analysis of mail return rates for 1990 and 2000 for census tracts characterized by such variables as percentage minorities or renters in 1990 (2000 variables were not available) did not support our supposition (see Appendix B). Much the same variables explained mail return rates in both 1990 and 2000,11 and the available demographic and socioeconomic variables failed to explain differences in mail return rates for census tracts between 1990 and 2000. Census tracts that experienced unusually large increases or decreases 11   A 1990 hard-to-count score constructed by the Census Bureau and the 1990 percentage net undercount, percentage people in multi-unit structures, and percentage people who were not high school graduates had large negative effects on mail return rates not only in 1990 but also in 2000; the 1990 percentage population over age 65 had a strong positive effect in both years.

OCR for page 57
The 2000 Census: Interim Assessment in mail return rates did show a tendency to cluster geographically. Research on particular characteristics of such clusters, including any distinctive features of outreach and other census operations, could be useful to identify factors that particularly help or hinder mail response. Imputations and Late Additions We were puzzled by the reduction in net undercount measured in the A.C.E. because the rates of omissions and erroneous enumerations in the A.C.E. were generally as high or higher as the rates in the 1990 PES. We identified a major reason for this result—namely, the considerably larger number of people in 2000 than in 1990 who could not be included in coverage evaluation but who were part of the total census count that is compared to the dual-systems estimate from the A.C.E. (or the PES) to calculate net undercount. The people who could not be included in the A.C.E. comprised two groups: (1) people reinstated in the census from the special MAF unduplication operation and (2) people who lacked sufficient information for matching and so required imputation to complete their census records (whole person imputations). Only a small number of people in 1990 were enumerated too late to be included in the PES, so the much larger number of reinstated people in 2000 (2.4 million) contributed to reducing the net undercount. Such people were about equally likely to be found among historically better-counted groups as among historically worse-counted groups, so they did not affect differences in net undercount rates. In contrast, people requiring imputation were not only a much larger group in 2000 (5.8 million) than in 1990 (1.9 million), but they were also disproportionately found among minorities, renters, and children, thus accounting in large part for the reduction in differential net undercount for these groups relative to non-Hispanic whites, owners, and older people. We discuss the types of people requiring imputation, as well as the people reinstated in the census, and the possible implications for the quality of census operations from their larger numbers, in Chapter 8. CONCLUSIONS Overall, we conclude that the census was well executed in many respects, particularly given the difficulties of changes in the overall design and other problems encountered in the years leading up to 2000. Many innovations appeared to be effective. These included: (1) contracting for data operations and use of improved data capture technology, (2) use of a redesigned questionnaire and mailing strategy, (3) paid advertising and expanded outreach, and (4) aggressive recruitment of enumerators. Greater reliance on computers for data editing and imputation requires evaluation of the effects on data quality.

OCR for page 57
The 2000 Census: Interim Assessment The concept behind the development of the MAF—namely, to make use of multiple sources, some for the first time—makes sense. However, there were problems in execution that may have increased duplicate and other erroneous enumerations and contributed to the larger number of people in 2000 who required imputation to complete their census records. An achievement was to maintain the overall mail response rate at the 1990 level. A disappointment was that long-form response rates were considerably lower than short-form rates. Another achievement was the reduction in measured net undercount from 1990 levels, overall and for historically less-well-counted groups. The larger numbers of people requiring imputation largely explained these reductions, which otherwise are not compatible with the estimated rates of omissions and erroneous enumerations in the A.C.E. Our analyses of these topics are limited by the available data. Consequently, our conclusions are preliminary and incomplete. The Census Bureau has under way a comprehensive set of evaluations, which should provide information for a definitive assessment of population coverage and data quality and of design and operational features of the 2000 census that most affected coverage and quality (U.S. Census Bureau, 2000). An important tool for evaluation of the effects of census operations should be the Bureau’s planned Master Trace Sample—essentially a compilation of major census databases for a systematic sample of addresses to permit tracing each step of the operations (see National Research Council, 2000a). Some evaluations in the Bureau’s planned program were moved up in priority and other evaluations were added last spring when the Bureau realized that the available assessments of the A.C.E., demographic analysis, and the census were not adequate to permit a decision to use A.C.E. estimates to adjust the census counts for legislative redistricting. These evaluations cover components of demographic analysis, several kinds of possible error in the A.C.E., enumeration procedures for and coverage of the group quarters population, whole person imputations, and people reinstated in the census from the MAF unduplication. The results of the Bureau’s work over the last 6 months will be released when the Bureau makes a decision around mid-October on whether to adjust population estimates for fund allocation and other purposes. We will review these evaluations at that time. NEXT STEPS Looking to the Census Bureau’s longer range evaluation program, we urge the Bureau to devote resources to completing planned studies on as fast a schedule as practicable. Even when the results of the last 6 months’ work are released, there will not be answers to many of the questions about the census, particularly which operations and design features had the greatest effects on coverage and data quality. Further, the information from the 2000 evaluations is needed for planning the 2010 census.

OCR for page 57
The 2000 Census: Interim Assessment Important aspects of census operations that we have identified for timely evaluation include:12 Group quarters: The completeness of coverage of the group quarters population and the effects of address list development and enumeration procedures on coverage should be assessed. MAF and LUCA: The quality of the MAF and the part played by LUCA and other sources of addresses in identifying good addresses, as well as in adding erroneous addresses, should be examined. In particular, the sources of addresses of people deleted from and reinstated in the census from the special MAF unduplication operation should be determined. It could be useful to conduct field work to estimate the extent of duplicate enumerations remaining among the reinstated people (see Chapter 8). Whole person imputations: Reasons for larger numbers of people requiring imputation to complete their census records should be sought, such as possible problems with the coverage edit and telephone follow-up and coverage improvement follow-up operations. Evaluations of the computerized routines used for imputation should be conducted (see Chapter 8). In addition, early completion of the Master Trace Sample should be a priority to permit tracing through the effects of each step of the census operations for a sample of addresses. Finally, as soon as practicable, the demographic and socioeconomic information collected in the long form should be thoroughly evaluated. Looking ahead to the 2010 census, the Census Bureau has made an early start on design and preparation. Its current plans include a major effort to reengineer MAF and the associated TIGER system of assigning addresses to geographic areas;13 the use of a new American Community Survey (ACS) to provide long-form information on an annual basis;14 and the implementation of a simplified short-form-only census in 2010 that makes maximum use of improved technology for enumeration and data capture (see Miskura et al., 2001; Waite et al., 2001). Our sister Panel on Research on Future Census Methods is charged to review the 2000 census evaluation results and the Bureau’s evolv-ing plans for 2010 to recommend appropriate research and testing that will lead to a successful 2010 design (see National Research Council, 2000a). That 12   Some of these topics will be covered in the evaluations to be released in mid-October; however, additional studies may be required. Priorities for research on demographic analysis and the A.C.E. are addressed in Chapters 5 and 7, respectively. 13   TIGER stands for Topologically Integrated Geographic Encoding and Referencing System. 14   When fully implemented in 2003, the ACS will survey 250,000 households each month, or 3 million households per year, using a mailout questionnaire similar to the 2000 long form, a targeted second questionnaire mailing to encourage response, telephone follow-up for nonresponse, and field follow-up of one-third of remaining nonrespondents. As a separate sample-based survey with a permanent staff, the ACS is expected to provide better quality long-form-type information than it appears possible to obtain in the census (see National Research Council, 2000c:Ch.4).

OCR for page 57
The 2000 Census: Interim Assessment panel is currently reviewing such topics as MAF reengineering, ACS estimation issues, and administrative computer systems for operations management. We are not charged specifically to recommend changes to the census design for 2010. Based on our evaluations of 2000 operations to date, we do offer two suggestions for consideration. First, the Bureau’s plans for the MAF for 2010 include continuation of a LUCA-type program. Implementation of LUCA for the 2000 MAF was difficult and participation was variable. The report of the LUCA Working Group suggests that participation can perhaps be most effective when it is coordinated for localities by a state or regional agency, such as a metropolitan association of governments. We suggest that the Census Bureau review its experience with LUCA partnerships in 2000 and consult with state and local governments to determine partnership strategies for 2010 that are likely to work well for both the Bureau and its LUCA partners. Second, we endorse the recommendations of two prior Committee on National Statistics panels that serious consideration should be given to moving Census Day to an earlier date than April 1, preferably to the middle of a month (National Research Council, 1994:38–40; 1999b:43–44). Changing Census Day could well improve the accuracy of enumeration of several groups of the population. These include: people moving into a new rental apartment or home, which is more likely to occur at the beginning than the middle of a month; college students, who may be less likely to be on spring break at an earlier date and less likely to have ended their spring semester when nonresponse follow-up is in progress; and snowbirds who may be less likely to be in transit at an earlier date. In addition, more time in which to evaluate the census, the A.C.E., and demographic analysis could make it possible to reach a decision about whether to adjust the census data for legislative redistricting without the uncertainties that affected the Bureau’s decision last March.15 Moving Census Day would require changing Title 13 of the U.S. Code, which specifies key delivery dates in terms of months after Census Day rather than a specific day (e.g., 12 months after Census Day for delivery of redistricting data). A possibility is to change Title 13 to specify the current delivery dates of December 31 of the census year for reapportionment counts and April 1 of the following year for redistricting counts, while giving the Census Bureau the authority to change Census Day should the Bureau conclude that such a change would facilitate the enumeration. Work to change Title 13 should begin soon if the Bureau is to have the option of moving Census Day in 2010. 15   Changing Census Day could have some effect on the time series of estimates from the census, depending on how the new and old dates relate to seasonal patterns of residence.