Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 45
The 2000 Census: Interim Assessment 3 Census Operations: Overview The process of conducting the 2000 decennial census involved a complex set of operations, with five primary components: develop the Master Address File (MAF), a list of addresses of all housing units in the country (along with a roster of dormitories, nursing homes, and other special places where people live in group quarters);1 mail or hand deliver census questionnaires to each address on the MAF, asking households to fill them out and return them by mail, and enumerate households in selected areas in person; carry out such processes as advertising and outreach, with the intent of boosting mail response and follow-up cooperation; follow up those addresses that failed to report and implement other field-based checks and coverage improvement procedures; and process the data through the steps of data capture, unduplication, editing and imputation, and tabulation. This set of components is an outline of the census enumeration process. It does not include the generation of demographic analysis estimates of overall patterns of census coverage (see Chapter 5); the Accuracy and Coverage Evaluation (A.C.E.) Program (see Chapters 6 and 7); or other evaluations and experiments by the Census Bureau (see National Research Council, 1999b, 2000a). Table 3-1 lists the basic components in the census, summarizing the challenges for each component, innovative procedures used in 2000 to meet these challenges, and the possible benefits and risks from the 2000 innovations relative to the procedures used in 1990. The overarching challenge was to complete as accurate a count of people and housing units in the United States as possible. Ideally, the census would correctly include all of the population and 1 Special places can also include separate residences (e.g., a warden’s home in a prison complex).
OCR for page 46
The 2000 Census: Interim Assessment TABLE 3-1 Operational Components and Challenges for the 2000 Census Component Challenges 2000 Solutions Possible Benefits/Risks Compared with 1990 (1) Develop MAP for mailback universe (99% of population in 2000); develop list of special places for enumerating people in group quarters (dormitories, nursing homes, etc.) Create list with fewest possible omissions and fewest possible duplicates or other erroneous addresses; assign (geocode) addresses correctly to census geographic units Begin with 1990 list for most addresses; use multiple sources to update and correct (Postal Service list, field canvassing, local input, updating during enumeration, computer checking to reduce duplication) □ Multiple sources could reduce omissions (benefit), but increase duplication (risk) □ Local review could vary in extent and completeness (risk) □ Unduplication procedures (including the unplanned summer 2000 operation) could be incomplete (risk) (2) Deliver questionnaires to each address on the list for households to fill out and mail back (enumerate rural households directly) Obtain highest possible mail response and return rates through delivery procedures [see also (3)] Use enumerators instead of U.S. Postal Service to reach non-city-style addresses; make multiple mailings (e.g., advance letter); use shortened user-friendly questionnaire; provide options to respond by telephone or the Internet; allow people to pick up “Be Counted” forms if they failed to receive a mailing □ Redesigned questionnaires and multiple mailings could boost mail response (benefit) □ Space for only 6 people on the questionnaire (7 in 1990) could affect count of large households (risk) □ Widespread availability of “Be Counted” forms could increase duplication (risk) (3) Before and throughout census operations, conduct outreach and advertising programs to boost mail response rate and cooperation with follow-up enumerators Reach and motivate general public and, especially, traditionally hard-to-count groups to fill out their census form Run extensive Census in Schools Program; pay for advertising, including ads targeted to minority groups; hire partnership specialists; partner with local governments, businesses, and others for special outreach efforts; challenge communities to meet or exceed their 1990 mail response rate □ Much more extensive advertising and outreach could stem the decline in mail response rates that occurred from 1970 to 1990 (benefit) □ Partnership efforts could vary in effectiveness (particularly because Census Bureau could provide materials and limited staff help but not funding) (risk)
OCR for page 47
The 2000 Census: Interim Assessment (4) Follow-up activities Send enumerators to follow up addresses for which no questionnaire is returned Obtain accurately completed questionnaire (or designation of address as vacant or nonresidential) from every address not responding in a timely manner Recruit more enumerators than are expected to be needed so that follow-up can be completed in a timely manner; pay higher area-adjusted wages to attract and keep good people □ More timely completion of follow-up could increase data quality and facilitate coverage evaluation (benefit) □ Push to complete enumeration could increase duplications and other errors (risk) Send smaller, second wave of enumerators to recheck addresses designated as vacant or delete, enumerate late additions to MAF, check “Be Counted” and telephone forms, and enumerate lost or blank forms, so as to improve coverage Carry out coverage improvement field checks accurately and on a timely basis Limit rechecks of vacant units to those not classified as such in two previous operations; drop several 1990 operations: local review of preliminary housing counts, field reinterview of questionnaires with minimal information, and parolee and probationer coverage program □ Targeted coverage operations could reduce erroneous enumerations (benefit) at small risk of increased omissions □ Reduced follow-up for incomplete questionnaires could increase use of computerized statistical imputation techniques (neutral or risk) (5) Process the questionnaires through data capture, coverage edit, unduplication, imputation for missing data, and tabulation Carry out all data processing operations with a high degree of accuracy in a timely manner Hire contractors to capture the data, using optical character recognition; identify forms with insufficient information about household members for telephone follow-up; unduplicate people and addresses [see (1)]; edit answers and impute values for missing answers; release data products in various formats (e.g., Internet) □ More centralization of computer processing and greater use of computerized editing could reduce errors and variability (benefit) □ Unduplication efforts could be incomplete (risk) □ Greater use of imputation could or could not effectively replace field work (neutral or risk)
OCR for page 48
The 2000 Census: Interim Assessment miss none of the population. More practically, the goal for 2000 was to reduce the measured net undercount below previously observed levels for the total population and important population groups. In addition, the census had to be completed within statutory deadlines and within budget. This chapter briefly describes the five major components of the 2000 census, emphasizing key differences from the 1990 census procedures. This overview provides background for the assessments in Chapters 4 through 8. Additional details on census procedures are in Appendix A, which contains full descriptions of census processes in 2000 and 1990. DEVELOPING THE MASTER ADDRESS FILE The procedures used by the Census Bureau to develop its 2000 computerized mailing list—the MAF—differed in several important respects from those used in past censuses.2 The major difference from 1990 was that the 2000 MAF was constructed using more sources. The expected benefit was that the MAF would be more complete. The possible risks were that the MAF would have more duplicate or erroneous addresses that were not weeded out from the final list and that the quality of the MAF would vary significantly across geographic areas. The risks were considered high because many of the new and previously untapped sources of addresses for the MAF were being used for the first time in 2000. As it turned out, the Census Bureau had to alter several parts of the MAF development process as it proceeded, in order to keep on schedule and improve the quality of the list. Initial Development The Census Bureau used somewhat different procedures to develop the MAF for areas believed to have predominantly city-style addresses (house number and street) than for areas believed to have predominantly rural route and post office box addresses (see Box A-1 in Appendix A). The base for the city-style portion of MAF was the final address file used in the 1990 census, which was augmented periodically by updates from U.S. Postal Service files. Additional city-style addresses were obtained through three auxiliary programs: the Local Update of Census Addresses (LUCA) Program, in which local and tribal governments reviewed the address lists for their areas;3 block canvass, a field operation to check the entire list, which was not part of the original plans; and a new construction LUCA Program added in response to 2 The Census Bureau refers to the version of the MAF that was used in the census as the Decennial Master Address File or DMAE It is an extract of the full MAF, which includes business as well as residential addresses. Our use of the term MAF refers to the Bureau’s DMAE 3 The Address List Improvement Act of 1994 (P.L. 103–430) made it possible for the Postal Service to share its list with the Census Bureau and for the Bureau to share the MAF with localities that signed a pledge to treat the list as confidential. Local review efforts in previous censuses were limited to review of housing unit counts for census blocks but not the individual addresses.
OCR for page 49
The 2000 Census: Interim Assessment local concerns, in which localities could identify addresses newly constructed between January and March 2000. An intensive check by the Postal Service of its own address files produced a set of updates that were made to the MAF in early 2000 prior to mailout; questionnaires were delivered by the Postal Service in March. The base for the non-city-style portion of the MAF was a complete block canvass, or prelist, conducted in late 1998 and early 1999. A LUCA Program for non-city-style address areas was implemented in 1999. The MAF for these areas continued to be updated in February-March 2000, as census enumerators were asked to note new entries for the MAF as they dropped off questionnaires to households. For remote rural areas, Census Bureau enumerators developed the address list concurrently with enumerating households in person. For special places (e.g., college dormitories), the Bureau used a variety of sources to develop an address list. Further Development MAF was a dynamic file during the operation of the census. Not only were addresses added from each stage of census field operations, but addresses were also deleted in an effort to minimize duplicate and erroneous entries. The Census Bureau estimates that a total of about 4 million addresses were added to the MAF—2.3 million during questionnaire delivery and 1.7 million during followup. At the same time, the Bureau estimates that a total of about 10.4 million addresses were removed as duplicative of other addresses or nonexistent—about half were deleted on the basis of field checks and half on the basis of internal computer checks. One computer check was performed prior to nonresponse follow-up; another (not included in the original plans) was performed in summer 2000 (see below). The final 2000 MAF included addresses for 115.9 million occupied and vacant housing units. Unduplication and Late Additions An unanticipated complication arose from evaluations of MAF between January and June 2000. These evaluations, which compared MAF housing unit counts to estimates prepared from such sources as building permits, led the Census Bureau to conclude that there were probably still a sizable number of duplicate housing unit addresses on the MAF despite prior computer checks. Field verification carried out in June 2000 in a small number of localities substantiated this conclusion. Consequently, the Bureau mounted an ad hoc operation to identify duplicate MAF addresses and associated census returns. Housing unit and person records flagged as likely duplicates were deleted from the census file and further examined. After examination, it was decided that a portion of the deleted
OCR for page 50
The 2000 Census: Interim Assessment records were likely separate housing units not already in the census, and they were restored to the census file. At the conclusion of the operation, 1.4 million housing units and 3.6 million people were permanently deleted from the census; 1 million housing units and 2.4 million people were reinstated.4 QUESTIONNAIRE DELIVERY AND MAIL RETURN The 2000 census, like the 1980 and 1990 censuses, was conducted primarily by delivering questionnaires to households and asking them to mail back a completed form. Procedures differed somewhat, depending on such factors as type of addresses in an area and accessibility; in all, there were nine types of enumeration areas (see Box A-2 in Appendix A). The two largest types of enumeration areas covered 99 percent of the household population: mailout/mailback, covering almost 82 percent of the population, in which Postal Service carriers delivered questionnaires; and update/leave/mailback (usually termed update/leave), covering almost 17 percent of the population, in which Census Bureau field staff delivered questionnaires and updated the MAF at the same time. The remaining 1 percent of the household population was enumerated in person. Separate enumeration procedures (not discussed in this report) were used for such special populations as people who frequented shelters for the homeless, residents of group quarters, and transients.5 The goal for the mailback universe for this phase of the census was to get a questionnaire to every housing unit on the MAF and motivate people to fill it out and mail it back (every mail return was one less address to follow up in the field). It was expected that mail response would continue to decline, as it had from 1970 to 1990, due to broad social and economic changes that have made the population more difficult to enumerate. These changes include rising numbers of new immigrants, both those who are legally in the country and those who are not, who may be less willing to fill out a census form or who may not be able to complete a form because of language difficulties; increasing amounts of junk mail, which may increase the likelihood that a household will discard its census form without opening it; and larger numbers of households with multiple residences, making it unclear which form they should mail back.6 The Bureau’s challenge was to forestall a further decline in mail response and, if possible, increase it above the level achieved in 1990. Approaches to boost mail response in 2000 included four major activities: 4 The reinstated people are often called “late additions.” Although not enumerated late, they were added back to the census too late to be included in the A.C.E. (see Chapter 6). 5 The 2000 census developed specific procedures only to enumerate the homeless population who use shelters, soup kitchens, and specifically identified nonsheltered outdoor locations. 6 There were no instructions on the 2000 questionnaire for how to respond to forms for more than one residence.
OCR for page 51
The 2000 Census: Interim Assessment Redesigning the questionnaires and mailing package: The questionnaires were made more attractive and easy to fill out. They were shortened by providing space to report characteristics for six people instead of seven as in 1990. In addition, most housing items previously included on the short form were moved to the long form. The mailing package emphasized the mandatory nature of the census, and multiple mailings were made to households, including an advance letter (in mailout/mailback areas), the questionnaire, and a reminder postcard. Adapting enumeration procedures to special situations: This involved having nine types of enumeration areas (see Box A-2 in Appendix A). Allowing multiple modes for response: Households could mail back their questionnaire or provide responses by telephone; recipients of the short form could submit their form on the Internet. In addition, people could pick up a “Be Counted” form from a local site if they thought they had been missed. (To reduce the potential for duplication, the Bureau did not widely advertise the Internet submission or “Be Counted” programs.) Expanding advertising and outreach efforts (see “Outreach,” below). A significant achievement of the 2000 census was that it did halt the historical decline in the mail response rate. The rate (about 66%) was similar to that in 1990 (65%) and considerably higher than the Bureau had projected (61%), which reduced the burden of field follow-up. The mail return rate—a more refined measure of public cooperation than the mail response rate—was slightly lower in 2000 (about 72%) than in 1990 (74%). However, for long forms, the mail return rate in 2000 was only about 58 percent, compared with about 72 percent for short forms, a much wider difference than occurred in 1990; see Box 3-1 for details. Note that questionnaires counted as “mail” returns in the 2000 census include responses from the multiple modes. Of 76 million “mail” returns, about 66,000 were Internet returns, 605,000 were “Be Counted” forms, and 200,000 were telephone responses. OUTREACH The Census Bureau engaged in large-scale advertising and outreach efforts for 2000. For the first time, the census budget included funds for a paid advertising campaign ($167 million). (In previous censuses, the Advertising Council arranged for advertising firms to develop ads and air them on a pro bono, public service basis.) The advertising ran from October 1999 through May 2000 and included separate phases to alert people to the importance of the upcoming census, encourage them to fill out the forms when delivered, and motivate
OCR for page 52
The 2000 Census: Interim Assessment BOX 3-1 Mail Response and Return Rates Definitions and Uses The mail response rate is defined as the number of households returning a questionnaire by mail divided by the total number of questionnaires sent out in mailback areas. Achieving a high mail response rate is important for the cost and efficiency of the census because every returned questionnaire is one less household for an enumerator to follow up in the field. The mail return rate is defined as the number of households returning a questionnaire by mail divided by the number of occupied households that were sent questionnaires in the mailback areas. This rate is an indicator of public cooperation. Achieving a high mail return rate (at least to the level of 1990) is important because of evidence from 1990 that mail returns are more complete than enumerator-obtained returns. In 2000, because of the alternative modes by which households could fill out their forms, the numerator of both “mail” responses and “mail” returns included responses submitted on the Internet, over the telephone, and on “Be Counted” forms. The denominator of the mail response rate included all addresses on the April 1, 2000, version of the MAF, covering both mailout/mail back and update/leave areas. The denominator of the mail return rate excluded addresses on the MAF that field follow-up determined were vacant, nonresidential, or nonexistent. Rates, 1970–2000 Censuses Census 1970 1980 1990 2000 Mail response rate 78% 75% 65% 66% Mail return rate 87% 81% 74% 70–72%a Source for 1970–1990 rates: National Research Council (1995:Table 3.1, App. A). Mail response and return rates are not strictly comparable across censuses because of differences in procedures used to compile the address list, percentage of the population included in the mailback universe (about 60% in 1970 and 95% or more in 1980–2000), and time allowed for mailback. Differences in Mail Return Rates: Short and Long Forms Return rates of long forms are typically below the return rates of short forms. This difference widened substantially in 2000. Census 1970 1980 1990 2000a Short-form rate: 87.8% 81.6% 74.9% 72% Long-form rate: 85.5% 80.1% 70.4% 58% aOverall preliminary mail return rates have been cited as 72 percent and 70 percent; if 70 percent is correct, then 72 percent and 58 percent are approximately correct for the short-form and long-form rates. Rates may change when the Census Bureau completes its evaluation of mail response.
OCR for page 53
The 2000 Census: Interim Assessment people who had not returned a form to cooperate with the follow-up enumerators. Ads were placed on TV (including an ad during the 2000 Super Bowl), radio, newspapers, and other media, using multiple languages. Using information from market research, the ads stressed the benefits to people and their communities from the census, such as better targeting of government funds to needy areas for schools, day care, and other services. In addition to the ad campaign, the Census Bureau hired partnership and outreach specialists in local census offices, who worked with community and public interest groups to develop special initiatives to encourage participation in the census. The Bureau signed partnership agreements with more than 30,000 organizations, including federal agencies, state and local governments, business firms, nonprofit groups, and others. A special program was developed to put materials on the census in local schools to inform school children about the benefits of the census and motivate them to encourage their adult relatives to participate. FIELD FOLLOW-UP Because not all households will mail back a form, and because many addresses to which questionnaires are delivered will turn out to be vacant or nonresidential, the 2000 census—like previous censuses—included a large field follow-up operation (see Appendix A). More than 500 local census offices (LCOs) were set up across the country (reporting to 12 regional census centers). The LCOs were responsible for hiring the temporary enumerators and crew leaders to conduct follow-up operations. In update/leave areas, enumerators were hired to deliver questionnaires prior to Census Day and to return to follow up nonresponding households. LCOs also carried out operations to enumerate special populations. Anticipating possible difficulties in hiring and also the possibility that the mail response rate would decline from 1990, LCOs were authorized to recruit aggressively in advance of Census Day, hire more enumerators than they thought would be needed, permit part-time work schedules, and pay above-minimum wages (which differed according to prevailing area wages). Most offices were successful in meeting their hiring goals before the first follow-up operations began in mid-April 2000. Follow-up operations were carried out in two separate stages. The first stage was nonresponse follow-up (NRFU), designed to obtain a questionnaire from every nonresponding unit in the mailback universe or to determine that an address was vacant or nonresidential. The NRFU operation involved visiting 45 million addresses. It began in late April 2000 and was completed in late June, a week ahead of schedule (unlike 1990, when NRFU fell considerably behind schedule). The second stage was coverage improvement follow-up
OCR for page 54
The 2000 Census: Interim Assessment (CIFU), which occurred in June-August and included specific operations designed to check and supplement NRFU. The CIFU workload included 8.7 million addresses. Several operations included in the 1990 CIFU were dropped for 2000. Timely completion of NRFU was expected to help population coverage, given evidence from previous censuses that returns obtained earlier in the process are more accurate than late returns (see Chapter 4). Similarly, focusing the 2000 CIFU effort on selected operations was expected to reduce erroneous enumerations in comparison with 1990. The possible downside risk was that pressure on field staff could lead to rushed and less accurate work. DATA PROCESSING Data processing for the 2000 census was a continuing, high-volume series of operations that began with the capture of raw responses and will end with the production of voluminous data products for the user community that will be available in 2001–2003. Important innovations for 2000 included the use of outside vendors for major data processing components; the use of optical mark and character recognition technology for data capture; and greater reliance on computer routines to supply missing information, in place of field checks. The challenge for each phase of data processing was to keep on schedule, follow procedures carefully, and minimize last-minute revisions to planned procedures that could affect quality. Several data processing operations in 2000 differed in important ways from those in 1990: Data capture: The return address on mailback questionnaires directed them to one of four data capture centers—the Bureau’s National Processing Center in Jeffersonville, Indiana, and three centers run by contractors. Every questionnaire had a bar code that was scanned to record its receipt. The questionnaires were then imaged electronically, checkbox data items were read by optical mark recognition (OMR), and write-in character-based data items were read by optical character recognition (OCR). Clerks keyed data from images if the OMR/OCR technology could not make sense of the questionnaire answers. Images of the long-form items were set aside temporarily to permit the fastest possible processing of short-form data. (In 1990, in contrast, many questionnaires were sent to local offices for check-in, clerical review, and field follow-up, if necessary, to complete the population count and characteristics of the household. Data capture was performed using a microfilm-based system first developed for the 1960 census.) Coverage edit and telephone follow-up: After data capture, the questionnaires were reviewed by computer to identify returns that required a
OCR for page 55
The 2000 Census: Interim Assessment reinterview by telephone. About 2.3 million cases were in the telephone follow-up workload, including: returns that reported a higher total count of household members than the number of members for which individual information (e.g., age, race, sex) was provided; returns that did not report a household count and provided information for exactly six people (the limit of the space provided on the questionnaire); returns that reported household counts of seven people or more; and returns of four or more people that contained nonrelatives of the household head. The purpose of the operation was to reduce undercounting of people in large households and nonfamily households. (Telephone follow-up was also used in the 1990 census, but, unlike 2000, the 1990 operation addressed missing characteristics as well as coverage problems and included a field follow-up effort when telephone follow-up was not successful.) Unduplication of households and people: Two major, computer-based unduplication operations were carried out subsequent to field follow-up. One operation was the special effort in summer 2000 to reduce duplication of housing unit addresses in the MAF (see “Unduplication and Late Additions,” above). The other operation, which was planned from the outset, used the primary selection algorithm (PSA) to unduplicate multiple returns for the same address. The purpose of the PSA was to determine which households and people to include in the census when more than one questionnaire was returned with the same census address identification number. Such duplication could occur, for example, when a respondent mailed back a census form after the cutoff date for determining the NRFU workload and the enumerator then obtained a second form from the household. In all, 9 percent of census housing units had two returns and 0.4 percent had three or more returns. In most instances, the PSA discarded duplicate household returns or extra vacant returns; less often, the PSA found additional people to assign to a basic return or identified more than one household at an address. Editing and imputation: It is standard census practice to use editing techniques to reconcile inconsistent or anomalous answers for a person or household and to use imputation routines to provide values for missing responses. In 2000, all editing and imputation were computer-based; there was no clerical editing of the questionnaires as in 1990 and past censuses. In instances when it was not possible to perform an edit that used other information for the same person or household, imputation was performed with “hot deck” methods that made use of information for other, similar people and households in the immediate neighborhood (see Box A-3 in Appendix A).
OCR for page 56
The 2000 Census: Interim Assessment One kind of imputation involved substituting the record of another person or an entire household: 5.8 million people required such whole person imputation in 2000, amounting to 2.1 percent of the household population count. (In 1990, only 1.9 million people, or 0.8 percent of the household population, were imputed in this way.) Whole person imputations in 2000 included: cases for which there was no information about the number of people living at that address or their characteristics (0.4% of the household population); cases for which household size was known but not the characteristics of the members (0.8% of the household population); and cases for which no information was provided for the individual, although other household members had reported data (0.9% of the household population). Editing and imputation rates for missing values for individual shortform content items, such as age, race, sex, and housing tenure, were low—ranging from 1.1 percent to 4.3 percent. (These rates exclude wholly imputed people.) In many instances, it was possible to fill in an answer from other information for the person or household, so that rates of hot-deck imputation for short-form items were lower still. Information about editing and imputation rates for long-form content items is not yet available.
Representative terms from entire chapter: