Click for next page ( 159


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 158
6 Data Products and Their Use Obviously, an investment in data collection can only earn a return to the extent that the data are used for basic and applied research, policy analysis, and improved public information. In order for the investment in a rich, complex survey such as SIPP to earn a high return, it is imperative that the responsible agency have an active data dissemination program that includes published reports, computer-readable data products, and associated explana- tory materials all produced on a timely basis and in accessible formats. In this chapter we discuss the requirements for an effective data dis- semination program for SIPP. We cover both the types of reports that should be developed and some of the conceptual and measurement issues that arise in estimating income and program statistics from the complex information in SIPP. We also consider microdata products and review the kinds of informational and instructional materials that SIPP users-whether of computer-readable files or printed reports need in order to make the most effective use of the survey data. PUBLICATIONS Regular publication series from a major, continuing survey such as SIPP serve many important purposes. Such publications, containing basic de- scriptive statistics plus key analytic measures (e.g., spell lengths for pro- gram participation), are a valuable reference source for the general user- and their value increases as each successive report adds to a time series. 158

OCR for page 158
DATA PRODUCTS AND THEIR USE 159 The annual P-60 series on income and poverty from the March Current Population Survey (CPS) is a notable example each fall's publication is eagerly awaited and immediately used by a broad community of policy analysts, researchers, and executive branch and congressional staff. Such publications also serve to orient an analyst who is using or plans to use the more detailed information contained in computer data products: they intro- duce the analyst to the survey, help the analyst develop fruitful study plans (e.g., the numbers may suggest hypotheses or indicate that the sample size is or is not sufficient for analysis of subgroups), and provide important control totals for the analyst to determine the accuracy of his or her com- puter output. The last function is particularly important for a complex survey like SIPP. Preparation of regular publications is also vitally important to the agency that sponsors the survey. It is only by having analysts who work with the data regularly develop tabulations and analytic measures that the agency can gain first-hand, in-depth knowledge of the quality and utility of the information. The agency, of course, needs input from outside users regard- ing data quality and utility, but it needs its own assessment as well to plan needed improvements in the survey and to provide informed guidance to users. For most of its household surveys, the Census Bureau is the data collec- tion agency but not the sponsor agency and so is not directly involved with the publication program. However, for SIPP, the Census Bureau is both the sponsor and the collection agency and, consequently, has publication re- sponsibility. It is especially important that the Census Bureau have a com- prehensive publication program for SIPP because of the richness and com- plexity of SIPP data. Users need to be made keenly aware, through regular publications that present and explain key indicators, of both the analytical power of SIPP-based measures and the problems that may result from in- complete understanding of such measures. To date, the publication program for SIPP, while including many useful reports, has not adequately served these needs. A Checkered History The Census Bureau's publication program for SIPP has been very uneven, including a stretch of several years in which almost nothing was published from the core information on income and program participation (see Table 6-1 for a chronological list of SIPP report titles published through 1991~. Initially, the Census Bureau, which established a new Household Economic Studies series (P-70) for SIPP, fully intended to publish a regular set of cross-sectional statistics from the core. The first SIPP report, released in September 1984, provided average monthly data on income and program

OCR for page 158
160 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION TABLE 6-1 SIPP Reports Published in P-70 Series Through 1991 by U.S. Bureau of the Census (in chronological order) Public all on Date (and Source of Data Number) Report Title Wave and Panel - Sept. 1984 Economic Characteristics of Households in Wave 1, 1984 (P-70-1) the United States: Third Quarter 1983 Feb. 1985 Economic Characteristics of Households in Waves 1-2, 1984 (P-70-2) the United States: Fourth Quarter 1983 April 1985 Economic Characteristics of Households in Waves 2-3, 1984 (P-70-3) the United States: First Quarter 1984 May 1985 Economic Characteristics of Households in Waves 3-4, 1984 (P-70-4) the United States: Second Quarter 1984 Oct. 1985 Economic Characteristics of Households in Waves 3-5, 1984 (P-70-5) the United States: Third Quarter 1984 Jan. 1986 Economic Characteristics of Households in Waves 4-5, 1984 (P-70-6) the United States: Fourth Quarter 1984 July 1986 *Household Wealth and Asset Ownership: 1984 Wave 4, 1984 (P-70-7) Dec. 1986 *Disability, Functional Limitations and Wave 3, 1984 (P-70-8) Health Insurance Coverage: 1984-85 (disability) Waves 2-9, 1984 (health insurance) May 1987 *Who's Minding the Kids? Child Care Wave 5, 1984 (P-70-9) Arrangements: Winter 1984-85 Aug. 1987 *Male-Female Differences in Work Experience, Wave 3, 1984 (P-70-10) Occupation, and Earnings: 1984 Sept. 1987 *What's It Worth? Educational Background and Wave 3, 1984 (P-70-11) Economic Status: Spring 1984 Sept. 1987 *Pensions: Workers Coverage Wave 4, 1984 (P-70-12) and Retirement Income: 1984 Oct. 1988 *Who's Helping Out? Support Networks Among Wave 5, 1984 (P-70- 13) American Families April 1989 Characteristics of Persons Receiving Benefits 1984 panel file (P-70- 14) from Major Assistance Programs Aug. 1989 Transitions in Income and Poverty Status: 1984 panel file (P-70-l5) 1984-85 July 1989 Spells of Job Search and Layoff. . . and 1984 panel file (P-70- 16) Their Outcomes

OCR for page 158
DATA PRODUCTS AND THEIR USE TABLE 6-1 Continued 161 Publication Date (and Number) Source of Data Wave and Panel March 1990 Health Insurance Coverage: 1986-88 Waves 1-8, 1985 (P-70-17) Waves 1-7, 1986-1987, 1985 panel file June 1990 Transitions in Income and Poverty Status: 1985 panel file (P-70- 18) 1985-86 June 1990 * The Need for Personal Assistance with Every- Wave 6, 1985 (P-70-19) day Activities: Recipients and Caregivers Wave 3, 1986 July 1990 *Who's Minding the Kids? Child Care Wave 6, 1985 (P-70-20) Arrangements: Winter 1986-87 Wave 3, 1986 Wave 6, 1986 Wave 3, 1987 Oct. 1990 *What's It Worth? Educational Background Wave 2, 1987 (P-70-21) and Economic Status: Spring 1987 Dec. 1990 *Household Wealth and Asset Ownership: 1988 Wave 7, 1986 (P-70-22) Wave 4, 1987 Jan. 1991 Family Disruption and Economic Hardship: 1984 panel file (P-70-23) The Short-Run Picture for Children Aug. 1991 Transitions in Income and Poverty: 1987-88 1987 panel file (P-70-24) June 1991 *Pensions: Worker Coverage and Retirement Wave 7, 1985 (P-70-2s) Benefits: 1987 Wave 4, 1986 *Denotes reports based primarily on data from topical modules. participation for the third quarter of 1983 (July-September) from wave 1 of the 1984 panel. From February 1985 to January 1986, the Census Bureau published five more quarterly reports-for the fourth quarter of 1983 through the fourth quarter of 1984 in the same format, but then discontinued the series (see Figure 6-1 for the contents of the quarterly reports). There were several factors behind the decision to drop the quarterly reports. First, as we discuss in Chapter 5, the Census Bureau clearly under- estimated the resources and capabilities required to process the volume of SIPP data that poured in from the field. Very quickly, the data processing system began to buckle under the strain' with consequent delays in both data files and tabulations for publication. For a time, each successive panel took longer and longer to process, and, indeed, the Census Bureau did not

OCR for page 158
62 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION For all persons monthly household cash income (mean, median, and distnbunon from under $300 to $4,000 and over) by sex crossed by race and ethnicity, metropolitan residence, region, household relationship, age, labor force status, and work disability status; residence in household receiving cash benefits, food stamps, and other noncash benefits for persons by characteristics (as listed above); and mean monthly household cash income, receipt of unemployment compensation, and household receipt of cash benefits, food stamps, and ocher noncash benefits by age and sex crossed by labor force status. For persons aged 16 and over monthly earnings (mean, median, and distubunon) by sex and full-time versus ocher work status crossed by race and ethnicity, age, household relationship, and current occupation. For households mean monthly household cash income, and household receipt of unemployment compensation, cash benefits, food stamps, and other noncash benefits by labor force status of household crossed by household type; monthly household cash income (mean, median, and distribution from under $300 to $4,000 and over) by race and edacity of householder, metropolitan residence, region, household type, age of householder, and work disability status of householder; mean monthly household cash income, and household receipt of unemployment compensation, cash benefits, food stamps, and other noncash benefits by charactensucs (as listed immediately above); household receipt of food stamps, WIC, free or reduced-pnce school meals, public or subsidized housing, Medicaid, Medicare, AFDC or other cash assistance, SSI, social security, veterans' benefits, and unemployment compensation by household type crossed by household receipt of food stamps, etc. (same categories as in table heading also for persons); and monthly household cash income (mean, median, distnbuiion) by household receipt of earnings, property income, social secunty, private pensions, federal government retirement, U.S. military retirement, state or local government retirement, veterans' payments, private support payments, AFDC or other cash assistance, SSI, unemployment compensation, other income, food stamps, WIC, free or reduced-pnce school meals, public or subsidized housing, energy assistance, Medicaid, Medicare. FIGURE 6-1 Contents of SIPP Quarterly Reports, Series P-70, Nos. 1-6. publish any reports-using either core or topical module information-from other than the 1984 panel until 1990. Second, Census Bureau analysts documented disturbing anomalies in the quarterly data that were hard to explain. A comparison of aggregate SIPP figures for selected income types with independent sources formed a regular (and highly useful) feature of the reports. However, some income types showed erratic patterns in comparison with the independent sources: for example, average monthly unemployment insurance benefits from SIPP were more than 100 percent of the independent source for the third and

OCR for page 158
DATA PRODUCTS AND THEIR USE 163 fourth quarters of 1983, but dropped to 85 and 80 percent, respectively, in the first and third quarters of 1984. Third, for most income sources, Census Bureau analysts believed that the reports showed little change from quarter to quarter and hence would not be interesting to users. The analysts also determined that sample sizes, particularly after reductions due to budget cuts, were insufficient in many cases to ascertain quarter-to-quarter changes that were significant. For all these reasons, the Census Bureau dropped the quarterly series (although the tabulations continued to be produced in-house). From February 1986 through April 1989, the only SIPP publications were cross-sectional reports based primarily on topical modules from the 1984 panel. (The modules are generally easier to analyze than the core- for one thing, they are specific to an interview wave.) The seven reports published during this period provided interesting and often path-breaking statistics on the following topics: child care, disability and health insurance coverage, household wealth and asset ownership, educational background and economic status, pensions, sex differences in work experience and oc- cupation and earnings, and support networks. From April 1989 through December 1991, the Census Bureau stepped up the pace and increased the scope of SIPP publications, releasing 12 reports in the P-70 series. Four of these reports on child care, educational background and economic status, household wealth and asset ownership, and pensions updated previous publications, based on the 1984 panel, with data from comparable topical modules in later SIPP panels. A fifth report analyzed the topical module data on caregiving from the 1985 and 1986 panels. The other seven reports used the core data contained in longitudinal files created from all waves of a SIPP panel. Reports from the 1984 panel file included characteristics of persons receiving benefits from major assis- tance programs, transitions in income and poverty status for 1984-1985, spells of job search and layoff, and the effects of family disruption and economic hardship on children. Reports from the 1985 panel file included transitions in health insurance coverage (this report also used the core data Tom several SIPP panels to provide quarterly estimates of health insurance coverage for 1986-1988) and transitions in income and poverty status for 1985-1986 (a modified version of the 1984-1985 report). Finally, a third report in the series, on transitions in income and poverty status (for 1987- 1988), used data from the 1987 panel file. 1In addition, tabulations on home ownership from the assets and liabilities topical module in wave 4 of the 1987 panel were published in the Current Housing Reports series (Fronczek and Savage, 1991); rates of migration calculated from the 1984 panel longitudinal file were published in the Special Studies series (DeAre, 1990); and tabulations on maternity leave arrangements during the years 1961-1985 from the fertility history topical module in wave 8 of the 1984 panel and wave 4 of the 1985 panel were published in the Special Studies series (O'Connell, 1990).

OCR for page 158
164 Descriptive Reports THE SURVEY OF INCOME AND PROGRAM PARTICIPATION Reports on Income and Programs The Census Bureau's publication plans for SIPP (see Bureau of the Census, 1991a) include a phased-in development of regular reports from the core data on income and program participation. For the first time since the quarterly reports were discontinued, cross-sectional as well as longitudinal measures will be published on these topics. Both cross-sectional and longi- tudinal data from the 1987 panel will be included in a report-on major assistance programs that is scheduled for release in 1992. Updated cross- sectional statistics will be published in 1993 on income, poverty status, and programs, followed by publication in 1994 of updated longitudinal statistics on transitions in income, poverty, and program participation from the 1990 panel.2 Thereafter, cross-sectional and longitudinal reports will alternate yearly. In addition, the Census Bureau plans to prepare a major report that compares annual income and poverty data from the 1990 SIPP panel with data from the March 1991 CPS. There are also plans to incorporate some SIPP-based tabulations into the P-60 report series from the CPS. We ap- plaud the initiatives by the Census Bureau's Housing and Household Eco- nomic Statistics Division (HHES) to develop a regular, comprehensive pro- gram of publications from the core SIPP data. One change that we urge in the overall publication plan relates to the role of SIPP vis-a-vis the March CPS. As we recommend in Chapter 3, the long-range goal should be for SIPP to become the centerpiece of the nation's income statistics. Hence, we urge HHES to reconsider its stated intention that the March CPS remain the primary source for annual income and po,~- erty estimates and to work instead towards a more prominent role for SIPP. Specifically, HEWS should consider a publication schedule for SIPP, once the new design is phased in, of releasing cross-sectional statistics every year instead of in alternate years (with longitudinal statistics being released every 2 years). Ultimately, HHES should look to scale back the extent of detail in the P-60 series from the March CPS as users become accustomed to the SIPP data and a sufficient time series is built up from SIPP to support mend analysis. In the interim, we are very supportive of HHES's plans to assess the comparability of SIPP and March CPS estimates and to develop SIPP-based tabulations for inclusion in the P-60 series as an immediate way to make more use of SIPP and alert users to the additional detail that it provides. 2Updated longitudinal information is not scheduled for publication earlier than 1994: the 1988 and 1989 panels were truncated lo Six and three waves, respectively, and hence do not provide sufficient periods of observation, and the complete longitudinal file from the 1990 panel will not be available until late 1993.

OCR for page 158
DATA PRODUCTS AND THEIR USE 165 Finally, we believe it would be useful for the Census Bureau to release an historical report containing the tabulations of average monthly income by quarter that have been produced from SIPP on a regular basis, although not published since the fourth quarter of 1984. Such a report would enable analysts to become more familiar with the SIPP core information. This report, and others from the SIPP core, should include appendix material of the type that was included in the original quarterly reports on the quality of the data (e.g., information on item nonresponse rates and comparison of SIPP aggregates with independent sources). Research Reports In addition to regular publications that provide tabulations and other statis- tics from SIPP, we urge the Census Bureau to issue a research report series of special analytical studies on topics related to income and program par- ticipation. Special studies could cover both substantive and methodological subjects-such as an analysis of trends in income and poverty status for particular subgroups of the population and an investigation of new methods of estimating duration of spells of program participation-and would go well beyond the level of analysis provided in the descriptive reports. Such studies would of course draw heavily on SIPP but should also include rel- evant data, as appropriate, from such sources as the March CPS income supplement, other surveys, and administrative records. The importance of having a strong analysis program in the core subjects of SIPP at the Census Bureau stems from the agency's role as the sponsor agency for both SIPP and the March CPS income supplement and the fact that there is no other center for income statistics. The Census Bureau's program should be at least as strong as the analysis program for labor force topics in the Bureau of Labor Statistics (BLS). Indeed, BLS's Monthly Labor Review offers a possible model for the income and program research report series (although the Census Bureau's series could be quarterly or semiannual). The Census Bureau needs to have a strong analytical capability, not only to serve as a beacon and source of information for the user community, but also for its own purposes (as we note above). Such a capability should put the Bureau in a better position to understand user data requirements, to assess and improve the survey data, and periodically to evaluate and im- prove the basic descriptive report series. We understand that Statistics Canada gives strong support to in-house analysis programs, publishing spe- cial studies in Perspectives on Labour Force and Income. Indeed, the Canadian statistical agency made a deliberate effort in the 1980s to improve its analytical capability. We point out that the SIPP Working Paper series includes special stud

OCR for page 158
166 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION ies by Census Bureau staff and outside analysts of the type that we have in mind (see the section below on user information and training). This is an important and useful series to continue, but we believe that a regularly published research report series is also needed to provide a more visible outlet and a strong motivation for in-depth substantive analysis as well as methodological investigations on the part of the Census Bureau's income and program staff. (Most of the Census Bureau contributions to the SIPP Working Paper series to date focus on survey research and methodological issues rather than analysis methods or substantive research findings.) In addition, the Census Bureau should encourage the analysis staff-to submit articles for publication in professional journals. Reports on Demographic and Employment Transitions SIPP is a rich source of information on a wide range of topics other than income and program participation. As we have noted, the Census Bureau has prepared a number of interesting and valuable publications from various SIPP topical modules. We wholeheartedly support continuation of such series; see Figure 6-2 for the schedule of topical module (and core) reports planned for 1992. We also urge the Census Bureau analysis staff from both the Population and HBS Divisions to give attention to a somewhat neglected aspect of SIPP: its potential for analyzing the dynamics of family and household composition over time and the correlates and consequences of key demo- graphic and employment events (such as mamage, job loss, or retirement). Some reports have been published or planned in this area (see Table 6-1 and Figure 6-2), and, in addition, the HHES publications on income and pro- gram participation include some statistics on related demographic and em- ployment transitions. However, we believe that much more should be done. We envision a series of publications that would focus on demographic and employment events for example, comparing the economic situation before and after marriage or divorce or widowhood for all people expen- encing each type of marital status change.3 The series would include the following: summary annual reports on a wide range of demographic and em- ployment transitions, including marital status change, family composition change involving children, change in residence, labor force status change, and job change; these reports would provide counts and basic charactens 3The report series on income and poverty transitions will look at the related but different issue of how many people entering a program or falling into poverty also experienced a change in marital status.

OCR for page 158
DATA PROD UCTS AND THEIR USE 167 Characteristics of Recipients and the Dynamics of Program Participation: 1987-88 Extended Measures of Well-Being: Selected Data from the 1984 Survey of Income and Program Participation (No. 26) Health Insurance Coverage: 1987 to 1990 (No. 29) Job Creation During the Late 1980s: Dynamic Aspects of Employment Growth (No. 27) What's It Worth? Earnings Data: 1990 Who's Helping Out? Support Networks Among American Families: 1988 (No. 28) Who's Minding the Kids? Child Care Arrangements: Fall 1988 NOTE: The P-70 series number is given for reports that have been released as of summer 1992; the other reports are scheduled for publication by December. In addition to Me reports shown above, a report based on SIPP data is scheduled for release in the P-23 series by December 1992: When Households Continue, Discontinue, and Form. FIGURE 6-2 SIPP Reports in P-70 Series Published in 1992 (in alphabetical order) tics, such as the age and sex of those people experiencing a particular type of change; and a special report each year that would provide an in-depth analysis of one or two particular types of events and their antecedents and consequences; a rotating schedule could be established, with reports on employment topics alternating with reports on demographic topics. Recent papers from the PSID by Burkhauser and Duncan (1988) and from the 1984 SIPP panel by David arid Flory (1988) and Ruggles and Williams (1987) provide examples of the types of analysis that the detailed special reports could include. Recommendations A strong publication program on income, program participation, and related topics is an essential component of the Census Bureau's responsibilities for SIPP. The program should include several types of descriptive and analyti- cal report series that provide basic information and more in-depth analysis from the survey. Recommendation 6-1: The Census Bureau should move forward with its plans for regular, comprehensive series of descriptive reports on income, programs, and related topics from the core

OCR for page 158
168 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION data in SIPP. Longitudinal statistics (e.g., on the dynamics and correlates of transitions in income, poverty, and program sta- tus) should be published; cross-sectional statistics should also be issued on a frequent schedule. The Census Bureau should also establish a research report series to include in-depth analytical and methodological studies of special topics related to income and program participation. Data sources for these studies could include in addition to SIPP- the March CPS income supplement and other surveys and ad- ministrative records. The Census Bureau should continue publications from the SIPP topical modules and also establish a regular series of summary and in-depth reports from SIPP on the dynamics and correlates of major demographic and employment transitions (e.g., mar- riage, retirement). The program outlined above is both important and ambitious. Our major concern is that the Census Bureau may underestimate the level of resources and capabilities required to carry it out. There are many complex technical issues involved in developing appropriate and policy-relevant sta- tistics from SIPP, particularly those based on the longitudinal data (see discussion in the next section). Development of useful series from SIPP also requires extensive analysis to understand the quality of the data and their comparability with other widely-used data sets, such as the March CPS. In addition, major work will be needed to develop the capacity to estimate from SIPP such statistics as after-tax income and appropriate val- ues for in-kind benefits. Hence, staff and resources need to be sufficient not only to produce publications, but also to support an ongoing program of research and development to identify and implement improved methods of analyzing and presenting statistics from SIPP. In this regard, it is critically important for the Census Bureau to invest in the skills and knowledge of the SIPP analysis staff so that they are up to date with relevant policy issues and analytical methods. There are many avenues to accomplish this goal. The Census Bureau already has experi- ence with several approaches: organizing in-house seminar programs and sessions at professional meetings for both staff and outside analysts to present findings and discuss analysis issues; commissioning experts, through joint statistical agreements or other means, to conduct research on specific ana- lytical issues (e.g., longitudinal weights); and making use of the American Statistical Association (ASA)/Census fellowship program to bring research- ers on-site to work with SIPP data and share their experiences. We urge the Bureau to sustain these efforts for SIPP and to focus them more directly on the needs of the analysis staff. In addition, the Bureau should provide

OCR for page 158
DATA PRODUCTS AND THEIR USE 191 had to slot the information from waves 3 and 4 for this rotation group into the reference months covered by waves 2 and 3 for the other three groups).36 The SIPP ACCESS system developed at the University of Wisconsin, with National Science Foundation funding, put the hierarchical files into the INGRES relational database management system and provided marry ancil- lary services to assist users. The SIPP ACCESS staff estimated that the system provided support for about two-thirds of the research on SIPP con- ducted by academic social scientists outside the Census Bureau during 1985- 1990, when the facility was in operation (David and Robbin, 1992:i). How- ever, users of SIPP ACCESS also found that various aspects of- the SIPP design made the data hard to understand and work with, and, indeed, SIPP quickly gained a reputation for complexity that detected at least some po- tential users from exploring the utility of the information for their needs (Committee on National Statistics, 1989:52~.37 One problem confronting early users of SIPP was alleviated when the Census Bureau developed fully linked longitudinal public-use files. Ini- tially, a 12-month file was developed from the 1984 panel, followed by a 32-month panel file. The longitudinal files include only information from the core questionnaire; users must merge topical module data from separate files Working with input from users (see, e.g., Smith, 1989), the Census Bureau recently redesigned the format of the wave files beginning with the 1990 panel-to include "person-month" records, that is, a record for each month for which a person has data from either a self or proxy inter- view or by means of imputation (e.g., the Type Z people not interviewed in an otherwise cooperating household). This format reduces waste space because records are omitted for months for which a person has no data. Also, records can readily be aggregated in a variety of ways-for example, to produce estimates for all people for a calendar month or estimates for all months of a wave, or to create new family or household variables to at- tribute to persons-without confusion about which records to include (see McMillen, 1990~. Most users expect that the person-month format will be significantly easier to understand and use.38 36This feature of the design, which was intended to better align certain topical module questions with a particular time of the year for all four rotation groups, was dropped at user insistence beginning with the 1987 panel. (See Committee on National Statistics [1989:Table 2-13 for a listing of the interviews received by each rotation group in the 1984-1986 panels.) 37The CNSTAT report noted that a good deal of the complexity of the SIPP data reflects the real world and is not something that the Census Bureau should attempt to simplify. However, the report urged that unnecessary complexities in the survey design and file structures be reduced. 38However, no file format for such a complex survey as SIPP will serve all users equally well. The Census Bureau will need to evaluate the successes and problems that users experi

OCR for page 158
192 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION The Census Bureau has also made other improvements to the microdata files in consultation with users: for example, standardizing the codes used to indicate that the respondent was not in the universe for a particular item and hence not asked the question. This improvement is important because of the highly complex skip patterns in the SIPP questionnaire and hence the need to distinguish carefully between a not-in-universe situation and a true zero or negative response (e.g., to a question about income amounts). Another recent innovation with regard to SIPP microdata products is that Census Bureau staff have developed an on-line system (SIPP On Call- Data Extraction System) for users to specify and receive extracts (as SAS or plain ASCII files) from the public-use SIPP microdata over dial-up telecom- munication lines (Bureau of the Census, l991C).39 At present, the capabili- ties of the system are limited for example, there is no facility to extract records based on the value of a continuous variable such as income or to use a recode of one or more variables in the record retrieval specifica- tions.40 Priorities for Improvements Because of the critical importance of microdata for research, microsimulation. and other types of policy analysis with SIPP, we urge the Census Bureau to continue to seek ways to improve the timeliness, format, content, and other aspects of SIPP microdata products. Timing Although the Census Bureau has made commendable progress in improving delivery schedules for SIPP microdata files, we believe that further im- provements in timeliness are both necessary and feasible. In order to achieve the goal of SIPP's serving as the nation's main source of income statistics, core data files must be available from SIPP on about as timely a basis as they are from the March CPS income supplement~urrently about 6 months ence with the person-month format and consider ways to alleviate problems that arise. For example, it may be possible for the Census Bureau to provide illustrative SAS code that would help users with particular kinds of applications. 39The SIPP ACCESS system was transferred from the University of Wisconsin to the Cen- sus Bureau in mid-1990, and Bureau staff worked to make it operable at the Bureau for access to the 1984 and 1985 panels; however, the staff decided to develop SIPP On Call instead for access to the person-month files from the 1990 and subsequent panels. SIPP On Call also includes an electronic mail feature for users of the system to communicate with each other and the Census Bureau. 40For a recent evaluation of SIPP On Call prepared for the Food and Nutrition Service (which provided funding to help develop the system), see Doyle and Cohen (1992).

OCR for page 158
DA TA PROD UCTS AND THEIR USE 193 after data collection. The SIPP rotation group structure and the length and complexity of the SIPP questionnaire have made it difficult to contemplate releasing SIPP files on the same schedule as files from the March CPS. However, we believe that the implementation of computer-assisted personal interviewing (CAPI) and database management technology for SIPP should make it possible to move toward-and achieve that goal. Kinds of Files Currently, the Census Bureau releases wave and panel files from each panel of SIPP (separate wave files for core and topical module information). The panel files are in the rectangular format, and we encourage the Census Bureau to consider converting them to the person-month format of the wave files. (The space-saving features of the person-month format would be particularly valuable for the lengthy panel files.) We also urge the Bureau to release calendar-year files that contain data from both panels that are in the field at the same time. Although the ability to combine panels was originally viewed as an important feature of SIPP, the Census Bureau has, to date, approached the processing of each SIPP panel as a completely separate operation. The delays in releasing files from the early panels meant that users had to wait for very long periods to be able to combine, say, wave 6 of the 1984 panel with wave 3 of the 1985 panel. At present, the Census Bureau's data delivery schedule for SIPP specifies approximately simultaneous release of wave files from panels that are in the field at the same time (e.g., the core files for wave 8 of the 1990 panel and wave 5 of the 1991 panel are both targeted for release in April 1993~. Hence, users can readily develop wave files that combine panels. We propose that the Bureau take the next step of preparing calendar-year files from combined panels, as such files are likely to prove very useful for many research and policy analysis purposes (e.g., for use in microsimulation models of tax and transfer programs). Content and Coding We encourage the Census Bureau to continue working with user groups (see discussion of advisory mechanisms in Chapter 8) to identify and implement changes to the content and fort of the SIPP microdata files in order to enhance their utility and ease of use. For example, users may identify recoded variables that it would be helpful for the Bureau to create rather than leaving them to the user. We note that adding variables can add processing costs and result in records that are difficult for users to work with However, in some in- stances the Census Bureau's efforts to keep the records in the SIPP data

OCR for page 158
194 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION files to a manageable length may have gone too far. Thus, the person- month format excludes some variables that were available in the rectangular format (e.g., it is no longer possible to determine coverage by more than one health insurance plan in the person-month files). Also, some program- related variables for which the Food and Nutntion Service provided funding to include on the initial 12-month longitudinal file from the 1984 panel were never adopted for the 32-month panel files. We urge the Census Bureau to consult with users about the benefits of reinstating these vari- ables. We also encourage the Census Bureau to work closely with users to further improve the information in the data files about missing values. As a general policy, the Bureau prefers to provide users with complete records for every respondent in a survey by supplying values for missing items through some type of imputation procedure. Complete records have the advantage of maximizing the sample size for analysis, as cases do not have to be discarded because of missing information. Also, there is the advan- tage that the Bureau can implement imputations in a consistent manner (individual users may vary in the sophistication and care with which they supply values for missing items). In a separate set of fields, the Bureau generally provides yes-no indicators of whether an individual item was re- ported or imputed. Because the imputations performed by the Census Bu- reau may have disadvantages for certain analyses (e.g., see the discussion in Chapter 3 of problems with the imputation of income and assets for pro- gram recipients) and because the imputation rates can be quite large for some variables, it is important for users to have as much information as possible about them. We believe it would be helpful for users to assess the quality of the imputations from the perspective of particular analyses if the imputation flags contained information about the reason for an imputaiion- that is, whether the respondent refused to answer a question or did not know the answer. Delivery Media We encourage the Census Bureau to explore alternative media for delivery of SIPP microdata to users that can make easier the process of obtaining extracts for analysis. We note that the days of 9-track magnetic tape as a medium for data dissemination are numbered, as more and more researchers are tuning away from cumbersome mainframe systems to work with micro- or minicomputer hardware and software that use some type of direct access disk media for file storage and input and output. The familiar floppy diskettes are much too small to serve as a file storage medium for a survey as large as SIPP. However, high-storage ca- pacity CD-ROM (compact disk-read only memory) technology is rapidly

OCR for page 158
DATA PRODUCTS AND THEIR USE 195 gaining popularity for large data sets. Users of the National Longitudinal Surveys of Labor Market Experience (NLS) currently choose CD-ROM over tape by about four to one The Census Bureau is now releasing CD-ROM versions of the public-use data sets from the 1990 census. CD-ROM with suitable extraction software could well be a useful access medium for SIPP (although further improvements in microcomputing hardware may be neces- sary before CD-ROM becomes sufficiently fast and easy to access for such a large data set as SIPP). We also encourage the Census Bureau to further develop its on-line extraction system (SIPP On Call), which could save users the time and expense of acquiring and archiving complete SIPP files when they only require a subset of the data. To be most useful, the Bureau needs to add sophisticated retrieval capabilities to the system. In addition, if SIPP On Call is to be an effective means for users to work with the large volume of SIPP data, the Bureau needs to provide access to the system over high- speed communications lines for example, those provided by Internet. Moving large amounts of data over regular telephone lines is tedious and costly. Finally, we note the importance for the effective use of CD-ROM and on- line technology of having full documentation- including frequencies for each variable" integrated with the actual data (see discussion below and in Chapter 5~. Recommendation Recommendation 6-3: The Census Bureau should continue to develop improved microdata products from SIPP to support policy analysis and social science research. Priority improvements in- clude: moving toward ~ goal of releasing core data files within 6 months after the end of data collection; producing calendar-year files that combine panels, in addi- tion to wave and panel files; determining, in consultation with users, changes and addi- tions to the file contents that would assist their analyses; and developing additional ways of delivering SIPP microdata products to users, such as by means of high-storage capacity compact disks (CD-ROM) and an improved on-line data extrac- tion system. DOCUMENTATION AND SERVICES FOR USERS For effective use of large, complex data sets, users need not only the data, but also what has been termed the metadata, that is, information that en

OCR for page 158
96 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION ables the user to access, understand, and analyze the data appropriately (see David, 1991; David and Robbin, 1989, 1990~. Users of computer-readable products most obviously need basic documentation that enables them to instruct a computer program how to "read" the data on the magnetic tape or other medium. In addition, users of computer products need information to help them understand the quality and meaning of the data. Users of printed publications require such information as well. The larger and richer the data set, the more extensive must be the accompanying documentation- also, the greater the need for ancillary ser- vices, such as training sessions, working papers, and other means-of reach- ing and educating users about the potentials and pitfalls of the data and data products. An investment in documentation and related services is amply justified in that it minimizes wasted time and resources and increases the return to users from their processing and analysis efforts. Good documenta- tion makes a vital contribution to the development of a strong and growing community of users for a survey like SIPP. Documentation and Related Services to Date Microdata Documentation From the beginning of SIPP, each microdata file has been accompanied by a codebook providing basic information on the file structure and tape location and content of each variable. Codebooks are available in printed form and as machine-readable files attached to the data files. A SIPP Users' Guide containing additional explanatory information for users about SIPP and its microdata products was initiated at the start of the survey but took several years to prepare the first edition was released in 1987 (Bureau of the Census, 1987~. The guide included chapters on survey design, survey con- tent, structure of the cross-sectional public-use microdata files, use of cross- sectional files for estimation and analysis, linking waves, and assessing the reliability of SIPP data. A second edition that added a chapter about SIPP cross-sectional weighting procedures and appendix material about the 1990 panel and the new person-month format was released in late 1991 (Bureau of the Census, l991e). The initial documentation did not include frequencies for the variables; at the behest of users, the Census Bureau contracted in 1989 to have fre- quencies prepared for each file and made available on diskettes, with a subset of key control counts provided in printed form. Such frequencies, which indicate the distribution of responses to each item, are invaluable tools for users in making initial decisions about variables and population subgroups to analyze, hypotheses to explore, and analytical methods to use. To inform data file users about problems with the files or documenta

OCR for page 158
DATA PRODUCTS AND THEIR USE 197 lion, the Census Bureau has a SIPP User Notes series that is sent to all file purchasers and can also be obtained on request. Notice of the user notes is contained in ~ supplement to the newsletter of the Association of Public Data Users (APDU) that is mailed to a large list of people who have in- quired of the Census Bureau about SIPP.4i Finally, the SIPP Qualiry Pro- file (Jabine, King, and Petroni, 1990) is a very valuable tool for informing data file users about the quality of the survey information. Documentation for Printed Reports Each SIPP publication includes appendix material that describes the survey, defines key terms, indicates how to make approximate calculations of sam- pling errors of the estimates, and briefly reviews other sources of nonsampling error (e.g., underreporting). Also, reference is generally made to the addi- tional detailed information on sampling and nonsampling errors provided by the SIPP Quality Profile. Other User Services The Census Bureau regularly publishes What's Available from SIPP (e.g., Bureau of the Census, l991g), a highly useful basic reference source that lists the publications, data files, and working papers from the survey. For a number of years, the Census Bureau supported a vigorous, proactive pro- gram to educate and inform users about SIPP and to keep users aware of others who were analyzing the data. This program included training ses- sions offered as part of the summer program of the Inter-university Consor- tium for Political and Social Research at the University of Michigan and as workshops in conjunction with many professional association meetings. Ire addition, the Bureau published the SIPP Working Paper series, which, by the end of 1990, totaled 140 substantive and methodological papers by analysts both inside and outside the Census Bureau (Bureau of the Census, l991g).42 Bureau staff also organized sessions that featured SIPP research at professional association meetings and published compilations of SIPP- related papers that were presented at meetings of the ASA. Census Bureau staff further encouraged SIPP researchers to apply for ASA/Census fellow- ships to use the data files on-site at the Bureau. In addition, Bureau staff regularly appeared at the monthly meetings of SIPP analysts in the Wash 41Arrangements for the APDU SIPP supplement and for an APDU SIPP committee to consult with the Census Bureau about the SIPP data products and documentation were made in early l9g9. 42For this series, Census Bureau staff identify papers using SIPP data that have been pre- pared for professional meetings or initiated in draft, solicit their inclusion in the series, have them reviewed by one or two others, and edit them and prepare reproducible copy.

OCR for page 158
198 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION ington, D.C., area and were available to meet with groups of users in other locations on request. All of these were valuable activities that enabled users and potential users to become informed about SIPP and keep abreast of what others were learning from the data. About 2 years ago, the loss of SIPP staff who had been most active in this program led to a suspension or reduction of many of these services. Recently, with the appointment of a SIPP liaison in the HHES Division (see Chapter 8), activities such as releasing new titles in the SIPP Working Pa- per series and providing workshops about SIPP at professional association meetings have started up again. However, the level of activity has not yet reached that of the earlier years. Recommendation We urge the Census Bureau to continue regular consultations with users about needed kinds of documentation and other informational and instruc- tional matenals. We cannot stress enough the importance of having com- prehensive, accurate, and intelligible documentation and related services to interest users in the potential of SIPP data and to enable them to make the most cost-effective use of the data. We see a number of areas in which improvements to the current documentation, information, and training pack- age for SIPP would be useful. First, it is vital, as part of the implementation of the proposed redesign of SIPP, to make use of CAPI and database management system technology to fully integrate the microdata file documentation with the actual data. Such integration should enable immediate calculation of frequencies for variables and inclusion of the frequencies in the printed and machine-read- able forms of the codebook. Integration should also reduce the likelihood of errors in the documentation, such as field positions not matching the actual positions in the file, and make it possible to improve the description of skip patterns in the questionnaire that define the universe of respondents for particular items. With regard to the microdata documentation, we note that an adequate description has never been developed for one important aspect of the SIPP processing system that affects the quality of a significant portion of the data: the procedures used to impute values for missing items.43 Although it may be infeasible and indeed unnecessary to provide detailed imputation 43Documentation has also never been provided for the elaborate routines that are used to edit inconsistent replies or, in some cases, to supply values for missing responses by means of an edit rather than imputation. These routines, which are highly specific to individual vari- ables, present a daunting documentation task. A benefit from the use of CAPI technology for SIPP should be that inconsistent replies are either resolved in the field or accepted, thus minimizing the need for after-the-fact editing.

OCR for page 158
DATA PRODUCTS AND THEIR USE 199 specifications for each vanable, it should certainly be possible to describe the procedures used for various classes of variables and to provide illustrative information on the effects of imputations (e.g., before-and-after distributions for selected variables). In addition, documentation is needed for variables in the data files that result from a process of recoding other vanables.44 Again, integration of the data and documentation in a database management system may well facilitate the development of useful descriptions of imputation procedures as well as documentation of recoded variables. It is also important to develop means to more frequently update docu- ments such as the SIPP Users' Guide that provide important contextual information. The material in a well-formulated, comprehensive guide can be invaluable in orienting users to the data and alerting them to processing and analytical pitfalls. The limited background information earned ire the codebook or documentation of individual variables and codes is not suffi- cient for these needs. Only two editions of the SIPP Users' Guide have been issued to date, even though SIPP has gone through many changes since 1983, and not all of those changes are well reflected in the latest edition (Bureau of the Census, l991e). Most notably, there is little informa- tion provided about the longitudinal panel files, although they represent a widely used end complex~ata product from SIPP. This deficiency needs to be remedied. Similarly, we encourage the Census Bureau to evaluate and determine ways to enhance the text in SIPP reports that is intended to educate and warn readers about the data contents. Cross-references to such other docu- ments as the SIPP Users' Guide and the SIPP Quality Profile are helpful, but many users will not seek out those references; hence, it is important to provide as much pertinent information in the report itself as possible. We have commented on the valuable nature of the venous ancillary informational and instructional materials (e.g., working papers, compila- tions of professional association papers) and training programs that were developed for SIPP. We urge the Census Bureau to restore and enhance these programs to serve the growing community of SIPP users. The pub- lished research report series that we recommend above will also play a valuable role in this regard.45 Preparation of a complete on-line bibl~ogra 44Work is in progress under a joint statistical agreement between the Census Bureau and the University of Michigan to develop documentation for the longitudinal imputations in the SIPP panel files, and the Census Bureau expects to make arrangements with another organization to obtain documentation for the cross-sectional imputations and edits. Also, work is in progress by Social and Scientific Systems, Inc., under contract to the Census Bureau, to develop docu- mentation for recoded vanables. 45The research report series will not, at least until it is well established, substitute for the SIPP Working Paper series, which makes available the work in progress of outside analysts as well as Census Bureau staff

OCR for page 158
200 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION phy that includes relevant Census Bureau staff memoranda that would not otherwise be known to most users is also an idea to consider.46 Finally, we believe it is important for the Census Bureau to take steps to ensure that there are effective channels for individual users to communi- cate both problems and suggestions for SIPP data products and documenta- tion and to obtain timely feedback from Bureau staff (see Chapter 8 for a discussion of more formal advisory mechanisms). Because of the decentral- ized system of operations at the Census Bureau for SIPP and other surveys, it has not always been clear to users which staff members to consult about problems and suggestions. Even when a responsive staff member- has been reached, it has not always been clear that there is an effective, timely sys- tem of internal communications within the Census Bureau to ensure that all relevant staff members such as those in data processing and data user services are informed and able to take appropriate action. Nor is it always clear that there are effective means of informing the user, or other users, of the reasons for the problem and the nature of the proposed solution or of the response to a suggestion. The recent establishment of a SIPP liaison position in HHES is helpful in this regard, as is the use of the APDU SIPP Supplement as a vehicle to reach users. We urge the Census Bureau to keep a vigilant eye on its user- staff communication channels and act promptly to keep them functioning in an open and timely manner. The upcoming redesign of SIPP, which will entail changes in data products and documentation, makes it all the more important to have good means of communication with individual users and the user community as a whole. Recommendation 6-4: The Census Bureau should work to im- prove documentation and related user information services for SIPP. Priority improvements include: making use of CAPI and database management system tech nology to fully integrate documentation (including frequency counts for variables) and data; developing documentation for recoded variables and the types of imputations that are performed for missing data in SIPP; developing means to update key explanatory documents, such as the SIPP Users' Guide, on a more frequent basis; restoring and expanding information and training programs, such as training sessions, working papers, and compilations of professional society presentations; and 46The SIPP ACCESS project developed such an on-line bibliography of SIPP working papers, presentations, and memoranda, which could serve as a model.

OCR for page 158
~ =~1~1aluing e~ct1ve channels of communication For users to Wed back problems and suggesdous and learn of the Bureaus response, and for users to be indeed of new development in the survey and as data product 207