Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 158
6
Data Products and Their Use
Obviously, an investment in data collection can only earn a return to the
extent that the data are used for basic and applied research, policy analysis,
and improved public information. In order for the investment in a rich,
complex survey such as SIPP to earn a high return, it is imperative that the
responsible agency have an active data dissemination program that includes
published reports, computer-readable data products, and associated explana-
tory materials all produced on a timely basis and in accessible formats.
In this chapter we discuss the requirements for an effective data dis-
semination program for SIPP. We cover both the types of reports that
should be developed and some of the conceptual and measurement issues
that arise in estimating income and program statistics from the complex
information in SIPP. We also consider microdata products and review the
kinds of informational and instructional materials that SIPP users-whether
of computer-readable files or printed reports need in order to make the
most effective use of the survey data.
PUBLICATIONS
Regular publication series from a major, continuing survey such as SIPP
serve many important purposes. Such publications, containing basic de-
scriptive statistics plus key analytic measures (e.g., spell lengths for pro-
gram participation), are a valuable reference source for the general user-
and their value increases as each successive report adds to a time series.
158
OCR for page 159
DATA PRODUCTS AND THEIR USE
159
The annual P-60 series on income and poverty from the March Current
Population Survey (CPS) is a notable example each fall's publication is
eagerly awaited and immediately used by a broad community of policy
analysts, researchers, and executive branch and congressional staff. Such
publications also serve to orient an analyst who is using or plans to use the
more detailed information contained in computer data products: they intro-
duce the analyst to the survey, help the analyst develop fruitful study plans
(e.g., the numbers may suggest hypotheses or indicate that the sample size
is or is not sufficient for analysis of subgroups), and provide important
control totals for the analyst to determine the accuracy of his or her com-
puter output. The last function is particularly important for a complex
survey like SIPP.
Preparation of regular publications is also vitally important to the agency
that sponsors the survey. It is only by having analysts who work with the
data regularly develop tabulations and analytic measures that the agency
can gain first-hand, in-depth knowledge of the quality and utility of the
information. The agency, of course, needs input from outside users regard-
ing data quality and utility, but it needs its own assessment as well to plan
needed improvements in the survey and to provide informed guidance to
users.
For most of its household surveys, the Census Bureau is the data collec-
tion agency but not the sponsor agency and so is not directly involved with
the publication program. However, for SIPP, the Census Bureau is both the
sponsor and the collection agency and, consequently, has publication re-
sponsibility. It is especially important that the Census Bureau have a com-
prehensive publication program for SIPP because of the richness and com-
plexity of SIPP data. Users need to be made keenly aware, through regular
publications that present and explain key indicators, of both the analytical
power of SIPP-based measures and the problems that may result from in-
complete understanding of such measures. To date, the publication program
for SIPP, while including many useful reports, has not adequately served
these needs.
A Checkered History
The Census Bureau's publication program for SIPP has been very uneven,
including a stretch of several years in which almost nothing was published
from the core information on income and program participation (see Table
6-1 for a chronological list of SIPP report titles published through 1991~.
Initially, the Census Bureau, which established a new Household Economic
Studies series (P-70) for SIPP, fully intended to publish a regular set of
cross-sectional statistics from the core. The first SIPP report, released in
September 1984, provided average monthly data on income and program
OCR for page 160
160
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
TABLE 6-1 SIPP Reports Published in P-70 Series Through 1991 by
U.S. Bureau of the Census (in chronological order)
Public all on
Date (and Source of Data
Number) Report Title Wave and Panel
-
Sept. 1984 Economic Characteristics of Households in Wave 1, 1984
(P-70-1) the United States: Third Quarter 1983
Feb. 1985 Economic Characteristics of Households in Waves 1-2, 1984
(P-70-2) the United States: Fourth Quarter 1983
April 1985 Economic Characteristics of Households in Waves 2-3, 1984
(P-70-3) the United States: First Quarter 1984
May 1985 Economic Characteristics of Households in Waves 3-4, 1984
(P-70-4) the United States: Second Quarter 1984
Oct. 1985 Economic Characteristics of Households in Waves 3-5, 1984
(P-70-5) the United States: Third Quarter 1984
Jan. 1986 Economic Characteristics of Households in Waves 4-5, 1984
(P-70-6) the United States: Fourth Quarter 1984
July 1986 *Household Wealth and Asset Ownership: 1984 Wave 4, 1984
(P-70-7)
Dec. 1986 *Disability, Functional Limitations and Wave 3, 1984
(P-70-8) Health Insurance Coverage: 1984-85 (disability)
Waves 2-9, 1984
(health insurance)
May 1987 *Who's Minding the Kids? Child Care Wave 5, 1984
(P-70-9) Arrangements: Winter 1984-85
Aug. 1987 *Male-Female Differences in Work Experience, Wave 3, 1984
(P-70-10) Occupation, and Earnings: 1984
Sept. 1987 *What's It Worth? Educational Background and Wave 3, 1984
(P-70-11) Economic Status: Spring 1984
Sept. 1987 *Pensions: Workers Coverage Wave 4, 1984
(P-70-12) and Retirement Income: 1984
Oct. 1988 *Who's Helping Out? Support Networks Among Wave 5, 1984
(P-70- 13) American Families
April 1989 Characteristics of Persons Receiving Benefits 1984 panel file
(P-70- 14) from Major Assistance Programs
Aug. 1989 Transitions in Income and Poverty Status: 1984 panel file
(P-70-l5) 1984-85
July 1989 Spells of Job Search and Layoff. . . and 1984 panel file
(P-70- 16) Their Outcomes
OCR for page 161
DATA PRODUCTS AND THEIR USE
TABLE 6-1 Continued
161
Publication
Date (and
Number)
Source of Data
Wave and Panel
March 1990 Health Insurance Coverage: 1986-88 Waves 1-8, 1985
(P-70-17) Waves 1-7,
1986-1987,
1985 panel file
June 1990 Transitions in Income and Poverty Status: 1985 panel file
(P-70- 18) 1985-86
June 1990 * The Need for Personal Assistance with Every- Wave 6, 1985
(P-70-19) day Activities: Recipients and Caregivers Wave 3, 1986
July 1990 *Who's Minding the Kids? Child Care Wave 6, 1985
(P-70-20) Arrangements: Winter 1986-87 Wave 3, 1986
Wave 6, 1986
Wave 3, 1987
Oct. 1990 *What's It Worth? Educational Background Wave 2, 1987
(P-70-21) and Economic Status: Spring 1987
Dec. 1990 *Household Wealth and Asset Ownership: 1988 Wave 7, 1986
(P-70-22) Wave 4, 1987
Jan. 1991 Family Disruption and Economic Hardship: 1984 panel file
(P-70-23) The Short-Run Picture for Children
Aug. 1991 Transitions in Income and Poverty: 1987-88 1987 panel file
(P-70-24)
June 1991 *Pensions: Worker Coverage and Retirement Wave 7, 1985
(P-70-2s) Benefits: 1987 Wave 4, 1986
*Denotes reports based primarily on data from topical modules.
participation for the third quarter of 1983 (July-September) from wave 1 of
the 1984 panel. From February 1985 to January 1986, the Census Bureau
published five more quarterly reports-for the fourth quarter of 1983 through
the fourth quarter of 1984 in the same format, but then discontinued the
series (see Figure 6-1 for the contents of the quarterly reports).
There were several factors behind the decision to drop the quarterly
reports. First, as we discuss in Chapter 5, the Census Bureau clearly under-
estimated the resources and capabilities required to process the volume of
SIPP data that poured in from the field. Very quickly, the data processing
system began to buckle under the strain' with consequent delays in both
data files and tabulations for publication. For a time, each successive panel
took longer and longer to process, and, indeed, the Census Bureau did not
OCR for page 162
62
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
For all persons
· monthly household cash income (mean, median, and distnbunon from under $300 to
$4,000 and over) by sex crossed by race and ethnicity, metropolitan residence, region,
household relationship, age, labor force status, and work disability status;
· residence in household receiving cash benefits, food stamps, and other noncash benefits
for persons by characteristics (as listed above); and
· mean monthly household cash income, receipt of unemployment compensation, and
household receipt of cash benefits, food stamps, and ocher noncash benefits by age and sex
crossed by labor force status.
For persons aged 16 and over
· monthly earnings (mean, median, and distubunon) by sex and full-time versus ocher
work status crossed by race and ethnicity, age, household relationship, and current occupation.
For households
· mean monthly household cash income, and household receipt of unemployment
compensation, cash benefits, food stamps, and other noncash benefits by labor force status of
household crossed by household type;
· monthly household cash income (mean, median, and distribution from under $300 to
$4,000 and over) by race and edacity of householder, metropolitan residence, region,
household type, age of householder, and work disability status of householder;
· mean monthly household cash income, and household receipt of unemployment
compensation, cash benefits, food stamps, and other noncash benefits by charactensucs (as
listed immediately above);
· household receipt of food stamps, WIC, free or reduced-pnce school meals, public or
subsidized housing, Medicaid, Medicare, AFDC or other cash assistance, SSI, social security,
veterans' benefits, and unemployment compensation by household type crossed by household
receipt of food stamps, etc. (same categories as in table heading also for persons); and
· monthly household cash income (mean, median, distnbuiion) by household receipt of
earnings, property income, social secunty, private pensions, federal government retirement,
U.S. military retirement, state or local government retirement, veterans' payments, private
support payments, AFDC or other cash assistance, SSI, unemployment compensation, other
income, food stamps, WIC, free or reduced-pnce school meals, public or subsidized housing,
energy assistance, Medicaid, Medicare.
FIGURE 6-1 Contents of SIPP Quarterly Reports, Series P-70, Nos. 1-6.
publish any reports-using either core or topical module information-from
other than the 1984 panel until 1990.
Second, Census Bureau analysts documented disturbing anomalies in
the quarterly data that were hard to explain. A comparison of aggregate
SIPP figures for selected income types with independent sources formed a
regular (and highly useful) feature of the reports. However, some income
types showed erratic patterns in comparison with the independent sources:
for example, average monthly unemployment insurance benefits from SIPP
were more than 100 percent of the independent source for the third and
OCR for page 163
DATA PRODUCTS AND THEIR USE
163
fourth quarters of 1983, but dropped to 85 and 80 percent, respectively, in
the first and third quarters of 1984.
Third, for most income sources, Census Bureau analysts believed that
the reports showed little change from quarter to quarter and hence would
not be interesting to users. The analysts also determined that sample sizes,
particularly after reductions due to budget cuts, were insufficient in many
cases to ascertain quarter-to-quarter changes that were significant. For all
these reasons, the Census Bureau dropped the quarterly series (although the
tabulations continued to be produced in-house).
From February 1986 through April 1989, the only SIPP publications
were cross-sectional reports based primarily on topical modules from the
1984 panel. (The modules are generally easier to analyze than the core-
for one thing, they are specific to an interview wave.) The seven reports
published during this period provided interesting and often path-breaking
statistics on the following topics: child care, disability and health insurance
coverage, household wealth and asset ownership, educational background
and economic status, pensions, sex differences in work experience and oc-
cupation and earnings, and support networks.
From April 1989 through December 1991, the Census Bureau stepped
up the pace and increased the scope of SIPP publications, releasing 12
reports in the P-70 series. Four of these reports on child care, educational
background and economic status, household wealth and asset ownership,
and pensions updated previous publications, based on the 1984 panel, with
data from comparable topical modules in later SIPP panels. A fifth report
analyzed the topical module data on caregiving from the 1985 and 1986
panels. The other seven reports used the core data contained in longitudinal
files created from all waves of a SIPP panel. Reports from the 1984 panel
file included characteristics of persons receiving benefits from major assis-
tance programs, transitions in income and poverty status for 1984-1985,
spells of job search and layoff, and the effects of family disruption and
economic hardship on children. Reports from the 1985 panel file included
transitions in health insurance coverage (this report also used the core data
Tom several SIPP panels to provide quarterly estimates of health insurance
coverage for 1986-1988) and transitions in income and poverty status for
1985-1986 (a modified version of the 1984-1985 report). Finally, a third
report in the series, on transitions in income and poverty status (for 1987-
1988), used data from the 1987 panel file.
1In addition, tabulations on home ownership from the assets and liabilities topical module in
wave 4 of the 1987 panel were published in the Current Housing Reports series (Fronczek and
Savage, 1991); rates of migration calculated from the 1984 panel longitudinal file were published
in the Special Studies series (DeAre, 1990); and tabulations on maternity leave arrangements
during the years 1961-1985 from the fertility history topical module in wave 8 of the 1984 panel
and wave 4 of the 1985 panel were published in the Special Studies series (O'Connell, 1990).
OCR for page 164
164
Descriptive Reports
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
Reports on Income and Programs
The Census Bureau's publication plans for SIPP (see Bureau of the Census,
1991a) include a phased-in development of regular reports from the core
data on income and program participation. For the first time since the
quarterly reports were discontinued, cross-sectional as well as longitudinal
measures will be published on these topics. Both cross-sectional and longi-
tudinal data from the 1987 panel will be included in a report-on major
assistance programs that is scheduled for release in 1992. Updated cross-
sectional statistics will be published in 1993 on income, poverty status, and
programs, followed by publication in 1994 of updated longitudinal statistics
on transitions in income, poverty, and program participation from the 1990
panel.2 Thereafter, cross-sectional and longitudinal reports will alternate
yearly. In addition, the Census Bureau plans to prepare a major report that
compares annual income and poverty data from the 1990 SIPP panel with
data from the March 1991 CPS. There are also plans to incorporate some
SIPP-based tabulations into the P-60 report series from the CPS. We ap-
plaud the initiatives by the Census Bureau's Housing and Household Eco-
nomic Statistics Division (HHES) to develop a regular, comprehensive pro-
gram of publications from the core SIPP data.
One change that we urge in the overall publication plan relates to the
role of SIPP vis-a-vis the March CPS. As we recommend in Chapter 3, the
long-range goal should be for SIPP to become the centerpiece of the nation's
income statistics. Hence, we urge HHES to reconsider its stated intention
that the March CPS remain the primary source for annual income and po,~-
erty estimates and to work instead towards a more prominent role for SIPP.
Specifically, HEWS should consider a publication schedule for SIPP, once
the new design is phased in, of releasing cross-sectional statistics every
year instead of in alternate years (with longitudinal statistics being released
every 2 years). Ultimately, HHES should look to scale back the extent of
detail in the P-60 series from the March CPS as users become accustomed
to the SIPP data and a sufficient time series is built up from SIPP to support
mend analysis. In the interim, we are very supportive of HHES's plans to
assess the comparability of SIPP and March CPS estimates and to develop
SIPP-based tabulations for inclusion in the P-60 series as an immediate way
to make more use of SIPP and alert users to the additional detail that it
provides.
2Updated longitudinal information is not scheduled for publication earlier than 1994: the
1988 and 1989 panels were truncated lo Six and three waves, respectively, and hence do not
provide sufficient periods of observation, and the complete longitudinal file from the 1990
panel will not be available until late 1993.
OCR for page 165
DATA PRODUCTS AND THEIR USE
165
Finally, we believe it would be useful for the Census Bureau to release
an historical report containing the tabulations of average monthly income
by quarter that have been produced from SIPP on a regular basis, although
not published since the fourth quarter of 1984. Such a report would enable
analysts to become more familiar with the SIPP core information. This
report, and others from the SIPP core, should include appendix material of
the type that was included in the original quarterly reports on the quality of
the data (e.g., information on item nonresponse rates and comparison of
SIPP aggregates with independent sources).
Research Reports
In addition to regular publications that provide tabulations and other statis-
tics from SIPP, we urge the Census Bureau to issue a research report series
of special analytical studies on topics related to income and program par-
ticipation. Special studies could cover both substantive and methodological
subjects-such as an analysis of trends in income and poverty status for
particular subgroups of the population and an investigation of new methods
of estimating duration of spells of program participation-and would go
well beyond the level of analysis provided in the descriptive reports. Such
studies would of course draw heavily on SIPP but should also include rel-
evant data, as appropriate, from such sources as the March CPS income
supplement, other surveys, and administrative records.
The importance of having a strong analysis program in the core subjects
of SIPP at the Census Bureau stems from the agency's role as the sponsor
agency for both SIPP and the March CPS income supplement and the fact
that there is no other center for income statistics. The Census Bureau's
program should be at least as strong as the analysis program for labor force
topics in the Bureau of Labor Statistics (BLS). Indeed, BLS's Monthly
Labor Review offers a possible model for the income and program research
report series (although the Census Bureau's series could be quarterly or
semiannual).
The Census Bureau needs to have a strong analytical capability, not
only to serve as a beacon and source of information for the user community,
but also for its own purposes (as we note above). Such a capability should
put the Bureau in a better position to understand user data requirements, to
assess and improve the survey data, and periodically to evaluate and im-
prove the basic descriptive report series. We understand that Statistics
Canada gives strong support to in-house analysis programs, publishing spe-
cial studies in Perspectives on Labour Force and Income. Indeed, the
Canadian statistical agency made a deliberate effort in the 1980s to improve
its analytical capability.
We point out that the SIPP Working Paper series includes special stud
OCR for page 166
166
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
ies by Census Bureau staff and outside analysts of the type that we have in
mind (see the section below on user information and training). This is an
important and useful series to continue, but we believe that a regularly
published research report series is also needed to provide a more visible
outlet and a strong motivation for in-depth substantive analysis as well as
methodological investigations on the part of the Census Bureau's income
and program staff. (Most of the Census Bureau contributions to the SIPP
Working Paper series to date focus on survey research and methodological
issues rather than analysis methods or substantive research findings.) In
addition, the Census Bureau should encourage the analysis staff-to submit
articles for publication in professional journals.
Reports on Demographic and Employment Transitions
SIPP is a rich source of information on a wide range of topics other than
income and program participation. As we have noted, the Census Bureau
has prepared a number of interesting and valuable publications from various
SIPP topical modules. We wholeheartedly support continuation of such
series; see Figure 6-2 for the schedule of topical module (and core) reports
planned for 1992.
We also urge the Census Bureau analysis staff from both the Population
and HBS Divisions to give attention to a somewhat neglected aspect of
SIPP: its potential for analyzing the dynamics of family and household
composition over time and the correlates and consequences of key demo-
graphic and employment events (such as mamage, job loss, or retirement).
Some reports have been published or planned in this area (see Table 6-1 and
Figure 6-2), and, in addition, the HHES publications on income and pro-
gram participation include some statistics on related demographic and em-
ployment transitions. However, we believe that much more should be done.
We envision a series of publications that would focus on demographic
and employment events for example, comparing the economic situation
before and after marriage or divorce or widowhood for all people expen-
encing each type of marital status change.3 The series would include the
following:
· summary annual reports on a wide range of demographic and em-
ployment transitions, including marital status change, family composition
change involving children, change in residence, labor force status change,
and job change; these reports would provide counts and basic charactens
3The report series on income and poverty transitions will look at the related but different
issue of how many people entering a program or falling into poverty also experienced a change
in marital status.
OCR for page 167
DATA PROD UCTS AND THEIR USE
167
Characteristics of Recipients and the Dynamics of Program Participation: 1987-88
Extended Measures of Well-Being: Selected Data from the 1984 Survey of Income and
Program Participation (No. 26)
Health Insurance Coverage: 1987 to 1990 (No. 29)
Job Creation During the Late 1980s: Dynamic Aspects of Employment Growth (No. 27)
What's It Worth? Earnings Data: 1990
Who's Helping Out? Support Networks Among American Families: 1988 (No. 28)
Who's Minding the Kids? Child Care Arrangements: Fall 1988
NOTE: The P-70 series number is given for reports that have been released as of summer
1992; the other reports are scheduled for publication by December. In addition to Me reports
shown above, a report based on SIPP data is scheduled for release in the P-23 series by
December 1992: When Households Continue, Discontinue, and Form.
FIGURE 6-2 SIPP Reports in P-70 Series Published in 1992 (in alphabetical order)
tics, such as the age and sex of those people experiencing a particular type
of change; and
· a special report each year that would provide an in-depth analysis of
one or two particular types of events and their antecedents and consequences;
a rotating schedule could be established, with reports on employment topics
alternating with reports on demographic topics.
Recent papers from the PSID by Burkhauser and Duncan (1988) and from
the 1984 SIPP panel by David arid Flory (1988) and Ruggles and Williams
(1987) provide examples of the types of analysis that the detailed special
reports could include.
Recommendations
A strong publication program on income, program participation, and related
topics is an essential component of the Census Bureau's responsibilities for
SIPP. The program should include several types of descriptive and analyti-
cal report series that provide basic information and more in-depth analysis
from the survey.
Recommendation 6-1: The Census Bureau should move forward
with its plans for regular, comprehensive series of descriptive
reports on income, programs, and related topics from the core
OCR for page 168
168
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
data in SIPP. Longitudinal statistics (e.g., on the dynamics and
correlates of transitions in income, poverty, and program sta-
tus) should be published; cross-sectional statistics should also
be issued on a frequent schedule.
The Census Bureau should also establish a research report
series to include in-depth analytical and methodological studies
of special topics related to income and program participation.
Data sources for these studies could include in addition to SIPP-
the March CPS income supplement and other surveys and ad-
ministrative records.
The Census Bureau should continue publications from the SIPP
topical modules and also establish a regular series of summary
and in-depth reports from SIPP on the dynamics and correlates
of major demographic and employment transitions (e.g., mar-
riage, retirement).
The program outlined above is both important and ambitious. Our
major concern is that the Census Bureau may underestimate the level of
resources and capabilities required to carry it out. There are many complex
technical issues involved in developing appropriate and policy-relevant sta-
tistics from SIPP, particularly those based on the longitudinal data (see
discussion in the next section). Development of useful series from SIPP
also requires extensive analysis to understand the quality of the data and
their comparability with other widely-used data sets, such as the March
CPS. In addition, major work will be needed to develop the capacity to
estimate from SIPP such statistics as after-tax income and appropriate val-
ues for in-kind benefits. Hence, staff and resources need to be sufficient
not only to produce publications, but also to support an ongoing program of
research and development to identify and implement improved methods of
analyzing and presenting statistics from SIPP.
In this regard, it is critically important for the Census Bureau to invest
in the skills and knowledge of the SIPP analysis staff so that they are up to
date with relevant policy issues and analytical methods. There are many
avenues to accomplish this goal. The Census Bureau already has experi-
ence with several approaches: organizing in-house seminar programs and
sessions at professional meetings for both staff and outside analysts to present
findings and discuss analysis issues; commissioning experts, through joint
statistical agreements or other means, to conduct research on specific ana-
lytical issues (e.g., longitudinal weights); and making use of the American
Statistical Association (ASA)/Census fellowship program to bring research-
ers on-site to work with SIPP data and share their experiences. We urge the
Bureau to sustain these efforts for SIPP and to focus them more directly on
the needs of the analysis staff. In addition, the Bureau should provide
OCR for page 191
DATA PRODUCTS AND THEIR USE
191
had to slot the information from waves 3 and 4 for this rotation group into
the reference months covered by waves 2 and 3 for the other three groups).36
The SIPP ACCESS system developed at the University of Wisconsin,
with National Science Foundation funding, put the hierarchical files into the
INGRES relational database management system and provided marry ancil-
lary services to assist users. The SIPP ACCESS staff estimated that the
system provided support for about two-thirds of the research on SIPP con-
ducted by academic social scientists outside the Census Bureau during 1985-
1990, when the facility was in operation (David and Robbin, 1992:i). How-
ever, users of SIPP ACCESS also found that various aspects of- the SIPP
design made the data hard to understand and work with, and, indeed, SIPP
quickly gained a reputation for complexity that detected at least some po-
tential users from exploring the utility of the information for their needs
(Committee on National Statistics, 1989:52~.37
One problem confronting early users of SIPP was alleviated when the
Census Bureau developed fully linked longitudinal public-use files. Ini-
tially, a 12-month file was developed from the 1984 panel, followed by a
32-month panel file. The longitudinal files include only information from
the core questionnaire; users must merge topical module data from separate
files
Working with input from users (see, e.g., Smith, 1989), the Census
Bureau recently redesigned the format of the wave files beginning with
the 1990 panel-to include "person-month" records, that is, a record for
each month for which a person has data from either a self or proxy inter-
view or by means of imputation (e.g., the Type Z people not interviewed in
an otherwise cooperating household). This format reduces waste space
because records are omitted for months for which a person has no data.
Also, records can readily be aggregated in a variety of ways-for example,
to produce estimates for all people for a calendar month or estimates for all
months of a wave, or to create new family or household variables to at-
tribute to persons-without confusion about which records to include (see
McMillen, 1990~. Most users expect that the person-month format will be
significantly easier to understand and use.38
36This feature of the design, which was intended to better align certain topical module
questions with a particular time of the year for all four rotation groups, was dropped at user
insistence beginning with the 1987 panel. (See Committee on National Statistics [1989:Table
2-13 for a listing of the interviews received by each rotation group in the 1984-1986 panels.)
37The CNSTAT report noted that a good deal of the complexity of the SIPP data reflects the
real world and is not something that the Census Bureau should attempt to simplify. However,
the report urged that unnecessary complexities in the survey design and file structures be
reduced.
38However, no file format for such a complex survey as SIPP will serve all users equally
well. The Census Bureau will need to evaluate the successes and problems that users experi
OCR for page 192
192
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
The Census Bureau has also made other improvements to the microdata
files in consultation with users: for example, standardizing the codes used
to indicate that the respondent was not in the universe for a particular item
and hence not asked the question. This improvement is important because
of the highly complex skip patterns in the SIPP questionnaire and hence the
need to distinguish carefully between a not-in-universe situation and a true
zero or negative response (e.g., to a question about income amounts).
Another recent innovation with regard to SIPP microdata products is
that Census Bureau staff have developed an on-line system (SIPP On Call-
Data Extraction System) for users to specify and receive extracts (as SAS or
plain ASCII files) from the public-use SIPP microdata over dial-up telecom-
munication lines (Bureau of the Census, l991C).39 At present, the capabili-
ties of the system are limited for example, there is no facility to extract
records based on the value of a continuous variable such as income or to
use a recode of one or more variables in the record retrieval specifica-
tions.40
Priorities for Improvements
Because of the critical importance of microdata for research, microsimulation.
and other types of policy analysis with SIPP, we urge the Census Bureau to
continue to seek ways to improve the timeliness, format, content, and other
aspects of SIPP microdata products.
Timing
Although the Census Bureau has made commendable progress in improving
delivery schedules for SIPP microdata files, we believe that further im-
provements in timeliness are both necessary and feasible. In order to achieve
the goal of SIPP's serving as the nation's main source of income statistics,
core data files must be available from SIPP on about as timely a basis as
they are from the March CPS income supplement~urrently about 6 months
ence with the person-month format and consider ways to alleviate problems that arise. For
example, it may be possible for the Census Bureau to provide illustrative SAS code that would
help users with particular kinds of applications.
39The SIPP ACCESS system was transferred from the University of Wisconsin to the Cen-
sus Bureau in mid-1990, and Bureau staff worked to make it operable at the Bureau for access
to the 1984 and 1985 panels; however, the staff decided to develop SIPP On Call instead for
access to the person-month files from the 1990 and subsequent panels. SIPP On Call also
includes an electronic mail feature for users of the system to communicate with each other and
the Census Bureau.
40For a recent evaluation of SIPP On Call prepared for the Food and Nutrition Service
(which provided funding to help develop the system), see Doyle and Cohen (1992).
OCR for page 193
DA TA PROD UCTS AND THEIR USE
193
after data collection. The SIPP rotation group structure and the length and
complexity of the SIPP questionnaire have made it difficult to contemplate
releasing SIPP files on the same schedule as files from the March CPS.
However, we believe that the implementation of computer-assisted personal
interviewing (CAPI) and database management technology for SIPP should
make it possible to move toward-and achieve that goal.
Kinds of Files
Currently, the Census Bureau releases wave and panel files from each panel
of SIPP (separate wave files for core and topical module information). The
panel files are in the rectangular format, and we encourage the Census
Bureau to consider converting them to the person-month format of the wave
files. (The space-saving features of the person-month format would be
particularly valuable for the lengthy panel files.)
We also urge the Bureau to release calendar-year files that contain data
from both panels that are in the field at the same time. Although the ability
to combine panels was originally viewed as an important feature of SIPP,
the Census Bureau has, to date, approached the processing of each SIPP
panel as a completely separate operation. The delays in releasing files from
the early panels meant that users had to wait for very long periods to be
able to combine, say, wave 6 of the 1984 panel with wave 3 of the 1985
panel. At present, the Census Bureau's data delivery schedule for SIPP
specifies approximately simultaneous release of wave files from panels that
are in the field at the same time (e.g., the core files for wave 8 of the 1990
panel and wave 5 of the 1991 panel are both targeted for release in April
1993~. Hence, users can readily develop wave files that combine panels.
We propose that the Bureau take the next step of preparing calendar-year
files from combined panels, as such files are likely to prove very useful for
many research and policy analysis purposes (e.g., for use in microsimulation
models of tax and transfer programs).
Content and Coding
We encourage the Census Bureau to continue working with user groups (see
discussion of advisory mechanisms in Chapter 8) to identify and implement
changes to the content and fort of the SIPP microdata files in order to
enhance their utility and ease of use. For example, users may identify
recoded variables that it would be helpful for the Bureau to create rather
than leaving them to the user.
We note that adding variables can add processing costs and result in
records that are difficult for users to work with However, in some in-
stances the Census Bureau's efforts to keep the records in the SIPP data
OCR for page 194
194
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
files to a manageable length may have gone too far. Thus, the person-
month format excludes some variables that were available in the rectangular
format (e.g., it is no longer possible to determine coverage by more than
one health insurance plan in the person-month files). Also, some program-
related variables for which the Food and Nutntion Service provided funding
to include on the initial 12-month longitudinal file from the 1984 panel
were never adopted for the 32-month panel files. We urge the Census
Bureau to consult with users about the benefits of reinstating these vari-
ables.
We also encourage the Census Bureau to work closely with users to
further improve the information in the data files about missing values. As a
general policy, the Bureau prefers to provide users with complete records
for every respondent in a survey by supplying values for missing items
through some type of imputation procedure. Complete records have the
advantage of maximizing the sample size for analysis, as cases do not have
to be discarded because of missing information. Also, there is the advan-
tage that the Bureau can implement imputations in a consistent manner
(individual users may vary in the sophistication and care with which they
supply values for missing items). In a separate set of fields, the Bureau
generally provides yes-no indicators of whether an individual item was re-
ported or imputed. Because the imputations performed by the Census Bu-
reau may have disadvantages for certain analyses (e.g., see the discussion in
Chapter 3 of problems with the imputation of income and assets for pro-
gram recipients) and because the imputation rates can be quite large for
some variables, it is important for users to have as much information as
possible about them. We believe it would be helpful for users to assess the
quality of the imputations from the perspective of particular analyses if the
imputation flags contained information about the reason for an imputaiion-
that is, whether the respondent refused to answer a question or did not know
the answer.
Delivery Media
We encourage the Census Bureau to explore alternative media for delivery
of SIPP microdata to users that can make easier the process of obtaining
extracts for analysis. We note that the days of 9-track magnetic tape as a
medium for data dissemination are numbered, as more and more researchers
are tuning away from cumbersome mainframe systems to work with micro-
or minicomputer hardware and software that use some type of direct access
disk media for file storage and input and output.
The familiar floppy diskettes are much too small to serve as a file
storage medium for a survey as large as SIPP. However, high-storage ca-
pacity CD-ROM (compact disk-read only memory) technology is rapidly
OCR for page 195
DATA PRODUCTS AND THEIR USE
195
gaining popularity for large data sets. Users of the National Longitudinal
Surveys of Labor Market Experience (NLS) currently choose CD-ROM over
tape by about four to one The Census Bureau is now releasing CD-ROM
versions of the public-use data sets from the 1990 census. CD-ROM with
suitable extraction software could well be a useful access medium for SIPP
(although further improvements in microcomputing hardware may be neces-
sary before CD-ROM becomes sufficiently fast and easy to access for such
a large data set as SIPP).
We also encourage the Census Bureau to further develop its on-line
extraction system (SIPP On Call), which could save users the time and
expense of acquiring and archiving complete SIPP files when they only
require a subset of the data. To be most useful, the Bureau needs to add
sophisticated retrieval capabilities to the system. In addition, if SIPP On
Call is to be an effective means for users to work with the large volume of
SIPP data, the Bureau needs to provide access to the system over high-
speed communications lines for example, those provided by Internet. Moving
large amounts of data over regular telephone lines is tedious and costly.
Finally, we note the importance for the effective use of CD-ROM and on-
line technology of having full documentation- including frequencies for
each variable" integrated with the actual data (see discussion below and in
Chapter 5~.
Recommendation
Recommendation 6-3: The Census Bureau should continue to
develop improved microdata products from SIPP to support policy
analysis and social science research. Priority improvements in-
clude:
· moving toward ~ goal of releasing core data files within 6
months after the end of data collection;
· producing calendar-year files that combine panels, in addi-
tion to wave and panel files;
· determining, in consultation with users, changes and addi-
tions to the file contents that would assist their analyses; and
· developing additional ways of delivering SIPP microdata
products to users, such as by means of high-storage capacity
compact disks (CD-ROM) and an improved on-line data extrac-
tion system.
DOCUMENTATION AND SERVICES FOR USERS
For effective use of large, complex data sets, users need not only the data,
but also what has been termed the metadata, that is, information that en
OCR for page 196
96
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
ables the user to access, understand, and analyze the data appropriately (see
David, 1991; David and Robbin, 1989, 1990~. Users of computer-readable
products most obviously need basic documentation that enables them to
instruct a computer program how to "read" the data on the magnetic tape or
other medium. In addition, users of computer products need information to
help them understand the quality and meaning of the data. Users of printed
publications require such information as well.
The larger and richer the data set, the more extensive must be the
accompanying documentation- also, the greater the need for ancillary ser-
vices, such as training sessions, working papers, and other means-of reach-
ing and educating users about the potentials and pitfalls of the data and data
products. An investment in documentation and related services is amply
justified in that it minimizes wasted time and resources and increases the
return to users from their processing and analysis efforts. Good documenta-
tion makes a vital contribution to the development of a strong and growing
community of users for a survey like SIPP.
Documentation and Related Services to Date
Microdata Documentation
From the beginning of SIPP, each microdata file has been accompanied by a
codebook providing basic information on the file structure and tape location
and content of each variable. Codebooks are available in printed form and
as machine-readable files attached to the data files. A SIPP Users' Guide
containing additional explanatory information for users about SIPP and its
microdata products was initiated at the start of the survey but took several
years to prepare the first edition was released in 1987 (Bureau of the
Census, 1987~. The guide included chapters on survey design, survey con-
tent, structure of the cross-sectional public-use microdata files, use of cross-
sectional files for estimation and analysis, linking waves, and assessing the
reliability of SIPP data. A second edition that added a chapter about SIPP
cross-sectional weighting procedures and appendix material about the 1990
panel and the new person-month format was released in late 1991 (Bureau
of the Census, l991e).
The initial documentation did not include frequencies for the variables;
at the behest of users, the Census Bureau contracted in 1989 to have fre-
quencies prepared for each file and made available on diskettes, with a
subset of key control counts provided in printed form. Such frequencies,
which indicate the distribution of responses to each item, are invaluable
tools for users in making initial decisions about variables and population
subgroups to analyze, hypotheses to explore, and analytical methods to use.
To inform data file users about problems with the files or documenta
OCR for page 197
DATA PRODUCTS AND THEIR USE
197
lion, the Census Bureau has a SIPP User Notes series that is sent to all file
purchasers and can also be obtained on request. Notice of the user notes is
contained in ~ supplement to the newsletter of the Association of Public
Data Users (APDU) that is mailed to a large list of people who have in-
quired of the Census Bureau about SIPP.4i Finally, the SIPP Qualiry Pro-
file (Jabine, King, and Petroni, 1990) is a very valuable tool for informing
data file users about the quality of the survey information.
Documentation for Printed Reports
Each SIPP publication includes appendix material that describes the survey,
defines key terms, indicates how to make approximate calculations of sam-
pling errors of the estimates, and briefly reviews other sources of nonsampling
error (e.g., underreporting). Also, reference is generally made to the addi-
tional detailed information on sampling and nonsampling errors provided by
the SIPP Quality Profile.
Other User Services
The Census Bureau regularly publishes What's Available from SIPP (e.g.,
Bureau of the Census, l991g), a highly useful basic reference source that
lists the publications, data files, and working papers from the survey. For a
number of years, the Census Bureau supported a vigorous, proactive pro-
gram to educate and inform users about SIPP and to keep users aware of
others who were analyzing the data. This program included training ses-
sions offered as part of the summer program of the Inter-university Consor-
tium for Political and Social Research at the University of Michigan and as
workshops in conjunction with many professional association meetings. Ire
addition, the Bureau published the SIPP Working Paper series, which, by
the end of 1990, totaled 140 substantive and methodological papers by
analysts both inside and outside the Census Bureau (Bureau of the Census,
l991g).42 Bureau staff also organized sessions that featured SIPP research
at professional association meetings and published compilations of SIPP-
related papers that were presented at meetings of the ASA. Census Bureau
staff further encouraged SIPP researchers to apply for ASA/Census fellow-
ships to use the data files on-site at the Bureau. In addition, Bureau staff
regularly appeared at the monthly meetings of SIPP analysts in the Wash
41Arrangements for the APDU SIPP supplement and for an APDU SIPP committee to
consult with the Census Bureau about the SIPP data products and documentation were made in
early l9g9.
42For this series, Census Bureau staff identify papers using SIPP data that have been pre-
pared for professional meetings or initiated in draft, solicit their inclusion in the series, have
them reviewed by one or two others, and edit them and prepare reproducible copy.
OCR for page 198
198
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
ington, D.C., area and were available to meet with groups of users in other
locations on request. All of these were valuable activities that enabled
users and potential users to become informed about SIPP and keep abreast
of what others were learning from the data.
About 2 years ago, the loss of SIPP staff who had been most active in
this program led to a suspension or reduction of many of these services.
Recently, with the appointment of a SIPP liaison in the HHES Division (see
Chapter 8), activities such as releasing new titles in the SIPP Working Pa-
per series and providing workshops about SIPP at professional association
meetings have started up again. However, the level of activity has not yet
reached that of the earlier years.
Recommendation
We urge the Census Bureau to continue regular consultations with users
about needed kinds of documentation and other informational and instruc-
tional matenals. We cannot stress enough the importance of having com-
prehensive, accurate, and intelligible documentation and related services to
interest users in the potential of SIPP data and to enable them to make the
most cost-effective use of the data. We see a number of areas in which
improvements to the current documentation, information, and training pack-
age for SIPP would be useful.
First, it is vital, as part of the implementation of the proposed redesign
of SIPP, to make use of CAPI and database management system technology
to fully integrate the microdata file documentation with the actual data.
Such integration should enable immediate calculation of frequencies for
variables and inclusion of the frequencies in the printed and machine-read-
able forms of the codebook. Integration should also reduce the likelihood
of errors in the documentation, such as field positions not matching the
actual positions in the file, and make it possible to improve the description
of skip patterns in the questionnaire that define the universe of respondents
for particular items.
With regard to the microdata documentation, we note that an adequate
description has never been developed for one important aspect of the SIPP
processing system that affects the quality of a significant portion of the
data: the procedures used to impute values for missing items.43 Although
it may be infeasible and indeed unnecessary to provide detailed imputation
43Documentation has also never been provided for the elaborate routines that are used to
edit inconsistent replies or, in some cases, to supply values for missing responses by means of
an edit rather than imputation. These routines, which are highly specific to individual vari-
ables, present a daunting documentation task. A benefit from the use of CAPI technology for
SIPP should be that inconsistent replies are either resolved in the field or accepted, thus
minimizing the need for after-the-fact editing.
OCR for page 199
DATA PRODUCTS AND THEIR USE
199
specifications for each vanable, it should certainly be possible to describe
the procedures used for various classes of variables and to provide illustrative
information on the effects of imputations (e.g., before-and-after distributions
for selected variables). In addition, documentation is needed for variables in
the data files that result from a process of recoding other vanables.44 Again,
integration of the data and documentation in a database management system
may well facilitate the development of useful descriptions of imputation
procedures as well as documentation of recoded variables.
It is also important to develop means to more frequently update docu-
ments such as the SIPP Users' Guide that provide important contextual
information. The material in a well-formulated, comprehensive guide can
be invaluable in orienting users to the data and alerting them to processing
and analytical pitfalls. The limited background information earned ire the
codebook or documentation of individual variables and codes is not suffi-
cient for these needs. Only two editions of the SIPP Users' Guide have
been issued to date, even though SIPP has gone through many changes
since 1983, and not all of those changes are well reflected in the latest
edition (Bureau of the Census, l991e). Most notably, there is little informa-
tion provided about the longitudinal panel files, although they represent a
widely used end complex~ata product from SIPP. This deficiency needs
to be remedied.
Similarly, we encourage the Census Bureau to evaluate and determine
ways to enhance the text in SIPP reports that is intended to educate and
warn readers about the data contents. Cross-references to such other docu-
ments as the SIPP Users' Guide and the SIPP Quality Profile are helpful,
but many users will not seek out those references; hence, it is important to
provide as much pertinent information in the report itself as possible.
We have commented on the valuable nature of the venous ancillary
informational and instructional materials (e.g., working papers, compila-
tions of professional association papers) and training programs that were
developed for SIPP. We urge the Census Bureau to restore and enhance
these programs to serve the growing community of SIPP users. The pub-
lished research report series that we recommend above will also play a
valuable role in this regard.45 Preparation of a complete on-line bibl~ogra
44Work is in progress under a joint statistical agreement between the Census Bureau and the
University of Michigan to develop documentation for the longitudinal imputations in the SIPP
panel files, and the Census Bureau expects to make arrangements with another organization to
obtain documentation for the cross-sectional imputations and edits. Also, work is in progress
by Social and Scientific Systems, Inc., under contract to the Census Bureau, to develop docu-
mentation for recoded vanables.
45The research report series will not, at least until it is well established, substitute for the
SIPP Working Paper series, which makes available the work in progress of outside analysts as
well as Census Bureau staff
OCR for page 200
200
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
phy that includes relevant Census Bureau staff memoranda that would not
otherwise be known to most users is also an idea to consider.46
Finally, we believe it is important for the Census Bureau to take steps
to ensure that there are effective channels for individual users to communi-
cate both problems and suggestions for SIPP data products and documenta-
tion and to obtain timely feedback from Bureau staff (see Chapter 8 for a
discussion of more formal advisory mechanisms). Because of the decentral-
ized system of operations at the Census Bureau for SIPP and other surveys,
it has not always been clear to users which staff members to consult about
problems and suggestions. Even when a responsive staff member- has been
reached, it has not always been clear that there is an effective, timely sys-
tem of internal communications within the Census Bureau to ensure that all
relevant staff members such as those in data processing and data user
services are informed and able to take appropriate action. Nor is it always
clear that there are effective means of informing the user, or other users, of
the reasons for the problem and the nature of the proposed solution or of the
response to a suggestion.
The recent establishment of a SIPP liaison position in HHES is helpful
in this regard, as is the use of the APDU SIPP Supplement as a vehicle to
reach users. We urge the Census Bureau to keep a vigilant eye on its user-
staff communication channels and act promptly to keep them functioning in
an open and timely manner. The upcoming redesign of SIPP, which will
entail changes in data products and documentation, makes it all the more
important to have good means of communication with individual users and
the user community as a whole.
Recommendation 6-4: The Census Bureau should work to im-
prove documentation and related user information services for
SIPP. Priority improvements include:
· making use of CAPI and database management system tech
nology to fully integrate documentation (including frequency
counts for variables) and data;
· developing documentation for recoded variables and the types
of imputations that are performed for missing data in SIPP;
· developing means to update key explanatory documents, such
as the SIPP Users' Guide, on a more frequent basis;
· restoring and expanding information and training programs,
such as training sessions, working papers, and compilations of
professional society presentations; and
46The SIPP ACCESS project developed such an on-line bibliography of SIPP working
papers, presentations, and memoranda, which could serve as a model.
OCR for page 201
~ =~1~1aluing e~ct1ve channels of communication For users
to Wed back problems and suggesdous and learn of the Bureaus
response, and for users to be indeed of new development in
the survey and as data product
207
Representative terms from entire chapter:
program participation