Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 91
4
Survey Design
In this chapter we review and compare the current SIPP design and avail-
able alternatives in light of our recommended goals for the survey. In
Chapter 5 we do the same for the SIPP data collection and processing
system. From both reviews we conclude that changes in the design and
operation of SIPP would enhance the utility of the data and increase the
cost-effectiveness of the SIPP program.
MAJOR DESIGN ELEMENTS AND ALTERNATIVES
The design of a continuing panel survey such as SIPP includes several
components, each of which affects the quality and utility of the data and the
costs of data collection, processing, and use. In this section we consider the
following major design elements:
· the number of interviews or waves in each panel;
· the length of the reference period covered by each interview;
· the length of each panel (a function of the number of interviews and
the reference period length);
· the frequency with which new panels are introduced; and
· the total initial sample size for each panel.
We also consider the advantages and disadvantages of spreading out the
workload by interviewing portions of the sample (called rotation groups)
each month rather than interviewing the entire sample at the sense time for
91
OCR for page 92
92
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
each wave. In the last two sections we consider aspects of the SIPP sample
design, namely, the use of oversampling to increase the sample size for the
low-income population and the rules for following people that detains
who is included in the sample for each panel over time.
The major design components listed above cannot be assessed in isola-
tion. They interact In a number of ways. Given a fixed budget that puts a
ceiling on the number of interviews that can be fielded each year, a change
in one of the design elements will generally necessitate an offsetting change
elsewhere. For example, an increase in panel length must be offset by one
or more of the following changes: a reduction in the frequency with which
new panels are introduced, a reduction in the sample size per panel, or an
increase in the reference period length for each interview wave.
Current SIPP Design
SIPP is a true panel survey, in that it follows individual people-including
those who change their address in contrast to quasi panel surveys, such as
the Current Population Survey (CPS) and Consumer Expenditure Survey
(CEX), which return to the same address and interview the people who
currently reside there. To obtain the sample for each SIPP panel, a list of
addresses is designated for interviewers to visit in the first wave. Typically,
about 75-80 percent of the addresses represent occupied housing units whose
occupants are eligible for the survey; the rest are vacant, demolished, or
nonresidential units. Of the eligible households, 92-95 percent of the resi-
dents usually agree to participate in the survey (Bowie, 1991~. The adult
members of these households (people aged 15 and over) are deemed origi-
nal sample members. Each of them is followed until the end of the panel or
until the person leaves the universe (e.g., by dying, entering an institution,
or moving abroad) or the sample (e.g., by refusing to continue to be inter-
viewed, moving to an unknown address, or moving outside the area covered
by the SIPP interviewing staffs ). Children of original sample members are
followed as long as they reside with an original sample adult, and adults
and children who join the household of an original sample adult are in-
cluded in the panel as long as they remain in that household.
The basic SIPP design calls for members of each panel to be ir~ter-
viewed at 4-month intervals over a period of 32 months for a total of eight
1 People who move to an address more than 100 miles from a SIPP primary sampling unit
(PSU) area are not followed, although interviewers are instructed to conduct telephone inter-
views with them if possible. Almost 97 percent of the U.S. population lived within 100 miles
of the sample PSUs for the 1984 panel (Jabine, King, and Petroni, 1990:16). Attempts are
made to keep track of people who enter institutions so that if they leave the institution at a
later point during the life of the panel, they can be brought back into the panel.
OCR for page 93
SURVEY DESIGN
93
interview waves. (One-half of the 1984 SIPP panel was interviewed nine
instead of eight times.) A new panel is introduced each year. To even out
the interviewing workload, the sample for each panel is divided into four
rotation groups, one of which is interviewed every month. Interviewing for
the first 1984 SIPP panel began for the first rotation group in October 1983;
interviewing for all subsequent panels has begun in February (see Commit-
tee on National Statistics [1989:Table 2-1] for an illustration of the rotation
group design). Each interview includes a set of core questions about in-
come, program participation, and employment. In most cases, information
is requested on these subjects for each of the 4 preceding months. Each
interview also includes one or more modules on specific topics that are
administered only once or twice in each panel. (See Tables 3-1, 3-2, and 3-
13 in Chapter 3 for information on the questionnaire content.)
The sample design for SIPP is a multistage clustered probability sample
of the population in the 50 states and the Distnct of Columbia that only
excludes inmates of institutions and those members of the armed forces
living on post without their families. There is currently no oversampling of
specific population groups in SIPP, with one exception: the 1990 panel
includes about 3,800 extra households continued from the 1989 panel, se-
lected because they were headed by blacks, Hispanics, or female single
parents at the first wave of the 1989 panel.
The initial sample size for the first 1984 SIPP panel was about 21,000
eligible households, with the expectation that, by combining two panels of
that size, users would be able to obtain a total sample size of about 37,000-
38,000 households.2 However, budget cuts necessitated an 18 percent re-
duction in the sample size midway through the 1984 panel (beginning with
wave 5~. Initial sample sizes for the 1985 through 1989 panels and the
1991 panel were only 12,500 to 14,500 eligible households (and the 1985
panel sample was further reduced beginning with wave 4~. The initial
sample size for the 1990 panel was about 23,600 eligible households; how-
ever, to fund this larger size, the Census Bureau had to terminate the 1988
and 1989 panels at six and three interviews, respectively. Budget cuts also
necessitated limiting the 1986 and 1987 panels to seven rather than eight
waves.3
The Census Bureau received sufficient funding for fiscal 1992 to enable
it to return to the original SIPP design. The 1992 panel began in February
2Attrition reduces the number of actual cases that can be obtained by combining early waves
of one panel with later waves of another, although new household formation by original
sample members somewhat offsets this effect.
3A1SO, for other reasons, one rotation group in the 1985 and 1986 panels received one less
wave than the other three groups (i.e., seven instead of eight waves in the 1985 panel and six
instead of seven waves in the 1986 panel; see CNSTAT [1989:Table 2-1]).
OCR for page 94
94
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
with an estimated initial sample of 21,600 eligible households whose origi-
nal sample members will be interviewed for eight waves. It is expected that
subsequent panels will be funded at about the same level.
User Views
In considering whether to recommend any changes to the SIPP design, we
consulted with researchers and policy analysts working in a range of rel-
evant subject areas. We asked them to assess the usefulness of the data
produced by the current SIPP design and to suggest design modifications
that they thought would improve data quality and utility (see Chapter 2~.
Virtually without exception, these SIPP data users indicated that the sample
size per panel, particularly for panels with sample size reductions due to
budget cutbacks, is too small to support analysis of many of the subgroups
of most interest, such as participants in assistance programs. Users view
the option of combining panels in order to increase sample size as cumber-
some; moreover, combining panels is not an option for such uses as longitu-
dinal analysis of a single panel or analysis of a variable topical module that
was asked in only one panel.
Users differ in their opinions on other major design elements, depend-
ing on their interest in longitudinal or cross-sectional applications of the
data. Users who value most the longitudinal information from SIPP support
increasing the length of each panel to provide an improved capability to
study transitions and spells of program participation and other behaviors.
Longer panels would increase the sample size of events of interest, such as
marital status or job changes or program exits and entrances, and would
provide longer periods of observation before and after these events for
analyzing their antecedents and consequences. Longer panels would also
reduce the "right-censoring" problem, that is, the problem that the duration
of some spells is not known because they are still in progress when the
panel ends. Users most often suggest extending SIPP panels to 5 years,
although some users would be satisfied with extending them to 4 years; at
least one user has suggested lengthening SIPP panels to 10 years to permit
the data to be used to study welfare dependency and persistent poverty
(Manski, 1991~.
In order to increase sample size and panel length, many users of the
longitudinal data say they are willing to live with longer reference periods
for each interview, thereby decreasing the number of interviews per year,
typically from three 4-month to two 6-month waves. They are also quite
willing to reduce the frequency with which new panels are introduced-
perhaps introducing a new panel every 2 or 3 years instead of every year.
Users who are more concerned about cross-sectional applications, such
as describing the characteristics of program participants at any given time
OCR for page 95
SURVEY DESIGN
95
and estimating the likely effects of a program change using comparative
static microsimulation modeling techniques,4 have a different viewpoint.
These users are womed about proposals to reduce the frequency with which
new panels are introduced because they assume that estimates based on a
panel that has been in the field for longer than a year will exhibit higher
levels of error than estimates based on a "fresh" panel. They are also loathe
to increase the reference period of the interviews, assuming that longer
recall periods will reduce the quality of the monthly data that are needed for
program analysis. (Users who are concerned with fine-grained longitudinal
analysis of program dynamics i.e., analysis of short spells and intrayear
changes in participation and related charactenstics within the context of a
longer panel-also share this view.)
The views of Census Bureau staff have tended in the past to coincide
with those of analysts who are most interested in cross-sectional applica-
tions of SIPP. The original plans called for the Census Bureau to publish
improved annual and subannual income statistics using core SIPP data. Prom
this perspective, yearly refreshment of the sample appeared highly desir-
able, as did short reference periods. However, for a variety of reasons (see
further discussion below), the Bureau has yet to realize this goal. More
recently, the Census Bureau staff have tended to emphasize the longitudinal
uses of SIPP, arguing for continued use of the March CPS to provide basic
annual income and poverty statistics (see Chapter 2~.
Staff at the Census Bureau have also argued strongly for design features
that they believe promote operational efficiency. Specifically, they have
supported using monthly rotation groups in order to spread out the workload
for the interviewers. Analysts, in contrast, find that the use of monthly
rotation groups complicates data processing (see discussion in later sec-
tion). Similarly, Bureau staff made the original decision to have reference
periods of 4 months, instead of 6 or 3 months, as a compromise between the
need for accurate monthly data and reduced cost of field operations.
Selected Design Alternatives
We could not investigate every design alternative. More important, while
we felt it essential to look at designs that could improve the usefulness of
4Mierosimulation models of such programs as Aid to Families with Dependent Children
(AFDC) and food stamps typically create an average monthly snapshot of the population,
simulating program eligibility and parueipation under current program regulations and then
simulating what the differences would be if program provisions were modified (e.g., if benefits
were liberalized). Historically, these models have used the March CPS as their mierolevel
database, employing information from such sources as the Income Survey Development Pro-
gram (ISDP) and SIPP to allocate the annual CPS employment and income data to months.
Several models of the food stamp program have been built directly from SIPP eross-seetional
monthly data; see Citro and Hanushek (1991a, 1991b).
OCR for page 96
96
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
the survey for longitudinal applications, we did not want to consider alter-
natives that undercut the uniqueness of SIPP: namely, that it is the only
household survey that provides monthly data for fine-ginned analysis of
changes in income and program dynamics on a short- to medium-term basis.
Hence, we did not give serious attention to extending the panel length
beyond 5 or 6 years nor the reference period length beyond 6 months at
most.5 Other surveys, such as the Panel Study of Income Dynamics (PSID),
will continue to serve users interested in analysis of longer term dynamics.
Moreover, because of our conclusion that SIPP, not the March CPS, should
serve as the primary source of the nation's income statistics, we did not
believe it appropriate to consider alternatives that could seriously affect
SIPP's ability to provide reliable cross-sectional estimates. Our concern
that any design change not cause major problems for Census Bureau opera-
tions also influenced our deliberations.
Below we sketch in the basic features of five alternatives: the current
(fully funded design and four designs intended to provide somewhat longer
periods of observation with varying panel and reference period lengths and
frequency with which new panels are introduced. For each design, we
calculate the sample size per panel under the assumption of a fixed field
budget that supports 160,000 interviews per year once a design is fully
phased in.6 The total of 160,000 interviews per year is the number entailed
by full implementation of the original SIPP design, that is, each year having
a new panel that is interviewed three times, a panel in its second year that is
interviewed three times, and a panel completing its term that is interviewed
two times, with all panels having art original sample size of 20,000 eligible
households. Note that none of the other designs has more than two panels
in the field at the same time.
5However, in the section on sample design considerations, we discuss extending the length
of SIPP panels for a longer period than whatever is the standard length for the full sample-
for subgroups of interest as a means of adding sample size and longitudinal information for the
subsampled groups.
6Attrition will reduce the number of required interviews: eligible households that do not
respond in the first wave are dropped from the sample; eligible households that subsequently
fail to respond are pursued for one more interview before being dropped. Formation of new
households by original sample members will somewhat offset the effects of attntion. Also, at
the first wave, an additional 4,000-5,000 visits are required to addresses that turn out to be
vacant, demolished, or nonresidential (i.e., not eligible). Because of budget cuts, the Census
Bureau has actually fielded no more than about 100,000-120,000 interviews in most years.
Note that, for simplicity, we assume that interviews catty the same average cost under each
design, that is, that the cost of a 6-month recall interview is the same on average as the cost of
a 4-month interview. We also do not take into account any extra data collection costs that could
result for longer panels from greater dispersion of the sample due to geographic mobility.
OCR for page 97
SURVEY DESIGN
97
Current Design Start a new panel every year; run each panel for 32
months and interview in 4-month waves, for a total of eight interviews. The
sample size per panel is 20,000 originally eligible households.
Alternative Design A Start a new panel every 2 years; run each panel
for 4 years (48 months) and interview in 6-month waves, for a total of eight
interviews (two per year). The sample size per panel is 40,000 originally
eligible households. (Two interviews times two panels times 40,000 equals
160,000 interviews per year.)
Alternative Design B Start a new panel every 2 years; run each panel
for 4 years and interview in 4-month waves, for a total of 12 interviews (3
per year). The sample size per panel is 26,650 originally eligible house-
holds. (Three interviews times two panels times 26,650 equals 160,000
interviews per year.)
Alternative Design C Start a new panel every 2-1/2 years; run each
panel for 5 years and interview in 6-month waves, for a total of 10 inter-
views (2 per year). The sample size per panel is 40,000 originally eligible
households. (Two interviews times two panels times 40,000 equals 160,000
interviews per year.)
Alternative Design D Start a new panel every 3 years; run each panel
for 6 years and interview in 6-month waves, for a total of 12 interviews (2
per year). The sample size per panel is 40,000 originally eligible house-
holds. (Two interviews times two panels times 40,000 equals 160,000 inter-
views per year.)
We initially considered another very different design that strives to
reconcile the widely voiced desire for larger sample size with the view that
cross-sectional uses require short reference periods and frequently refreshed
samples (Doyle, 1992~. In brief, this scheme would encompass two related
kinds of surveys: (1) large, annual cross-section surveys, designed to ob-
tain highly robust information for January of each year, and (2) small 2-year
panels, introduced annually in midyear as subsets of the cross-sectional
samples and designed to provide monthly information from six 4-month
waves for limited analysis of program dynamics.
More precisely, this design would do the following: start a new panel
every year; field a large initial cross-section and interview once with a 1-
month reference period; then, 6 months later (to allow time to draw the
subsample), continue a subsample for 2 years, interviewing in 4-month waves,
for a total of six interviews (three per year). The cross-section sample size
is 55,000 eligible households and each panel subsample includes 17,500
originally eligible households. (55,000 plus three interviews times two
panels times 17,500 equals 160,000 interviews per year.) To make the
relatively small panels more useful for certain kinds of analysis, Doyle
(1992) proposes to oversample a particular target group in each panel: for
OCR for page 98
98
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
example, oversample low-income people in one panel and higher income
people the next.
We early on determined that the costs of the Doyle design were likely
to outweigh its possible benefits. As a practical concern, the Census Bureau
would have to gear up each year for a very large cross-sectional survey and
then scale down its operations to handle the much smaller panels. More-
over, the cross-sectional survey component would provide estimates only
for the month of January, while the panel survey component would provide
longitudinal data only for 2 years for small samples.7 This design also
introduces new panels on an annual basis a feature that we argue below is
a major complication for SIPP data processing and use under the current
design.
Our discussion in the next section considers the likely effects that de-
signs A-D would have on the quality and utility of SIPP data in comparison
with the current design. Each design makes tradeoffs within a fixed field
budget. For example, design A increases the sample size and overall length
of each panel in comparison with the current design, but lengthens the
reference period and reduces the frequency with which new panels are in-
troduced. Designs C and D have 6-month reference periods like design A,
but further lengthen each panel and reduce the frequency with which new
panels are introduced. Design B retains the 4-month reference period of the
current design, but provides fewer additional sample cases than the other
designs. Our challenge was to assess the implications of these design choices
for the "bottom line": the ability of SIPP to provide high-quality, relevant
data for research and policy analysis related to income and program partici-
pation.
In considering alternative choices of panel length and number of inter-
views, we focused on the implications for errors in panel survey estimates
due to the following factors:
· attrition-or the cumulative loss from the sample over time of people
who cannot be located or no longer want to participate, which can bias
survey estimates and also reduce the sample size available for analysis;
· time-in-sample effecter changes in respondents' behavior or re-
porting of their behavior due to their continued participation in the survey;
and
· censoring of spells of program participation, poverty, and other be-
haviors that is, the failure to observe the beginning and ending dates of all
spells within the time span covered by the panel. (We also considered the
implications of panel length for analysis of transitions and spells more gen-
erally.)
7The proposed solution to the problem of small sample sizes, namely, to oversample differ-
ent groups each year, would complicate the design and use of survey.
OCR for page 99
SURVEY DESIGN
99
In considering the choice of length of reference period, we focused on
two kinds of errors:
· respondents' faulty recall, which is usually assumed to get worse as
the period about which the respondent is queried is farther away;
· a related phenomenon known as the "seam" problem, whereby more
changes (e.g., transitions in program participation or employment or changes
in benefit amounts) are reported between months that span two interviews
(e.g., the last month covered by wave 1 and the first month covered by
wave 2) than are reported between months that lie entirely within the refer-
ence period of one interview.
In considering the choice of how often to introduce new panels, we
looked at the possible reductions in error for cross-sectional estimates-
reductions both in sampling error and in bias from attrition and time-in-
sample effects afforded by the opportunity to use newer panels. We also
looked at the negative effects of more frequent panels, one of which is a
reduction in sample size available for longitudinal analysis of single panels.
Negative effects can also stem from what we term the "complexity factor":
specifically, having multiple panels in progress at the same time can in-
crease the burden on interviewers and data processing operations, which, in
turn, can introduce errors and reduce timeliness of data products. A com-
plex design can also affect the costs to users of accessing and analyzing the
data. Finally, given the importance of sample size to users, we considered
the implications of alternative sample sizes for cross-sectional and longitu-
dinal uses of the data.
We attempted, whenever possible, to quantify the relationships of the
venous design dimensions to the venous sources of error.8 Such quantifi-
cation is highly desirable for making informed choices among design alter-
natives. For example, in considering the optimum panel length and number
of interviews, it is not enough to note that attrition bias and time-in-sample
effects are assumed to worsen as a function of the number of interviews and
also, perhaps, of the overall length of the panel, and that censoring is re-
duced with an increase in panel length. One needs to know the relative size
of these effects and their implications for important uses of the data. Unfor-
tunately, the literature does not always provide clear guidance, and, ulti
80ther sources of nonsampling error appear related primarily to questionnaire design and
data collection procedures and hence are not discussed here. They include undercoverage of
population groups in the survey (see Chapter 7), nonresponse to specific questionnaire items
that is not a function of length of recall, and reporting errors that are not a function of length of
recall. Jabine, King, and Petroni (1990) provide an excellent review of the literature on
sapling and nonsampling errors in SIPP. Other useful sources are Kalton, Kasprzyk, and
McMillen (19891; Lepkowski, Kalton, and Kasprzyk (1990); and Marquis and Moore (1989,
1990a).
OCR for page 100
100
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
TABLE 4-1 Cumulative Household Noninterview and Sample Loss Rates,
1984-1988 and 1990 SIPP Panels (in percent)
1984 Panel
Noninterview
1985 Panel
Noninterview
1986 Panel
Noninterview
Wave Type A Type D Loss Type A Type D Loss Type A Type D Loss
1 4.9 4.9 6.7 6.7 7.3 7.3
2 8.3 1.0 9.4 8.5 2.1 10.8 10.8 1.5 13.4
3 10.2 1.9 12.3 10.2 2.7 13.2 12.6 2.3 15.2
4 12.1 2.9 15.4 12.4 3.4 16.3 13.8 3.0 17.1
5 13.4 3.5 17.4 14.0 4.1 18.8 15.2 3.7 19.3
6 14.9 4.1 19.4 14.2 4.8 19.7 15.2 4.3 20.0
7 15.6 4.9 21.0 14.4 5.2 20.5 15.3 4.8 20.7
8 15.8 5.7 22.0 14.4 5.5 20.8
9 15.8 5.7 22.3
NOTES: Differences in rates for the 1984 panel in comparison with subsequent panels may be
due in part to differences in the sample design. Rates are not shown for the 1989 panel
because it lasted only 3 waves.
Type A noninterviews consist of households occupied by persons eligible for interview and
for whom a questionnaire would have been filled if an interview had been obtained. Reasons
for Type A Noninterview include: no one at home in spite of repeated visits, temporarily
absent during the entire interview period, refusal, and unable to locate a sample unit. Type D
noninterviews consist of households of original sample persons who are living at an unknown
new address or at an address located more than 100 miles from a SIPP PSU and for whom a
telephone interview is not conducted.
mately, we have relied on our professional judgments in recommending
design changes to SIPP.
Attrition
All household surveys are subject to unit nonresponse, that is, the failure to
locate or obtain the cooperation of some fraction of the eligible households
(or of individual members of otherwise cooperating households). Panel
surveys are also subject to wave nonresponse, or attrition, at each succes-
sive interview.9
9More precisely, total sample loss at each interview, or total wave nonresponse, includes
attrition per se, that is, nonresponse by households that are never brought back into the survey,
plus nonresponse of households that miss a wave but are successfully interviewed at the next
wave. (In SIPP, households that miss two interviews in a row are dropped from the survey.)
In addition, in every SIPP interview, there are "Type Z" nonrespondents, that is, individual
members of otherwise cooperating households for whom no information is obtained, either in
person or by proxy.
OCR for page 101
SURVEY DESIGN
101
1987 Panel
Noninterview
1988 Panel
Noninterview
1990 Panel
Noninterview
Type A Type D Loss Type A Type D Loss Type A Type D Loss
6.7 6.7 7.S - 7.5 7.1 7.1
11.1 1.5 12.6 11.4 1.5 13.1 10.9 1.5 12.6
1 1.5 2.6 14.2 12.0 2.3 14.7 1 1.5 2.5 14 4
12.3 3.3 15.9 13.0 3.0 16.5 12.6 3.3 16.5
13.7 4.1 18.1 13.9 3.3 17.8 13.7 4.5 18.9
13.6 4.9 18.9 13.6 4.0 18.3 14.1 5.2 20.1
13.6 4.9 19.0 14.3 5.8 21.0
N.A N.A N.A
The sample loss rate consists of cumulative noninterview rates adjusted for unobserved
growth in the Noninterview units (created by splits).
aRates for 1990 are for the nationally representative portion of the sample; they exclude the
households that were continued from the 1989 panel.
N.A., Not available.
SOURCE: Data from Jabine, King, and Petroni (1990:Table 5.1) and unpublished tables from
the Census Bureau.
Attrition reduces the number of cases available for analysis including
the number available for longitudinal analysis over all or part of the time
span of a panel and the number available for cross-sechonal analysis from
later interview waves and thereby increases the sampling error or variance
of the estimates. More important, people who drop out may differ from
those who remain in the survey. To the extent that adjustments to the
weights for survey respondents do not compensate for these differences,
estimates from the survey may be biased.
Evidence on Attrition To date, the wave nonresponse rates from SIPP
show a definite pattern (see Table 4-1~. Total sample loss in the 1984-1988
and 1990 panels is highest at the first and second interviews 5-8 percent
of eligible households at wave 1 and an additional 4-6 percent of eligible
households at wave 2. Thereafter, the additional loss is only 2-3 percent in
each of waves 3-5 and less than 1 percent in each subsequent wave.~° By
iOIndeed, looking closely at later panels in comparison with earlier ones, the numbers sug-
gest that SIPP interviewers are experiencing somewhat less success in obtaining responses
from households in waves 1 and 2 of later panels but better success in retaining cooperative
households for subsequent waves of later panels.
OCR for page 120
120
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
The first stage in the sampling process for SIPP (as for the March CPS
and other household surveys conducted by the Census Bureau) is to use
decennial census data to divide the entire United States into primary sam-
pling units (PSUs) of larger counties and independent cities and groups of
smaller counties. The larger PSUs are then selected with certainty for the
sample; smaller PSUs are grouped into strata and subsampled (174 PSUs
were selected for the 1984 SIPP panel and 230 for subsequent panels). The
final stages in the sampling process are to obtain addresses in each sampled
PSU and select clusters of two to four households for interviewing. The
addresses represent a combination of decennial census addresses and ad-
dresses that are obtained through field canvasses. The latter include ad-
dresses in areas of new housing construction and in areas for which the
census address list was incomplete. The 1970 and 1980 censuses formed
the basis of the sample design and selection of census addresses for the
1984 and 1985-1994 SIPP panels, respectively (see Jabine, King, and Petroni
[l990:Ch. 3] for additional information).
The Census Bureau is currently developing a new sample design for
SIPP, based on the 1990 census, that will be implemented beginning with the
1995 panel. The necessary research has been completed to identify and select
the PSUs, and work is proceeding on other aspects of implementation.
A new feature of the design will be a provision to oversimple low-
income households (see Singh, 1991~. This change is at the behest of SIPP
users. In 1988-1989, the Census Bureau held several meetings with data
users who were concerned about the effects of sample size reductions in
SIPP due to budget cuts. Users expressed an interest in a larger sample size
for a number of subgroups' including (in priority order) low-income people,
the elderly, blacks, Hispanics, and the disabled. Several options for oversampling
were discussed.
Given budget constraints, it became apparent that it would be extremely
difficult to implement an oversampling scheme in SIPP prior to the 1995
redesign. To help users in the meantime, the Census Bureau decided to
curtail the 1988 and 1989 panels (to six and three waves, respectively), in
order to have funds to field a larger sample for the 1990 panel, including a
supplemental sample that was continued from the 1989 panel (see section
above on the current SIPP design).
We generally support the goal of oversampling low-income groups in
SIPP, which accords with the survey's focus on people who are economi-
cally at risk However, we believe that the Census Bureau's scheme for the
1995 redesign (see below) is not likely to be as effective as it is projected to
be in achieving this goal. We present several alternative means of oversampling
that we believe the Census Bureau should explore.
OCR for page 121
SURVEY DESIGN
121
Using the 1990 Census for Oversampling in SIPP
In planning for the 1995 sample redesign, Census Bureau staff conducted
research on methods for obtaining a larger sample for the low-income popu-
lation, defined as households with annual income below 150 percent of the
poverty threshold. The research also investigated ways to minimize the
increase in the variance of estimates for people aged 55 and older that
would be expected to result from oversampling the poor and near-poor
(given the lower poverty rate for older than for younger people).
The Census Bureau decided to adopt a methodology from-Waksberg
(1973), which creates two strata within each PSU. The first stratum has a
high concentration of the group of interest and is oversampled relative to
the second stratum, which has a low concentration of the group of interest.
For the SIPP redesign, the 1990 census address list within each PSU will be
divided into strata of low-income and higher income households. For households
in the 1990 census list that answered the long-form questionnaire (about
one-sixth of the total), the determination of income above or below 150
percent of poverty will be made directly. For households that answered the
short forte, proxy characteristics will be used to make the classification:
specifically, the low-income stratum will include female-headed households
with children under 18; low-rent households in central cities of metropoli-
tan statistical areas; black and Hispanic households in central cities; and
black and Hispanic households in which the head is under age 18 or over
age 64. For those blocks of the PSU for which there is no complete census
address list (the area frame portion of the sample), the classification will be
made using aggregate census information on the proportion of the popula-
tion below 150 percent of poverty in each block. The low-income and
higher income strata in each PSU will be sampled at higher and lower rates
so that an oversample of households in the low-income strata is obtained.
The extent of oversampling will be restricted by the requirement that the
sampling error of estimates for persons aged 55 and older not increase by
more than 5 percent.
When the new sample design is introduced in 1995, it is expected that
the census address portion of the sample will constitute about 70 percent of
the total and the area frame portion about 20 percent. The remaining 10
percent will represent addresses of new construction, for which no oversampling
will be performed; obviously, over the course of a 10-year period, this
category will grow as a proportion of the total. Moreover, one can confi-
dently expect that the efficiency of the design will decline from what would
have obtained in 1990 because of the mobility of the population: for ex-
nmple, by 1995, 1998, or 2003, a low-income household may occupy a
sample address that was drawn from the higher income stratum and vice
OCR for page 122
122
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
versa. Also, even when the same household is present in 1995 or 1998 as in
1990, the household may have changed classification from low to higher
income or vice versa. The question is how great a deterioration in the
efficiency of the design will occur over time.
The Census Bureau conducted research with data from the 1980 census
for 27 PSUs to determine the extent of the gains that could be expected
from oversampling low-income households in the 1990 census address por-
tion of the sample, assuming that the design was implemented immediately
after the census. The results (Singh, 1991:Table 1) showed gains (i.e.,
decreases in sampling error) for many subgroups of interest to users, such
as poor blacks and Hispanics. The Census Bureau also conducted research
with data from the American Housing Survey (AHS) on the effects of time
on the efficiency of the design and estimated very little increase in sampling
error 5 to 15 years after the census date (Singh, l991:Table 3~. The Bureau
estimated somewhat higher but still relatively low increases in sampling
error due to uniform sampling in the new construction frame and the assump-
tion that stratification will be less efficient at the block level in the area frame
compared with the census address frame (Singh, l991:Table 4) 27
Although this research appears encouraging about the proposed oversampling
scheme, we remain skeptical. There were many limitations to the research,
such as the use of only 27 PSUs from just a few states in the 1980 census
analysis and the inability of the analysts to replicate fully the proposed
design with the AHS data (households were classified on the basis of proxy
characteristics rather than on the basis of their income-to-poverty ratio).
We believe that further research on the extent to which the household pov-
erty classification assigned to an address in the census predicts the poverty
classification of the household at that address ~ to 15 years later is needed
to support the Census Bureau's proposed oversampling scheme. For ex-
ample, it could be useful to conduct research on the extent to which the
household poverty classification of addresses in the 1985 SIPP panel corre-
sponds to the 1980 census classification.28
There is also no opportunity to change any aspect of the design because
the Census Bureau plans to draw 10 years' worth of sample for SIPP (and
other household surveys) at the same time. Hence, the samples for all of
27Chu et al. (1989:2.9-2.11) found that oversampling geographic areas with relatively high
percentages of low-income households was not very successful in reducing the sampling errors
for estimates of the poverty population in the National Health and Nutrition Examination
Surveys. They attributed this outcome to the fact that many poor people live in nonpoverty
areas and vice versa.
28We understand that a match of 1985 SIPP and 1980 census address lists is not likely to be
operationally feasible, and we strongly urge the Bureau to take steps to ensure that it will be
possible to perform a match of 1995 SIPP and 1990 census lists.
OCR for page 123
SURVEY DESIGN
123
the panels from 1995 to 2005 will be drawn in the same way, using the
same characteristics to determine the two strata within each PSU. The only
exception is that provision has been made to jettison the oversampling and
implement a uniform sampling rate for any SIPP panel in the 1995-2005
period if that is later viewed as desirable.
Even assuming the benefits of the proposed scheme, we believe that
there are some technical ways in which it could be improved. For example,
if the object is to oversample low-income households in SIPP, then the
census address portion of the sample could be drawn exclusively from the
long-form respondents to the 1990 census, which represent a-very large
fraction (1 in 6) of the total population. Selection of PSUs on the basis of
poverty-related characteristics could also be beneficial. We are pleased that
the Census Bureau decided to adopt the same oversampling rate across all
PSUs, instead of determining PSU-specific rates as in the original plan that
we reviewed. The latter procedure would have allowed the Census Bureau
to better control the size of the workload across PSUs, but it would have
resulted in variations in the weights for addresses sampled within each
stratum low-income or higher income-across PSUs. (Such weight varia-
tions are likely to reduce the sampling error gains.) Also, those PSUs with
the highest percentages of low-income households would have had propor-
tionately less oversampling of the low-income stratum compared with wealthier
PSUs.
More broadly, we urge the Census Bureau (and the user community) to
be clear about the target population in considering the use of oversampling
in SIPP. For the redesign, the Census Bureau is essentially defining a
cohort of low-income people on the basis of their previous year's household
income-to-poverty ratio. However, many people with low incomes at wave
1 will move into a higher income category over the life of a SIPP panel and
vice versa (see Short and Littman, 1990; Short and Shea, 1991~. Instead of
a larger sample for a low-income cohort at the start of a panel, it may be
that users would prefer to have a larger sample for people who are at risk of
experiencing a spell of low income at any time during a panel or at risk of
experiencing a long spell of low income. Different oversampling criteria
would be required, depending on the definition of the target population: for
example, a combination of variables, such as family type, ethnicity, and
previous year's low-income status, may be a better predictor of long-term
economic disadvantage than the latter variable alone.
Screening as Another Method of Oversampling
An alternative method for oversampling the low-income population in SIPP
is to use a screening interview close to the time when a new panel is to be
introduced. This approach could be used to refine the proposed 1990 cen
OCR for page 124
24
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
sue-based approach (if larger-than-needed samples were drawn from the
census list) or serve as a substitute for it.
The advantage of screening is that it provides information on which to
draw a sample that is close in time to the introduction of the survey and
thereby is likely to permit more effective oversampling since much less
mobility or change in classification will have occurred in the interim. Also,
screening offers flexibility the criteria for sampling can be changed as
needed (e.g., some panels could oversample minorities instead of low-in-
come households). In addition, screening can be applied uniformly to the
entire sample, instead of using different procedures for the census long-
fonn respondents, census short-form respondents, area frame address, and
new construction address segments of the sample. In the context of oversampling
low-income households while not worsening the estimates for older people,
screening should make it possible to develop a more efficient approach to
this problem (e.g., also oversampling elderly higher income households).
On the negative side, screening imposes the costs of conducting an
interview for a larger number of households than will be selected for the
survey, which may necessitate a reduction in the overall sample size. It
may also add costs by lessening the ability of the Census Bureau to equalize
interviewers' workloads across PSUs (e.g., the screening might result in
sample sizes that overtax some interviewers while underemploying others,
with little time to make adjustments before the start of the survey).
However, these costs must be viewed in the context of the entire sur-
vey, which, in the case of SIPP, will amount to 12 interviews under the
proposed redesign. There are also ways to reduce costs. It may be possible
to conduct much of the screening using a centralized CATI system that
eliminates interviewer travel costs.29 Another way to reduce costs is to
treat the screening interview as wave 1 of a SIPP panel instead of as an
added interview. In a CAPI environment, the sampling criteria could be
built into the interview so that the full wave 1 interview could be adminis-
tered on the spot to those households selected for the sample.
Another problem with screening, when the purpose is to identify house-
holds on the basis of income or poverty, concerns measurement error. Screening
interviews are typically short in order to reduce costs. However, studies
have shown that respondents tend to underestimate their income in response
to brief, general questions (e.g., Chu et al., 1989; Moeller and Mathiowetz,
1990), so that a short screening questionnaire may erroneously classify higher
income households as low income, thereby reducing the gains from the
oversampling. In addition, some low-income households may be falsely
29Telephone mlmbers would be obtained from directories that are organized by address.
Some personal screening would also be required for addresses with no telephones or with an
unlisted number.
OCR for page 125
SURVEY DESIGN
125
classified as higher income based on their responses to the screening ques-
tionnaire. Such households will thus not be oversampled. There is also the
issue, noted above, of defining the target population-for example, people
at risk of a long-term spell of low-income during a SIPP panel rather than a
low-income cohort and defining appropriate variables to use in the screen-
ing questionnaire. Despite the problems of a screening approach, we be-
lieve that the potential benefits in terms of a more efficient sample design
and greater flexibility merit a careful examination of its cost-effectiveness
for oversampling the low-income population in SIPP.
Recommendation 4-4: The Census Bureau should investigate
alternative methods of oversampling the low-income population
in SIPP, including the use of screening interviews as ~ possible
complement to or substitute for an approach based on using
information from the 1990 census.
Increasing Sample Size by Extending Panel Length
The Census Bureau's census-based plan and the use of screening do not
exhaust the possible approaches for oversampling low-income households
or other subgroups in SIPP. Another possibility is to extend the length of
one or more SIPP panels for subgroups of interest. This strategy both
provides additional longitudinal information for the subsampled cases and
makes it possible to treat them as an addition to the sample for the next
panel (see David, 1985a). This approach was followed in the 1990 panel,
for which the sample includes households from the 1989 panel that were
headed by blacks, Hispanics, and female single parents as of wave 1 of that
panel.
Users have often expressed interest in periodically extending the length
of SIPP panels for people who may be at economic risk because of experi-
encing a divorce or job loss or for people who benefit from programs or
have certain demographic characteristics (e.g., single parents). We are not
now recommending that such an approach be built into SIPP because we
believe that the Census Bureau confronts a very large agenda in implement-
ing the proposed redesign of 4-year panels introduced every 2 years to-
gether with computer-assisted interviewing and an improved database man-
agement system (see Chapter 5~. However, we do believe that the concept
has merit and should be an option for the future. Hence, we urge the
Census Bureau to take the steps that are necessary to permit this and related
options to be considered for SIPP at some future date. (A related option,
which would add longitudinal information although not necessarily increase
the sample size for the next SIPP panel, would be to return at annual or
longer intervals to selected cases.)
OCR for page 126
26
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
One such step involves informed consent. Respondents need to be
informed at the outset that there is a possibility that they may be asked to
answer further questions after the closeout of their SIPP panel. If such
consent is not sought, then, under current views about the obligations of
statistical agencies to their respondents, it would probably not be possible
to later impose this additional burden on them.
Another step involves setting in place procedures for tracking respon-
dents after the end of a panel. Particularly if it appears desirable to revisit
the subsampled groups at less frequent intervals (e.g., yearly), the Census
Bureau will need to have good procedures developed to keep in touch with
them so as to minimize sample loss.
Recommendation 4-~: The Census Bureau should take steps to
ensure that it will be possible to extend the length of SIPP pan-
els for selected subgroups of interest or to follow them up at a
later date, should such options be desired to obtain increased
sample size and longitudinal information.
Multiple-Frame Samples
Yet another way to obtain an additional sample for subgroups of interest in
SIPP is to develop multiple-frame samples, that is, samples of households
together with cases that are drawn from one or more types of administrative
records for example, program records, tax records, or employer records.30
Augmenting a household sample with cases from administrative records can
offer considerable benefits. First of all, such a strategy may be a very
efficient means of oversampling such subgroups as program recipients. Also,
providing that confidentiality and data access issues are resolved, additional
records data could be obtained for the administrative cases, not only con-
current with but also preceding and following the time span of the survey
interviews. Analysis of the relationships of the records data and survey
responses for the administrative cases could serve a number of useful pur-
poses. For example, in a multiple-frame sample of program recipients, the
records information could provide the basis for imputing program character-
istics to recipients in the household sample for use in improved policy
models for program analysis and simulation of program changes.
The drawback to the multiple-frame approach for increasing sample
size and information for subgroups of interest is that a number of problems
impede its ready implementation. Many of the problems are operational in
30A sample of households together with cases drawn from one source of records is termed a
dual-frame sample.
OCR for page 127
SURVEY DESIGN
127
nature.3i Permission for the records must be obtained, which can be time-
consuming and difficult to achieve. In the case of programs that are state
administered (e.g., AFDC and food stamps), there are differences across
states in access rules and in the extent to which the records are appropri-
ately computerized so that access is operationally feasible. At this time,
only a small number of states have good computerized records for such
programs as AFDC; hence, it would not be possible to develop a national
multiple-frame sample for these programs. Also, the Census Bureau itself
may not be able to have access to an entire administrative file for purposes
of sample selection, in which case it would have to rely on the responsible
agency's ability to properly implement a specified sampling procedure. Fi-
nally, the addresses in the sampled records may not be current, in which
case a tracing operation, with likely problems of its own, would be neces-
sary (see Logan, Kasprzyk, and Cavanaugh, 1988~.
A multiple-frame approach poses technical difficulties as well, includ-
ing the determination of appropriate weights for the combined sample. Tak-
ing the simple case of a dual-frame sample, some fraction of the household
sample will have a probability of selection in the sample drawn from records,
and all members of the records sample will have a probability of selection
for the household sample. Consequently, it is necessary to develop weight-
ing adjustments to compensate for dual selection probabilities, and this
requires identifying those members of the household sample who are in-
cluded in the administrative frame. One way to identify these members is
to rely on respondents' reports of their status at the time of drawing the
administrative sample. (For example, in the case of a dual-frame sample
including SSI cases drawn the August before the start of a SIPP panel, the
questionnaire would ask about receipt of SSI in the preceding August.) A
possibly more reliable approach is to match the household sample members
with the administrative frame. However, this procedure adds a step to the
data processing that could cause delays in the release of data files and
reports. (The 1979 ISDP multiple-frame sample initiative came to grief on
this very point usable, fully weighted data files were not completed before
the program ended; see Kasprzyk [19833.)
In addition, there are technical issues to resolve with regard to the type
of sample to draw from administrative records (assuming that the sample
can be properly selected.) In the case of a program such as AFDC, one
needs to decide whether a cross-section sample or a sample of new entrants
is appropriate. A cross-sectional sample will overrepresent longer term
31Record-check studies, including forward record checks and full record checks, face many
of the some operational problems: see, for example, the discussion in Marquis and Moore
(199Ob) of the difficulties in carrying out the SIPP record-check study, which obtained records
for eight federal and state programs.
OCR for page 128
28
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
recipients. If new entrants are sampled, the decision must be made as to the
time period or window during which cases are eligible for selection (a
month, year, or other period). Ideally, for program analysis, one would like
to sample people who are eligible for Programs. not lust Participants, but
an-- r--r~ -I---- --- I-- ~ ---- J--- r
there is no record system available to do this.
The usefulness of a multiple-frame or dual-frame sample for SIPP de-
pends very much on user interest in particular population groups, such as
program recipients. We suggest that a decision to adopt this means of
oversampling, particularly in light of the operational and technical difficul-
ties it would pose, should be contingent on the support and cooperation of
an interested agency. We encourage the Census Bureau staff to keep up to
date on the methodology of multiple-frame samples, so that SIPP can be
responsive to requests from agencies that want to obtain a larger sample
size and information for a particular population by adding a component to
the SIPP sample that is drawn from their records.32
FOLLOWING RULES
At present, SIPP follows original sample adults that is, all people aged 15
and older who resided in an interviewed household in wave 1 for the life
of a panel or until they leave the universe or drop out of the survey. SIPP
also keeps track of original sample adults who enter institutions and inter-
views them if they return to a household setting.33 Adults who join the
household of an original sample adult after wave 1 are followed only so
long as they continue to reside with an original sample member. Similarly,
children, regardless of whether they were present in wave 1, are followed
only so long as they reside with an original sample adult. We believe that
the utility of SIPP for important research and policy concerns would be
enhanced by changing the following rules for two groups all children and
children and adults who enter institutions.
Many users expressed great interest in having more information with
which to analyze children's changing circumstances (see Chapter 2~. Fewer
arid fewer children have a stable family or economic situation throughout
their childhood, and more and more children are experiencing economic and
social distress. Extending the length of SIPP panels will make the survey
32The report of the ISDP special frames study (Logan, Kasprzyk, and Cavanaugh, 1988),
which was designed to test the feasibility of sampling and locating respondents from adminis-
trative records, provides pertinent information for consideration of multiple-frame samples in
SIPP; see also Kasprzyk (1983).
33Tracking the institutionalized was not originally intended for SIPP but was initiated in
May 1985 for the 1984 panel and in October 1985 for the 1985 panel (Jean and McArthur,
1987).
OCR for page 129
SURVEY DESIGN
129
more useful for analyzing children's circumstances over time. However,
the current following rules preclude using the survey to obtain a complete
picture of children's family dynamics. For example, in wave 1 of the 1984
panel, 24 percent of children under 15 lived with only one parent or with
another relative (McArthur, Short, and Bianchi, 1986:Table 6~. Any of
these children who subsequently moved in with an adult not part of the
original sample (e.g., the other parent or another relative) would not be
tracked under the current following rules. Similarly lost to follow-up would
be children who went to live with, say, a grandparent following a divorce,
or children born to a marriage between a sample and nonsample-adult who
stayed with the nonsample parent following a marital separation.34 The
numbers of such events will increase with longer panels, but the current
following rules will preclude analysis of them. We urge the Census Bureau
to treat all children present in interviewed households at wave 1 together
with children born subsequently to original sample mothers as original sample
members who are followed throughout the life of a panel. When original
sample children move into a household of nonsample members, information
would be obtained about them and about other members of their new house-
hold.
There is also user interest in learning about both children and adults
who become institutionalized. SIPP is not the appropriate vehicle to pro-
vide information about the institutionalized population as such, but because
the survey follows people over time, it naturally provides a sample of en-
trants to all types of institutions (e.g., mental health treatment facilities,
nursing homes, prisons). Extending the current practice of following origi-
nal sample members who enter institutions to include children and, in addi-
tion, obtaining at least some limited information for them would enhance
the usefulness of SIPP for analysis of socioeconomic well-being in the
United States. For example, for fuller analysis of some government pro-
grams, it is important to include institutionalized people who can still re-
ceive benefits under such programs as social security and SSI.3s It would
also be useful to know about other sources of income for institutionalized
people, such as private pensions, asset income, and transfers from relatives.
340f children present in all 8 waves of the 1984 SIPP panel who lived with both parents in
wave 1, 7 percent had experienced a change in the marital status of their parents by the end of
the panel (Bianchi and McArthur, l991:Table A). This is likely a lower bound estimate of
children at risk of a marital separation to the extent that the weights do not adequately adjust
for the higher attrition rates of children in nonintact families (McArthur, Short, and Bianchi,
1986:Table 6).
35Coder (1988:Table 9) found that 4 percent of SSI recipients in the first month of the 1984
SIPP panel had entered an institution by the end of the panel, as was also true for 3 percent of
recipients of social security and veterans' payments and 1 percent of private pension recipi
ents.
l
OCR for page 130
130
THE SURVEY OF INCOA'IE AND PROGRAM PARTICIPATION
These data would be useful for analysis in their own right and also in
conjunction with the data for the people in the household they left.
We do not offer detailed suggestions about the kinds of data to collect
for original sample members who enter institutions during the course of a
SIPP panel, nor even about the preferred data collection mode (e.g., some
items might best be obtained from the institution and others from the family).
However, we do urge the Census Bureau to investigate the needs of users for
information about institutionalized persons that fall within SIPP's goal of pro-
viding improved data on economic resources arid assistance programs, particu-
larly for people and families who may be economically at risk.
Recommendation 4-6: SIPP panels should treat all children who
reside in interviewed households at the first wave and also chil-
dren born during the course of a panel to original sample moth-
ers as original sample members, who are followed if they move
into households without an original sample adult. SIPP panels
should also continue to follow and collect data for both original
sample adults and children if they move into institutions.
Representative terms from entire chapter:
program participation