Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 317
D
Sampling and Randomization:
Technical Questions about Evaluating CDC's
Three Major AIDS Prevention Programs
Following the release of the first edition of Evaluating AIDS Prevention
Programs, CDC program personnel met with the pane] and raised a
number of questions about the report. This appendix deals with essentially
technical matters relating to the implementation of some of the report
suggestions in particular, questions about sampling and the random
assignment of treatment and control groups. Appendix E deals with the
evaluation of projects that are ancillary to, emerging, or related to those
discussed in Chapters 3 and 5.
The first section of this appendix treats the following technical issues
related to sampling: the number of case studies to be used in a pro-
cess evaluation of the counseling and testing program; the sample sizes
needed to evaluate the effectiveness of all three programs; suggestions
for controlling attntion; and the comparison of convenience samples and
probability samples. The second section addresses two aspects of using
randomized experiments to evaluate a project's effectiveness: successful
experiments In the AIDS prevention arena and the ethics of no-treatment
controls.
The panel's objective is to outline some of the general principles
Involved with sampling and randomization as part of research design.
Because the panel's original task was one of developing overall evaluation
strategy rawer than rendering detailed technical advice, we have been
reluctant to provide any kind of specificity on questions of sample size, the
use of convenience samples, and so on. The panel believes strongly that
317
OCR for page 318
318 ~ APPENDIX D
technical advice of this sort is so context-driven that each set of evaluation
objectives warrants its own response. Such advice is best fashioned by
statistical and subject matter experts who can assess each evaluation
problem on its own terries. Thus, our foremost recommendation is that
CDC either develop the requisite in-house expertise among personnel
responsible for evaluation research or contract for expert services when
these types of questions arise. Nonetheless, we offer the following
general information in the hopes that it will prove to be useful.
SAMPLING ISSUES
Personnel from the National AIDS Information and Education Program
(NATEP) and the Center for Prevention Services (CPS) raised questions
about the optimal number of case studies and sample sizes needed for
evaluating a project's effectiveness. Related to the issue of sample size
is the question of how to control attrition. In addition to addressing
these sampling issues, this section includes some thoughts about using
convenience samples when it is not possible to calTy out probability
sampling.
Number of Case Studies
The purpose of conducting case studies of counseling and testing sites is
to identify the variables to be considered in evaluating how well services
are delivered: i.e., who is being served, do they complete the service
protocol, what are the baIIiers, and so on. The question is supple: How
many case studies need to be performed? The answer is complicated:
As many as it takes to identify the relevant variables and no more.
Unfortunately, it is impossible to predict how many case studies this will
entail. Moreover, no "optimal number" exists, and it is impossible to
recognize that a satisfactory number has been covered until that number
has been exceeded. In other words, when the researcher recognizes Cat
additional case studies are shedding no more significant light on one's
understanding of service delivery, it no longer makes practical sense to
continue such field research.]
A good sampling scheme is important In malting a correct decision.
The pane] believes that a stratified sample of counseling and testing sites
is the best method for gathering case data on service delivery variables.
In Chapter 4, the pane] suggested a 2 x 2 x 2 matrix (stratifying by
seroprevalence rates, activity, and target group), for case studies of a
1 Obviously, if the goals of service delivery or the needs of evaluation research change, new case studies
will again become necessary.
OCR for page 319
SAMPLING AND RANDOMIZATION ~ 319
sample of community-based organization (CBO) projects. This scheme
would require a minimum of ~ case studies.2 The pane! believes that
the sample of case studies of counseling and testing projects should be
similarly laid out but will be larger because of the greater number of site
. .
vanatlons.
A stratification scheme would probably best be planned by CDC
program personnel, who are familiar with key variables In the different
projects that the agency funds as well as the distnbution of those variables.
Nonetheless, the pane] suggests the following stratification variables:
· Type of facility, e.g., health department, family planning
citric, drug treatment center, clinic for treatment of sexually
transmitted disease, and so on (already the matrix is larger
than that proposed for CBOs because of the diversity in
types of setting);
Seroprevalence rates or number of AIDS cases (i.e., low,
middle, or high prevalence areas);
Type of region (i.e., urban or rural).
Sites should be selected on the basis of the important service variables, not
simply because they are convenient or their staffs are cooperative. The
important dimensions should incorporate the diversity among counseling
and testing sites.
As case studies are conducted, program staff at CDC or the staff's
evaluation consultants will need to carefully assess information as it
is collected from site visits to determine when it is sufficient, so that
resources are not needlessly spent on gathering redundant information.
Staff will also need to keep abreast of organizational and goal-related
changes at the project level, so that the information does not become
outdated.
Estimating Sample Sizes
On the panel's experience, sample size calculations rarely overestimate
the number of units required and quite commonly underestimate them.
To determine sample size for use in an efficacy or effectiveness trial,
the investigator must first specify four factors. For the simplest possible
case,3 they are:
2As ~nennoned in Chapter 4, some cells in Me ma~x may be empty, in which case We number of case
studies required would be smaller.
31his explication presents a simplified overview that omits many complications that are not treated
here. Detailed treatments of this topic can be found in Lipsey (1990), Cohen (1988), and Kraemer and
Thiemann (1987).
OCR for page 320
320 ~ APPENDIX D
1. The kind of analysis that will be performed (e.g., ~ test,
regression, estimation of a proportion or difference in pro-
portions, etc.) and the statistical mode} that is assumed to
describe the data distnbution.
2. The minimum effect the experimenter wishes to detect, if
indeed there is an eKect. For instance, if an investigator's
outcome variable is a respondent's number of sexual part-
ners, the desired "effect size" might be a reduction by a
factor of 2. There are standard ways to express the effect
size, and they depend on the type of analysis that will be
performed. For example, for comparing two group means,
the effect size (ES) is usually defined as the difference
between the two population means fibs, R2), in units of stan-
dard deviations Alp. The formula is:
~1—II2
ES=
3. The "[eve] of significance" (~) at which the test wiU be
performed. The level of significance is the probability of
concluding there is an effect when none really exists. Con-
ventional values are .01 and .05.4
4. The desired "power" for the test. Power refers to the prob-
ability that the minimum effect specified (or a larger effect)
will be detected, if it exists.
These factors are sufficient to determine the needed sample sizes, but they
must be specified in light of their possible uses. Although the last three
factors may be specified by a sophisticated investigator, combining the
four factors to derive the implied sample size is best left to a statistician.
For example, a statistician might suggest careful research designs such as
blocking and matching to reduce Me size of the standard deviation Alp,
which in turn can either increase the size of the effect one will detect or
reduce We size of the sample necessary for the study.
As intuition may suggest, the appropriate sample sizes increase as
smaller effect sizes or smaller it's are chosen and/or as the desired
power increases. Statistical texts can help investigators determine these
venous factors and then use them to find the appropriate sample size,
but even these specialized texts and their reference tables require an
4It should be recognized that besides merely testing the null hypothesis (that the difference between
groups is zero), one may have a particular interest in obtaining an estimate of the magnitude of the
treatment effect with a given degree of precision. The texts noted in the preceding footnote will provide
detailed guidance in this regard.
OCR for page 321
SAMPLING AND RANDOMIZATION ~ 321
understanding of several statistical concepts. Moreover, tables do not
exist for comparisons more complicated than t tests and analyses of
variance. To resolve the matter, the pane] advises that the investigator
specify the desired factors (effect size, a, power level) and then consult
with a statistical expert to determine the necessary sample size.
Controlling Attrition
In Chapter 6, the pane] discussed attrition and a lack of compliance
as potentially important detractors from a project's effectiveness, and we
noted that such phenomena are useful endpoints to be studied because they
can reveal whether a project is too unattractive to retain its participants.
A project that cannot motivate its participants to comply or to stick with
the protocol cannot be practically effective.
In some cases, however, attrition occurs for reasons unrelated to
project attractiveness. The loss of data through attrition is a potentially
serious source of bias, with the level of the problem depending in part
on the amount of attrition and in part on other factors. Attntion becomes
more senous, for example, if the outcome vanable is relatively uncom-
mon, if the treatment is causing expenmental group members to drop
out, or if the lack of treatment is causing control group members to drop
out. Whether attrition is a problem depends on a given situation, so that
no "standard appropriate attrition rate" can be advised.
The pane} wishes to suggest ways to help contain attrition. In Chapter
4, He pane] mentioned two ways: confidentiality guarantees and some
form of compensation to respondents. We further discuss these and other
suggestions below.
Confidentiality Guarantees
Assurances of confidentiality, which should be fairly easy to guarantee
in any COO study, typically have been found to decrease attrition and
item nonresponse (Singer, 1978; Pane} on Privacy and Confidentiality
as Factors In Survey Response, 1979~. Anonymity may be slid more
successful in reducing nonresponse (Moore, Lessler, and Caspar, 1989),
but it obviously hinders follow-up. To meet this challenge, researchers
involved In CDC's community demonstration projects have tried a code
name system to track individuals. Under this system, media announce-
ments summon respondents by code groups for periodic reinterviews.
Return rates, however, have been modest, ranging from 50 to 80 percent
of project participants, perhaps because the media approach is not suf-
ficiently visible or persuasive. More intensive follow-up would require
names and over locator information. Tanur (1982) has proposed one
OCR for page 322
322 ~ APPENDIX D
method for protecting the anonymity of respondents for whom this infor-
mation is known; i.e., making telephone follow-ups asking respondents
to anonymously call back the interviewer later.
Compensation
Under certain conditions, compensation has been shown to be an effective
method of curbing attrition. Ferber and Subman (1974) reviewed the
effects of compensation on response rates in consumer surveys and found
a mix of outcomes. For example, some form of compensation (cash,
gifts) appeared to contribute to a higher response rate when the study
conditions were burdensome to participants (e.g., when they were asked
to participate in a longitudinal study, to keep records or dianes, or to
come in to a study site). In less burdensome settings (such as one-time
interviews with no written records), compensation was not particularly
helpful in increasing response. Moreover, compensation seemed to be
more effective among certain groups of respondents than others (e.g., it
was more effective among participants with lower incomes and education
than it was among middle class participants).
Cannell and Henson (1974) posit that survey participants must be
sufficiently motivated to provide information. When the purpose of
the study is perceived to be compatible with persona or social goals,
compensation may not be important; however, money might motivate
the respondent who feels that no other goal for participating ex~sts.5 In
the AIDS prevention arena, few studies have been made on the impact
of compensation to complete a study's protocol, but the studies that are
available suggest that the goals or motivations of participants are indeed
unportant in recruitment and attrition. For example, Fox and Jones
(1989) found that participants in the Baltimore MACS study reported
that prizes were not a major reason for continuing to volunteer. In fact,
74 percent of recruits returned for follow-up after a single mailed notice
was sent (follow-up was even higher 84 to 90 percent after telephone
requests for return). Carballo-Dieguez and colleagues (1989) provide an
example in which compensation was counterproductive. They found that
a $10~our payment appeared to be the major motivation for participation
in a S-year study on HIV disease progression among intravenous drug
users and led some candidates to become what investigators characterized
as aggressive and manipulative to secure enrollment.
5 The most important source of motivation appeared to be the investigators' interactions with the par-
ticipants (Cannel! and Henson' 1974).
OCR for page 323
SAMPLING AND RANDOMIZATION ~ 323
In an interesting contrast, Davis, Faust, and Ordentlich (1984) turned
the concept of financial incentives on its head In a successful plan to re-
duce dropout rates in a smoking cessation program. In a randomized al-
ternative treatments study, researchers obtained $20 deposits from volun-
teers who received one of four different self-help packages; them deposits
were refunded only after five follow-up interviews were completed (re-
gardIess of outcome). The follow-up rate was 95 percent. Implementing
this suggestion clearly would not be feasible in low-income community
or outreach projects; it might, however, be possible in other types of
projects, such as one that provides valuable resources like counseling
activities to middle class gay males.
Stabilization Funds
To retain respondents in an intervention study, some (CDC) personnel
suggested providing emergency "stabilization" fimds to project partic-
ipants. Such fiends go beyond a token compensation and, for some
participants, can mean the difference between leaving the study area or
staying to receive an intervention and participate in its evaluation. The
panel looked at this suggestion from two sides. On the one hand, we
saw that it may at times be necessary to alter the social environment
to provide the intervention. Conversely, however, if the purpose is to
evaluate the intervention, it is damaging to alter the environment because
it may contaminate the intervention with an "additional value" (here, He
emergency funds).
One method of incorporating such incentives into a research design
is to provide the additional value to the whole pool of program candidates
before assigning individuals to the experimental and condor groups.6 In
this way, the samples are drawn from a homogeneous population. ~ fact,
this method could enhance the feasibility of randomization because an
investigator is likely to have a larger pool of willing study participants
once members of a community learn that a given evaluation will provide
them additional funds. However, the provision of the additional value
would change the evaluation from a study of effectiveness to a study of
efficacy.7
Cultivating and Tracking Respondents
Other ways Hat have helped to avoid attrition include: familiarizing
respondents at first contact with the importance and purpose of the study
6Designs that provide incentives only to participants and not to controls confound the incentive win
die treatment. The results of such designs would be particularly difficult to interpret.
7
' See Chapter 3 for a full discussion of "efficacy" versus "effectiveness."
OCR for page 324
324 ~ APPENDIX D
and cultivating their participation; proxy reporting (less likely but possible
when sex and drug partners are recruited into a project); participant
screening (described in Chapter 61; and rigorous follow-up. The last is
strengthened when the researcher gathers all relevant information about
all respondents Experimental and condom group members alike) at the
time of initial contact and, if multiple follow-ups are anticipated, every
time that respondents are recontacted. "All relevant information" is meant
to include locator information, characteristics of the target population, and
information about variables affecting the content of the treatment that are
related to the outcome.
Gathenng such comprehensive information serves multiple purposes,
but it is especially important for tracking respondents. In addition to rou-
t~ne information such as respondents' names and addresses, nontraditional
identifying data, such as alternative locator information, social security
number, and date of birch should be collected. Follow-up is facilitated
when the researcher has the name and address of several persons who are
likely to know a respondent's whereabouts; alternative locator info~a-
tion is especially useful In situations that involve mobile populations such
as intravenous Mug users or prostitutes. Access to respondents' social
security numbers and birth dates also facilitates locating them through
archival records such as voter registries, tax roils, motor vehicle records,
credit bureaus, marriage licenses, real estate records, death certificates,
wills, and the like. Federal agencies that remain in touch with some
respondents include the Veterans Administration, the Social Secunty Ad-
min~stration, and the Internal Revenue Service. Successful follow-up by
mail includes the use of postal services such as forwarding and record up-
dates. Telephone techniques include the use of directones, local and long
distance operators, and reverse directories that provide numbers of former
neighbors. Telephone searches may include searches made directly for
the respondent, for known relatives or alternative locators, or for persons
with the same last name who may be related to the respondent.
Tracking adolescents poses some particular problems. For one thing,
state law sometimes prevents a researcher from gaining access to adoles-
cents without parental consent. In some contexts, when the adolescent
comes to the researcher, the teen is legally considered an "emancipated
minor" with power to consent. For an adolescent to initiate and persist
in these contacts, he or she must be strongly motivated to participate In
the research. In cases such as these, compensation may be an effective
motivating tool.
On the other hand, where parental consent is given, investigators
can actively follow up their adolescent participants. Pirie and colleagues
OCR for page 325
SAMPLING AND RANDOMIZATION ~ 325
(1989) report on studies for which good background data on the adoles-
cents and their parents and guardians (names and social security numbers)
was helpful, as was the cooperation of school districts in tracking students
and transfers. Other public records were generally not successful-sources
for tracking adolescents, with the exception of Diverts license records
in a study centering on suburban adolescents. Telephone tracking was
important, but had to be modified to focus on parents or guardians; tele-
phone tracking was particularly important in rural areas, where listings
for a given name often led to persons who knew or were related to the
respondents.
Personnel for Tracking Respondents
Even with good information for tracking respondents, investigators must
use sheer persistence to approach the goal of 100 percent follow-up. For
maximum follow-up, the research team has to have the personnel neces-
sary to do the tracking (which implies having the necessary resources to
support such personnel). Trackers need not be research investigators, but
they do need to be trained in the search techniques discussed above.
The time necessary to follow up respondents should not be under-
estimated. In addition to tracking participants through a paper trail of
records, field work is necessary to locate many individuals. Field work
involves tracking time and interview time, both of which need to be
factored into each scheduled follow-up. In some cases, the tracking pro-
cedure may be more intensive than the intervention itself. For example,
tracking participants from a drug treatment center may be much more
labor intensive than providing the original counseling intervention.
Modeling Attrition
The second reason to get as much information about individuals in the
study at intake is that, if attrition does occur, researchers will be able
to estimate the characteristics of nonrespondents. Since nonrespondents
differ from respondents in terms of their refusal to respond or Weir being
untraceable, they may also differ in terms of the variable that the evaluator
is trying to measure. When they do, the validity of evaluation results will
be subject to question.
Where attrition occurs, there is a need to model the causes and
distribution of nonresponse. Ignoring nonresponse altogether implies
acceptance of a model that says Mat nonrespondents are distributed in
the same way as respondents.8 Alternatives to ignoring nonresponse all
This admits to some refinement depending on the analytic strategy employed. A typical default as-
sumption in a life-table analysis is that persons lost resemble persons followed from He time of loss
onward, not from time 0.
OCR for page 326
326 ~ APPENDIX D
call for estimating—or guessing the ways in which nonrespondents
may differ. One way is to assume that they resemble some specified
subset of the respondents. Other ways may exploit information about the
nonrespondents (e.g., demographic features, responses at Initial contact,
etc.) to estimate or impute their later, missing responses. (For furler
discussion of missing data, see, e.g., Little and Rubin, 1987.) Another
approach is to locate and interview the dropouts to find out what caused
attrition and thus to mode} self-selection bias.9
(As noted in Chapter 6, it should be clearly understood that attrition
and noncompliance in an experiment introduce uncertainties that directly
parallel those that arise in nonexperimental studies. Modeling their ef-
fects, in turn, invites inferential uncertainties parallel to those that beset
modeling effects in nonrandom~zed studies.)
Convenience and Probability Sampling
As noted In Chapter 5, We panel has recommended that CDC conduct
population surveys Cat include potential and actual clients of counseling
and testing services. By measuring a population's experience with and
desire for these services, such surveys could be used to evaluate barriers
to access and provide insight into perceived availability, needs for the
services, and fears about the system. In addition to community and
general population surveys, the pane] recommended surveys of high-risk
or hard-to-reach groups using probability samples whenever possible. We
recognized that it may sometimes be difficult to construct the sampling
frames from which probability samples of some high-risk groups can
be drawn. The numbers and demographic profiles of gay men and
intravenous drug users, for example, are not known with any certainty,
nor are definitions of group membership always clear.
The pane! observed that because of He difficulty and cost ~n-
volved win population-based samples, replicable convenience samples
can sometimes be used. The term "convenience" sample is not meant
to convey a naive or effortless assemblage of study participants. Conve-
n~ence samples are simply a type of nonprobability sample and can be
devised in many ways, with some designs weaker or stronger than others.
For example, "accidental" samples are drawn from subjects at hand and
are rawer easy to implement, but are far from being representative of We
general population. "Purposive" samples are more carefully constructed
to reflect the researcher's best judgment about what is tYDical of the
9Rossi and Freeman (1982) suggest community surveys as often Me only feasible means of discov-
enng nonparticipants; the panel considers this alternative more difficult than getting the important
infonnation at intake and tracking tirelessly.
OCR for page 327
SAMPLING AND RANDOMIZATION ~ 327
larger population; they probably have a more substantive claim to an
adequate coverage of the population. Regardless of construction design,
estimates of population parameters from these sorts of convenience sam-
ples have unknowable amounts of bias and vanance, and results are not
generalizable to the constituents of the high-nsk groups.
For comparing interventions, however, convenience samples may be
useful. Alternative interventions can be compared by assigning them to
randomly chosen subsets of a convenience sample of persons, or clinics,
or other relevant unit of analysis. The random assignment solves the
question of internal validity. The main hazard to external validity arises
from the possibility of qualitative interaction between treatment effects
and population subgroups. This risk cannot be dismissed, but it can be
typically expected to be less threatening than Me risk of large direct
differences between population and convenience samples.
The pane! did not discuss convenience samples at much length, but
Me parent committee in its first report (Turner, Miller, and Moses, 1989)
did review a number of nonrandom or nonprobability sample studies of
gay and bisexual men and of drug users. Briefly, the studies can be
arranged along a spectrum of the strength of their sampling schemes.
The pane] believes that it would be helpful to highlight and update these
examples here.
Sample Studies of Gay and Bisexual Men
Convenience samples recruited from narrowly circumscribed or "acciden-
tal" sources (e.g., STD clinics or gay bars) are frequently used, but their
potential for infening effects to the whole population is seriously flawed.
For example, gay men recruited from an STD clinic (e.g., Swarthout et
al., 1989) will quite likely have different information needs as well as a
different awareness of available testing facilities than the gay population
at large; if they are seropositive they may have different medical referral
needs as wed.
Samples Cat recruit volunteers through public notices (e.g., the
Baltimore MACS sampled) are somewhat more useful. Although stall a
form of accidental sampling, such kinds of nonprobability samples are
improved by casting a wider net, and the volunteers will probably have a
more diverse base of needs for and concerns about counseling and testing
services. Nevertheless, data derived from such a design will be biased
by the self-selection of respondents into the sample.
Nonprobability samples and probability samples of narrow ur~verses
can be purposively enlarged to ensure differences among respondents by
10MACS~he Multicenter AIDS Cohort Studie~are described in Chapter 6.
OCR for page 328
328 ~ APPENDIX D
including presumably representative groups in a sample. An example of
such a purposive sample is a cohort assembled by Martin (19861. The
design began with a probability sample of men belonging to at least
one gay organization; the sample was supplemented with self-selected
volunteers, recruits from a Gay Pride festival, respondents from a public
health citric, and a snowball sampling from persons already enrolled
in the study. This purposive sampling example is only illustrative-
not definitive" as it likely overselects respondents with reasonably high
knowledge of available services. For the purpose of measuring barriers to
counseling and testing, it might be more helpful to diversify the sample,
e.g., by supplementing We base probability sample with patrons of gay
bars rather Man users of health clinics.
Finally, a probability sample of gay and bisexual men is not impos-
sible and of course would produce the most defensible data. One such
sample was drawn by the San Francisco Men's Health Study from the
Castro district of San Francisco, an area highly populated by gay men and
having the highest incidence of AIDS cases In the city (see Winkelste~n et
al., 1987~. The sample was representative of that community, and such a
design might be quite appropriate in other high-profile gay communities
for evaluating the accessibility of testing services.
Sample Studies of Intravenous Drug Users
Intravenous drug users are a difficult population to survey because of the
clandestine nature of drug use activities and the difficulty of defining who
is or has been a user at risk of REV from needle-sharLng. Nonetheless,
the spectrum of sampling done among intravenous drug users is similar
to that done among gay men. Surveys have been largely limited to
accidental convenience samples of subpopulations, but purposive and
probability sampling have been possible.
Members of drug treatment centers constitute the most accessible
populations for convenience samples, and numerous examples exist of
research samples drawn from methadone and detoxification clinics. As-
signment to treatment, however, is nonrandom, and results are not repre-
sentative of the general drug-using population. Researchers would face
i\A snowball sample is a sampling method in which each person interviewed is asked to suggest
additional people for interviewing.
12The design was a clustered probability sampling of single men aged 25 to 55 in the 19 San Francisco
census tracts that comprised the area of the city with the highest AIDS incidence rate. Note, however,
that despite its scientific sampling plan, the representativeness of the survey could have been flawed
by a fairly high nonresponse rate (41 percent), although investigators judged differences between re-
spondents and nonrespondents to be "insufficient" to warrant that conclusion.
OCR for page 329
SAMPLING AND RANDOMIZATION ~ 329
similar problems getting information about testing and counseling ser-
vices from such a sample: treatment clientele are likely to have different
testing needs and different perceptions about the availability of services
than persons who choose to continue using drugs or cannot get into treat-
ment. Other frequently used convenience samples include drug users in
hospitals, emergency rooms, and health clinics.
A more varied but still accidental population is arresters, of whom
some 15-50 percent can be identified as drug users (Eckerman et al.,
1976~. Using a sample of arresters would probably result in a more
diverse group than a sample of clinic clients in terms of individuals'
awareness of counseling and testing services and barriers to access. Be-
cause self-reports of drug use by arresters may be unreliable, however,
screening such as urinalysis may be necessary, making this design more
difficult to implement because researchers cannot be sure of the individ-
uals' consent. Moreover, arrestees may constitute a more desperate class
of drug users those who resort to crime to support their habit—than is
representative of the population.
Purposive, nonprobability samples using street outreach to recruit
IV drug users often attempt to draw on a broader cross-section and be
more representative of the drug-using population than samples drawn
from a~Testees. For example, several studies have sampled IV drug users
recruited from the streets of neighborhoods where drug use prevalence is
high (e.g., Abdul-Quader et al., 1989; Inciardi et al., 1989; Wiebel et al.,
l989~. As with gay study recruits, street user recruits are probably more
representative of the broader population than are institutional populations
and, because they are active users and vulnerable to health problems,
would likely have a variety of needs for counseling and testing services.
Still, the conclusions from these samples cannot be generalized to the
total population.
Some researchers have purposively enlarged nonprobability samples
to ensure differences among respondents. Such purposive sampling co-
horts have been assembled by researchers In Portland, Oregon (Sibthorpe
et al., 1989, recruited from a corrections facility, county health clinics,
private welfare organizations, and street outreach), at Johns Hopkins in
Baltimore (e.g., Nelson et al., 1989, recruited from street outreach, clinics
for sexually transmitted disease, emergency rooms, and drug treatment
centers), and New York (Carballo-Dieguez et al., 1989, recruited from a
poster campaign, methadone clinics, and inpatient wards). Depending on
Heir final composition, these samples may provide some good results.
Nonetheless, they are not substitutes for probability sampling and will
never provide wholly representative results.
OCR for page 330
330 ~ APPENDIX D
Finally, a probability sample is possible, at least of street drug users.
Such a sample can be drawn from a systematic mapping of drug-related
activity that includes the enumeration of activities and individuals as
well as the selection of potential informants (such as ex-users) to identify
active users (e.g., McAuliffe et al. [1987] used this method to deliver
AIDS education to randomized neighborhoods of intravenous drug users).
This sort of probability sampling could provide good data for analyzing
access and barriers to counseling and testing services on the part of
noninstitutionalized IV drug users.
RANDOMIZATION
CDC staff members expressed interest in learning more about success-
fully randomized samples in the AIDS prevention arena. This section
provides some additional examples of such samples, including examples
of experiments with no-treatment control groups. The section also read-
d~resses the ethics of implementing randomized teals with no-~eatment
controls.
Examples of Randomized Experiments
In Chapter 4, the panel recommended a strategy of evaluating health
education/nsk reduction through randomized experiments In the context
of street outreach. Although few evaluation studies of this sort have
been published, the strategy is certainly feasible. One example is the
sample drawn by McAuliffe and colleagues (1987), who used ax-addicts
to deliver AIDS health education to intravenous drug users in randomly
assigned neighborhoods of Baltimore; the experimental group had sig-
nificantly more knowledge at follow-up than did control group members
who did not receive the intervention, although there were no significant
behavioral differences.
Few formal examples of randomized evaluation of street outreach
studies are available in the literature; however, anecdotal reports and dis-
cussions with community-based providers indicate substantial opportuni-
ties and support for systematically testing venous strategies. Community-
based studies offer multiple units Hat can be randomized, such as street
corners, street blocks, public housing communities, and less well-defined
neighborhoods. Because of Ignited resources, outreach efforts in these
communities sometimes have to be employed in a delayed implementation
design. Alternative methods of outreach, such as comparing indigenous
outreach workers with health professionals, may also be possible.
It may be possible Mat only a handful of sweet outreach projects
will meet the criteria, discussed In Chapter 4, for randomizing to no-
treatment. Similarly, the possibilities in other community-based settings
OCR for page 331
SAMPLING AND RANDOMIZATION ~ 331
may be few, but where they occur they should be creatively used. As an
example of the latter, Bellingharn and Gillies (1989) recently reported on
a successful randomized control trial In Great Bntain, In which six youth
training centers were randomly assigned to receive an AIDS education
comic book. No differences between the groups were detected at pretest,
but at pastiest the knowledge scores of the experimental group were
significantly higher.
Randomizing alternative treatments may be easier still. A recent ex-
ample of a "natural" experiment where student classes were randomly as-
signed to alternative treatments was reported by Zither and Ziffer (19891.
Investigators austere pretests and posttests to students enrolled in
parapet one-semester courses on AIDS. The class that received a values
and attitudes component In addition to a basic "facts" course showed
significant attitudinal change compared with the class who received facts
only.
The experience of the panelists indicates that the research commu-
nity is weB aware of most of the steps that lead to a well-controlled
experiment. We wish to emphasize, however, a sometimes neglected
step, which is to involve project practitioners in the development of the
research protocol. Such involvement, we believe, ought to facilitate co-
operation and active participation in experimental studies. By designating
a particular program to help in the development of the protocol and to
serve as the test site, a prototype is created for the experimental trials.
Staff of the prototype project could also assist In the training of new
randomly selected sites In the controlled experiment.
The Ethics of No-treatment Controls
Although in Chapter 5 the panel does not recommend randomized studies
with no-treatment controls for evaluating counseling and testing, the pane]
does recommend such a design in Chapter 4 for evaluating new CBO
projects Panelists debated this issue long and hard. ~ not recommending
the design for counseling and testing, the panel's conclusion is based on
an ethical consideration that needs to be made more explicit, especially
because it may sometimes apply to certain CBO projects: the pane!
believes that efficacious patient care is essential and, on ethical grounds,
should not be withheld for purposes of evaluation. This aspect should
be considered along with He other justifications listed In Chapter 4 for
no-~eatment coneo} conditions: scarce resources with which to provide
a service; interventions that are of unproven value; and availability of
related services elsewhere.
The pane] notes an important difference between many CBO in-
terventions and CDC's larger counseling and testing program. Despite
OCR for page 332
332 ~ APPENDIX D
CDC's characterization of counseling and testing as an AIDS prevention
program, the services can and should be distinguished. Unlike other
projects, it is not simply a behavioral intervention. Rather, it is a pro-
gram that offers HIV testing. Although testing can provide important
information to be used in decisions about sexuality, contraceptive use,
and needie-shanng, the test itself is a diagnostic tool and is an important
aspect of patient care that the panel believes should be available to all
who seek it. Because of this distinction, HIV testing should never be
withheld for purposes of evaluating its effectiveness. Similarly, although
counseling alone might be considered an intervention to encourage be-
havioral change, when it is joined to testing it becomes part of an effective
medical care procedure (as a means of explaining test results and as a
psychological and social support). Thus, counseling (in the context of
counseling and HIV testings also should not be withheld for purposes
of evaluation. The pane! therefore recommends an evaluation strategy
for counseling and testing in which alternative counseling treatments are
randomized and their relative effectiveness assessed. This strategy re-
tains the essential parts of the service: the diagnostic technology of HIV
testing and the counseling that is part of patient care.~3 At the same time,
it allows for the evaluation of alternative counseling methodologies that
may be found to have superior value in promoting behavioral change.
In deciding whether to without a given CBO service, care must
be taken to distinguish whether He service offered is an integral aspect
of patient care or is an intervention of unproven worth that is available
elsewhere. Consider, for example, a CBO project that provides bleach
to intravenous Hug users. Bleach is known to be an effective agent for
reducing HIV transmission; thus, information about the utility of bleach
as a preventive too} should not be withers. It is not known, however,
whether providing a supply of bleach is effective in getting people to
use it. Assuming that bleach is otherwise readily accessible, it would
be ethical to randomly assign He provision of bleach samples across
communities or organizations.
REFERENCES
Abdul-Quader, A., Tross, S., Des Jarlais, D. C., Kouzi, A., Friedman, S. R., and
McCoy, E. (1989) Predictors of Attempted Sexual Behavior Change in a Street
Sample of Active Male IV Drug Users in New York City. Presented at the Fit
International Conference on AIDS, Montreal, June 4-9.
Bellingham, K. and Gillies, P. (1989) AIDS Education for Youth - A Randomised
Controlled Trial. Presented at the Fifth International Conference on AIDS,
Montreal, June 4-9.
is although the panel believes it would be unethical not to offer counseling as a part of patient care, it
is important to recognize that patients have the prerogative to refuse counseling.
OCR for page 333
SAMPLING AND RANDOMIZATION ~ 333
Cannell, C. F., and Henson, R. (1974) Incentives, motives, and response bias. Annals
of Economic and Social Measurement 3~2~:307-317.
Carballo-Dieguez, A., El-Sadr, W., Gorman, J., Joseph, M., McKinnon, J., and
Sorrell, S. (1989) Research with Intravenous Drug Users: Problems and Practical
Recommendations. Presented at the Fifth International Conference on AIDS,
Montreal, June 4-9.
Cohen, J. (1988) Statistical Power Analysis for the Behavioral Sciences. Rev. ed.
Hillsdale, N.J.: Lawrence Erlbaum Associates.
Cook, T. D., and Campbell, D. T. (1979) Quasi-Experimentation: Design & Analysis
Issues for Field Settings. Boston: Houghton Mifflin.
Davis, A. L., Faust, R., and Ordentlich, M. (1984) Self-help smoking cessation and
maintenance programs: A comparative study with 12-month follow-up by the
American Lung Association. Americas: Journal of Public Health 74~111:1212-
1217.
Eckerman, W. C., Rachal, J. V., Hubbard, R. L., and Poole, W. K. (1976) Methodological
issues in identifying drug users. In Drug Use and Crime. Report of the Panel
on Drug Use and Criminal Behavior (Appendix). Research Tnangle Park, N. C.:
Research Triangle Institute.
Ferber, R., and Sudman, S. (1974) Effects of compensation in consumer expenditure
studies. Annals of Economic and Social Measurement 3~21:319-331.
Fox, R., and Jones, L. T. (1989) Maintaining followup in prospective epidemiologic
studies of HIV infection: Experience in the Baltimore MACS study. Presented at
the Fifth International Conference on AIDS, Montreal, June 4-9.
Inciardi, J. A., Chitwood, D., McCoy, C. B., and McBride, D. C. (1989) Needle
sharing behaviors and HIV serostatus in Miami, Florida. Presented at the Fifth
International Conference on AIDS, Montreal, June =9.
Kraemer, H. C., and l~hiemann, S. (1987) How Many Subjects? Statistical Power
Analysis in Research. Newbury Park, Calif.: Sage Publications.
Lipsey, M. W. (1990) Design Sensitivity: Statistical Power for Experimental Research.
Newbu~y Park, Calif.: Sage Publications.
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New
York: Wiley.
Martin, J. L. (1986) AIDS risk reduction recommendations and sexual behavior patterns
among gay men: A multifactorial categorical approach to assessing change. Health
Education Quarterly 13~4~:347-358.
McAuliffe, W. E., Doenng, S., Breer, P., Silverman, H., Branson, B., and Williams, K.
(1987) An evaluation of using ex-addict outreach workers to educate intravenous
drug users about AIDS prevention. Presented at He Third International Conference
on AIDS, Washington, D.C., June 1-5.
Moore, R. P., Lessler, J. T., Caspar, R. A. (1989) Technical Report: Results of Intensive
interviews to Study Nonresponse in the National Household Seroprevalence Survey.
Research Tnangle Park, N.C.: Research Triangle Institute.
Nelson, K. E., Vlahov, D., Solomon, L., Lindsay, A., and Chowdhury, N. (1989)
Clinical symptoms and medical histories of a cohort of IV drug users: Correlation
with HIV seroprevalence. Presented at the Fifth International Conference on
AIDS, Montreal, June =9.
Panel on Privacy and Confidentiality as Factors in Survey Response (1979) Privacy and
Confidentiality as Factors in Survey Response. Report of the NRC Committee on
National Statistics. Washington, D.C.: National Academy Press.
OCR for page 334
334 ~ APPENDIX D
Pirie, P. L., Thomson, S. J., Mann, S. L., Peterson, A. V., MuIray, D. M., Flay, B. R.,
and Best, J. A. (1989) Tracking and attrition in longitudinal school-based smoking
prevention research. Preventive Medicine 18:249-256.
Rossi, P. H., and Freeman, H. E. (1982) Evaluation: A Systematic Approach. 2nd ed.
Beverly Hills, Calif.: Sage Publications.
Sibthorpe, B. M., Fleming, D., McAlister, R., Klockner, R., and Gould, J. (1989)
Needle sharing among IVDU's where needles are available without prescription.
Presented at the Fifth International Conference on AIDS, Montreal, June 4-9.
Singer, E. (1978) Informed consent: Consequences for response rate and response
quality in social surveys. American Sociological Review 43:144-162.
Swarthout, D., Gonsiorek, J., Simpson, M., and Henry, K. (1989) A behavioral
approach to REV prevention among sero-negative or untested gay/bisexual men
with a history of unsafe behavior. Presented at the Fifth International Conference
on AIDS, Montreal, June 4-9.
Tanur, J. M. (1982) Advances in methods for large-scale surveys and experiments. In
R. M. Adams, N. J. Smelser, and D. J. Treiman, eds., Behavioral aru] Social
Science Research: A National Resource. Part II. Report of the NRC Committee
on Basic Research in the Behavioral and Social Sciences. Washington, D.C.:
National Academy Press.
Turner, C. F., Miller, H. G., and Moses, L. E., eds. (1989) AIDS, Sexual Behavior, and
intravenous Drug Use. Washington, D.C.: National Academy Press.
Jezebel, W., Altman, N., Chene, D., and Fritz, R. (1989) Risk talking and risk reduction
among IV drug users in 4 US cities. Presented at the Fifth International Conference
on AIDS, Montreal, June 4-9.
W~nkelstein, W., Samuel, M., Padian, N. S., Wiley, J. A., Lang, W., Anderson, R. E.,
and Levy, J. A. (1987) The San Francisco Men's Health Study: m. Reduction
in human i~rununodeficiency virus transmission among homosexuaVbisexual men,
1982-86. American Journal of Public Health 76~9~:685-689.
Ziffer, A., and Differ, J. (1989) The need for psychosocial emphasis in academic courses
on AIDS. Presented at the Fifth International Conference on AIDS, Montreal, June
4-9.
Representative terms from entire chapter:
convenience samples