Read "Evaluating AIDS Prevention Programs: Expanded Edition" at NAP.edu

« Previous: 2 Measurement of Outcomes

Page 50 Cite

Suggested Citation:"3 Evaluating Media Campaigns." National Research Council. 1991. Evaluating AIDS Prevention Programs: Expanded Edition. Washington, DC: The National Academies Press. doi: 10.17226/1535.

Page 51 Cite

Page 52 Cite

Page 53 Cite

Page 54 Cite

Page 55 Cite

Page 56 Cite

Page 57 Cite

Page 58 Cite

Page 59 Cite

Page 60 Cite

Page 61 Cite

Page 62 Cite

Page 63 Cite

Page 64 Cite

Page 65 Cite

Page 66 Cite

Page 67 Cite

Page 68 Cite

Page 69 Cite

Page 70 Cite

Page 71 Cite

Page 72 Cite

Page 73 Cite

Page 74 Cite

Page 75 Cite

Page 76 Cite

Page 77 Cite

Page 78 Cite

Page 79 Cite

Page 80 Cite

Page 81 Cite

Page 82 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Evaluating Media Campaigns This chapter focuses on current and future mass media campaigns cen- tered on the prevention of AIDS. The most visible current effort is the national, multiphase America Responds to AIDS campaign of public ser- vice announcements and a mass mailing that has been developed through CDC's National AIDS Information and Education Program (NATEP). However, numerous other campaigns that are usually more local In scope have also been and will be conducted. The panel's suggestions for the evaluation of such media campaigns are relevant for all types, national and local. The pane] focuses on the special problems of evaluating a national mass media effort In the context of an epidemic that win affect the nation for years to come. The assumption that HIV and AIDS win beset U.S. society for the next 20 or 30 years underlies the panels position that allocating substantial resources to rigorous evaluations at an early stage of program development makes for a wise investment In the long run. In fact, we advocate intensive evaluation at all phases of campaign development precisely to increase the chances of meaningful ejects. Evaluation may also help reduce the wasting of resources on ineffective campaign activities; if a campaign phase does not produce ejects it can be withdrawn or replaced with more effective material. ~ discussing the need for rigorous evaluations, the pane} concluded that randomized experiments would produce the most valid account of program effects. We are well aware that this type of evaluation entails high initial costs; nonetheless, we believe that the benefits that may be derived from conducting such experiments will pay off handsomely in the long nm. In addition, from a "dollars-and-sense" perspective, we 50

EVALUATING MEDIA CAMPAIGNS ~ 51 think it is reasonable for CDC to want to know whether its expenditure of millions of dollars on the campaign for the public's education is having the desired effect. Because the evaluation of mass media campaigns presents special difficulties (see Flay and Cook, 1981), this chapter adds a fours line of inquiry to the Tree fundamental questions, "What is delivered?," "Does it make a difference?" and "What works better?" namely, "Can the campaign make a difference?" Another unique feature for the media campaign is that the panel recommends that the latter two questions be answered in condoned settings rather than the real world. The chapter includes discussions of methodological problems and issues related to resources and aspirations under each question. The chapter thus has five major sections: · Background and objectives · Formative evaluation: What works better? · Efficacy teals: Can the campaign make a difference? · Process evaluation: What is actually delivered? · Outcome evaluation: Does the campaign make a difference? BACKGROUND AND OBJECTIVES Four phases of a national AIDS media campaign have already been launched. Tree phases have channeled messages nationwide through a series of public service announcements (PSAs); a fours phase provided a mass mailing of an informational brochure to aB U.S. households. From discussions with CDC staff, it appears that future campaign phases win be focused largely on PSAs. The campaign, America Responds to AIDS, has Gus far been conducted In a series of six-month phases, although the current (Phase IV) win probably be extended to eight months. For each of He phases, target ,audiences have been identified and general frameworks provided for the aims of the campaign. These objectives are stated below, as the pane] understands them on the basis of extensive discussions win CDC staff. The general objective of Phase ~ (October 1987 to March 1988) was to increase the general population's awareness of AIDS and to correct misperceptions about how it is and is not acquired. This phase mmed to "humanize" AIDS and reduce needless fear. The campaign consisted of PSAs aired by television and radio stations throughout the county. CDC and contract staff (the advertising firm of Ogilvy & Mather) conducted a media marketing outreach program to obtain amine more favorable than PSAs typically receive. Auditing of these PSAs, to the extent that

52 ~ EVALUATING AIDS PREVENTION PROGRAMS it has been possible, showed that in many "markets" around the country, the PSAs for Phase ~ were frequently aired, although nearly 90 percent were aired in nonprime time. (It should be noted, however, that over 50 percent of the dollar value of donated air time was in prime time.) The general objectives of Phase II of the campaign (April to Septem- ber 1988) were similar to those of Phase I, but this phase used a mass mailing rather than PSAs as the central channel of information. A mass mailing is expected to reach a higher proportion of the national popu- lation with more consistent and more detailed information than a PSA campaign. An eight-page brochure, Understanding AIDS, was mailed to all households in the United States dunng the last week of May and first two weeks of June 1988. Under a contract with CDC, Ogilvy & Mather coordinated press releases, a press conference with Surgeon Gen- eral C. Everett Koop and CDC Director (now Assistant Secremy for HeaIth) lames O. Mason, and national marketing of a video news release followed by satellite news interviews and PSAs to promote the mailing. Phase m of the campaign (October 1988 to March 1989) was directed specifically toward women at risk and sexually active adults with mul- i~ple partners. ~ addition, because a disproportionate number of AIDS cases have been reported among blacks and Hispanics, this phase of the campaign included a number of program elements aimed specifically at these audiences. Phase IV (April to November 1989) emphasizes families and chil- dren. According to an oral presentation In March by NAIEP staff to the panel, this phase has four objectives: (~) to improve parent-child com- munication about sexual behaviors; (2) to increase the acceptability of abstinence to parents and adolescents (and also to the general population); (3) to increase the acceptability of condom use if abstinence is not prac- ticed; and (4) to increase the availability and adoption of comprehensive school health culTicula.2 Aim (~) concerns the specific target audiences only: changes should be observed only among parents of appropriately aged adolescents and the adolescents themselves and not necessarily in the rest of the population. To be useful In the long term, however, auns (2) and (3) should also influence the general population, events ly af- fect~ng social norms regarding He acceptability of abstinence; premarital, 1 The Government Accounting Office (GAO) audited He airing of CDC's PSAs on network TV from December 1987 through Febn~ary 1988 (GAO, 1988). 2A press release (n.d.) on Phase IV casts these objectives in a slightly different voice. The first objec- tive is said to encourage adult-child communication about HIV and ADS; next, rather than specifying abstinence and condom use, the ensuing objectives are expressed as encouraging youth "to adopt and maintain behaviors that eliminate or reduce their risk of infection;" die final objective is to "raise public awareness of young people's vulnerability to HIV."

EVALUATING MEDIA CAMPAIGNS ~ 53 extramantal, and casual sex; condom use; etc. Aim (4) appears to be directed toward schools as much as toward parents and children: the aim is to increase the acceptability of comprehensive school health education provided by school personnel and the rate of adoption of such curricula by the nation's schools. As discussed In the last chapter, the objectives of a campaign, or of any defined component or phase of a campaign, are not useful to an evaluation if they are stated generally. Objectives must be specific before useful evaluations can be conducted and outcomes can be mea- sured. Several possible outcomes of the national AIDS campaign are of interest: knowledge and beliefs about AIDS and its causes, particularly, the clarification of myths; beliefs about susceptibility and the severity of consequences; attitudes toward people with AIDS and toward high-risk situations and protection; intentions regarding personal protection; actual high-risk and protective behaviors; and seroprevalence. (See Chapter 2 for a discussion of these outcomes.) Consistent with its overall recommendations, the pane] urges Hat the desired outcomes of each future phase of the campaign (~) be made explicit, (2) be measured repeatedly, (3) be reached (or retargeted) prior to developing future phases, and (4) be monitored periodically to ensure that they are maintained once they have been achieved. Maintenance can be ensured either by rerunning some campaign phases or by introducing reinforcing messages in later phases. The panel further suggests that evaluative data from earlier phases Dive the choice of target issues and audiences for future phases. FORMATIVE EVALUATION: WHAT WORKS BETTER? Formative evaluations of media campaigns are especially useful for ob- taining detailed, documented evidence of effectiveness prior to wide or further deployment. DuIing a formative evaluation, alternative campaigns or campaign messages can be tested on a small scale, which will help contain the costs of doing randomized experiments. This type of research partially answers the question "What works better?" (Efficacy mals pro- vide further information on what works in an optimal situation; see the next section). It is an unfortunate fact that the budgets of most PSA campaigns do not allow the kind of thoughtful formative evaluation that is common in the development of commercial advertisements. Indeed, in many cases, PSA formative evaluation involves noting more than a review of creative ideas by the officials funding the campaign perhaps the selection of one

54 ~ EVALUATING AIDS PREVENTION PROGRAMS campaign from a set of two or three that have been developed to the idea stage and a selection of the campaign design based on simple intuition. The absence of serious research at this stage of program development can be fatal, and it is always a great disadvantage, since the track records of even the most successful advertising agencies show that changing audience behavior is a difficult enterprise. A campaign as large and as important as the National AIDS Infor- mation and Education Program does not suffer from the same resource deficits that hamper developmental efforts In more typical PSA cam- paigns. It is possible to do much better than the standard approach in PSA campaign development and to adopt the kinds of research strategies that are more typical of formative evaluations for commercial advertise- ments. The result is a much stronger campaign and a campaign whose effects can be more easily assessed. Our whole approach to evaluation is designed to lead to the development of effective campaigns. While others have advocated extensive formative evaluation (e.g., the Office of Cancer Communications within the National Cancer Institute) hardly any campaign developers have conducted such careful developmental and evaluation research. A campaign as long as this one deserves such careful attention if we are not to waste millions of doDars. The pane! recommends the expanded use of formative evalu- ation or developmental research in designing media projects. There are several standard strategies that are used in formative eval- uations of mass media campaigns, and they should be part of a major national campaign such as this one. The panel details one of these ap- proaches here; alternative strategies are provided by Flay (1986), the Office of Cancer Communications (1983, 1984), Palmer (1981), and oth- ers. The approach we propose has five basic steps: (~) idea generation, (2) concept testing, (3) the positioning statement, (4) copy testing, and (5) test marketing. The first four of these steps are described below; the fifth is discussed in the next section on efficacy trials. Some of these steps have already been taken by CDC and its contractor; however, we endorse the adoption of all five and hereby underscore their importance. Step I: Idea Generation A media campaign begins with an idea about a message that is thought to have motivational power. There is now a fairly substantial body of research on the cognitive and motivational determinants of and the bar- riers to behavioral change: see, for example, McKusick, Horstman, and Coates (19871; Joseph and colleagues (1987~; McCusker and colleagues

EVALUATING MEDIA CAMPAIGNS ~ 55 (19881; and Valdisern and colleagues (1988~. This literature is being applied to studies of risk reduction to prevent AIDS, and it can be used to help stimulate the ideas needed to develop campaign messages. Yet even when ideas are solidly grounded in research, the first step in formative evaluation should be to evaluate the power of the idea. This step is particularly important in cases in which the idea is developed outside of the context of the lives of the people to whom the message will be delivered. The literature on behavioral change interventions is full of examples of ideas that seemed to make perfect sense In the abstract but that failed completely because they clashed with fundamental ideas held by the target population or were not stated in ways that were relevant to the lives of those people. Preliminary evaluation must guard against these "fatal" flaws by evaluating the message's ideas in the context of the lives of the people toward whom the message will be directed. This type of evaluation is usually done in a phase of research known as concept testing. Step 2: Concept Testing Concept testing is an iterative process that is usually based on focused interviews and other qualitative research techniques. It attempts to de- term~ne the meanings of the behavior one wants to change in the context of the lives of the people one wants to reach. This information is used to assess the appeal of the campaign message in relation to this context. For example, if the message is that condom use helps protect against transmission of the AIDS virus, it is necessary to develop some under- standing of the meanings currently associated with condom use, the kinds of reactions that occur when it is claimed that condoms are protective, and the persuasive power of the message that there is danger of infection. A fuller understanding of these issues might very well lead to a revision of the message that will make it more appearing to the selected audience. It might be discovered, for example, that teenagers are not motivated by the fear of infection, but that they can be reached by appealing to their feelings of insecunty. A revised message based on this insight might emphasize the fact that older, more experienced peers use condoms and that it is only ~nexpenenced youngsters who fail to use them. A word is in order about the methods used to generate insights of this sort. Qualitative approaches are the standard methods used in this phase of formative evaluation. In commercial applications, a focus group is the qualitative technique that is most commonly used. The use of groups is thought to be superior to individual interviews because group processes

56 ~ EVALUATING AIDS PREVENTION PROGRAMS help stimulate ideas and uncover resistances that may remain hidden in one-on-one sessions (Morgan and Spanish, 19841. Exploratory work early in the AIDS epidemic win focus groups of gay men documented these advantages by showing that a number of important themes were raised In group discussions but were not found in personal interviews (Joseph et al., 19841. This kind of interaction occurs in part because people in groups talk to each other more than to the group moderator, and this talk tends to be insider tam rather than Elk aimed at presenting only the public self. Group interviews, then, can be an extremely valuable tool in preliminary research on the contextual validity of a basic idea or message for the campaign. At the same tune, group processes can sometimes create balkers to understanding when insider behavior involves a good deal of posing and protective actions aimed at presenting an appealing view of oneself. Comprehensive concept testing should use both group and Individual methods to guard against the Innitations of each mode. The goal of concept testing is to develop insights into the meanings of ideas and mes- sages, which can best be achieved by being open to multiple sources of information. Group interviews, unobtrusive observation, key informants, and focused or in-dep~ interviews with a cross-section of the selected population all have parts to play. Once an idea has been refined In this kind of research, it is common to carry out some type of representative survey to guarantee that He themes heard In the focused, qualitative part of the study are representative of the total target population. This research is confirmatory rather than exploratory; its goal is to validate the insights obtained from the individual and group processes. ~ the development of a campaign for a new commercial product, for example, it is not uncommon for formative evaluation to include several hundred group and individual interviews directed toward refining ideas and messages. This exploratory work is often followed by a nationally representative survey of several thousand respondents to assess whether the themes developed in the exploratory research are representative of the entire selected audience. What is uncommon is to repeat this type of evaluation each time a new phase of a campaign occurs. However, a media campaign whose purpose is to prevent the transmission of HIV is far different in importance and in its different target audiences Han a campaign for a commercial product. Thus far, the AIDS campaign has chosen different audiences- the general public, women, adults with multiple sexual parmers, parents and youth for each campaign phase. The panel believes ~at, under these circumstances, themes and messages should continue to be tested for each new phase and each new target group of the media campaign.

EVALUATING MEDIA CAMPAIGNS ~ 57 Step 3: The Positioning Statement The product of concept testing is a positioning statement, that is, an outline of the messages one wants to communicate to the target audience. The positioning statement is the starting point for the development of the creative materials In the campaign. The agency that is developing the campaign begins with the positioning statement and attempts to develop commercials that convey the messages in this statement effectively. It is conventional for the creative teams in charge of commercial de- sign to develop several different formats for conveying these messages. On early stages, the formats may consist of rough text and descriptions of what the visual components of the advertisements would look like. As formats become more firmly established, several more polished texts are developed win sketches of camera shots. A series of these sketches conveys the entire sequence that would be produced In a final commer- cial. Several of these sequences, known as story boards, are typically presented to clients for preliminary approval before a more refined phase of development takes place. In PSA advertisement development, it is not uncommon for the story boards to be the only intermediate version of the advertisement to be reviewed prior to final production. In addition, as noted earlier, it often happens that the selection of the final campaign from among several different story boards is made In a single review meeting, based on the intuitions of the public health clients and advertising executives. Cost constraints often make it difficult to carry out systematic evaluations to select the set of story boards that is likely to make the most effective advertisement. Large and important campaigns, however, should continue the process of formative evaluation at this stage to help in the selection process. By so doing, they not only increase the chances of campaign success by selecting the best set of story boards, but they also obtain information that can be used to make final changes to the story boards that are selected. This data collection is carried out in a type of evaluation known as copy testing. Step 4: Copy Testing Copy testing is any research that exposes a test audience to some version of an advertisement and evaluates the success of that advertisement in communicating the intended message. The most superficial kinds of copy testing use small samples of people to read and discuss Me advertisements as a way of making sure they are comprehensive and adequate. Systematic copy testing, however, also attempts to assess persuasiveness and does so with experiments.

58 ~ EVALUATING AIDS PREVENTION PROGRAMS The simplest kinds of experiments randomly assign a sample of re- spondents to a laboratory exposure of story boards, aromatics (a cartoon or animated story board), or other rough production versions of an adver- tisement and then evaluate the effects of different formats on self-reported knowledge, attitudes, and behavioral intentions. Debriefing is typically a part of Me data collection activity because it allows researchers to pinpoint aspects of the various advertisements that lead to confusion, that convey messages other than those that were intended, or that fait to have the intended persuasive impact. This information can then be used to modify the story boards for another iteration of the process, which continues until a format is developed that has the desired characteristics. If a researcher is creative, it is possible to use this approach to evaluate real behavioral effects. Wright (1979), for example, evaluated the impact of different formats for risk disclosure about over-the-counter drugs by giving participants in his experiment a free coupon to obtain their choice of several different over-~e-counter drugs from a pharmacy next to the theater in which the experiment took place. The evaluation consisted of ratings made by unobtrusive observers on the length of time participants spent reading the warrung labels on the packages before choosing a product. Wright found that one version of his advertisement was superior to others in promoting careful reviews of warning state- ments. One could imagine similar evaluation outcomes in copy-testing experiments of proposed AIDS materials. A somewhat different kind of copy-test~ng evaluation is also possible in what is known as a recruit-to-view format of rough productions. This approach recruits participants to view a television program in their homes without telling Hem that a rough version of the PSA being tested will be one of the commercials on the program. A postviewing Interview is then conducted to determine the percentage of the audience who recall the PSA and the aspects they recall most vividly. This kind of real-life exposure allows researchers to study the in- trusiveness of the PSA that is, its ability to catch the attention of the audience. It also allows them to assess comprehension and knowledge retention in a real-life setting rather Can In a laboratory. It often happens that comprehension difficulties are detected in this kind of copy test- ing that were missed in laboratory research where people are artificially placed in a situation that encourages them to pay more attention to the advertisement than they ordinarily would when watching television at home. The confusion only emerges among people who are half listening, He way many people listen to television commercials. Sometimes, discoveries at this stage of fonnative evaluation lead

EVALUATING MEDIA CAMPAIGNS ~ 59 to revisions In final production versions of the advertisements that are extremely important. These can be subtle changes for instance, chang- ing one word that, if misunderstood, would change the meaning of the message In an important wayor they can be dramatic changes. One might discover, for example, that a message that is easily comprehended In a laboratory situation is delivered In such a low-keyed way that audi- ences at home take little notice of it and, as a result, never really hear the message. This kind of discovery could lead to a total revision of the campaign and a return to another set of story boards with a more intrusive message. Methodological Issues The main methodological problem of formative evaluation is its lack of external validity: that is, one cannot generalize from it because the evalu- ation is made of a Vomited or pilot program and not of a program deployed under realistic operating conditions. By conducting small-scale experi- mental or quasi-experimental evaluations of one-time message exposure In laboratories In forms that are not identical to the final advertisements, evaluators are testing an intervention that is quite different from the final product. Nevertheless, formative research is an important component of program development. Although it may not always result in an effective campaign, it will usually identify an unacceptable or ineffective approach. Resources and Aspirations Here and elsewhere in the report we discuss the resources needed to conduct the types of evaluation the panel recommends (i.e., costs, time, staffing), as well as the realistic aspirations for what will be gained from doing the evaluations. It is important to note that aU of what has been discussed thus far is not a guarantee of the success of a campaign once it has been fully implemented. But carefully carrying out a systematic formative evaluation will provide clear indications of the types of effects to be expected. Following formative evaluation and subsequent program development efforts, a campaigns Implementation and outcomes also need to be evaluated (as described below). The reader should also recognize that the search for a better campaign the evaluation of what works better is inherently limited by the fact that Be role of evaluation is to assess ideas rather than to generate them. The success of a campaign such as America Responds to AIDS hinges on the stimulation of creative ideas for reaching hard-to-reach population groups. The role of evaluation is not to provide this creativity; rather, it is to recognize it and distinguish efforts that are more likely to be effective from those that are not.

60 ~ EVALUATING AIDS PREVENTION PROGRAMS Senous attempts should be made to work through more than one creative idea. Typically, the purchaser of a media campaign will describe a product concept to an advertising agency and contract with the agency to prepare up to six campaign options from which to choose. Often, one of these ideas will be more fully developed than the others, indicating the agency's judgment about the most potentially effective option. If a client is not satisfied with the agency's work, the client may negotiate a contract with another agency. A commercial buyer usually has on staff informed advertising con- sumers, often with ad agency expenence, whose expertise prepares Hem to choose among multiple options and to decide when to bring In another agency. NAIEP in fact has a staff with substantial ad agency experience, which makes it well-positioned to select from among several creative ideas. To inaugurate the America Responds to AIDS campaign, NATEP solicited proposals and presentations from 13 different agencies before choosing Ogilvy & Mather. That contract is due to expire this fall, and to compete for the next contract, ad agencies will be required to present a proposed campaign in story-board form. The pane] endorses this plan, while recognizing that because of the expense of story boards, it may not be possible to get 13 agencies to compete again. For this reason, the pane] advocates gerdug a minimum of three different agencies to develop a campaign to this stage. Because some agencies are unlikely to develop story boards unless they have a contract, the development of boards could be purchased on a fixed-fee basis. This purchase of ideas from multiple agencies is unusual, but the pane] believes it is warranted in He present circumstances. Increasing the range of ideas that are nur- tured increases the chances that an effective and creative campaign win be produced. When faced with such multiple approaches, evaluation can help determine which of them is likely to be the most effective. In He preceding section, the pane] sketched out an ambitious but not unrealistic approach for formative evaluation. ~deed, this series of activities mimics the standard practices of commercial advertisers in the formation of a major national advertising campaign. Fonnauve evaluation budgets for commercial campaigns are often in the range of $200,000-$500,000. Formative evaluation should not be left In He hands of the advertising agencies Hat develop the campaigns, because it is at this critical stage In the decision-making process that it is important to make all reasonable efforts to encourage the development of many ideas for the most creative campaign possible. NAIEP should be able to take advantage of the ad agency backgrounds of its professional staff In He formative evaluation of

EVALUATING MEDIA CAMPAIGNS ~ 61 media messages. The investment of staff time and funds in such research at this stage of campaign development will help to ensure implementation of a good program. Before implementation, however, it is also wise to determine whether the campaign can? indeed, make a difference if it were optimally implemented in markets around the country. EFFICACY TRIALS: CAN THE CAMPAIGN MAKE A DIFFERENCE? The ultimate effectiveness of a campaign will depend on its efficacy: with optimal implementation, could it have the desired effects? (An example of optimal implementation is the Anne of a PSA dunng a program that is popular with a target audience.) This type of evaluation or market test should be earned out prior to the launching of the campaign and should be based on test campaigns conducted in a small sample of prototypical markets (rather than in the entire country). Given the enormous potential impact of such national campaigns as America Responds to AIDS, it is inappropriate to implement the campaign (or any of its main stages) nationally without some evaluation of whether it is likely to be effective, ineffective, or even harmful. Efficacy evaluation should be distinguished from effectiveness evaluation, which is distinguished by the broad, real- worId implementation of a program. It is important to remember that, even with adequate evaluation at this stage, there is no guarantee that a campaign will be effective when finally implemented It may not be implemented adequately enough to be effective. Ideally, program developers should separate the determination of a campa~gn's efficacy that is, whether it could work if it were im- plemented optimally from the determination of its effectiveness when actually implementedthat is, whether it does work as implemented (usually less than optimally).3 Otherwise, if less than optimal effects are found only after full-scale implementation, one can never be sure of the reason. Was it because the campaign as designed was not efficacious (could not have produced better effects even if implemented optimally)? Or was it because of less than optimal implementation? In this section the pane} considers tests of campaign efficacy. The subsequent section considers tests of campaign effectiveness. Randomized Experiments In keeping with its recommendations on randomly controlled trials, the pallet believes that campaign efficacy should be determined through the 3 See Flay (1986) for discussion of the efficacy'ffectiveness distinction.

62 ~ EVALUATING AIDS PREVENTION PROGRAMS use of randomized experiments. The proposed media campaign should be implemented with maximum integrity In several carefully selected test markets, and the data gathered there should be compared with data gathered In several other matched control (nontest) markets. The data sources should be specially designed surveys that are used only in the test and control (nontest) markets. Whenever possible, however, the questions that are to be used in subsequent effectiveness trials should also be used In tests of efficacy. It is important that the test and control markets chosen for efficayv teals be representative of the selected population one wants to reach. Because of the difficulty of reaching a significant proportion of the audience with more than one message exposure, multiple airings are necessary. In standard test-market studies, a campaign is commonly aired from 6 to 12 months to be sure that most viewers will have been exposed to an ad at least three tunes. Similarly, Be test period for the national AIDS media campaign should be at least six months to provide a sound evaluation of the campaign's exposure effectiveness. Campaign tracking surveys are a common means of monitoring suc- cess in test-market studies. Such surveys might include independent monthly interviews of samples of people In the general population to monitor levels of campaign awareness, recall, understanding, persuasion, and self-reported behavioral change. With parallel surveys in nontest mar- kets, researchers can control for the "confounding" influence of broader societal determinants of change that are unrelated to the campaign. The parallel control market survey is a particularly important design compo- nent in an evaluation of an AIDS information campaign because there are many other sources of AIDS information that may be determinants of changes in knowledge, attitudes, and behavior. An evaluation of the unique incremental influence of any new campaign requires a comparison of trends in the test markets with trends in the control markets, which are exposed to all of this over AIDS information but not to the new campaign. In commercial applications, an important part of test-market eval- uation is information about sales trends. An advertising campaign for an automobile is not considered successful unless automobile sales in- crease more (or decrease less) in the test markets than in Be control markets. There may be opportunities for using similar outcome measures in the evaluation of public health information programs. Trends in con- dom sales, visits to birth condom clinics, sales of books about safe sex, and phone calls to the AIDS hotline all provide independent estimates of potential outcomes. (However, the reader should note the cannons

EVALUATING MEDIA CAMPAIGNS ~ 63 regarding the use of trend data in outcome evaluation, discussed below.) Test-market evaluation should also include the monitoring of nega- tive outcomes, especially in the case of a campaign like the phases of America Responds to AIDS. In some cases a message that engenders extreme fear without also offenug advice on how to reduce the threat may also engender feelings of helplessness or the denial of susceptibility to the threat. ~ the case of a disease whose consequences are as deadly as those of AIDS, any negative outcomes resulting from an intervention should be carefully monitored and studied. Media campaign developers should note that test marketing need not be limited to a single campaign that has been selected as the presumed best among alternatives from prior stages of evaluation. If more than one campaign still seems promising after the copy-testing stage, it is both possible and desirable to catty out randomized comparisons of the two (or more) variations across different markets. AIDS campaign sponsors must decide whether there are, indeed, several potentially elective cam- paign designs when they carry out their own evaluations of story boards for future stages of the campaign. If there are, the sponsors should not hesitate to produce different campaigns aimed at the same selected pop- ulation groups and then evaluate the campaigns' comparative efficacy in test markets. Randomized tests of alternatives can also be used to assess the relative impact of different versions or components of a campaign. For example, to test two approaches to altering norms regarding abstinence, a campaign directed toward parents and children might be run in some markets, and a campaign directed toward the general population might be run In others. A similar approach could be taken to establish the relative strength of different components of any one strategy. Let us take Phase IV of America Responds to AIDS as an illustration, even though it will ending in a few months. Phase IV consists of different sets of messages for parents and children; one or the other set might be more acceptable to the group selected for the intervention and thus be more effective. The relative strengths of the two could be determined with an expenmental test of alternatives, in which some markets received only the parent messages and some only the children messages; other test strategies might include giving die messages in different orders or both at once. Of course, this approach should be reserved only for questions Mat remain unresolved after formative research. Methodological Issues There are several methodological issues that must be addressed when undertaking efficacy teals of the type recommended by the panel. The

64 ~ EVALUATING AIDS PREVENTION PROGRAMS fact that PSA campaigns must rely on donated ~ the makes it difficult to conduct an efficacy trial of this approach to AIDS prevention. To determine efficacy, the selected test markets must air the PSAs on an optimal schedule. To expect this kind of cooperation from randomly selected media markets is unrealistic, but it might be achieved by pro- viding desirable incentives for example, partial or full payment for the PSAs. Efficacy teals would certainly be feasible using paid air time. The fact that incentives or payment would make the test different from what would occur In the fills Implementation is not a serious concern: the intent at this stage is to determine what effects the campaign could have if it were implemented optimally (efficacy) and not to test what would happen during the real implementation (effectiveness). Whether a paid advertising campaign would be more cost-effective than a PSA campaign is an untested question. A paid campaign might produce exposure and effects that were so much better than Dose pro- duced by a PSA campaign that it would be worm the increased costs. Perhaps the exposure and effects of a PSA campaign do not justify the high costs of producing efficacious spots (and certainly noting can justify the lower costs of producing ineffective spots). The pane' recommends systematic comparative tests of paid advertising versus PSA campaigns. To conduct such tests, air time on desirable shows (those likely to be viewed by the selected audience for the campaigns must be purchased In some random markets and not in others. A common alternative to the ideal approach of using multiple, ran- domly selected test and control markets is to use only one or two coopera- tive test markets and the same or a larger number of well-matched control sites. Trials of a campaign in "lead" or test markets before national un- plementation are an example of this approach. Any serious campaign that is planned to last at least six months should be launched in test markets. However, this approach is acceptable only if extensive attempts are made to ensure that the selected control markets are comparable. By conduct- ing a test-market evaluation, researchers study only a small number of areas. These areas, however, do not provide a probability sample of the population the campaign is intended to reach. In such test-market evaluations, it is important to make a substan- tive investment in the development of formative evaluation materials and interventions that are as close to the finished product as possible. An- imatics or video story boards should be chosen rather than the simple story boards used in laboratory expenments. Without efforts to make the

EVALUATING MEDIA CAMPAIGNS ~ 65 tests as realistic as possible, their implications for the full campaign will be lessened. Expenmental tests of alternatives might seem like an expensive approach to determining whether a campaign can make a difference. There will probably be many instances, however, in which one or another approach will be found to be more effective and the other can then be dropped rather than implemented on a wide scale. Without this way to discriminate among approaches, the less effective strategy may be the one to be implemented on a wide scale. Even if one strategy is no better or no worse than another, the existence of the experimental test of alternatives allows for a more solid assessment of project efficacy. The resources required for such assessments, as well as the limitations Hat must be accepted, are described below. Resources and Aspirations Without high-quality efficacy trial data, policy makers, CDC officials, other public health officials, budgeting staff (at CDC, the Office of Management and Budget, and the Government Accounting Office), and the public are left without any reliable indicators of the likely value of their investments. Yet high-quality teals require professional staff with training and experience in experimental field tests of media campaigns, as well as staff to design and oversee all associated data collection and analysis. Industry typically spends as much as $200,000-$300,000 for efficacy teals (market tests) for a $10 million campaign. Using this figure as a benchmark, we estimate that the resources needed for efficacy teals of an AIDS media campaign prior to national implementation will be 2 to 3 percent of the expected total campaign cost (including donated or Unkind costs from all sources). There may be a concern that the social context of the epidemic is subject to unexpected rapid developments c.g., publicity surrounding Rock Hudson's death may have precipitated quick changes in the pub- lic's knowledge about AIDS. In this context, He length of time required for mounting He recommended activities (almost a year to conduct for- mative and efficacy research before a campaign can be fielded) may cause concern. However, it is our contention that one year to develop a campaign is not excessive within the 20-30 years presumed necessary for AIDS prevention. Only rarely are charges likely to be so dramatic and so fast Hat a campaign idea will have to be dropped halfway through development. However, He possibility of fairly rapid changes serves to emphasize He need for continuous fonnative research and efficacy and effectiveness teals and for continued monitoring of campaign effects.

66 ~ EVALUATING AIDS PREVENTION PROGRAMS The pane] recognizes that there are limitations to the general~zability of test-market effects in pilot studies such as these. Of course Here will be variant contexts in which campaigns are aired, but the pane] believes the substance of the effects will be sunilar in similar groups. Randomized exposure of campaigns among audiences representative of the target populations will go a long way toward factonug out spurious effects. Then, once efficacy has been determined and the campaign has been implemented, process evaluation can begin to assess the delivery of the intervention. PROCESS EVALUATION: WHAT IS ACTUALLY DELIVERED? The questions of concern in this section involve the implementation of the campaign: whether the campaign was aired at all, how often it was aired, and to whom. There are three major approaches to process evaluations of media campaigns: audits of PSA broadcasts through Broadcast Advertis- ers Reports data; monitoring of AIDS information requests through cans to AIDS telephone hotlines; and general population surveys of campaign awareness. In the past, CDC's process evaluation efforts for media campaigns have primanly involved auditing PSA broadcasts through Broadcast Ad- vertisers Reports (BAR) data, which provide information on what PSAs were broadcast on particular stations and at what times. By merging BAR data win information on audience characteristics obtained from A.C. Nielsen, it is possible to generate analyses of the types of people who were exposed to the PSAs. .. .... . . . Another way to evaluate PSA delivery is to monitor whether viewers of a PSA make requests for more inflation, the opportunity for which is provided by the SOO-number telephone hotline associated with the PSA campaign. The AIDS hotline has been operated by the American Social Beaten Association under contract from CDC since February 1987. By September 1987, the hotline had 17 lines in place; one month later, it had increased its capacity fourfold. Figure 3-} presents data on the frequency of both taped and operator-han~ed calls to the hotline from February 1987 to February 1989. As He figure shows, there was an increase from approximately 70,000 cans per month during He summer of 1987 to approximately 200,000 calls per month In the fall of 1987, a Mend that continued through June 1988. This increase might be attributable to the PSA campaign and to the publicity associated with it. However, the proportion of media coverage that actually resulted from the PSA campaign compared win the

EVALUATING MEDIA CAMPAIGNS ~ 67 250 200 ~0 100, ~0 Thousands . _ s . . _ 111~~ ..... 1 , .......... 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 (37 ~ (38 ~ (39 Month/Year . Live Operator Taped Calls FIGURE 3-1 Calls received by He AIDS hotline by mono, from February 1987 to February, 1989. (Note: Hotline calls are reported starting February 16, 1987; there were technical problems wide taped responses in September-October, 1987). Source: National AIDS Hotline. proportion that resulted from other publicity and campaigns is unknown and difficult to estimate accurately.4 The further increase that occurred in operator-handled calls in May-June 1988 may be attributable to the national mailing and the associated media coverage; the elimination of taped calls in June 198S, however, makes this result less interpretable. Following the May-June increase, the number of hotline calls per month dropped by half from June to August 1988 and remained at that level dunng subsequent months except for a small increase in January 1989, the cause of which has not been determined. A more expensive but considerably more effective method for assess- ing exposure to a media campaign is to take general population surveys of campaign awareness. A special version of this approach is to conduct a telephone survey of a random sample of television households that limits contact to the few minutes directly following the broadcast of a PSA in a particular market. Coincidental surveys of this type are a unique way of determining how many people actually sit through a PSA when 4Although America Responds lo AIDS is the only campaign to have aired the hotline telephone number' the number is readily available through directory assistance and AIDS support groups.

68 ~ EVALUATING AIDS PREVENTION PROGRAMS it is broadcast (rather than leaving the room or shifting their attention elsewhere). Methodological Issues All of the above strategies can provide information for process evaluation, but each has methodological problems that act as counterweights to potential benefits. Some of Be problems can be corrected when, as In this instance, the approaches are used for process evaluation. However, when these approaches are used to assess effectiveness (discussed in the next section), the problems often are much harder to resolve. The Broadcast Advertisers Reports (BAR) data are limited in their ability to provide complete and accurate inflation on how often PSA messages are broadcast. Information on PSAs is not routinely collected at the same level of detail as information on commercial advertisements. PSA data are limited to major television markets and are reported only once every four weeks. Although this limitation is not a major problem for monitoring general trends (because most television stations use a standard PSA rotation scheme that is fairly constant from one week to the next over short penods of time), the limitation of BAR data to major markets is a serious constraint on the ability to cause where and how often PSAs appear. , ~ ~ ,, ,._. _ ~,~ ..~ ,, Several methods are available to remedy this situation. One common approach used by agencies that manage PSAs is to send a mad! question- na~re (or "bounce-back" postcard) on PSA airing and rotation schedules to station managers responsible for PSA scheduling. This method is not particularly accurate, however, because only a small percentage of the questionnaires or cards are returned. A better method is to conduct a systematic telephone survey with the managers. A probability sample could be selected in a rolling pane] designs that would minimize the cost of Be survey and the burden borne by respondents and yet stir] yield unbiased and efficient estimates of how often the campaign PSAs are being broadcast throughout the country. An evaluation of campaign exposure in all markets throughout the country might produce more accurate information by malting use of the AIDS hotline callers as respondents. It is known that hotline calls increase for a short period of time after the broadcast of a PSA. The monitoring system for hotline calls could easily be enhanced to record the telephone exchange numbers or area codes of all callers and the time SA rolling panel design is one in which a portion of the panel is dropped at every wave and a new representative sample added.

EVALUATING MEDIA CAMPAIGNS | 69 of their calls; alternatively, caners could be asked to provide their zip codes. These data could be used to indicate indirectly when a PSA was broadcast in a particular market. A burst of cans to the hotline from one particular market in the country between 7:15 and 7:30 p.m. on a particular evening would almost certainly mean that one of the campaign PSAs was broadcast in that market between 7:00 and 7:15 p.m. A few simple questions posed to hotline callers could verify the broadcasting station's number or call letters; this information In turn could be linked with Nielsen data to generate estimates of the size and composition of the viewing audience for Mat station at that time. Unfortunately, data on viewers, which are necessary for any com- prehensive assessment of implementation, are weak for most television markets because they are based only on diary samples Mat are obtained during designated "sweep" weeks each year. Programming options tend to be unique during these weeks; as a result, it is not clear that informa- lion obtained through these diaries is characteristic of the markets at over times of the year. Furthermore, even if these data accurately reflected the number of people viewing a particular program on a particular station, they provide no information on whether the PSAs captured the atten- tion of the audience. As noted above, the only feasible way to resolve these problems is to carry out a coincidental telephone survey in which a random sample of television households are contacted shortly after Me broadcast of a PSA. Brief interviews win respondents can determine whether they were viewing the station on which the PSA was broadcast and, if so, whether they watched the PSA. Telephone co~ncident~ surveys are not totally without difficulties, however. Their main drawback is that they require enormous coordination to know when a PSA win be broadcast on a particular station in a particular market. Although this kind of information is available for paid advertisements, it is not for most PSAs, malting it nearly impossible to implement the strategy of coincidental phone calls nationally. Fortunately, the AIDS hotline could be used to help resolve this problem. The same strategy of monitoring hotline calls to determine indirectly when a PSA was broadcast in a particular market could allow interviewers working in conjunction with the hotline to pinpoint markets for random-dig~t-dialing coincidental interviews. A small team of no more than 3 or 4 interviewers added to the 68 who are already answering hotline cans could phone random households in areas with PSA broadcasts reported during tile prior 30-minute period. Ongoing assessment of this sort could provide valuable information on the clarity of Me message and the attentiveness of the audience. These strategies are relatively modest in terms of the resources required for their implementation.

70 ~ EVALUATING AIDS PREVENTION PROGRAMS Resources and Aspirations It is important to recognize that the evaluation of an ongoing set of PSAs brings a media campaign to a critical decision point: whether to continue broadcasting these advertisements, to take them off the air, or to introduce a new set of PSAs. The strategy that has just been described, which involves parallel monitoring of trend information from the national telephone hotline and from telephone coincidental surveys, could prove to be the most useful evaluation too] for this purpose. Indications Hat particular PSAs have been seen so many times that they no longer provoke thoughtful attention and no longer evoke the same serious responses (as indicated by calls to the AIDS hotline) provide strong, face-valid evidence that it is time to introduce another set of messages. To monitor the success of a media campaigns implementation, how- ever, it is necessary to obtain more detailed information than is currently available regarding how often the campaign messages are broadcast on television stations throughout the country. Probably the best way to get this information is to supplement BAR data with an ongoing telephone survey of scheduling managers in a probability sample of television sta- lions that have received the PSAs. A survey of this kind would require no more than the equivalent of one fuB-time telephone interviewer and some administrative support to help design the sampling frame and the plan for systematic sampling in a rolling pane} design. In addition, more use could be made of the AIDS hotline in Me variety of ways suggested above. First, consideration should be given to augmenting the hotline protocol win a brief set of questions. We recognize that although this suggestion sounds and should be simple, We situation is complicated by current policy that disallows questioning of hotline callers. We urge Cat present policy be amended to allow a short, approved list of questions that are carefully crafted to maintain a caDer's confidentiality. These questions should cover where and when We caller saw the PSA, which PSA the caller saw, and whether He caller would be willing to participate in a follow-up assessment of the campaign. In addition, a content code should be developed to record the nature of the questions posed by the caller. Given the enormous number of calls handled by the hotline, the selection of a random sample of caners to receive these more detailed questions would be adequate. The resources needed for this activity would include two people: a project manager who is capable of designing a brief series of questions and a sampling scheme to embed these questions in hotline interviews, and a computer programmer capable of augmenting the telephone control system associated with the hotline. The augmented telephone system

EVALUATING MEDIA CAMPAIGNS ~ 71 would record the time of calls and the telephone exchange of caners: that is, the area code and 3-digit prefix identifying the telephone exchange, not the entire telephone number. Both of these pieces of information can be recorded mechan~caBy. The panel also considers it potentially important to implement an on- going telephone coincidental survey in parallel with Me hotline system. As noted above, this survey could involve as few as 3 or 4 additional telephone interviewers working together with the current 68 hotline per- sonnel. The same project manager recommended in the previous para- graph could develop an interview schedule and coordinate the design of this ongoing survey, as well as monitor trends In the data. The interview data obtained In such a survey could be managed by a computer-assisted telephone interview system, thus avoiding any need for data entry. Costs for all of the resources noted In this section should not exceed 2 percent of the total cost of the campaign. The total cost includes the donated air time for PSAs as well as any over in-kind costs from all sources. OUTCOME EVALUATION: DOES THE CAMPAIGN MAKE A DIFFERENCE? Once a campaign has been implemented on a broad scale, any outcome evaluation that is done attempts to answer He question "Does it work?" that is, Is it effective as it was actually Implemented? The answers to this question are usually less clear than the answers to the efficacy question ("Can the campaign make a difference?") because the level and integrity of a campaign's implementation may vary widely. Thus, if a campaign does not work very well, it is difficult to determine whether it was because it was not well implemented or because it was not efficacious Hat is, it could not have been effective even if it had been wed implemented. The previous section discussed determining the efficacy of a campaign. This section considers the panel's recommended approach to determining the electiveness of a campaign. Randomized Experiments As discussed in Chapter I, the panel recommends randomized experi- ments to determine the effectiveness of AIDS media campaigns. Several strategies are possible for the conduct of such teals. The ideal approach would be to Implement the campaign in one-half of the national media markets randomly selected and delay implementation in the other half for six months or more. (This approach is analogous to wait-list controls in coccal trials.) Another alternative would be to implement a "phased

72 ~ EVALUATING AIDS PREVENTION PROGRAMS roll-out" of the campaign; that is, implement the campaign in several ran- domly selected markets each month over a 6- to 12-month period. Still another approach would be to implement two very different campaigns with different objectives at the same time and then switch them after six months (a switching replication design). For campaign components that have different objectives, it might be possible to deliver different compo- nents to different markets for 3 to 6 months and then switch them (again, the switching replication design). This evaluation approach would be particularly valuable for determining whether specific media campaign strategies were effective in achieving their specified objectives. Effectiveness evaluation is more complex than efficacy evaluation because it requires an assessment of He extent to which changes in mea- sured outcomes can be linked to PSA exposure. One way to assess this linkage is by means of time-series analyses that monitor trends in these outcomes over the course of the campaign period. The National Health Interview Survey (NHIS), a weekly survey of samples of 800 respon- dents in the United States, has been used for this purpose specifically to evaluate the effects of the America Responds to AIDS campaign. Un- fortunately, the data cannot be interpreted with any confidence because respondents were not randomly assigned to exposure to the campaign or to a nonexposure control condition. National campaigns like America Responds to AIDS make it pos- sible to analyze naturally occurring variations in campaign exposure by comparing aggregate trends In the outcomes across television markets that differ in their frequencies of airing the PSAs. One should recognize, however, that the determinants of this vanation in exposure need to be introduced as explicitly as possible into the analysis and the ~nterpreta- tion of results to evaluate the possibility of spurious associations. It is plausible to expect the greatest changes In the outcomes in areas of the country In which the PSAs were shown most often; however, it might be that the decision to air the PSAs frequently was a response by local station officials to the fact that knowledge or attitudes were not chang- ~ng in their communities, in which case one could well find exactly the opposite aggregate pattern. Owing to this problem, caution is needed in drawing conclusions about campaign effectiveness from simple analyses of this sort. A number of data sources are readily available for use In effectiveness evaluations. The pane} comments on three of them here: the THIS, as an example of population surveys that can be used in evaluation, the AIDS hotline, and other archival sources.

EVALUATING MEDIA CAMPAIGNS ~ 73 The National Health Interview Survey The National Health Interview Survey, can be a particularly useful source of data for the analysis of aggregate trends, because it has included AIDS- related items since August 1987. Data are collected from 800 adults per week. Questions are designed specifically to "provide estimates of pub- lic knowledge and attitudes about AIDS transmission and prevention and AIDS virus infection . . . for monitoring educational efforts, e.g., the series of radio and television public service announcements enti- tled America Responds to AIDS and the brochure Understaruling AIDS" (Dawson, 19881. The current version of the NHIS includes questions on the following: · sources of AIDS information; · self-assessed level of AIDS knowledge; · basic facts about the AIDS virus and how it is transmitted; · blood donation expenence; · awareness of and experience with the blood test for the AIDS virus; · perceived effectiveness of selected preventive measures; · self-assessed chances of getting the AIDS virus; · personal acquaintance with persons with AIDS or the AIDS virus; · willingness to take part in a proposed national seropreva- lence survey; and · a general risk behavior question. The National Center for Health Statistics publication Advance Data No.163 provides comparisons of the August 1987 and August 1988 responses. By 1988, population knowledge and attitudes had improved significantly In many respects, although there were still misconceptions held by a large proportion of the population. In addition to the questions that already appear on the NIBS, a small number of items could be added from time to time to assess particular knowledge, attitudes, and beliefs. For example, seven new items were added to He May, June, and July 1988 surveys to evaluate the receipt and use of the brochure mailed out in the May-June penod of Phase II of the media campaign. The results Indicate that He brochure was received by 63 percent of households, read by someone In more than one-half of those households, and discussed win others by about one-third of Hose who read it. This type of approach can constructively be used for any new campaign phase.

74 ~ EVALUATING AIDS PREVENTION PROGRAMS The panel recommends that items be added to the National Health Interview Survey to evaluate exposure to, recall of, responses to, and changes resulting from a new phase of the media campaign. Hotline Calls a similar way, it is relatively simple to log the number of AIDS hotline calls received per month. As discussed above, additional information could be easily obtained from the hotline. The pane! recommends that CDC increase the usefulness of hotline data for media campaign assessments by collecting evaluation-related data such as the caller's geographic loca- tion, selected caller characteristics, issuers) of concern, and counselor responses. Each of these types of information could then be related to the issues addressed In the campaign and the regional variations that may have been used In the campaign's implementation. For Phase IV of He America Responds to AIDS campaign, possible exposure to the PSAs should also be related to whether the caller is an adolescent, the parent of an adolescent, or an educator. Because the campaign is specifically directed toward these groups, one would expect more calls from people in these groups than from other people. To He extent that calls from the genera] population do increase, such a trend might indicate success in reaching the general population. The collection of additional detailed data on the type of question the caller asked and He type of written material (if any) mailed to him or her would further improve the ability to link campaign activities to population behavior. Following up some callers- with their prior permission- would allow for an assessment of subsequent knowledge and behavioral changes. Other Archival Sources Evaluators of national AIDS prevention campaigns should collect or monitor data on several other AIDS-relevant archival indicatorssuch as condom sales and reports of STDs of all typesand use them to aid the interpretation of other evaluation data. These examples and others are all possible indicators of changes in societal norms and behaviors, at least when considered in combination. The limitations in their abil- ity to indicate such changes are discussed in the following section on methodology.

EVALUATING MEDIA CAMPAIGNS ~ 75 Methodological Issues As noted earlier, evaluating the effectiveness of a PSA campaign is ex- tremely difficult. To attribute observed changes to a particular campaign requires randomized tests of alternatives or of lagged implementation. This being the case, it is useful to distinguish between problems in ef- fectiveness evaluation of current campaigns and phases, and problems In He evaluation of future campaigns and phases. Effectiveness Evaluation of Current Activities There are three levels of analysis that may be undertaken to judge the effectiveness of current campaign activities: aggregate analyses, partially disaggregated analyses, and disaggregated analyses. Each has me~od- ological problems that may limit its use. As noted above, aggregate analyses of campaign electiveness provide trend information Cat could be explained on the basis of determinants other than exposure to the cam- pa~gn. Partiality disaggregated analysescomparing differential trends in outcomes across geographic areas that vary in intensity of campaign exposure are subject to biases introduced by selection factors. Disag- gregated analyses are difficult to perform because measures of individual differences In exposure to the campaign are usually unreliable and often systematically biased. The best one can hope for in a situation of this sort is to make a concerted effort to examine trends on all three of these levels and obtain as much information as possible about potentially important selection factors. The documentation of consistent findings across the three different levels of analysis, coupled with evidence that the trends persist in the face of adjustments for potentially important selection fac- tors, can be taken as strongly suggesting that an association between exposure to the campaign and outcomes of interest is likely to be causal. The combination of trend monitoring through the AIDS telephone hotline and the parallel implementation of a telephone coincidental sur- vey in an ongoing fashion could also help to provide evidence about effectiveness. The combination of these data would allow an evaluator to determine whether the hotline continues to be used by the same percent- age of people who are exposed to the campaign over the course of time. This info~ai~on is more relevant to dete~ng whether a campaign continues to be effective in caning attention to the availability of addi- tional information through the hotline Man it is to the overall effectiveness Of Be campaign; nevertheless, the issue of continued effectiveness is cnt- ical in a campaign that self-consciously defines itself as having a series of phases. Given the seriousness of the AIDS epidemic as well as the expense of mounting a media campaign to slow its transmission, one type

76 ~ EVALUATING AIDS PREVENTION PROGRAMS of evaluation should be used to determine how long each phase should be maintained (instead of relying on anecdotal evidence about the "typical shelf life" of PSA material, as is currently done). Monitoring hotline calls, and particularly the trends in these calls that can be associated with different PSAs, as well as trends in the types of information requested In the calls, could provide extremely useful information on when particular messages begin to lose their effect or their purpose. Separating the effects of any national campaign from other national or local campaigns is difficult, but it is particularly so when there are numerous other campaigns in operation, as is often the case with AIDS. It is just as important, and just as difficult, to separate the effects of a national media campaign from national news coverage about AIDS. Nevertheless, the monitoring of other associated activities needs to be an integral part of outcome evaluation for the campaign. Evaluators need to deterrence to what extent observed changes In knowledge, attitudes, and beliefs are the result of the national mass media campaign and not the accumulated effects of news coverage or of a great many local campaigns. For example, it is impossible to attribute changes in lmowledge from August 1987 to August 1988 to either the mailed brochure or the PSA campaign. With hindsight, it is apparent that one possibility for causal attribution would have been a nonequivalent dependent variable approach: that is, a study could have been designed so that some knowledge, attitudes, or beliefs would have been changed by the brochure but not by the PSAs or anything else; some would have been changed by the PSAs but not by He brochure or anything else; and some would not have been changed by either He brochure or the PSAs or anything else. Additions to the NHIS could easily have been designed with this approach In mind, as could future surveys, for example, to contribute to the evaluation of Phase IV of the America RespondEs to AIDS campaign. Another possible method of causal attribution would entail mapping any patterns of increased knowledge collected by the NHIS along with patterns of PSA agings. If the two patterns tracked fairly wed, competing explanations for change could be discounted. A further possibility for evaluating the effectiveness of the brochure is to identify geographic regions that differed significantly in the date of delivery of the brochure. A several-week lag would be required to detect similarly lagged changes in self-reports of brochure receipt and changes in knowledge, attitudes, alla beliefs. Effectiveness Evaluation of Future Activities For future phases of the America Responds to AIDS campaign or for new campaigns, a much more accurate assessment of effectiveness could

EVALUATING MEDIA CAMPAIGNS ~ 77 be obtained by staggering the implementation of the new activity In a randomized sample of markets. The selection of early-exposure markets should be based on the degree to which they are typical of the entire country and on their lack of "spill-out" (picking up television signals in one market from adjacent markets).6 A systematic variation of this sort, if followed throughout subsequent stages of the campaign, would make it possible to use the analysis of interrupted t~me-series to evaluate effectiveness. In this approach, change In Me time trend of outcomes associated with the introduction of a new intervention could be interpreted as the result of the intervention rather than the result of other contemporaneous changes in the larger environment. By staggering the introduction of new campaign phases, the validity of this assumption could be assessed by de- tenIiining whether parallel trend changes occurred in early-exposure and later-exposure markets (separated by perhaps 6-12 months). If parallels can be documented, an evaluator can discount the otherwise plausible concern about influence other than the campaign leading to the change. If paraBels cannot be found, then the evaluator should be aware Mat id- iosyncratic influences other than the campaign are likely to be involved. When combined with the use of naturally occurring variations (noted above), this approach maximizes an evaluator's ability to accurately as- sess campaign effectiveness. Problems with Sources of Data The NH S arm Other Surveys. Ongoing surveys have great value in that they can provide data that are useful for conducting some types of trend analysis. The~r major limitation, however, concerns the relevance of the collected data: the closer the relation of the questions being asked to the campaign being evaluated, the more useful the survey. The NHIS is obviously of greatest relevance to the AIDS media campaigns in that a senes of AIDS-related items has been included since August 1987. These items are limited to fairly general constructs, however, and cannot be used to evaluate focused objectives. The one advantage of the NHIS over all other federally supported surveys is the ability to insert new items from time to time. To pro- vide more relevant data, new AIDS-related items should collect focused attitude and behavioral information rather than exposure and immediate- use information only (the data collected on the mailed brochure). These items should also be incorporated In the THIS for a longer period of 6This approach relies on the compliance of station managers to air the test PSAs. This type of evalu- ation would be easier to implement if air tune were purchased rather Han donated.

78 ~ EVALUATING AIDS PREVENTION PROGRAMS time, including several months prior to a new campaign, to make some type of trend analysis possible. The NIBS is far too valuable a resource to be limited to items that assess exposure only over a short time frame. To maximize its usefulness, the NHIS should also be used as the vehicle for the nonequivalent dependent vanable approach Prescribed above; see Cook and Campbell, 1979~. Under this approach, three types of items would be included: items that should show change as a result of a specific campaign; items that should change as a result of other concurrent campaignfs); and other items (albeit related) that should not be changed by any ongoing campaign. Ongoing surveys are most useful when the campaign under evalua- tion is directed toward the general population; whenever a campaign has a more restricted focus (e.g., high-r~sk women, parents of adolescents, adolescents, school personnel), national surveys become less valuable. How much less valuable depends on how restnctively the selected pop- ulation is defined. The narrower the population description, the less the selected population will be represented in the survey sample. In such cases, oversampling of the selected population is necessary. For exam- ple, CDC could support oversampling of the groups (adolescents, parents of adolescents) that are He focus of Phase IV of He America Responds to AIDS campaign. Hotline Calls and Other Archival Data. Currently, the logging forms for AIDS telephone hotline cans permit only the assessment of the number of cans from month to month, possibly by region. This limitation could and should be removed. There are two major inferential issues with other archival data: What do observed changes mean? Can they be attributed to a particular campaign? Changes in the indicators listed above need to be interpreted with care. For example, increased sales of condoms have been viewed as an indication of a reduction in unsafe sex; however, the validity of this Indication depends on how many of the purchased condoms were actually used, a fact that remains unknown. The meaning of fewer condom sales would be similarly unclear: if the new campaign succeeded in increasing the acceptability of abstinence, it might lead to a decrease in overall sexual activity (provided such a decrease had not already occurred) and to a concurrent decrease In condom sales. Reductions In reported STDs of all types, in pregnancies among teenagers, and in the number of babies born with HIV could all indicate lower rates of sexual activity or safer sexual practices. As discussed in Chapter 2, these measures do not, by themselves, indicate which is occurring, although in combination with figures on sales of prophylactic

EVALUATING MEDIA CAMPAIGNS ~ 79 and contraceptive devices or materials (e.g., condoms and spermicides), they might. Lower rates of STDs together with lower sales would suggest less sexual activity. In the absence of improved STD treatment, Tower rates of STDs together with higher or unchanging condom and spermicide sales would suggest the increased use of safer sexual practices. Archival data must be used with great caution and in combination, without relying on any single indicator. Despite the valuable information provided by archival data, they offer no evidence on whether observed changes are due to a particular media campaign. Are the changes suggested by such indicators the result of the CDC campaign, some other campaign, or combined campaigns? Would they have occurred even without any campaign? From a public health perspective, it might not matter: the important point would be that sexual practices were becoming safer (and HIV infection should subsequently decrease). From a cost-effectiveness policy or perspective, however, it is critical to determine what caused the changes. If they would have occulted without the CDC: campaign, the cost of the campaign was then a waste of resources that might have been used for something more valuable (the opportunity costs). From a policy perspective, it is obviously important to determine whether a particular campaign produced the effects attributed to it. The limitations that must be confronted arid the resources necessary to make such a determ~nahon are discussed below. Resources and Aspirations Outcome evaluation of an ongoing national campaign such as America Responds to AIDS is limited by a number of factors. First, the analysis is based on a sample of one case. Second, the environment may vary in ways that induce changes in the target population that are paraBe] to the effects sought by the campaign, thus making it difficult to attribute trends in outcomes uniquely to campaign effects. Gird, CDC does not control variations in exposure to the campaign either over time or over different television markets. NATEP staff have indicated to the pane] that the national media campaign is expected to continue into the foreseeable future; the panel believes strongly that unless randomized tests of alternatives or of lagged implementations are conducted, there is little hope for anything more than educated guesses about the meanings of any observed trends. Outcome evaluation may also be limited by funding and staff avail- ability. Although CDC has substantial resources for the media campaigns, resources for the evaluation of these activities are much more difficult to determine. Although special AIDS-related items have been and win

80 ~ EVALUATING AIDS PREVENTION PROGRAMS be added to the NHIS, CDC needs to have adequate staff resources to design these added items to be of optimal value in an outcome evalu- ation or analyze the resulting data in depth. Another example involves the AIDS hotline. CDC established the national AIDS hotline in 1982 and contracted with the American Social Health Association in 1987 to operate it, but there is little evidence of efforts to determine its effects or effectiveness beyond showing how many calls came in. Information on the number of calls received is useless without knowledge of who is calling, what type of questions are asked, how Hey were answered, what other information or services were provided to caners, and with what effects. The National Cancer Institute provides an example of bet- ter ongoing evaluation of a hotline (Stein, 1986~. Another evaluation opportunity involves regional variations in media implementation and exposure. There have been no attempts to analyze the effects of these natural variations. Conducting randomized tests of media campaigns is expensive but not prohibitively so. For example, a randomized test of delayed im- plementation would require surveys of randomly selected samples from media markets Hat receive a campaign first and from markets In which implementation is delayed. A national survey is appropriate as long as items about a respondent's region and exposure are included for all par- ticipants. As noted above, this could be done through added items on the NITS, although a separate national survey would be preferable. A national survey of 5,000 randomly selected respondents, repeated three times, might cost from $500,000 to $750,000. For some campaigns or some phases of campaigns that are directed toward more specific popu- lations, costs might double or triple because of the extensive screening that would be necessary to identify the sample. Randomized tests of alternative campaign strategies are more expensive, but only marginally so. The only additional cost is the production of alternative campaign materials; all over costs of campaign implementation and evaluation are the same as for the delayed implementation model. Even for the most expensive approaches, the costs of outcome eval- uations for media campaigns will not exceed the costs of program de- velopment and win be only a small fraction of the total costs of the campaign. Adequate resources for determining campaign effectiveness are estimated to be at least 5 percent of the expected total campaign cost Concluding donated or ~n-kind costs from all sources). A final note is in order on the potential impact of changes in AIDS treatments and social phenomena. Any changes in treatments for AIDS, prevalence rates or infection patterns, public attitudes, or legislation

EVALUATING MEDIA CAMPAIGNS | 81 regarding the social treatment of people with AIDS will have implications for public education. The panel urges Hat the developers of national AIDS prevention campaigns continuously monitor new developments and news reports and consider the implications of new findings, treatments, or vocabularies for future surveys and media campaigns. REFERENCES Cook, T., and Campbell, D. T. (1979) Quasi-Experimentation: Design and Analysis for Field Settings. Boston: Houghton-Mifflin. Dawson, D. (1988) Knowledge and attitudes for May and June 1988. NCHS Advance Data 160:1-24. Flay, B. R. (1986) Efficacy and effectiveness trials (and other phases of research) in the development of health promotion programs. Preventive Medicine 15:451474. Flay, B. R., and Cook, T. D. (1981) Evaluation of mass media prevention campaigns. In R. R. Rice and W. Paisley, eds., Public Communication Campaigns. Beverly Hills, Calif.: Sage Publications. Government Accounting Office (GAO) (1988) AIDS Education: Activities Aimed at the General Public Implemented Slowly. GAO/HRD-89-21. December. Washington, D.C.: U.S. Government Printing Office. Joseph, J. G., Emmons, C. A., Kessler, R. C., Worenan, C. B., O'Brien, K., Hocker, W. T., and Schaefer, C. (1984) Coping with the Great of AIDS: An approach to psychosocial assessment. American Psychologist 39:1297-1302. Joseph, J. G., Montgomery, S. B., Emmons, C. A., Kessler, R. C., Ostrow, D. G., Worunan, C. B., O'Brien, K., Eller, M., and Eshleman, S. (1987) Magnitude and determinants of behavioral risk reduction: Longitudinal analysis of a cohort at risk for AIDS. Psychology and Health 1:73-96. McCusker, J., Stoddard, A. M., Mayer, K. H., Zapka, J., Morrison, C., and Saltzman, S. P. (1988) Effects of HIV antibody test knowledge on subsequent sexual behaviors in a cohort of homosexual men. American Journal of Public Health 78:462-467. McKusick, L., Horstman, W., and Coates, T. J. (1985) AIDS and sexual behavior reported by gay men in San Francisco. American Journal of Public Health 75:493-496. Morgan, D. L., and Spanish, M. T. (1984) Focus groups: A new tool of qualitative research. Qualitative Sociology 7:253-270. Office of Cancer Communications (1983) Making PSAs Work. Washington, D.C.: National Cancer Institute. Office of Cancer Communications (1984) Pretesting Television PSAs. Washington, D.C.: National Cancer Institute. Palmer, E. (1981) Shaping persuasive messages with formative research. In R. R. Rice and W. Paisley, eds., Public Communication Campaigns. Beverly Hills, Calif.: Sage Publications.

82 ~ EVALUATING AIDS PREVENTION PROGRAMS Stein, J. (1986) The Cancer Infonnation Service: Marketing a large-scale national Information program through media. In D. G. Leather, G. B. Hastings, K. M. O 'Reilly, and J. K. Davies, eds., Health Education and the Media, Vol. 2. London: Pergamon. Valdisem, R. O., Lyter, D., Leviton, L. C., Callahan, C. M., Kingsley, L. A., and Rinaldo, C. R. (1988) Variables influencing condom use in a cohort of gay and bisexual men. American Journal of Public Health 78:801-805. Wright, P. (1979) Concrete action plans and TV messages to increase reading of drug warnings. Journal of Consumer Research 6:256-269.

Next: 4 Evaluating Health Education and Risk Reduction Projects »

Evaluating AIDS Prevention Programs: Expanded Edition (1991)

Chapter: 3 Evaluating Media Campaigns

Welcome to OpenBook!

Get Email Updates