PART III
Evaluation and Social Indicator Data



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 197
Community Programs to Promote Youth Development PART III Evaluation and Social Indicator Data

OCR for page 197
Community Programs to Promote Youth Development As more funding becomes available for community programs designed to promote youth development, higher expectations are being placed on programs to demonstrate (and not just to proclaim) that they do indeed promote the healthy development of youth. Well-established programs that draw on public funds are being adopted in cities and communities throughout the United States. Increasingly, these programs are being expected to demonstrate that they actually do make a difference in young people’s lives. It is precisely because these programs are being established in a wide variety of communities that it is important to know if they make a difference and, if so, under what conditions and for whom. Moreover, given the recent call for significant investments in public resources, it would betray public trust not to document the steps taken to implement these programs and to provide evidence of the effectiveness of programs. How should one think about evaluation of community programs for youth in the future? What social indicators exist that help us understand community programs for youth? What else is needed in order to better understand and evaluate these programs? Part III explores the various methods and tools available to evaluate these programs, including experimental, quasi-experimental, and nonexperimental methods. Each method involves different data collection techniques, and each affords a different degree of causal inference—that is, whether a particular variable or treatment actually causes changes in outcomes. Findings from evaluations using these methods were incorporated into Part II. We turn now to exploring evaluation methodologies in more detail, looking specifically at the role for evaluation (Chapter 7) and data collection (Chapter 8) for the future of these programs. There are particular challenges inherent to evaluating community programs for youth. Many of them tend to be relatively new and are continually changing in response to growing interest and investments on the part of foundations and federal, state, and local policy makers. In addition, the elements of community programs for youth rarely remain stable and consistent over time, given that program staff are always trying to improve the services and the manner in which they are delivered. Moreover, some programs struggle to overcome barriers during the implementation phase—for example, to receive a license or permits, to acquire appropriate space or renovate a facility, or to recruit appropriate staff and program participants. As a result, early implementation of

OCR for page 197
Community Programs to Promote Youth Development programs may not follow the specific plan. Evaluation involves asking many questions and often requires an eclectic array of methods. Deciding what questions to ask and the methods to use for the development of a comprehensive evaluation is important to the wide range of stakeholders of community programs for youth.

OCR for page 197
Community Programs to Promote Youth Development This page in the original is blank.

OCR for page 197
Community Programs to Promote Youth Development CHAPTER 7 Generating New Information This chapter explores the role of program evaluation in generating new information about community programs for youth. Evaluation and ongoing program study can provide important insights to inform program decisions. For example, evaluation can be used to ensure that programs are acceptable, assessable, developmentally appropriate, and culturally relevant according to the needs of the population being served. The desire to conduct high-quality evaluation can help program staff clarify their objectives and decide which types of evidence will be most useful in determining if these objectives have been met. Ongoing program study and evaluation can also be used by program staff, program users, and funders to track program objectives; this is typically done by establishing a system for ongoing data collection that measures the extent to which various aspects of the programs are being delivered, how are they delivered, who is providing these services, and who is receiving them. In some circles, this is referred to as reflective practice. Such information can provide very useful information to program staff to help them make changes to improve program effectiveness. And finally, program evaluation can test new or very well-developed

OCR for page 197
Community Programs to Promote Youth Development program designs by assessing the immediate, observable results of the program outcomes and benefits associated with participation in the program. Such summative evaluation can be done in conjunction with strong theory-based evaluation (see later discussion) or as a more preliminary assessment of the potential usefulness of novel programs and quite complex social experiments in which there is no well-specified theory of change.1 An example of the latter is the Move to Opportunity programs, in which poor families were randomly assigned to new housing in an entirely new neighborhood. In other words, program evaluation and study can help foster accountability, determine whether programs make a difference, and provide staff with the information they need to improve service delivery. They can generate new information about community programs for youth and the population these programs serve and help stakeholders know how to best support the growth and development of programs. Different types of evaluation are used according to the specific questions to be addressed by evaluation. In general, there are two different types of evaluations relevant to this report: Process evaluation (formative evaluation) describes and assesses how a program operates, the services it delivers, and the activities it carries out and Outcome evaluation (summative evaluation) identifies the results of a program’s efforts and considers what difference the program made to the young people after participating. These two evaluative approaches can be thought of as a set of assessment options that build on one another, allowing program staff to increase their knowledge about the activities that they undertake as they incorpo- 1   Summative evaluations can be used with such programs to assess impact and estimate its size. Such evaluations are often called for when the government has invested very large sums of money in a novel omnibus social experiment, such as those linked to recent welfare-to-work reform or large-scale school reform efforts. Often such evaluation focus only on outcome differences between the treatment and the control groups. However, without some theory of change and measures of the processes assumed to mediate the impact of the program, such evaluation leave many questions unanswered and do not provide good guidance for either modifying or replicating the program in the future. We do not discuss such evaluations in any detail in this report. Interested readers are encouraged to look at the more standard textbooks on program evaluation for additional information, such as Shadish et al. (2001).

OCR for page 197
Community Programs to Promote Youth Development Box 7–1 Process and Outcome Evaluation Process evaluation helps explain how a program works and its strengths and weaknesses, often focusing on program implementation. It may, for example, document what actually transpires in a program and how closely it resembles the program’s goals. Process evaluation is also used to document changes in the overall program and the manner in which services are delivered over time. For example, a program administrator might observe that attendance at a particular activity targeted to both youth and their parents is low. After sitting down and talking with staff, the program administrator discovers that the activity has not been sufficiently advertised to the parents of the youth participating in the program. Moreover, the project team realizes that the times in which the activity is scheduled are inconvenient for parents. Increased and targeted outreach to parents and a change in the day and time of the activity result in increased attendance. Thus, process evaluation is an important aspect of any evaluation, because it can be used to monitor program activities and help program staff to make decisions as needed. Outcome evaluation facilitates asking if a program is actually producing changes on the outcomes believed to be associated with the program’s design. For example, does participation in an after-school program designed to improve social and communication skills actual lead to increased engagement with peers and participation in program activities? Are the participants more likely to take on leadership roles and participate in planning program activities than they were before participating? rate more options or activities into their evaluation (see Box 7–1 for elaboration). They can also serve as the foundation from which programs justify the allocation of public and private resources. Process and outcome evaluations both rely on the collection of two types of data—qualitative and quantitative data (see Box 7–2 for elaboration). Quantitative data refer to numeric-based information, such as descriptive statistics that can be measured, compared, and analyzed. Qualitative data refer to attributes that have labels or names rather than numbers; they tend to rely on words and narrative and are commonly used to describe program services and characterize what people “say” about the programs.

OCR for page 197
Community Programs to Promote Youth Development Box 7–2 Quantitative and Qualitative Data Quantitative data are commonly used to compare outcomes that may be associated with an intervention or service program for two different populations (i.e., those who received the service—or participated in the program—and those who did not) or from a single population but at multiple time points (e.g., before, during, and after participation in the program). For example, assume that an organization wants to compare two different types of tutoring programs. To accomplish this goal, they give a structured survey with numeric coded responses (such as a series of mathematical problems or a history or English knowledge test) to participants at the time they enter the program, and 3 and 6 months after the start of the program. The data collected by this survey (for example, scores on the tests of mathematical, historical, or English knowledge) are statistically analyzed to determine differences between the participants from each group. Qualitative data are often derived from narratives and unstructured interviews or participant observations. A common misconception is that qualitative methods lack rigor and therefore are not scientific. In fact, qualitative methods can be just as scientific (meaning objective and empirical) as quantitative methods. They are the basis for much descriptive and classification work in both the social and natural sciences. They provide an opportunity to systematically examine organizations, groups, and individuals in an effort to extract meaning from situations; understand meaning ascribed to behaviors; clarify or further explore quantitative findings; understand how a service program operates; and determine whether a program or services can be adapted for use in other contexts and with other populations. EVALUATING COMMUNITY PROGRAMS FOR YOUTH In our review of studies and evaluations that have been conducted on community programs serving youth, we found that a wide range of evaluation methods and study designs are being used, including experimental, quasi-experimental, and nonexperimental methods (see Box 7–3). Part II provided examples of community programs for youth using both experimental or quasi-experimental methods, as well as a range of other nonexperimental methods of study, including interviews, focus

OCR for page 197
Community Programs to Promote Youth Development Box 7–3 Evaluation Design Methodologies Experimental design involves the random assignment of individuals to either a treatment group (in this case participation in the program being assessed) or a control group (a group that is not given the treatment). Many believe that the experimental design provides some of the strongest, most clear evidence in research evaluation. This design also affords the highest degree of causal inference, since the randomized assignment of individuals to an intervention condition restricts the opportunity to bias estimates of the treatment effectiveness. Quasi-experimental design has all the elements of an experiment, except that subjects are not randomly assigned to groups. In this case, the group being compared with the individuals receiving the treatment (participating in the program) is referred to as the comparison group rather than the control group, and this method relies on naturally occurring variations in exposure to treatment Evaluations using this design cannot be relied on to yield unbiased estimates of the effects of interventions because the individuals are not assigned randomly. Although quasi-experimental study designs can provide evidence that a causal relationship exists between participation in the intervention and the outcome, the magnitude of the effect and the causation are more difficult to determine. Nonexperimental design does not involve either random assignment or the use of control or comparison groups. Nonexperimental designs rely more heavily on qualitative data. These designs gather information through such methods as interviews, observations, and focus groups in an effort to learn more about the individuals receiving the treatment (participating in the program) or the effects of the treatment on these individuals. This type of research often consists of detailed histories of participant experiences with an intervention. Although they may contain a wealth of information, nonexperimental studies cannot provide a strong basis for estimating the size of an effect or for unequivocally testing causal hypotheses because they are unable to control for such factors as maturation, self-selection, attrition, or the interaction of such influences on program outcomes. They are, however, particularly useful for generating new hypotheses, for developing classification systems, and for gaining insight into people’s understandings.

OCR for page 197
Community Programs to Promote Youth Development groups, ethnographic studies, and case studies. It also summarized some findings from studies using both nonexperimental and experimental methods. There are differing opinions among service practitioners, researchers, policy makers, and funders about the most appropriate and useful methods for the evaluation of community programs for youth (Catalano et al., 1999; McLaughlin, 2000; Kirby, 2001; Connell et al., 1995; Gambone, 1998). Not surprising, there was even some disagreement among committee members about the standards for evaluation of community programs for youth. Through consideration of our review of various programs, the basic science of evaluation, and a set of experimental evaluations, quasi-experimental evaluations, and nonexperimental studies of community programs for youth, the committee agreed that no specific evaluation method is well suited to address every important question. Rather, comprehensive evaluation requires asking and answering many questions using a number of different evaluation models. What is most important is to agree to, and rely on, a set of standards that help determine the conditions under which different evaluation methods should be employed and to evaluate programs using the greatest rigor possible given the circumstances of the program being evaluated. At the end of this chapter, we present a set of important questions to be addressed in order to achieve comprehensive evaluation, whether it be by way of experiments or other methods. QUESTIONS ASKED IN COMPREHENSIVE EVALUATION To fully realize the overall effectiveness and impact of a program on its participants, a comprehensive evaluation should be conducted. A comprehensive evaluation addresses six fundamental questions: Is the theory of the program that is being evaluated explicit and plausible? How well has the program theory been implemented in the sites studied? In general, is the program effective and, in particular, is it effective with specific subpopulations of young people? Whether it is or is not effective, why is this the case? What is the value of the program? What recommendations about action should be made?

OCR for page 197
Community Programs to Promote Youth Development These questions have a logical ordering. For instance, answers to the middle questions depend on high-quality answers to the first two. To give an example, if one is not sure that a program has been well implemented, what good does it do to ask whether it is effective? The appropriate questions to answer in any specific evaluation depend in large part on the state of previous research on a program and hence often on the program’s maturity. The more advanced these matters are, the more likely it is that one can go on to answer the later questions in the sequence above. While answering all six questions may be the ultimate goal, it is important to note that it is difficult to answer all questions well in any one study. Consequently, evaluations should not routinely be expected to do so. However, with multiple evaluation studies, one would hope and even expect that all of these questions will be addressed. Answers to all of these questions may also be obtained by synthesizing the findings from many evaluations conducted with the same organization, program, or element. They may also be answered by using various evaluation methods. We now take a brief look at these questions, realizing that they are interdependent. Is the Theory of the Program Plausible and Explicit? Good causal analysis is facilitated by a clear, substantive theory that explicitly states the processes by which change will occur. Ideally, this theory should be the driving force in program development and should guide the decision of what to measure and how. The theory should explicitly state the components that are thought to be necessary for the expected effects to occur (i.e., the specific aspects of the programs—such as good adult role models—that account for the programs’ effects on such outcomes as increasing self-confidence). It should also detail the various characteristics of the program, the youth, and the community that are likely to influence just how effective the programs is with specific individuals (i.e., the moderators of the program’s effectiveness, such as ethnicity or disability status). These components must then be measured to evaluate whether they are in place in the form and quantity expected. At a more abstract level, the theory must also be plausible, taking into account current knowledge in the relevant basic sciences, the nature of the target population, and the setting in which the program intervention takes place. We provided an initial review of the existing literatures in Chapters 2, 3, and 4. Obviously, a program needs no further evalua-

OCR for page 197
Community Programs to Promote Youth Development Theory-based evaluation acknowledges the importance of substantive theory, quantitative assessment, and causal modeling, but it does not require experimental or even quasi-experimental design. Instead it focuses on causal modeling derived from a well-specified theory of the processes that take place between program inputs and individual change. If the causal modeling analyses suggest that the obtained data are consistent with what the program theory predicts, then it is presumed that the theory is valid and success of the program has been demonstrated. If time does not permit assessing all the postulated causal links, information on the quality of initial program implementation nonetheless will be gathered because implementation variables are usually the first constructs in the causal model of the program. If the first steps in the theory are proceeding as predicted, the evaluators can then recommend that further evaluations be conducted when sufficient time has passed for the proposed mediating mechanisms to have their full effect on the proposed outcomes. This should prevent any inclinations toward premature termination of programs, even though it does not demonstrate that the ultimate outcomes have been reached. It is without question that the analysis of substantive theory is necessary for high-quality evaluation. Otherwise, analyses of implementation and causal processes cannot be carried out. However, the key question is whether a theory-based model can substitute for experiments rather than be built into them. It is not logical that long-term effects will come about just because the proposed mediators have been put into place. Without control groups and sophisticated controls for selection, it is impossible to conclude with any reasonable degree of confidence that the program is responsible for the changes observed rather than some co-occurring set of events or circumstances. Theory-based experimental approaches provide more confidence that it is the program itself that is accounting for the effects, and they also provide strong clues as to whether the program will continue to be effective in other sites and settings that can recreate the demonstrably effective causal processes. We support the use of program theory in evaluation, but as adjuncts to experiments rather than as substitutes. Qualitative Case Studies Some argue that qualitative case studies are sufficient for producing usable causal conclusions about a program’s effects. These are studies that focus on gaining in-depth knowledge of a program through inter-

OCR for page 197
Community Programs to Promote Youth Development views with various individuals involved, on-site observation, analysis of program records, and knowledge of the literature on programs like the one under review. Such studies involve little or no quantitative data collection and hence little manipulation of the data. Moreover, although some case studies include multiple program sites both for purposes of generalization and for comparison of different kinds of programs, case studies often concentrate on only a few sites, sometimes only one, because resources are usually too limited to collect in-depth information on many sites. The result is therefore to gain in-depth knowledge of only a few sites. Case studies, done singly or at a sample of sites, are useful for describing and analyzing a program’s theory, for studying program implementation, for surfacing possible unintended side effects, and for helping explain why the program had the effects it did. They are also very useful at the stage of theory development. Some programs are thought to work: sometimes this conclusion is based on randomized treatment designs of omnibus programs; at other times, it is based on more subjective criteria, such as user satisfaction or continued evidence of high performance by its users on criteria valued in the community. High-quality case studies can help one understand more about these programs. In turn, this information can be used to design and then evaluate the effectiveness of these newly designed programs using the more quantitative experimental designs discussed earlier in this chapter. Qualitative case studies can also help reduce some of the uncertainty about whether a program has had specific effects, since they often rule out some of the possible spurious reasons for any changes noted in an outcome. Finally, qualitative case studies often provide exactly the kinds of information that are useful to policy makers, journalists, and the public. These studies provide the kind of rich detail about what it means to youth and their families to participate in particular programs. Of course, such information can be misused, particularly if it is not gathered with rigorous methods; such misuse of information is also possible using quantitative experimental methods. But when done with scientific rigor, particularly if done in conjunction with rigorous experimental and quantitative quasi-experimental studies, qualitative information can provide very important insights into the effectiveness of youth programs. A separate issue is whether such studies reduce enough uncertainty about cause to be useful. The answer is complex and depends in part on how this information is to be used. In the committee’s view, such information can often be useful to program practitioners who want informa-

OCR for page 197
Community Programs to Promote Youth Development tion that will help them improve what is going on at their site. Case studies are probably a better source of information than having no information about change and its causes, and program personnel want any edge they can get to improve what they do. Leavened by knowledge from the existing relevant literature, the data collected on site, and the staff’s other sources of program knowledge, the results of a case study can help local staff improve what they do. However, when the purpose of the evaluation is to estimate the effectiveness of a program in helping participants, we have doubts about the utility of case studies. There are three reasons for this. First, it is often difficult to assess how much each participant has changed from their entry into the program to later. Second, and perhaps most importantly, there is no counterfactual against which to assess how much the participants would have changed had they not been in the program. And third, conclusions about effectiveness always involve a high degree of generalization—e.g., across persons, service practitioners, program inputs—and many qualitative researchers are reluctant to make such generalizations. They prefer to detail many known factors that make some difference to an outcome, and this is not the same as noting what average effect the program has or how its effectiveness varies by just a few carefully chosen factors. Identifying the Most Appropriate Method Experimental evaluation methods are often considered the gold standard of program evaluation and are recommended by many social scientists as the best method for assessing whether a program influences developmental outcomes in young people. However, many question the feasibility, cost, and time intensiveness of such methods for evaluating community programs for youth. In order to generate new, important information about community programs for youth, the committee recommends that the kind of comprehensive experimental evaluation discussed in this chapter be used under certain circumstances (see also footnote 1): The object of study is a program component that repeatedly occurs across many of the organizations currently providing community services to youth; An established national organization provides the program being evaluated through a system of local affiliates; and

OCR for page 197
Community Programs to Promote Youth Development Theoretically sound ideas for a new demonstration program or project emerge, and pilot work indicates that these ideas can be implemented in other contexts. Comprehensive experimental evaluation is useful when the focus of evaluation is on elements that can be found in many organization providing services to youth: how recruitment should be done and continued participation should be supported; how youth and parent involvement should be supported; how youth’s maturity and growing expertise should be recognized and incorporated into programming; how staff members should be recruited and then trained; how coordination with schools should be structured; how recreational and instructional activities should be balanced; how mentoring should be carried out; how service learning should be structured and supported; and how doing homework should be supported. Typically, these are components in any single organization, but they are extremely important because they are common issues in organizations across the country. In our view, such elements are best examined by way of experimental research in which a sample of organizations is assigned to different ways of solving the problem identified. Although such work has the flavor of basic research on organizational development, it is central to an effectiveness-based practice of community youth programming. Equally critical to an effectiveness-based practice approach is knowledge of the individual- and community-level factors that influence the effectiveness of these practices for different groups of individuals or communities. Consequently, it is also important that experimental methods be used to assess the replicability of such practices in different communities and with different populations of youth. Particular attention here needs to be paid to such individual and community differences as culture, age and maturity, sex, disability status, social class, educational needs, and other available community resources. Comprehensive experimental evaluations are also called for in two other contexts. The first is when the target of evaluation is a national organization that has affiliates in many locations across the United States. The best model of this is the evaluation completed on Big Brothers Big Sisters to assess the effects of membership and of participation across the affiliates included in the sampling design. Many of these national organizations have been providing services to youth for many years and carry a disproportionate burden of current service provision. As a result, even when programs are still developing at the margins, many have mature

OCR for page 197
Community Programs to Promote Youth Development program designs. Since the total effect of these organizations is amplified across its affiliates, experimental evaluation with random assignment helps illuminate how effective these programs are. The final context for comprehensive experimental evaluation is when some bold new idea for a new kind of service surfaces and critical examination shows that the substantive theory behind the idea is reasonable and that it is indeed likely to be able to be implemented. This situation is often called a demonstration project. Such demonstrations provide the substantive new ideas out of which the next generation of superior services is likely to emerge. As such, they deserve to be taken very seriously and to be evaluated by rigorous experiments. Programs that meet the following criteria should be studied on a regular, ongoing basis with a variety of either nonexperimental methods or more focused experimental, quasi-experimental and interrupted time-series designs, such as those advocated by Box (1958): An organization, program, project, or program element has not sufficiently matured in terms of its philosophy and implementation; The evaluation has to be conducted by the staff of the program under evaluation; The major questions of interest pertain to the quality of the program theory, implementation of that theory, or the nature of its participants, staff, or surrounding context; The program is quite broad, involving multiple agencies in the same community; and The program or organization is interested in reflective practice and continuing improvement. If Effective, Why? An explanation of the reasons why a program is effective is important because it identifies the processes that are thought to be present for effectiveness to occur (Cook, 2000). This knowledge is crucial for replication of program effects at new sites because of the uniqueness inherent in delivering the program to new populations and in new settings. Causal or explanatory knowledge not only identifies the critical components of program effectiveness, but also specifies whether these components are moderator variables (variables that change the relation between an intervention and an outcome; common moderator variables include all types

OCR for page 197
Community Programs to Promote Youth Development of individual difference constructs, such as sex, ethnic group, disability status, age, and social class) or mediator variables (variables that mediate the impact of an intervention on specific outcome variables; common mediators variables are the many personal and social assets discussed in Chapter 3—these are often hypothesized to mediate the impact of program features on adolescent outcomes, such as school achievement, avoidance of getting involved in problematic behaviors, and conditions such as very early pregnancy). Supported by a clear understanding of the causal processes underlying program effectiveness, practitioners at new sites can decide how the processes can best be implemented with their unique target population and their unique community characteristics. Mixed methods are the most appropriate way to answer the question of why a program is effective. Theory-based evaluation is especially appropriate here and depends on the following steps (Cook, in press): Clearly stating the theory a program is following in order to bring about change. This theory should explicitly detail the program constructs of both mediator and moderator relations that are supposed to occur if the intended program intervention is to impact major target outcomes. Chapters 2 and 3 can serve as an initial basis for developing elaborated theories of change. Collecting both qualitative and quantitative data over time to measure all of the constructs specified in the program’s theory of change. Analyzing the data to assess the extent to which the predicted relations among the treatment and the outcome variables have actually occurred in the predicted time sequence. If the data collection is limited to only part of the postulated causal chain, then only part of the model can be tested. The goal, however, should be to test the complete program theory. A qualitative approach to theory-based evaluation collects and synthesizes data on why an effect came about; through this process, this approach provides the basis to derive subsequent theories of why the change occurred. The qualitative data are used to rule out as many alternative theories as possible. The theory is revised until it explains as much of the phenomenon as possible. Both quantitative and qualitative implementation data can also tell us a great deal about why programs fail. In addition, these studies make it clear how the programs are nested into larger social systems that need

OCR for page 197
Community Programs to Promote Youth Development to be taken into account. When adequate supports are not available in these larger social systems, it is unlikely that specific programs will be able to be implemented well and sustained over time. If Effective, How Valuable? If a youth program is found effective, a comprehensive evaluation can then ask: Is it more valuable than other opportunities that could be pursued with the resources devoted to the program? Or, less comprehensively, is it more valuable than other programs that pursue the same objective? The techniques of benefit-cost analysis and cost-effectiveness analysis can offer partial but informative answers to these questions. The fundamental idea of benefit-cost analysis is straightforward: comprehensively identify and measure the benefits and costs of a program, including those that arise in the longer term, after youth leave the program, as well as those occurring while they participate. If the benefits exceed the costs, the program improves economic efficiency—the value of the output exceeds the cost of producing it—and makes society better off. If the costs exceed the benefits, society would be better off devoting the scarce resources used to run the program to other programs with the same goal that do pass a benefit-cost test, or to other purposes. Choices among competing uses of scarce public and nonprofit resources inherently embody judgments about relative benefits and costs. Benefit-cost analysis seeks to make the basis of such choices explicit so that difficult trade-offs can be better weighed. At the same time, benefit-cost analysis neither can nor should be the sole determinant of funding decisions. Aside from the limitations of any specific study, this technique cannot take into account moral, ethical, or political factors that are crucial in determining youth program policy and funding. Any benefit-cost analysis must consider several key issues. What counts as a benefit? A cost? How can one measure their monetary value? If a benefit or cost is not measurable in monetary terms, how can it enter the analysis? How can one extrapolate benefits or costs after a youth leaves a program and any follow-up period when impact data are gathered? The costs of youth programs mostly occur at the outset, while the benefits may be realized many years later. How should benefits and costs at different times be valued to reflect the fact that a dollar of benefit received in the far future is worth less a dollar received in the near future, and that both are worth less than a dollar of cost incurred in the present? How can one assess benefits and costs to youth who participate in the

OCR for page 197
Community Programs to Promote Youth Development program, to taxpayers or other program funders, and to society as a whole? Persons who bear the costs of a program may well differ from those who share in the benefits. How can one incorporate these distributional impacts into the analysis? An enormous literature has arisen to address these issues. There are several excellent texts on the subject (e.g., Boardman et al., 1996; Zerbe and Dively, 1997). If the principal benefit expected from a youth program cannot be given monetary values, cost-effectiveness analysis can be an alternative to benefit-cost analysis (Boardman et al., 1996). Suppose, for example, that the primary goal is to increase volunteer activity in community groups and that other possible program impacts are of little import to decision makers. In such a case, programs might be compared in terms of the number of volunteer hours they inspire per dollar of cost. Decision makers will want to fund the program that produces the largest increase in hours per dollar spent. Focusing on one goal is a strength in that it obviates the need to express the value of the outcome in monetary terms. Yet when interventions have multiple goals and no one has clear priority, cost-effectiveness data may not offer much guidance. If one youth program increases voluntary activity by 20 percent and reduces drug use by 15 percent and an alternative, equally costly program has an increase of 12 percent and a reduction of 20 percent, which is better? When there are multiple types of benefits, none of which dominates, and when some can be cast in monetary terms, a benefit-cost analysis that considers both monetary and nonmonetary benefits will usually provide more useful information than a cost-effectiveness analysis. Systematic benefit-cost analysis has hardly been applied to youth programs, except for those likely to reduce juvenile crime (Aos et al., 2001). While application of this methodology to youth development programs is complex, it is no more so than in other areas of social policy, in which it has made significant contributions to research and policy analysis (e.g., health and mental health, early childhood education, job training, welfare-to-work programs). To advance youth program evaluation in this direction will require more rigorous evaluations with adequate follow-up periods and suitable data on a broad set of impacts. Analysts and practitioners may be concerned that benefit-cost analysis will lead decision makers to focus too narrowly on financial values and downplay or ignore other important program impacts that cannot be translated into financial terms. However, a careful analysis will discuss nonmonetary benefits and emphasize that a complete assess-

OCR for page 197
Community Programs to Promote Youth Development ment of programs when important social values are at stake, such as in the area of youth development, must weigh such benefits along with the monetary ones. Analyses that fail to do so can be criticized for presenting an incomplete picture. As with other evaluation methods, any benefit-cost analysis has limitations. It can be questioned because its results rest on judgments about which impacts to quantify and various other assumptions needed to conduct an analysis. Time and resource constraints prevent investigation of all possible benefits and costs. Some effects may be inherently unquantifiable or impossible to assess in financial terms yet considered crucial to a program’s success or political viability. Nonetheless, when carefully done with attention to the findings’ sensitivity to different assumptions, benefit-cost analysis can improve the basis on which youth development policy decisions rest. SUMMARY In this chapter, we reviewed fundamentals of evaluation and important questions for the development of a comprehensive evaluation strategy. Several conclusions emerge from this discussion. First, there are many different questions that can be asked about a program. A priority for program practitioners, policy makers, program evaluators, and other studying programs is to determine the most important questions and the most useful methods to evaluate each program. It is very difficult to understand every aspect of a program in a single evaluation study. Like other forms of research, evaluation is cumulative. The committee identified six fundamental questions that should be considered in comprehensive evaluation: Is the theory of the program that is being evaluated explicit and plausible? How well has the program theory been implemented in the sites studied? In general, is the program effective and, in particular, is it effective with specific subpopulations of young people? Whether it is or is not effective, why is this the case? What is the value of the program? What recommendations about action should be made? While it is difficult to answer all six questions well in one study,

OCR for page 197
Community Programs to Promote Youth Development multiple studies and evaluations could be expected to address all of these questions. Comprehensive evaluation requires asking and answering many of these questions through various methods. Opinions differ opinions among program stakeholders (e.g., service practitioners, researchers, policy makers, and funders) about the most appropriate and useful methods for the evaluation of community programs for youth. No specific evaluation method is well suited to address every important question. And while there is tension between different approaches, the committee agrees that there are circumstances that are appropriate for the use of each of these methods. The method used depends primarily on the program’s maturity and the question being asked. It is rare to find programs that involve comprehensive evaluations, and they are probably most warranted with really mature programs that many people are interested in. The committee concluded that studying program effectiveness should be a regular part of all programs. Also, not all programs require the most extensive comprehensive experimental evaluation outlined in this chapter. In order to generate the kind of information about community programs for youth needed to justify large-scale expenditures on programs and to further fundamental understanding of role of community programs in youth development, comprehensive experimental program evaluations should be used when: the object of study is a program component that repeatedly occurs across many of the organizations currently providing community services to youth; an established national organization provides the program being evaluated through many local affiliates; and theoretically sound ideas for a new demonstration program or project emerge, and pilot work indicates that these ideas can be implemented in other contexts. Such evaluations need to pay special attention to the individual- and community-level factors that influence the effectiveness of various practices and programs with particular individuals and particular communities. The committee also discussed the need for more ongoing collaborative teams of practitioners, policy makers, and researchers/theoreticians in program design and evaluation. We conclude from case study materials on high-quality comprehensive evaluation efforts that the odds of

OCR for page 197
Community Programs to Promote Youth Development putting together a successful high-quality comprehensive evaluation are increased if there is an ongoing collaboration between researchers, policy makers, and practitioners. Yet such collaborations are hard to create and maintain. When experiments are not called for, a variety of nonexperimental methods and more focused experimental and quasi-experimental studies are ways to understand and assess these types of community programs for youth and help program planners and program staff build internal knowledge and skills and can highlight theoretical issues about the developmental qualities of programs. Such systematic program study should be a regular part of program operation. Comprehensive evaluation is dependent on the availability, accessibility, and quality of both data about the population of young people who participate in these programs and instruments to track aspects of youth development at the community and program levels. The next chapter explores social indicators and data instruments to support these needs.