5
Measurement and Data Collection in Evaluation

This chapter provides an overview of evaluation concepts and introduces some principles for their application to the Metropolitan Medical Response System (MMRS) program. Evaluation has been defined in numerous ways, but all of the definitions refer in some way to a systematic assessment to reach a judgment about value or worth (Scriven, 1991). The entity being evaluated can be a program, product, policy, or personnel. In the MMRS program context, the judgment about value might apply to

  1. individual elements or components of an individual city’s MMRS,

  2. the capacity and overall performance of a city’s MMRS,

  3. the capacity and performance of the aggregate of city MMRS across the nation,

  4. administration of the federal MMRS program or related agencies, and

  5. federal- or state-level policies as they affect the adequate development of an MMRS.

Systematic assessment is a means to distinguish evaluation from subjective impressions or anecdotal evidence. Systematic assessment may be qualitative or quantitative in nature, but in all cases it is self-conscious about the need for validity and reliability in the assessment. Validity means (1) that independent assessors can agree on the relevance and appropriateness of criteria for judging value and on evidence that reflects those criteria and (2) that safeguards are in place to control potential bias



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program 5 Measurement and Data Collection in Evaluation This chapter provides an overview of evaluation concepts and introduces some principles for their application to the Metropolitan Medical Response System (MMRS) program. Evaluation has been defined in numerous ways, but all of the definitions refer in some way to a systematic assessment to reach a judgment about value or worth (Scriven, 1991). The entity being evaluated can be a program, product, policy, or personnel. In the MMRS program context, the judgment about value might apply to individual elements or components of an individual city’s MMRS, the capacity and overall performance of a city’s MMRS, the capacity and performance of the aggregate of city MMRS across the nation, administration of the federal MMRS program or related agencies, and federal- or state-level policies as they affect the adequate development of an MMRS. Systematic assessment is a means to distinguish evaluation from subjective impressions or anecdotal evidence. Systematic assessment may be qualitative or quantitative in nature, but in all cases it is self-conscious about the need for validity and reliability in the assessment. Validity means (1) that independent assessors can agree on the relevance and appropriateness of criteria for judging value and on evidence that reflects those criteria and (2) that safeguards are in place to control potential bias

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program in measurement, data collection, analysis, and the drawing of conclusions (Shadish et al., 2001). Reliability means that different assessors would reach similar conclusions on the basis of the evaluation methods used. The context of the MMRS program presents some special challenges in terms of evaluation of the program. First, the MMRS program involves a web of planning activities, resources, intergovernmental agreements, and exercises at multiple levels of government. This web of activities is seen in Figure 5-1. Any one of a number of policy instruments, development activities, emergency capacity functions, and follow-up activities might be evaluated, or sets of them might be evaluated. Second, any MMRS itself represents an effort to coordinate multiple entities and activities that are independently funded and that receive their authority from other sources. Third, evaluation of the MMRS program is inferential, because even after September 11, 2001, incidents of domestic terrorism have occurred in only a few cities, and so the adequacies of most MMRSs have never been tested directly. Fourth, evaluation of the MMRS program is also inferential because, of necessity, assumptions must be made about how the component parts should work together. EVALUATIONS OF VARIOUS TYPES Evaluation can focus on a variety of entities and questions, such as the following: Inputs. Inputs are an individual city’s resources, personnel, and political and logistic agreements committed on behalf of the MMRS. Processes. Also known as implementation, processes would include the variety of activities designed to achieve a specific level of capacity to detect an attack, to deal with the crisis phase, and to manage the aftermath. Such activities lead to the intermediate results required to achieve preparedness. These might include, for example, growth in decision makers’ knowledge and experience with the variety of events in question, training programs under way in various units, designation and assumption of responsibilities, purchase of necessary equipment and supplies, and periodic testing of communications. Outputs. Because true terrorist attacks are, fortunately, still rare, an assessment of the ultimate outcome of the MMRS is not likely to be available for many cities. Instead, intermediate outcomes, referred to hereafter as outputs, are more feasible and are represented by progress of various elements of the system in response to exercises, false alarms, and nonterrorism events. Immediate outputs might include, for example, the number or percentage of personnel passing specialized tests on chemical or biological weapons.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program FIGURE 5-1 MMRS program participants, policy instruments, development activities, emergency capacity, and follow-up activities.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program Other types of evaluations are sometimes distinguished, such as cost-benefit and cost-effectiveness analyses. These are less central to the questions posed by the Office of Emergency Preparedness (OEP) of the U.S. Department of Health and Human Services (DHHS) and so are not addressed here. MANAGEMENT FUNCTIONS OF EVALUATIONS IN THE MMRS PROGRAM CONTEXT The MMRS program provides a planning and coordination mechanism for cities’ responses to chemical, biological, or radiological (CBR) terrorism that is otherwise lacking in the national capacity to deal with terrorist attacks. Because it aims to assist cities in coordinating a complex set of activities and capacities, the questions asked during evaluations of the MMRS program should focus on the federal, state, and city levels. Decision makers at all these levels can use these evaluations to improve the operation of the system and for accountability purposes. At the federal level, the primary aim is to ensure the maximum level of preparedness feasible in cities that vary greatly in terms of their resources and the levels of cooperation between participating agencies. At the city level, the need is to ensure a coordinated response among disparate agencies and units that are accountable to and funded by a wide array of federal, state, and local decision-making bodies. Quite commonly, funding agencies, legislative overseers, and other stakeholders use program evaluation as a device to hold program managers or grantee agencies accountable. Evaluation results can help determine, for example, whether the program is producing expected substantive outputs, carrying out planned activities, and using grant funds for allowable purposes. This information can then be used to make future decisions about the program or an individual grantee. However, holding program managers or grantees accountable for their stewardship of a public mission or resources is not the sole purpose for which the results of evaluations can be used. Evaluation results can also be usefully applied to improve management of the program as a whole or of individual contracts or grants. Here the objective is to diagnose how well the program as a whole or individual grantees are performing with the objective of remedying shortcomings and identifying and replicating best practices. For a program that is likely to continue irrespective of current levels of performance because its substantive purpose is regarded as a critical public need, the management improvement function of program evaluations may well be as important as or more important than accountability.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program Accountability Function In the case of a specific program (e.g., the MMRS program), the key accountability relationship may extend from the federal agency or unit responsible for administering the program as a whole to higher-level department executives, the President and the federal Office of Management and Budget (OMB), and federal congressional authorizing and appropriations committees; more generally, it may extend to stakeholder groups and the public. Alternatively, the primary accountability relationship may extend from individual grantees back to the federal agency (see Figure 5-2). The accountability function is often the prime motivation for evaluation requirements in federal grant programs. The information gathered during evaluations to assess accountability is useful in determining whether future funding commitments should be made in an agency’s internal budget preparation process or in the congressional appropriations process, showing stakeholder groups that a problem is being effectively handled, or determining whether to reward a specific grantee with additional resources (or, possibly, not to renew or complete funding). In each of these cases, the information generated during an evaluation is primarily used by external program overseers or stakeholders. The term often used for this is summative evaluation (Scriven, 1991). These re- FIGURE 5-2 Accountability relationships for federal grantees and grant-making agencies.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program cipients are rating or judging the adequacy of the program or the grantee’s performance and deciding whether to act on the basis of this information. Although a grantee could benefit from a favorable rating (through an enhanced reputation or supplementary support), the evaluation performed for accountability purposes is not primarily oriented toward helping the program or the grantee. The program or the individual grantee is potentially at risk. Unfavorable judgments may bring negative consequences. Management Improvement Function Accountability is not the sole purpose of an evaluation. Instead, evaluations can be designed with the aim of helping the evaluated entity—whether it is the program as a whole or an individual grantee— assess its strengths and weaknesses and take appropriate future actions. This type of evaluation is termed a formative evaluation (Scriven, 1991). It can be closely related to technical assistance efforts (e.g., when OEP provides data to a city MMRS) and also to continuous quality improvement efforts (e.g., when a city MMRS makes use of locally collected data for local purposes). The key characteristic of this approach is to emphasize feedback from the evaluators to those who have been evaluated—either individually or as part of a professional community that can share lessons and ideas to improve the overall performance of that community. When the evaluation reveals shortcomings, the follow-up step would be for the program or grantee or for outsiders to develop a course of action that would improve the program’s capacity to at least satisfactory levels. When strengths are discovered, analysis can suggest whether there are lessons to be learned (either general lessons or lessons limited to certain contingencies) that might assist other grantees or similar entities. In that case, efforts can be made to use this information to work with less capable grantees or to disseminate information about best practices more widely. Compatibility of Approach Accountability and management improvement are not antagonistic conceptions of the purpose of evaluations. In principle, one could design procedures that would serve both functions. However, there are some definite tensions between these purposes: Managers in grantee organizations are more likely to feel wary of and are less likely to be cooperative with evaluations undertaken primarily for accountability purposes than with those aimed at management improvement.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program Evaluations aimed at management improvement are necessarily more customized, focusing on specific features of a particular environment, whereas accountability evaluations are likely to emphasize features standardized across jurisdictions. Limited resources may require the evaluating entity to set priorities among alternative purposes if implementation of a more wide-ranging evaluation is fiscally or administratively infeasible. In what follows, the need to serve various evaluation audiences at the federal, state, and local levels will be termed “the layering problem,” and the strategy used to inform these audiences through systematic data collection will be termed “the layering strategy.” One feature of such a strategy might be to collect information that can be used by more than one audience. Some suggestions for the layering strategy will be offered in Chapter 9. However, the reality is that federal managers are likely to motivate most of the evaluations that occur and to push for the use of most of the data from those evaluations at other levels. SUMMATIVE AND FORMATIVE USES OF VARIOUS EVALUATION TYPES Many misconceptions arise in using the terms summative and formative evaluations for evaluation functions. In the case of the MMRS program, as with many programs, both formative and summative evaluations are ideally ongoing or are occurring on a cyclical basis. Evaluations can and do serve both formative and summative purposes at the same time. As noted previously, the same evaluation information can be used at both the federal (or state) and local levels (Leviton and Boruch, 1983). In addition, information from both formative and summative evaluations could be used in rhetorical or persuasive fashion, for example, to argue for additional funds or coordination at the federal level or to alert a key unit in the city’s overall response plan that its personnel need improved training. There is no hard-and-fast rule concerning the functions themselves. For example, federal decision makers often request information intending to make summative judgments, but they often use the information in a formative fashion to “tinker” with programs (Cronbach et al., 1980; Leviton, 1987; Leviton and Boruch, 1983). In the federal MMRS program context, OEP might indicate to a city that it had made satisfactory progress in building MMRS capacity, a summative judgment of worth, and then go on to indicate areas that still required improvement, a formative judgment. A formative evaluation is sometimes equated with a process evaluation, and a summative evaluation is sometimes equated with an outcome

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program or output evaluation. This conflates the function of an evaluation with the entity being evaluated. Processes and even inputs can be evaluated in a summative fashion, whereas outcomes are frequently used for formative purposes. Consider, for example, the following potential evaluations: OEP might indicate to a city that it had all of the requisite units and responsible parties in place, an evaluation of inputs and a summative judgment. At the aggregate level, OEP might report to the U.S. Congress the number of MMRS program cities that had these requisite inputs in place, a summative judgment. Yet, Congress might then request information about how to overcome barriers to getting these requisite inputs in place, a formative question. OEP might also indicate that an MMRS program city had not engaged in a satisfactory number of situational exercises or drills within the past year, an evaluation of process and a summative judgment. However, visitors to the site for the purposes of peer review might point out this obstacle and offer suggestions for overcoming it. For all MMRS program cities, OEP might report to DHHS or the Congress on the number of cities engaging in a satisfactory number of exercises, a summative statement, or it might report on typical barriers to achieving a satisfactory number of exercises, a formative statement. OEP or state- or city-level evaluators might determine that a component system in a city’s MMRS does not meet the standard for speed of mobilization. This is a summative judgment about outputs. For all MMRS program cities, OEP might indicate the number of cities whose systems do not meet the standard, also a summative statement. However, peer reviewers working in the spirit of quality improvement might probe in-depth with cities not meeting the standard to determine barriers and to make suggestions on how failure to meet the standard can be overcome. This is a formative evaluation activity. WHY AN ADEQUATE WRITTEN PLAN IS NOT SUFFICIENT ASSURANCE OF PREPAREDNESS To date, OEP staff have used a checklist format to assess whether cities’ plans for their MMRS included the component parts that would be necessary to achieve preparedness (see Appendix D). Assessment with the checklist has been followed by personal contact and observation in many cases, and this has permitted OEP staff to use their substantial experience in judging more directly whether a city’s plans were adequate. Although this format has been an important starting point for evaluations of local systems, it has served primarily to ascertain whether local planners had included the variety of important inputs in their written plans.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program Changing the format from a simple yes/no checklist to a graded instrument with potential answers ranging from “absent” to “exceptional” would help both OEP and contractor communities identify weaknesses to be shored up and strengths worth sharing with other communities. However, as municipal systems develop further and more contract experience is gained, OEP has recognized that the checklist alone is insufficient grounds for concluding that an MMRS program city is in fact prepared to cope with the consequences of an act of terrorism involving a CBR agent. The checklist is described below, followed by some reasons why it is not sufficient. The following section then examines a variety of alternative ways to measure the performance and outcomes of an MMRS. A checklist format is insensitive to matters of degree. As such, it does not adequately reflect reality. It measures variables in a categorical, dichotomous fashion, assessing whether or not a system or function meets a standard or has adequate capacity. In reality, there are degrees of preparedness. Each community would cope with a terrorist attack to the best of its ability, and even if all standards were met, “success” would still be a matter of degree. Also, the MMRS program can set and encourage standards, but in the event of an attack, more can be learned from assessing the degree to which performance standards were achieved than from merely noting whether a standard was met. A checklist does not permit recording of the information that can be used to improve an MMRS. Although the checklist indicates functions that are not available and that need to be addressed, it does not indicate the actions that a city should be taking to further increase its capabilities in specific areas. The MMRS program cities must have the capacities to deal with a great variety of potential incidents. The sheer number of possible variations in terrorist incidents (the weapon used, the mode of delivery, the range of crisis capacities that are essential or that could be used, the targets of the terrorist attack, and relevant aspects of the urban situation) means that even though the preparations may be perfect for one incident, they may have limited applicability to another incident. The checklist format captures the range of capacities that might be needed only in the broadest way. A checklist to assess a plan on paper is seldom an adequate reflection of local reality. In general, a small number of individuals write the existing plans. Even with the best will in the world, these individuals may not fully appreciate either the limitations or the actual crisis capacities possessed by other agencies that need to be part of the system. Local politics that undermines coordination may not be properly understood, or it may not be reflected in the plan.

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program A written plan does not present a test of operational conditions. Observations under several operational conditions are vital to achieving confidence in the eventual success of an MMRS: (a) a test of the plan under exercise conditions, (b) a test under a variety of real emergencies (both terrorist and nonterrorist in nature), and (c) a test under conditions in which the MMRS is confronted with unexpected conditions or when parts of the plan fail. Murphy’s Law should be assumed, and the availability of backup plans for each component is desirable. A checklist to assess a plan on paper is vulnerable to “corruption of indicators.” It has long been understood in evaluations of health and social programs that when rewards and punishments result from people’s apparent performance on an indicator, that indicator can sometimes change in ways that have no bearing on the actual outcomes of a governmental system (Blau, 1963; Campbell, 1988). This produces a lack of validity (increase in bias) in reporting to OEP from the field. In the context of the MMRS program, at least two possible forces can lead to a corruption of indicators. First, municipalities may believe that continued federal funding is contingent on features of their plan; through self-reporting, the writers of the proposal for a grant may make the situation appear better than it is in reality. Second, even in the absence of such contingencies, no city manager wants the public to believe that the city is not prepared for emergencies. The committee heard from congressional staff of glowing reports about the results of exercises in several cities; in actuality, the cities had failed rather badly to protect their populations from potential harm from a weapon of mass destruction. Capacity or preparedness for each of the functions and components of the MMRS program is inferred; it cannot be directly observed because there have been very few actual terrorist incidents in the United States. Even if abundant examples of terrorist events were available, the “crisis capacity” itself could be inferred only from the responses of many different players at the federal, state, and local levels operating within complex systems and reacting to complex events. However, OEP personnel already infer crisis capacity in ways beyond the scope of the checklist that they use. On the basis of their expert understanding of what is required to address a terrorist incident with CBR weapons, OEP officials can assess a variety of features of a city’s MMRS plan beyond what is on paper. What is needed is a way of systematizing those judgments in the most cost-effective way and in terms of the level of accuracy or validity that is justified by the cost, the response burden (the time required to fill out paperwork, participate in a site visit, and undergo interviews) for local contractors, and use of the time of scarce federal personnel to assess the MMRS programs. A checklist format offers one of a

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program range of potential ways to infer the degree of crisis capacity for various MMRS functions. Two conclusions follow from this description of the existing instrument. First, MMRS capacity or preparedness is more usefully viewed as a complex policy goal than an absolute set of conditions that can be directly ascertained. In this respect, MMRS capacity or preparedness is like other big policy goals discussed in the United States, for example, policies related to access to medical care, privacy, and child health and well-being. No single indicator can assess it directly. Second, these big policy goals are best achieved through the use of multiple measurement strategies since any individual measurement strategy is inevitably flawed. In what follows, the advantages and disadvantages of some of these measurement strategies are described. Later chapters outline specific recommendations for measurement, data collection, and analysis. EVALUATION MEASUREMENT FOR LOW-FREQUENCY, HIGH-STAKES EVENTS As of this writing, a chemical, nuclear, or biological terrorist attack is a low-frequency, high-stakes event within the United States. As such, a terrorist attack presents challenges both to the maintenance of crisis capacity and to its improvement. In this respect, MMRS programs are similar to other systems and sets of skills with which it is a challenge to gain sufficient practice. For example, maintaining satisfactory skills for cardiopulmonary resuscitation requires refresher training (Baessler, 2000; Broomfield, 1996). A central challenge in maintaining crisis capacity in a system that is preparing for a low-frequency event is the need for some form of ongoing feedback to address flaws in the component parts. Other systems have faced similar problems and have discovered various methods to assess preparedness for low-frequency events through the assessment of higher-frequency, more proximal events. For example, worker safety faces this challenge because, thankfully, serious injuries and fatalities do not happen very often in most workplaces in the United States. Therefore, those charged with preventing worker injuries have turned to proxies for accidents, such as monitoring behaviors and conditions on the job that pose a risk of injury. Dangerous behaviors and conditions are then addressed as soon as possible since it is well understood that learning is most effective when the behavior and its consequence are closely paired in time (Feldman, 1985; Komaki et al., 1978; Samways, 1983). Two recommendations stand out as central:

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program For evaluation, as for feedback systems generally, measurement of the more frequent, more proximal indicators is a superior strategy. Therefore, the committee recommends the evaluation of outputs instead of outcomes. When outcomes are available, either in the wake of a true terrorist event or through proxy measures, they need thorough study as a basis for learning and improvement. They are not, however, the basis of an evaluation system. Multiple kinds of indicators are likely to be necessary to give an adequate picture of the performances of an MMRS: Multiple indicators (whether they are outputs, processes, inputs, or proxy measures for actual incidents) permit quantification of the degree of preparedness or the capacities of the MMRS. Low-frequency events such as terrorist attacks offer measures that are not sensitive to real improvements because the level of measurement (disaster versus no disaster) provides far less information than a cumulative examination of proxy incidents over time. Proxy measures are important because substantial preparations may be made before an attack, but without a proxy measure their effectiveness will only become apparent in the aftermath of an attack, too late for corrective action. Indicators and proxy measures describe trends in performance over time. Thus, in keeping with the analogy of worker safety presented above, a supervisor tracking incidents of a worker’s behavior that pose a danger to the worker or others can examine whether his or her approach to improving the safety of employees is working. In the same way, certain indicators recommended later in this report assess the speed with which component systems in a city are notified and involved in drills or situational exercises. If the trend is for the time to notification to be reduced, the city MMRS managers will have some confidence in the methods that they have chosen to improve responsiveness. Multiple kinds of indicators and liberal use of proxy measures are important because many and various types of terrorist incidents may occur. Success with dealing with one kind of attack, whether it is real or simulated, offers less assurance than one would like about whether other potential attacks would also be dealt with successfully. However, the use of many different kinds of incidents, events, and simulations as proxies for the variety of potential attacks that may occur offers more information on how these might be handled. EVALUATION MEASUREMENT: PERFORMANCE MEASURES AND PROXIES Provided that several assumptions are accepted, it is feasible to create

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program two classes of measures for the MMRS program that are higher in frequency than actual incidents and that can be monitored more closely than an actual incident could. The two classes are proximal measures related to likely performance and proxies for actual incidents. In terms of performance, one of the theories underlying the MMRS program is the assumption that the component parts of the systems (e.g., the police, firefighting personnel, emergency medical services personnel, and epidemiologists) must perform adequately or the MMRS program will not perform adequately. Because the skills of the individuals who make up these component parts are periodically assessed, to the degree that their functions are relevant to the performance of the program in a terrorist incident, these assessments can partially represent the capacity of the MMRS. The performances of the component parts of the MMRS program in response to various actual emergencies and incidents that bear some resemblance to what might occur in a terrorist attack could be assessed. Such incidents range from chemical spills (which would test the capacity to deal with hazardous materials), to medical system responsiveness to flu epidemics, to pranks (e.g., the release of pepper spray at a local mall or the mailing of letters falsely claiming to contain anthrax spores), to the turnaround time until the state epidemiology office is notified about the appearance of a cluster of suspicious cases of an infectious disease. Different components of the system are involved in many such incidents, and the assumption is that the performance of those components is relevant to what might happen in likely terrorist incidents. In this respect, after action reports about the performances of the various components of the system become crucial components for evaluation, especially for local system improvement. Furthermore, such monitoring can also provide a means for establishing accountability of the MMRS to OEP, in line with the layering strategy for evaluation outlined above. Accountability can be achieved if OEP is satisfied that MMRS leadership is addressing problems identified either by OEP or by the local MMRS leadership. Alternatively, OEP may want to assess progress and establish a timetable to address a problem with one of the components of the MMRS. It is reasonable to hold the MMRS accountable for progress in addressing potential flaws. It is not reasonable to hold the MMRS accountable for all possible outcomes that might result from the endless variety of potential terrorist scenarios. In general, the use of these proximal measures and proxies will cost less than the use of some of the alternative strategies. CRITERIA FOR SELECTION OF EVALUATION METHODS A variety of methods might be of use in evaluating MMRS program

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program cities. The choice of methods should depend on cost and feasibility, and the criteria presented below. Resources and Skills in the National Program Office The management improvement approach to evaluation may represent an additional responsibility for the national program office for which resources are lacking or inadequate. Conducting evaluations to spur management improvement, moreover, requires different skills for the national MMRS program staff than they may have available. Follow-up technical assistance depends on operational activities that are quite different from the contract writing and fiscal management tasks that characterize much of current program stewardship. Whether these skills are sufficiently developed in the existing national program office is an important question for which inadequate information to be able to provide an answer is available. Developmental Phases Because an MMRS is typically started from scratch in a particular jurisdiction and because its growth is then nurtured through a process that takes at least several years and it must then be sustained at a high level of readiness and competence for an indefinite period, each MMRS can be presumed to go through a series of developmental steps. There may be a high degree of similarity in the developmental phases encountered in each jurisdiction; alternatively, there may be idiosyncratic features in some or all jurisdictions that make it difficult to prescribe or foresee how development will proceed. If evaluations are done across the board on a regular schedule, not all grantees will be at the same stage of development during any particular evaluation period. If evaluations are done for each grantee at specified intervals after a grant is given, the evaluation instrument and process can be designed to be sensitive to the developmental trajectory of the particular grantee. The Federal Emergency Management Agency’s National Urban Search and Rescue Team program may provide a model for the evaluation of an MMRS at different stages of development. The Urban Search and Rescue Team program uses a three-stage process designed with the idea of a progression of developmental phases. These stages are (1) a self-assessment in which a checklist of equipment and training steps is used, (2) a peer visit by recognized experts in the field who consider a set of operational guidelines, not just a checklist, and (3) a deployment exercise

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program that helps determine whether the urban search and rescue team is ready for action. Variations Among MMRS Program Cities’ Resources Evaluations of MMRS program cities must face the problem that some municipalities will never be able to afford the assets that protect other municipalities. In addition, municipalities often have special considerations: vulnerable targets that differ from those of other municipalities. For example, Washington, D.C., needs to anticipate likely attacks on numerous federal facilities and embassies, whereas Baton Rogue, Louisiana, has a variety of chemical plants that are vulnerable to attack. One implication of these problems is that MMRS capacity and its assessment will need to be tailored, at least to some degree. Another implication of the variations in resources available to deal with a terrorist attack is that plans should have a hierarchy of methods to approach an incident; that is, backup plans should be available. These might be assessed at the time of application for the MMRS program contract, they might be suggested at the time of site visits, and they might be tested over time in much the same way that the more preferred plans are. Timeliness of Feedback The time between the gathering of evaluation information and the provision of feedback (whether it is formal and conclusive or informal and provisional) should be relatively short. Long delays mean that shortcomings may not be addressed as quickly as they could be or that the conclusions reached may be outmoded by further events or developments that have not been taken into account by the evaluation. Communications Channels Evaluations undertaken for management improvement purposes have less defined boundaries than evaluations undertaken for accountability purposes. One of the key side effects of such evaluations is therefore the establishment and densification of lines of communication from a grantee to other grantees and outside experts. These channels can be used for operational improvement purposes even after the evaluation is nominally complete. To the extent that peer review is used (see Chapter 8), it should be noted that the learning is two way and is not limited to the jurisdiction being reviewed. Best-practice ideas may emerge and may deserve to be disseminated. The national meetings of MMRS officials could be incorpo-

OCR for page 75
Preparing for Terrorism: Tools for Evaluating the Metropolitan Medical Response System Program rated in some fashion into the process of communicating insights from the evaluation process. Measurement Characteristics Most evaluations are a combination of qualitative and quantitative approaches, each of which brings strengths and weaknesses. Qualitative studies provide greater depth of understanding about a small number of cases or subjects, often identify new variables for study or new relationships among variables, but that understanding may not generalize beyond the few cases studied. Quantitative measures typically provide greater breadth of understanding and, depending on the research design, may allow for strong inferences about causation, but the depth of knowledge will be limited (Cronbach, 1982; Francisco et al., 2001). Both approaches must deal with the issues of reliability (Will the measures yield the same result in the hands of different evaluators?) and validity (Do the measures provide an accurate picture of reality, or make accurate predictions about future events?). The former can generally be assessed by comparing measurements of a small sample of cases by two or more evaluators, and this should be possible for any MMRS evaluation tools as well. Validity testing on the other hand relies on comparison of a condition or event predicted by the measurement instrument to an actual condition or event. In the case of MMRS preparedness, the actual event would be an effective response to a large-scale CBR terrorism incident, so validation of any preparedness measurement would depend on the occurrence of a large number of such incidents. In practice, the most important measurement characteristic may be response cost, the money, time, and energy required of the organizations being evaluated, for without the enthusiasm, or at least the willing cooperation of those organizations the evaluation is liable to be meaningless.